Lyndsay’s A-Z of Microsoft Copilot: H is for Hallucinations

Lyndsay’s A-Z of Microsoft Copilot: H is for Hallucinations

Join me exploring Microsoft Copilot through each letter of the alphabet. This week, H is for Hallucinations — and no, I haven’t been up to anything dodgy.

If you’ve been following this blog series, you’ll have noticed a bit of a theme emerging.

In B is for Be Careful, I showed you that Copilot confidently told me the wrong week of the year.

In C is for Create, I shared the moment it decided we’d hired 45 new people (we are a company of 8).

And in A is for Agents, I mentioned that when I used Researcher, I had to explicitly tell it not to guess at answers — because left to its own devices, it would rather invent something plausible than admit it didn’t know.

All of those moments? That’s hallucination.

So what is a hallucination?

A hallucination is when an AI confidently gives you information that is completely, utterly wrong.

Not “slightly off” wrong. Not “a bit outdated” wrong. We’re talking invented from thin air wrong. Names, statistics, dates, all presented with the same breezy confidence as if it was indisputable.

The word “hallucination” is borrowed from the world of psychology, and it’s a perfect fit. Just like how a person experiencing a hallucination genuinely believes they’re seeing something real, the same is happening with AI. It doesn’t know it’s making things up. It’s presenting a fabrication as if it were fact, with zero awareness that it has done so.

That’s what makes it such a sneaky problem.

Why does Copilot hallucinate? What’s happening in there?

Copilot (and AI like it) is built on something called a Large Language Model (LLM). An LLM is trained on a vast amount of text — books, websites, articles, documentation — and through that training, it learns patterns. Patterns of language, of facts, and how ideas connect.

When you ask Copilot a question, it isn’t going away to “look up the answer” the way you might Google something. It’s doing something more like extreme-speed pattern-matching. It’s asking itself: given everything I’ve ever “read,” what is the most likely thing to come next?

This is why it’s so impressive when it works, and so bafflingly wrong when it doesn’t.

If you ask it something it has seen a lot of, like well-documented topics or widely published facts, it’s very likely to be accurate, because the pattern is clear and consistent. But if you ask it something more niche, something it has limited training data on, or something that requires it to know your specific reality (like how many people work in your company), the pattern-matching gets wobbly. And rather than say “I don’t know,” it fills in the gap with something that sounds right.

Think of it a bit like autocomplete on your phone. Autocomplete predicts the next word based on patterns. Sometimes it’s brilliant. Sometimes it finishes your text message with something wildly inappropriate. It didn’t know it was wrong. It just gave you its best guess.

Copilot is a more sophisticated version of the same thing. And when it guesses wrong, it guesses wrong with complete conviction.

It’s not just how these models are built that causes hallucinations, it’s also how they’re trained. Research has shown that AI models are essentially taught to favour confident guessing over admitting uncertainty.

Remember exams? The advice was always ‘don’t leave an answer blank, at least write something and you may get some marks.’ AI models work on the same basis, that sounding confident pays off, and saying “I don’t know” does not.

Which is exactly why, back in my A is for Agents blog, I had to explicitly tell Researcher not to guess when it didn’t know an answer. Left to its own devices, it had been trained to fill the gap with something plausible rather than hold its hands up. Once you understand why that happens, that tip makes a lot more sense, and feels a lot more important.

How do you spot a hallucination?

This is where you need your detective hat on. Here are the telltale signs:

🚩 Specific numbers and statistics that you can’t trace If Copilot tells you “37% of employees said…” and you can’t find where that figure came from — be suspicious. Numbers feel authoritative, which makes them a favourite hallucination flavour.

🚩 Names, titles, or details that don’t quite check out In my Researcher experiment, I had to explicitly tell it not to guess answers it didn’t know. Left unchecked, it had no problem inventing plausible-sounding ones. If names, job titles, or specific references look a little “off,” verify them.

🚩 No sources, or sources that don’t hold up Copilot will often provide citations when it’s pulling from real data — this is one of the benefits of grounding (see G is for Grounding if you missed it!). If there are no sources, or the source links don’t actually say what Copilot claims they say, hallucination alert.

🚩 Content that sounds great but seems too neat This one is subtler. Hallucinations often read beautifully. They’re well-structured, confident, and just plausible enough. If something sounds too perfectly tailored to what you wanted to hear, give it a second look.

🚩 Copilot fills in gaps you didn’t ask it to fill Remember when Copilot Create couldn’t access my files and just… used “template data” instead? It didn’t flag this upfront — it just quietly filled in the blanks with made-up content. If you haven’t given Copilot enough context and it still produces something detailed and specific, ask yourself: where did it get that from?

The top 3 ways to prevent hallucinations

1. Ground it in real data

We covered this in G is for Grounding, but it’s even more relevant here. The more specific, real-world context you give Copilot, the less likely it is to go rogue and invent things.

Attach the relevant files. Use the Work toggle when you’re asking about work things. Reference specific documents or meetings. When Copilot has solid, verifiable information to draw from, it’s far less likely to need to fill gaps by guessing.

A vague prompt = a higher hallucination risk. A grounded, specific prompt = a much lower one.

2. Tell it explicitly not to guess

This one sounds almost too simple, but it works. In my A is for Agents blog, I specifically told Researcher: “If you don’t know the answer, say so — do not guess.” And it made a real difference.

You can do the same in your prompts:

“If you are unsure of any facts, flag them rather than assume.”
“Only include information you can verify from the files I’ve provided.”
“If you can’t find the answer, tell me — don’t make something up.”

3. Always check the sources (and the toggle)

As I mentioned in B is for Be Careful, the Work/Web toggle matters a lot. If you’re asking Copilot about something that requires current, factual, internet-based information and you’re on the Work toggle — you’re giving it the wrong tools and increasing the chance it’ll improvise.

Beyond the toggle: check the citations. When Copilot is properly grounded, it should point you to the sources it used. Click them. Read them. Make sure they actually say what Copilot claims. It’s a small habit that can save you a lot of embarrassment.

TLDR

A hallucination is when Copilot confidently makes something up. You can spot them by watching for unsourced stats, suspiciously neat output, and content that appears from nowhere. You can reduce them by grounding your prompts, telling Copilot not to guess, and always checking your sources.

In short: Copilot is brilliant, but it isn’t infallible. Trust, but verify. Always.

Next time, we’ll see what I is for…

Have you had a memorable Copilot hallucination? I’d love to hear about it in the comments — bonus points if it’s career-safe enough to share! 😄

« Previous Next »

No results found...