1. Zero-shot vs few-shot, in one minute
Zero-shot prompting means you give the model an instruction and nothing else — no examples. You're trusting it to know what "good" looks like from its training: "Classify this review as positive, negative, or neutral." Most everyday prompting is zero-shot, and for common tasks it works fine.
Few-shot prompting means you include a handful of worked examples — input-output pairs — right inside the prompt, then give it the real input last. The model doesn't get re-trained; it just copies the pattern it sees in your examples. This is sometimes called in-context learning: the examples teach the model, for this one request, exactly what shape the answer should take. One-shot is the same idea with a single example.
The whole question of this guide is when the examples earn their place. Examples cost tokens and effort, and a sloppy example actively hurts — so the goal isn't "always add examples," it's knowing the moment they flip a result from unreliable to reliable.
The mental model: zero-shot tells the model the rule. Few-shot shows the model the rule being applied. When the rule is easy to state, telling is enough. When "good" is easier to demonstrate than to describe, showing wins.
2. The same task, both ways
Take a concrete job: tag incoming support messages with a category and an urgency. Here's the zero-shot version — pure instruction.
You'll get a reasonable answer — but the format is a gamble. One run returns a paragraph, the next a JSON object, the next a different label like "Technical Issue" that you never asked for. The model is guessing at your house style because you never showed it. Now the few-shot version, with two examples doing the work:
The two examples pin down three things at once: the exact label set (no surprise "Technical Issue"), the output format (category | urgency, every time), and the urgency judgment — a double-charge is High, a someday-wish is Low, so a broken core feature lands High without you spelling out the rule. The model copies the demonstrated pattern instead of inventing one. Same model, far more consistent output — because you showed instead of only told.
Why this works: the model finishes patterns. When the last line of your prompt is an unfinished example, it completes it in the shape of the examples above. That's the entire mechanism behind few-shot — and why your examples must be exactly right.
Build the prompting reflex in 15 minutes a day.
PromptSharp sends one focused prompting lesson to your inbox every day — techniques like this one, with real before/after examples you can run on Claude, ChatGPT, or Gemini. Free forever, no card required.
No spam. Unsubscribe anytime.
3. The decision rule: show or just tell?
Don't reach for examples by default — they cost tokens, and a bad one teaches the wrong thing. Start zero-shot, then add examples only when the task hits one of four triggers.
Output must match a shape
You need a precise schema, table, label set, or layout every single time — JSON keys, a fixed CSV, a strict tag vocabulary. One or two examples lock the format far more reliably than describing it in prose.
"Good" is easier to show than say
Tone, voice, and house style resist description. If you'd struggle to write the rule but can paste a perfect example, that's a few-shot job — the example carries the style you can't articulate.
The model keeps repeating a mistake
It mishandles one recurring case — abbreviations, negation, a boundary call. Add an example that demonstrates the right handling of exactly that case, and the error usually disappears.
When to stay zero-shot
Keep it zero-shot — and just tighten the instruction — when any of these is true:
- The task is common and well-understood. Summarize, translate, rewrite, simple classification — modern models already nail these. Examples add tokens for no gain.
- You can't write good examples. A wrong or off-pattern example is worse than none — the model will faithfully copy the flaw. No clean example? Don't fake one.
- You want the shortest, cheapest prompt. For high-volume API calls, a tight zero-shot instruction that already works beats paying for example tokens on every request.
- The instruction itself isn't sharp yet. Fix the wording first — a clear role and explicit constraints often solve the problem before examples are needed.
The fourth few-shot trigger is the catch-all: the task is niche or domain-specific — your internal jargon, an unusual labeling scheme, a format the model has never seen. There, examples aren't optional; they're how the model learns your world for the length of the prompt.
4. How many examples — and which ones
More examples is not better. Past a handful, extra examples mostly add cost, and they can even bias the model toward whatever pattern dominates your set. Here's the practical ladder.
One-shot
A single example. Great for locking format fast — the model now knows the shape. Weak at teaching variation, since it's seen only one case.
Two to five (the sweet spot)
Enough to show the common case and a tricky one, without bloating the prompt. This is where few-shot earns its keep for most real tasks.
Ten-plus (rarely)
Diminishing returns and rising cost. If you genuinely need many examples, that's often a signal to fine-tune or restructure the task instead.
Which examples matter more than how many. Two principles do most of the work:
- Cover diversity, not repetition. Three different cases — including a boundary or tricky one — teach more than ten near-identical ones. The model generalizes from the range it sees.
- Make every example flawless. The model copies your examples literally, mistakes included. One mislabeled example will reliably reproduce that exact error in the output. Proofread examples harder than you proofread the instruction.
And mind the order: models weight the most recent examples a little more heavily, so put your cleanest, most representative example last — right before the real input.
5. Zero-shot vs few-shot, side by side
| Dimension | Zero-shot | Few-shot |
|---|---|---|
| Prompt size & cost | Shortest — instruction only | Larger — every example costs tokens on every call |
| Format control | Inconsistent; the model guesses your style | Tight — examples pin the exact shape |
| Best for | Common, well-understood tasks | Specific formats, house style, niche or edge cases |
| Main risk | Drift in format and label vocabulary | A bad example teaches the wrong pattern |
| Effort to write | Low — one clear instruction | Higher — you must craft and verify examples |
The two aren't rivals so much as a progression: start zero-shot, and graduate a prompt to few-shot the moment it starts drifting in format, missing your style, or fumbling the same edge case twice. If even good examples don't make it reliable, the task may want a chain of focused prompts instead of one bigger prompt.
6. The 30-second recap
- Zero-shot = instruction, no examples. Shortest and cheapest; great for common tasks the model already nails.
- Few-shot = a few input-output examples in the prompt, then the real input. The model copies the demonstrated pattern.
- Add examples when you need a precise format, a style you can show but not describe, a fix for a recurring edge case, or for niche/domain tasks. Otherwise stay zero-shot.
- Two to five examples is the sweet spot — diverse over repetitive, every one flawless, cleanest example last.
- The trap: a wrong example is worse than none. The model reproduces the flaw exactly.
The fastest way to internalize this isn't reading it once — it's practicing until "show or just tell?" is an instant call. That's what PromptSharp is built for: one short lesson a day, real before/after examples, until sharp prompting comes out of your fingers without thinking.