Definition
Few-shot prompting is the technique of providing a small number of input-output examples (demonstrations) within the prompt to guide the LLM's behavior on a new, unseen input. The model infers the desired pattern from the examples and applies it — no training or fine-tuning required.
Core Concept
`
[Example 1 Input] → [Example 1 Output]
[Example 2 Input] → [Example 2 Output]
[Example 3 Input] → [Example 3 Output]
[New Input] → ??? (model predicts the output)
`
Terminology
| Term | Examples in Prompt |
|------|-------------------|
| Zero-shot | 0 |
| One-shot | 1 |
| Few-shot | 2–10+ (typically 3–5) |
| Many-shot | 10–100+ |
"In-context learning" is the general term for all shot-based prompting — the model learns from examples in context without weight updates.
Why Few-Shot Works
During pre-training, the model saw countless patterns of the form "X → Y". When shown examples in a prompt, it pattern-matches: "this looks like the pattern where the answer follows the examples in a consistent format."
Key insight: No gradient updates occur. The model learns to solve the task purely from the attention mechanism reading the examples in context.
Few-Shot Examples by Task Type
Sentiment Classification
`
Review: "I loved the product!" → Sentiment: Positive
Review: "Terrible experience." → Sentiment: Negative
Review: "It was okay." → Sentiment: Neutral
Review: "Best purchase I've made this year." → Sentiment:
`
Named Entity Extraction
`
Text: "Apple released the iPhone 15 in September."
Entities: {company: "Apple", product: "iPhone 15", date: "September"}
Text: "Tesla CEO Elon Musk announced the Cybertruck launch."
Entities: {company: "Tesla", person: "Elon Musk", product: "Cybertruck"}
Text: "Microsoft acquired Activision Blizzard for $68.7 billion."
Entities:
`
Format Conversion
`
Input: name=John age=30 city=NYC
Output: {"name": "John", "age": 30, "city": "NYC"}
Input: name=Alice age=25 city=LA
Output: {"name": "Alice", "age": 25, "city": "LA"}
Input: name=Bob age=45 city=Chicago
Output:
`
Few-Shot Best Practices
Example Quality
- Use high-quality, representative examples
- Cover edge cases and diversity of inputs
- Consistent format across all examples
- Place easier examples first, harder last
- For classification: balance classes across examples
- The last example before the test input is most influential
- More examples generally help, up to a point
- 3–5 examples often sufficient for simple tasks
- Complex tasks may benefit from 10+
- Diminishing returns beyond ~20 examples for many tasks
- Zero-Shot, Chain of Thought, Prompt, Context Window, In-Context Learning, RAG
Example Ordering
Example Count
Label Space Coverage
For classification, include examples of all possible output classes.
Few-Shot vs. Fine-Tuning
| Aspect | Few-Shot | Fine-Tuning |
|--------|----------|-------------|
| Training required | No | Yes |
| Cost | Token cost per call | GPU compute (one-time) |
| Flexibility | Easy to change examples | Requires retraining |
| Context efficiency | Uses context window | No context overhead |
| Performance | Good for most tasks | Better for consistent, high-volume |
| Latency | Higher (longer prompt) | Lower (no examples in prompt) |
Few-Shot Chain of Thought (CoT)
Combining few-shot with chain of thought:
`
Q: Roger has 5 tennis balls. He buys 2 cans of 3 balls each. How many balls does he have?
A: Roger starts with 5 balls. 2 cans × 3 balls = 6 new balls. 5 + 6 = 11 balls. Answer: 11
Q: The cafeteria had 23 apples. They used 20 to make lunch and bought 6 more. How many do they have?
A:
`
The reasoning chain in the example guides the model to show its work.
Dynamic Few-Shot (RAG-based)
Instead of hard-coding examples, dynamically retrieve the most relevant examples from a database:
1. Embed the new query
2. Find most similar examples in an example store
3. Inject those as the few-shot demonstrations
4. More relevant examples → better performance
Few-Shot in Practice
| Platform | Implementation |
|----------|---------------|
| OpenAI API | Add (user, assistant) example turns before the real user message |
| Claude API | Same pattern in messages array |
| HuggingFace | Format using model's chat template |
| LangChain | FewShotPromptTemplate handles formatting automatically |
Limitations
| Limitation | Notes |
|------------|-------|
| Token cost | Each example costs tokens |
| Context window | Many examples fill up available context |
| Label leakage | Model may overfit to example patterns |
| Sensitivity to examples | Wrong/poor examples degrade performance |
| Not true learning | Model doesn't retain knowledge after the call |