Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications

Categories

πŸŽ“ Certifications
πŸ€– Artificial Intelligence
☁️ Cloud and Infrastructure
πŸ’Ύ Data and Databases
πŸ’Ό Professional Skills
🎯 Programming and Development
πŸ”’ Security and Networking
πŸ“š Specialized Topics
CheatGrid
HomeAboutTopicsPricingMy VaultStatsPractice TestsCertifications
LVLEVEL 0
0/5 XP
GitHub
Β© 2026 CheatGridβ„’. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Prompt Engineering Cheat Sheet

Prompt Engineering Cheat Sheet

Tables
Back to Generative AI
Next Topic: Qdrant Vector Database Cheat Sheet
🎯Take a practice test on this topic5 practice tests Β· 168 questionsβ†’

Prompt engineering is the practice of designing and optimizing textual instructions that guide large language models (LLMs) and other AI systems to generate desired outputs. Born from the rise of transformer-based models like GPT, Claude, and Gemini, prompt engineering has evolved from simple question-answer patterns into a sophisticated discipline involving reasoning frameworks, output control, and security considerations. As models grow more capable, the field is converging with context engineering β€” the broader practice of shaping all information a model receives β€” making the structure, format, and context of prompts as important as the words themselves.


Quick Index108Β entriesΒ Β·Β 16Β tables
Mind Map

16 tables, 108 concepts. Select a concept node to jump to its table row.

Preparing mind map...

Table 1: Core Prompting Approaches

Foundational ways to shape an LLM's answer using the wording of a single prompt, before reaching for reasoning chains or tools. They differ mainly in how much you show the model (no examples, one, or several), how you frame the request (persona, situational context, explicit rules), and whether the model is asked to clarify the question for itself first.

TechniqueExampleDescription
Zero-shot prompting
Translate to French: Hello
β€’ Model performs task without examples, relying solely on pre-training knowledge
β€’ fast but less reliable for complex or domain-specific tasks
Few-shot prompting
English: cat β†’ French: chat
English: dog β†’ French: chien
English: bird β†’ ?
β€’ Provides 2–5 example input-output pairs before the query
β€’ significantly improves accuracy and consistency for nuanced tasks
One-shot prompting
Example: "angry" β†’ negative
Classify: "delightful" β†’ ?
β€’ Single demonstration example
β€’ useful when task is straightforward but model needs format guidance
Role prompting
You are an expert oncologist.
Explain CAR-T therapy.
β€’ Assigns a persona or expertise to the model
β€’ most effective for controlling tone, style, and output format rather than expanding factual knowledge
Instruction following
List three benefits. Use bullet points.
Keep under 50 words.
β€’ Explicit directives on what, how, and constraints
β€’ essential for controlling output length, format, and style
Contextual prompting
Background: User is a beginner.
Task: Explain neural networks.
Provides situational information (audience, constraints, domain) to shape response appropriately
Rephrase and Respond (RaR)
Rephrase and expand this question,
then answer: Why is the sky blue?
β€’ Model rephrases the question before answering
β€’ improves accuracy by resolving ambiguity in the original phrasing

Table 2: Reasoning and Decomposition Techniques

These prompts get a model to work through a problem instead of guessing an answer in one leap. Some show step-by-step reasoning, others split a task into smaller parts, branch and backtrack across options, or interleave reasoning with real tool calls. Their gains depend heavily on model scale, and a written reasoning trace is not a guaranteed account of what the model actually did.

MethodExampleDescription
Chain-of-Thought (CoT)
Q: 23 + 47 = ?
A: 23 + 47 = 20 + 40 + 3 + 7 = 60 + 10 = 70
β€’ Prompts model to show step-by-step reasoning
β€’ an emergent ability that helps large models on math and logic, but can fail or hurt on small models
Zero-shot CoT
Let's think step by step.
β€’ Triggers reasoning without examples
β€’ effective shortcut when few-shot is impractical; redundant on reasoning models (o1/o3/R1)
Self-consistency
Generate 5 answers via CoT β†’ select majority answer
β€’ Samples multiple independent reasoning paths and takes the majority answer
β€’ improves reliability, but costs N times the tokens and latency
Tree of Thoughts (ToT)
Evaluate 3 approaches β†’ explore best 2 β†’ backtrack if stuck
β€’ Models reasoning as branching exploration with self-evaluation and backtracking
β€’ uses search over partial paths, handling planning and multi-path problems
Least-to-Most prompting
Step 1: Simplify equation
Step 2: Solve for x using Step 1
β€’ Decomposes a problem into ordered subproblems, each fed the previous answer
β€’ generalizes to problems harder than the examples shown
ReAct (Reasoning + Acting)
Thought: Need population data
Action: search("France population")
Observation: 67M β†’ Answer
β€’ Interleaves reasoning traces with tool actions and observations
β€’ grounds reasoning in retrieved results, reducing hallucination
Plan-and-Solve (PS+)
First, devise a plan to solve this.
Then carry out the plan step by step.
β€’ Model plans subtasks before executing them
β€’ a zero-shot method that reduces the missing-step and calculation errors of zero-shot CoT
Step-back prompting
Before answering, what general
principles apply to this problem?
β€’ Model identifies high-level concepts or first principles before specifics
β€’ improves reasoning on knowledge-intensive and abstract problems
Graph of Thoughts (GoT)
thought_1 + thought_2 β†’ aggregated_insight
Loop back for refinement
β€’ Organizes reasoning as a directed graph that can merge branches and loop back
β€’ most flexible for complex interdependent reasoning
Thread of Thought (ThoT)
Walk me through this context
step by step, summarizing as you go.
β€’ Segments and analyzes long or chaotic contexts methodically
β€’ plug-and-play technique for tasks with extended or noisy input
Auto-CoT
Cluster questions by diversity β†’ auto-generate CoT demos
β€’ Automatically constructs chain-of-thought demonstrations without manual effort
β€’ samples diverse questions so the occasional wrong auto-generated chain does little harm
Self-Ask
Are follow-up questions needed?
Yes: What is...? β†’ intermediate answer
Final answer: ...
β€’ Model generates and answers sub-questions before the main answer
β€’ improves compositional and multi-hop reasoning

Table 3: Output Control and Formatting

These techniques shape what the model returns and how it is structured, so downstream code can parse it and humans can read it. A key distinction runs through the table: prompt-only instructions (asking for JSON, a length, or "do not" rules) are soft requests the model can miss, while API-level features like structured outputs and max_tokens enforce hard constraints.

TechniqueExampleDescription
Structured output (JSON)
Return as JSON: {"name": str, "age": int}
β€’ Enforces a specific schema (JSON, XML, YAML)
β€’ only schema-enforced structured outputs (constrained decoding) guarantee conformance; a prompt-only "Return JSON" can still emit invalid or extra text
XML tag structuring
<context>text</context>
<instructions>summarize</instructions>
β€’ Wraps prompt sections in semantic XML tags
β€’ reduces ambiguity by marking boundaries; an Anthropic best practice especially effective with Claude, but clear delimiters help most models
Delimiters and sections
### Input
text
### Output
summary
β€’ Uses markers (###, ```, ---) to separate sections
β€’ reduces ambiguity about what content the model should process vs. generate |
Output length control
Summarize in exactly 3 sentences.
Keep under 100 tokens.
β€’ Specifies word/sentence/token count
β€’ a stated count is a soft target the model may miss; max_tokens is a hard truncation that can cut output mid-word and break JSON
Format templates
<summary>
<title>...</title>
<body>...</body>
</summary>
β€’ Provides a markup skeleton for the model to fill
β€’ keeps nested or hierarchical output consistent; especially effective with XML
Enumerated instructions
1. Extract entities
2. Classify sentiment
3. Return as table
β€’ Numbered steps clarify sequence and expectations
β€’ improves task adherence when multiple operations are required
Negative prompting (constraints)
Do NOT include personal opinions.
Avoid bullet points.
β€’ Specifies what to exclude from output
β€’ unreliable on its own because models handle negation poorly; pair with positive framing (say what to include)

Table 4: Advanced Reasoning Patterns

These patterns push a model past a single answer by adding structure: writing its own prompts or examples, critiquing and verifying its work, offloading hard computation to code, or trading layout for speed. Knowing what each one actually changes (and where it quietly fails) is what separates a reliable pipeline from a fragile one.

PatternExampleDescription
Meta-prompting
Generate a prompt to classify movie reviews.
β€’ Model writes or optimizes prompts for a task
β€’ enables iterative self-improvement and automated prompt engineering
Generated knowledge prompting
First, list relevant facts about photosynthesis.
Now answer: What role does chlorophyll play?
β€’ Model generates intermediate knowledge before answering
β€’ improves factual accuracy on knowledge-intensive queries, using its own training (not external retrieval)
Self-Refine
Draft β†’ Critique your draft β†’
Revise based on feedback β†’ repeat
β€’ Model iteratively generates, critiques, and refines its own output
β€’ no external model needed; lifts quality on generation tasks, but self-critique alone does not reliably fix reasoning errors
Chain of Verification (CoVe)
Answer β†’ generate verification questions β†’
answer each independently β†’ revise
β€’ Model plans verification questions, answers them independently of the draft, then revises
β€’ significantly reduces hallucinations in factual tasks
Directional stimulus prompting
Keywords: protein, folding, disease
Write an abstract.
Provides hints or cues (keywords, themes) to steer generation toward desired content; a small trained policy model can generate the hints for a frozen LLM
Program-Aided Language (PAL)
Write Python to solve: "If x^2 = 16, find x"
def solve(): return sqrt(16)
β€’ Model generates executable code as the reasoning step
β€’ offloads arithmetic to an interpreter that runs the code, for higher accuracy
Skeleton-of-Thought (SoT)
First: generate outline with 5 sections.
Then: write each section in parallel.
β€’ Creates structural outline first, then parallelizes content generation
β€’ reduces latency by up to 2.4x for long outputs
Chain of Density (CoD)
Summary 1: sparse (50 words)
Summary 2: denser (same length, +3 entities)
Iterate 5 times
β€’ Iteratively packs more entities into a fixed-length summary
β€’ produces human-preferred summaries by the later steps
Active-Prompt
Measure uncertainty on unlabeled questions β†’ annotate most uncertain β†’ add to few-shot pool
β€’ Uses uncertainty sampling to select which examples a human should annotate
β€’ improves few-shot performance with minimal human labeling
Analogical prompting
Recall relevant problems similar to this,
then solve by analogy.
β€’ Model self-generates relevant examples before solving the task
β€’ eliminates manual few-shot curation; improves math and code reasoning
Cumulative reasoning
Generate propositions iteratively β†’
verify each β†’ accumulate into final answer
β€’ Uses a proposer, verifier, and reporter to build the answer from verified steps
β€’ verifying each proposition before accumulating it is what sets it apart from plain chain-of-thought

Table 5: Message Roles and Context Structure

Chat models read a list of role-tagged messages rather than one block of text. The system (and newer developer) role sets standing behavior, user carries the live request, and assistant holds the model's prior replies. Because each request is stateless, your app resends the whole list every turn to maintain context, and higher-privilege roles outrank the user role when instructions conflict.

RoleExampleDescription
System message
You are a helpful assistant specializing in Python.
β€’ Sets global behavior, persona, and constraints
β€’ applied before all user messages as persistent context, but it is guidance, not a security boundary
User message
How do I reverse a list in Python?
β€’ Contains user query or command
β€’ the primary input the assistant responds to
Assistant message
Use list.reverse() or slicing: lst[::-1]
β€’ Model's previous response, supplied back as history
β€’ you can also write one to prefill or steer the next answer
Multi-turn context
[user] "Define recursion"
[assistant] "..."
[user] "Give example"
β€’ Each request is stateless, so the client resends the full history every turn
β€’ longer chats cost more tokens and can exceed the context window
Developer message
[developer] "Always respond in JSON format"
β€’ OpenAI's newer app-developer instruction role
β€’ ranks above user messages in the instruction hierarchy and is meant to win conflicts

Table 6: Prompt Chaining and Workflow Orchestration

Once a task is too big for one prompt, you compose several model calls into a pipeline. These patterns range from simple sequential chains to retrieval, tool use, routing, and self-directing agents. A key theme: the model proposes structured steps, but your code executes tools, routes branches, and enforces stop conditions.

TechniqueExampleDescription
Prompt chaining
Prompt 1: Extract entities β†’ output_1
Prompt 2: Classify entities from {output_1}
β€’ Decomposes a task into sequential LLM calls
β€’ each prompt's output feeds the next, so steps stay simple and easy to debug
Retrieval-Augmented Generation (RAG)
1. Retrieve docs about "mitochondria"
2. Prompt: "Using {docs}, explain ATP synthesis"
β€’ Fetches external documents at query time and adds them to the prompt
β€’ grounds answers in current or proprietary data without retraining the model
Function calling (tool use)
tools: [{"name": "get_weather",
"parameters": {"location": "string"}}]
β€’ Model selects a structured tool schema and emits name plus arguments
β€’ your application code runs the tool, so validate arguments before executing
Agentic workflows
Agent: Plan β†’ Act β†’ Observe β†’ Refine β†’ Act
β€’ Model directs its own steps, choosing tools based on each result
β€’ loops toward a goal, so a max-iteration cap is needed to avoid runaway cost
Conditional branching (routing)
If sentiment=negative: call escalation_prompt
Else: call thank_you_prompt
β€’ Classifies the input, then routes it to a specialized prompt
β€’ separates concerns so each branch stays focused on one kind of case
ReWOO (Reasoning Without Observation)
Plan all tool calls upfront β†’
execute β†’ synthesize
β€’ Decouples planning from observation
β€’ a planner writes the full plan with placeholders, workers run tools, a solver combines results, cutting LLM calls vs ReAct

Table 7: Sample Selection and Example Design

Which examples you put in a few-shot prompt, and in what order, often moves accuracy more than how many you add. This table covers the main ways to choose demonstrations, from query-matched and balanced sets to contrastive pairs, plus the biases that make order and label balance matter.

StrategyExampleDescription
Similarity-based selection
Choose examples most similar to query via embedding distance
β€’ Provides contextually relevant demonstrations
β€’ often outperforms random, but similar examples cluster and can lose diversity
Stratified sampling
2 positive, 2 negative, 1 neutral sentiment
β€’ Ensures balanced coverage of categories
β€’ counters majority-label bias when data is imbalanced
Contrastive examples
Correct: "Step A β†’ B β†’ C"
Incorrect: "Step A β†’ C (missing B)"
β€’ Shows both correct and incorrect cases
β€’ helps the model see which reasoning steps to avoid
Example ordering
Place most relevant or recent examples last
β€’ LLMs exhibit recency bias
β€’ reordering the same examples can swing accuracy from near chance to near best
Random sampling
Pick 5 random examples from dataset
β€’ Baseline approach
β€’ fast but mirrors data skew and ignores query relevance

Table 8: Generation Parameters and Sampling

These settings control how a model turns its next-token probabilities into actual text: how much randomness to allow, which low-probability tokens to discard, how long to keep going, and when to stop. Tuning them well is the difference between focused, parseable output and creative-but-unreliable rambling.

ParameterExampleDescription
Temperature
temperature=0.0 (deterministic)
temperature=1.0 (creative)
β€’ Controls randomness, not answer quality
β€’ lower = more focused/repetitive, higher = more diverse/creative
β€’ typical range 0–2; even 0 is not guaranteed bit-for-bit identical across runs
Top-p (nucleus sampling)
top_p=0.9
β€’ Keeps the smallest token set whose cumulative probability β‰₯ p, then samples from it
β€’ adapts the candidate count to the model's confidence
β€’ vendors recommend tuning temperature or top_p, not both
Max tokens
max_tokens=150
β€’ Hard cap on output length that truncates the moment it is hit
β€’ not a target length; can cut mid-sentence and break JSON
β€’ prevents runaway generation and controls cost
Top-k sampling
top_k=40
β€’ Restricts sampling to the k most likely tokens (a fixed count)
β€’ simpler than top-p but a blunt cutoff that ignores the distribution's shape
Frequency penalty
frequency_penalty=0.5
Reduces repetition by penalizing tokens in proportion to how often they have already appeared (count-based)
Presence penalty
presence_penalty=0.6
Encourages topic diversity with a flat one-time penalty applied once a token has appeared at all, regardless of count
Min-p sampling
min_p=0.05
β€’ Keeps tokens above a fraction of the top token's probability (base value Γ— top probability)
β€’ adaptive: strict when one token dominates, relaxed when the model is uncertain
β€’ pairs well with temperature > 1
Stop sequences
stop=["###", "\n\n"]
β€’ Terminates generation when a specified string is produced (content-based, unlike the length-based max-tokens cap)
β€’ useful for structured outputs and preventing runaway text

Table 9: Multimodal and Vision-Language Prompting

Multimodal prompting feeds a model more than text. You pass an image or audio clip alongside your question, and the model reasons over both. Keep in mind these models do not "see" or "hear" perfectly: they give approximate object counts, struggle with precise spatial detail, and can hallucinate text when reading documents, so verify anything high-stakes.

ApproachExampleDescription
Image + text prompting
[image of chart]
What trend does this show?
β€’ Combines visual and textual input
β€’ model analyzes image content to answer text query
Visual question answering
[image of room]
How many chairs are visible?
Model performs object counting, detection, or scene understanding from image. Counts are approximate, so verify them
OCR and document understanding
[scanned receipt]
Extract total amount.
Reads and interprets text within images, including tables, forms, and structured documents. May hallucinate plausible but wrong values
Image captioning
[photo]
Generate detailed caption.
Model produces natural language description of image content
Visual reasoning
[two images]
Which object is larger?
Requires comparison or relational reasoning across visual inputs. Precise spatial localization is unreliable
Audio prompting
[audio clip]
Transcribe and summarize this meeting.
β€’ Processes speech or audio input natively
β€’ supported by multimodal models like GPT-4o for transcription, analysis, and translation

Table 10: Safety and Robustness

Securing an LLM application means assuming its prompts and the data it reads are adversarial. These techniques cover the layered defenses that matter most: keeping untrusted input from overriding your instructions, validating what the model emits, training models to refuse harmful requests, and testing your system the way an attacker would before it ships.

TechniqueExampleDescription
Prompt injection defense
Use input handling, instruction delimiters, and privilege limits
Mitigates attacks where input tries to override developer instructions or exfiltrate data. OWASP ranks prompt injection as LLM01, the top LLM risk
Output validation
Check output against a schema, encode it, or screen with a secondary LLM
Treats model output as untrusted before it reaches a browser, database, or shell, preventing XSS, SQL injection, or command execution
Constitutional AI principles
Model self-critiques against rules like Refuse harmful requests. Be helpful and honest.
A training method (RLAIF): the model critiques and revises its own answers against a set of principles, not a runtime word filter
Red-teaming prompts
Run adversarial probes such as injection and jailbreak attempts before launch
Adversarial testing to find vulnerabilities before attackers do. Expected by the NIST AI RMF and OWASP LLM Top 10
Jailbreak resistance
Detect attempts to bypass safety via role-play, encoding, or indirection
Models trained to recognize and refuse disguised harmful requests, targeting the model's safety rules (distinct from injection)
Indirect prompt injection defense
Separate trusted instructions from untrusted external data using privilege boundaries
Prevents attackers from embedding hidden instructions in documents, emails, or tool outputs the model processes. The key risk for agentic and RAG systems

Table 11: Emotion and Persona Techniques

These techniques shape how a model speaks and reasons by giving it a role, an audience, or an emotional frame. They mostly steer tone, depth, and perspective, and their effect on factual accuracy is far weaker and less reliable than popular advice suggests.

TechniqueExampleDescription
Expert persona
You are a Pulitzer Prize-winning journalist.
Write a headline.
β€’ Assigns specific expertise or identity
β€’ mainly shapes tone, depth, and style, and does not reliably boost factual accuracy
Multi-persona prompting
Summon three experts (security, UX, backend).
Have them collaborate on a review.
β€’ One model simulates multiple expert personas collaborating in a single self-collaboration
β€’ produces more thorough, multi-perspective outputs
Emotional prompting
This is very important to my career.
Please give your best answer.
β€’ Adds emotional stakes or urgency
β€’ reported gains in earlier studies, but effects are mixed and model-dependent, often weaker on frontier models
Simulated Theory of Mind (SimToM)
Put yourself in the reader's shoes.
What would they find confusing?
β€’ Two-stage perspective-taking: filter context to what a character knows, then answer from that view
β€’ improves reasoning about beliefs and supports more empathetic responses

Table 12: Optimization and Automation

These methods move prompt work from hand-tuning to measured, repeatable engineering: tools that auto-generate and score prompts (APE, DSPy), ways to compare and version prompts in production (A/B testing, prompt versioning), a parameter-efficient training alternative (soft prompts), and an inference trick that reuses a repeated prefix to cut cost and latency (prompt caching).

MethodExampleDescription
Automatic Prompt Engineering (APE)
Generate prompt candidates β†’ score on a dataset β†’ select best performer
β€’ A model proposes instruction candidates, which are then scored and filtered on a validation set
β€’ replaces manual trial-and-error
DSPy framework
Define signatures β†’ framework compiles and optimizes prompts from examples and a metric
β€’ Declarative approach where prompts are compiled, then iteratively improved against a metric, not hand-written
β€’ discards variations that do not score better
A/B testing prompts
Run variant A vs B on the same inputs β†’ measure accuracy, latency, cost β†’ deploy winner
β€’ Empirical comparison to select the best prompt for production
β€’ needs enough samples for statistical confidence, since LLM outputs vary
Prompt tuning (soft prompts)
Learn a few continuous embedding vectors prepended to the input while the model stays frozen
β€’ Trains small learnable vectors, not readable text, leaving model weights frozen
β€’ parameter-efficient alternative to full fine-tuning
Prompt versioning
Track each prompt change as an immutable, identified version with eval metrics
β€’ Manages prompt iterations in production
β€’ enables exact rollback, A/B testing, and regression tracking
Prompt caching
Place static system instructions first β†’ variable content last
β€’ Providers reuse the computed prefix for an exact match, cutting cost up to ~90% and latency up to ~80%
β€’ caches the input prefix, never the response; supported by OpenAI, Anthropic, Google

Table 13: Specialized Patterns and Emerging Techniques

These are newer or niche prompting patterns, several from single recent papers, that squeeze more reliability out of a model without touching its weights. They lean on tricks like self-reflection, voting across reasoning chains, picking complex examples, and even repeating the prompt, so treat the emerging ones as promising rather than settled and verify before production use.

PatternExampleDescription
Reflexion
Review your answer. What could be improved?
Revise β†’ iterate
β€’ Model self-critiques and writes a verbal reflection it stores in memory as context for the next attempt
β€’ reinforces the agent without any weight update or fine-tuning
Complexity-based prompting
Select few-shot examples with the most reasoning steps
β€’ Prefers demonstrations with higher reasoning complexity (longer chains)
β€’ can also vote over the most complex chains at decoding; raises multi-step accuracy
Maieutic prompting
Generate explanation tree β†’ prune contradictory branches
β€’ Builds an abductive, recursive tree of explanations
β€’ frames the answer as a satisfiability problem over their logical relations to find the most consistent one
Universal Self-Consistency
Apply self-consistency to non-reasoning tasks (e.g., classification, extraction)
β€’ Has the LLM itself pick the most consistent of several candidate answers
β€’ extends majority-voting benefits to free-form tasks where answers cannot be counted
Prompt repetition
What are the causes of inflation?
What are the causes of inflation?
β€’ Repeating the prompt twice gives a bidirectional-context effect in causal models, reported to help non-reasoning LLMs
β€’ doubles input token cost but adds no generated tokens or latency
DR-CoT (Dynamic Recursive CoT)
Recurse on sub-steps β†’ truncate context β†’
vote across reasoning chains
β€’ Combines recursive refinement, dynamic context truncation within a token budget, and majority voting
β€’ helps parameter-efficient models rival larger ones; voting can still fail under shared model bias

Table 14: Prompting for Reasoning Models

Reasoning models such as OpenAI o1/o3, Claude with extended thinking, and DeepSeek-R1 think before they answer, so they reward concise goal statements over heavy step-by-step scaffolding. These techniques cover how to steer their hidden reasoning, tune its depth against cost and latency, and avoid instructions that older models needed but these models do not.

TechniqueExampleDescription
Goal-oriented prompting
Solve for x where 3x + 7 = 22.
Show the solution process and final result.
β€’ State desired outcome clearly without prescribing steps
β€’ reasoning models (o1/o3/R1) perform best with concise goal statements
Extended thinking (budget tokens)
thinking: {type: "enabled",
budget_tokens: 10000}
β€’ Allocates a reasoning scratchpad for Claude models, drawn from max_tokens and at least 1024
β€’ model thinks step-by-step in a hidden block before producing the answer
Reasoning effort control
reasoning_effort: "high"
β€’ Adjusts how deeply the same model reasons before answering
β€’ "low" for simple tasks, "high" for complex problems; controls cost and latency
Avoid explicit CoT instructions
Do not add "think step by step" to o1/o3/R1
β€’ Reasoning models already reason internally
β€’ explicit CoT is redundant and can increase latency without benefit

Table 15: Domain-Specific Applications

Prompt engineering plays out differently across common LLM tasks. Code and data work reward low temperature and strict structure, summarization and translation lean on dedicated techniques like Chain of Density and few-shot terminology, and grounded question answering depends on retrieval. Knowing the right tool and setting per task is what separates a reliable pipeline from a flaky one.

DomainExampleDescription
Code generation
Write a Python function to merge two sorted lists.
β€’ Produces runnable code, but output can look correct yet miss details like an import
β€’ favor low temperature and always test the result
Data extraction
Extract: name, email, phone from:
"Contact John at john@ex.com"
β€’ Pulls structured fields from unstructured text
β€’ structured outputs enforce a JSON schema at decode time, far more reliable than free-text or plain JSON mode
Summarization
Summarize this article in 2 sentences.
β€’ Condenses long text into key points
β€’ Chain of Density packs more entities into a fixed length and reduces lead bias
Creative writing
Write a haiku about autumn.
β€’ Generates poetry, stories, or dialogue
β€’ higher temperature adds variety, but too high turns coherent text into nonsense
Translation
Translate to German: "Good morning"
β€’ Converts text between languages
β€’ few-shot examples of approved terminology improve accuracy on domain terms
Question answering
Based on: {document}, answer: Who founded the company?
β€’ Provides a factual answer from context
β€’ RAG grounds answers in private or fresh sources, but is unneeded for general knowledge
Sentiment analysis
Classify sentiment: "I loved this movie!" β†’ positive
β€’ Determines emotional tone
β€’ few-shot with diverse examples improves handling of sarcasm and mixed reviews

Table 16: Anti-Patterns and Common Pitfalls

These are the prompt habits that quietly wreck output quality, run up cost, or produce confidently wrong answers. For each one, the fix is usually the opposite move: be specific, decompose, show examples, set limits, mark boundaries, ground recent facts with retrieval, and match sampling to the task. The last row is an evolving guideline, heavy step-by-step scaffolding helps weaker models but can over-constrain frontier ones.

PatternExampleDescription
Vague instructions
Tell me about AI.
β€’ Lacks specificity
β€’ produces generic, unfocused output; always specify scope, audience, or format
Overloading single prompt
Mixing 10 unrelated tasks in one prompt
β€’ Splits the model's attention, so every task gets shallow output
β€’ better to chain or decompose into separate prompts
No examples for complex tasks
Zero-shot on nuanced classification
β€’ Underperforms without demonstrations
β€’ 2-5 few-shot examples teach your exact criteria; more than that tends to plateau
Ignoring output length
No length constraint, leading to a 5000-word response
β€’ Generates unnecessarily long outputs and runs up cost at scale
β€’ state the length in words; max-tokens only truncates, it does not shape length
Ambiguous delimiters
Input: text here Output: more text (no clear boundary)
β€’ Model confuses what to process vs. generate
β€’ use ### or ``` to separate sections |
Assuming knowledge cutoff awareness
What happened last week? (model trained months ago)
β€’ Model cannot access real-time data and will confidently invent recent facts
β€’ ground it with RAG or tool use
Wrong parameters for task
Deterministic task with temperature=1.5
β€’ Excessive randomness where consistency is needed (temperature is not truthfulness)
β€’ tune temperature and top-p to the task
Excessive scaffolding for capable models
10-step procedural instructions for a frontier model on an open-ended task
β€’ Over-constraining can hinder autonomous reasoning in frontier models
β€’ describe the desired result and let it choose the route; strict steps still suit procedural, schema-bound tasks

Back to Generative AI
Next Topic: Qdrant Vector Database Cheat Sheet

More in Generative AI

  • Pinecone (Vector Database) Cheat Sheet
  • Qdrant Vector Database Cheat Sheet
  • Advanced RAG Patterns and Optimization Cheat Sheet
  • ColBERT and Late Interaction Retrieval Cheat Sheet
  • LangSmith Cheat Sheet
  • NL-to-SQL and Text-to-Code Generation Cheat Sheet
View all 95 topics in Generative AI

References

Official frameworks & canonical authors

  1. Output validation - https://genai.owasp.org/llmrisk/llm052025-improper-output-handling/
  2. Prompt injection defense - https://genai.owasp.org/llmrisk/llm01-prompt-injection/

Vendor documentation & tools

  1. Agentic workflows - https://www.anthropic.com/research/building-effective-agents
  2. Assistant message - https://developers.openai.com/api/docs/guides/conversation-state
  3. Audio prompting - https://platform.openai.com/docs/guides/audio
  4. Avoid explicit CoT instructions - https://developers.openai.com/api/docs/guides/reasoning-best-practices
  5. Constitutional AI principles - https://www.anthropic.com/research/constitutional-ai-harmlessness-from-ai-feedback
  6. DSPy framework - https://dspy.ai/
  7. Extended thinking (budget tokens) - https://platform.claude.com/docs/en/build-with-claude/extended-thinking
  8. Function calling (tool use) - https://developers.openai.com/api/docs/guides/function-calling
  9. Goal-oriented prompting - https://developers.openai.com/api/docs/guides/prompt-engineering
  10. Image + text prompting - https://huggingface.co/blog/vlms
  11. Indirect prompt injection defense - https://www.anthropic.com/research/prompt-injection-defenses
  12. Instruction following - https://developers.openai.com/api/docs/guides/prompt-engineering/
  13. Min-p sampling - https://huggingface.co/posts/joaogante/319451541682734
  14. Output length control - https://developers.openai.com/api/docs/guides/reasoning
  15. Prompt caching - https://platform.openai.com/docs/guides/prompt-caching
  16. Prompt tuning (soft prompts) - https://huggingface.co/docs/peft/conceptual_guides/prompting
  17. Structured output (JSON) - https://developers.openai.com/api/docs/guides/structured-outputs
  18. System message - https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/advanced-prompt-engineering
  19. Top-k sampling - https://huggingface.co/docs/transformers/generation_strategies
  20. User message - https://developers.openai.com/api/docs/guides/text-generation
  21. Vague instructions - https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-api
  22. Visual question answering - https://platform.openai.com/docs/guides/vision
  23. XML tag structuring - https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices

Academic papers & preprints

  1. Active-Prompt - https://arxiv.org/abs/2302.12246
  2. Auto-CoT - https://arxiv.org/abs/2210.03493
  3. Automatic Prompt Engineering (APE) - https://arxiv.org/abs/2211.01910
  4. Chain of Density (CoD) - https://arxiv.org/abs/2309.04269
  5. Chain of Verification (CoVe) - https://arxiv.org/abs/2309.11495
  6. Chain-of-Thought (CoT) - https://arxiv.org/abs/2201.11903
  7. Complexity-based prompting - https://arxiv.org/abs/2210.00720
  8. DR-CoT (Dynamic Recursive CoT) - https://www.nature.com/articles/s41598-025-18622-6
  9. Example ordering - https://arxiv.org/abs/2102.09690
  10. Expert persona - https://arxiv.org/abs/2311.10054
  11. Graph of Thoughts (GoT) - https://arxiv.org/abs/2308.09687
  12. Least-to-Most prompting - https://arxiv.org/abs/2205.10625
  13. Maieutic prompting - https://arxiv.org/abs/2205.11822
  14. Multi-persona prompting - https://arxiv.org/abs/2307.05300
  15. Plan-and-Solve (PS+) - https://arxiv.org/abs/2305.04091
  16. Prompt repetition - https://arxiv.org/abs/2512.14982
  17. ReAct (Reasoning + Acting) - https://arxiv.org/abs/2210.03629
  18. Reflexion - https://arxiv.org/abs/2303.11366
  19. Rephrase and Respond (RaR) - https://arxiv.org/abs/2311.04205
  20. ReWOO (Reasoning Without Observation) - https://arxiv.org/abs/2305.18323
  21. Self-Ask - https://arxiv.org/abs/2210.03350
  22. Self-consistency - https://arxiv.org/abs/2203.11171
  23. Self-Refine - https://arxiv.org/abs/2303.17651
  24. Similarity-based selection - https://arxiv.org/abs/2101.06804
  25. Simulated Theory of Mind (SimToM) - https://arxiv.org/abs/2311.10227
  26. Skeleton-of-Thought (SoT) - https://arxiv.org/abs/2307.15337
  27. Step-back prompting - https://arxiv.org/abs/2310.06117
  28. Thread of Thought (ThoT) - https://arxiv.org/abs/2311.08734
  29. Tree of Thoughts (ToT) - https://arxiv.org/abs/2305.10601
  30. Universal Self-Consistency - https://arxiv.org/abs/2311.17311

Articles, guides, & named-author resources

  1. A/B testing prompts - https://dev.to/benchwright/how-to-ab-test-llm-prompts-without-breaking-production-4823
  2. Analogical prompting - https://learnprompting.org/docs/advanced/thought_generation/analogical_prompting
  3. Code generation - https://www.promptingguide.ai/applications/coding
  4. Contextual prompting - https://handsonai.info/agentic-building-blocks/prompts/prompt-engineering/contextual-prompting/
  5. Contrastive examples - https://learnprompting.org/docs/advanced/thought_generation/contrastive_cot
  6. Creative writing - https://www.promptingguide.ai/techniques
  7. Cumulative reasoning - https://learnprompting.org/docs/advanced/self_criticism/cumulative_reasoning
  8. Data extraction - https://www.promptingguide.ai/applications/function_calling
  9. Delimiters and sections - https://www.promptingguide.ai/introduction/settings
  10. Directional stimulus prompting - https://www.promptingguide.ai/techniques/dsp
  11. Emotional prompting - https://learnprompting.org/docs/advanced/zero_shot/emotion_prompting
  12. Excessive scaffolding for capable models - https://www.mindstudio.ai/blog/how-to-prompt-gpt-5-5-outcome-first-prompting
  13. Few-shot prompting - https://www.promptingguide.ai/techniques/fewshot
  14. Generated knowledge prompting - https://www.promptingguide.ai/techniques/knowledge
  15. Jailbreak resistance - https://www.promptfoo.dev/blog/jailbreaking-vs-prompt-injection/
  16. Meta-prompting - https://www.prompthub.us/blog/a-complete-guide-to-meta-prompting
  17. Negative prompting (constraints) - https://www.the-main-thread.com/p/prompting-like-a-parent
  18. OCR and document understanding - https://www.datacamp.com/blog/top-vision-language-models
  19. One-shot prompting - https://learnprompting.org/docs/basics/few_shot
  20. Program-Aided Language (PAL) - https://www.promptingguide.ai/techniques/pal
  21. Prompt chaining - https://www.promptingguide.ai/techniques/prompt_chaining
  22. Prompt versioning - https://www.braintrust.dev/articles/what-is-prompt-versioning
  23. Question answering - https://www.promptingguide.ai/techniques/rag
  24. Red-teaming prompts - https://www.promptfoo.dev/docs/red-team/
  25. Retrieval-Augmented Generation (RAG) - https://aws.amazon.com/what-is/retrieval-augmented-generation/
  26. Role prompting - https://learnprompting.org/docs/basics/roles
  27. Stratified sampling - https://www.prompthub.us/blog/the-few-shot-prompting-guide
  28. Summarization - https://www.prompthub.us/blog/better-summarization-with-chain-of-density-prompting
  29. Temperature - https://learnprompting.org/docs/intermediate/configuration_hyperparameters
  30. Zero-shot CoT - https://www.promptingguide.ai/techniques/cot
  31. Zero-shot prompting - https://www.promptingguide.ai/techniques/zeroshot

Updated: 2026-05-31; Version 2