- Few-Shot & Zero-Shot Prompting: Guide the model with examples (few-shot) or rely on instructions alone (zero-shot) for in-context learning.
- Chain-of-Thought Prompting: Encourage step-by-step reasoning by adding phrases like “Let’s think step by step.”
- Instructional Prompting: Use clear, explicit instructions with imperative verbs to improve response relevance and structure.
- Role-Based Prompting: Assign personas (e.g., “You are a tutor”) to influence tone, depth, and framing of responses.
- Self-Consistency Prompting: Generate multiple answers and choose the most consistent to increase correctness and reliability.
Prompt engineering is the process of crafting and optimizing input prompts to guide a large language model’s output. In practice, a well-designed prompt can significantly influence the quality, accuracy, and style of the model’s response. This is especially important with modern models like ChatGPT (GPT-3.5, GPT-4, etc.), which can perform a wide range of tasks without additional training simply by interpreting the prompt. Early breakthroughs (such as GPT-3) showed that scaling up language models enabled strong performance on many tasks just from prompting alone, without gradient updates or fine-tuning. In essence, prompt engineering has emerged as a key technique to unlock an LLM’s capabilities by telling it what we need in the right way.
- 1 Zero-Shot, One-Shot, and Few-Shot Prompting
- 2 Chain-of-Thought Prompting (Step-by-Step Reasoning)
- 3 Instructional Prompting (Explicit Instructions)
- 4 Role-Based Prompting (Assigning a Persona or Role)
- 5 Reframing or Rewriting Questions (Prompt Rewriting)
- 6 Delimiting and Structuring the Output
- 7 Self-Consistency Prompting
- 8 Retrieval-Augmented Prompting
- 9 Socratic Questioning (Ask and Answer via Sub-Questions)
- 10 Incremental Prompting (Iterative Approach)
- 11 Best Practices
Zero-Shot, One-Shot, and Few-Shot Prompting
One fundamental set of patterns revolves around how many examples (if any) are given in the prompt. In zero-shot prompting, the model is not provided any example; the prompt simply instructs the model on the task or asks a question. The model must respond based on its prior knowledge and understanding of the instruction. Modern instruction-tuned LLMs (like ChatGPT) are quite capable in zero-shot mode for many tasks due to their extensive training.
If zero-shot results are unsatisfactory, one-shot or few-shot prompting can be used. Here, the prompt includes one or more demonstration examples of the task. This technique provides in-context learning, steering the model with examples so it can infer the desired output pattern. The included examples act as conditioning data for the model’s completion.
For more difficult tasks, more examples (3-shot, 5-shot, etc.) can be provided to further improve performance. Research also shows that the choice and format of exemplars can matter. For instance, using a consistent format and even providing dummy labels in examples is often better than providing no examples at all.
Chain-of-Thought Prompting (Step-by-Step Reasoning)
Complex problems often benefit from prompts that encourage the model to reason through the problem step by step before giving a final answer. Chain-of-thought (CoT) prompting explicitly asks the model to produce a series of intermediate reasoning steps – a “chain” of reasoning – rather than jumping straight to the conclusion. By verbalizing its reasoning, the model can tackle arithmetic, commonsense, or logical problems that require multiple steps. Academic work by Wei et al. (2022) demonstrated that providing a few chain-of-thought examples (i.e., few-shot CoT) dramatically improved performance on math word problems and other reasoning benchmarks. Such reasoning abilities tend to emerge in sufficiently large models – for example, even a 540-billion parameter model surpassed fine-tuned alternatives on a math test when prompted with chain-of-thought demonstrations.
There are two main approaches to CoT prompting: Few-Shot CoT: Provide one or more exemplars where the question is followed by a step-by-step explanation and answer. The prompt format might be: “Q: [Question]? A: Let’s think step by step.” and then show the reasoning process leading to the answer. The model learns from the examples to produce a similar reasoning chain for the new query. Zero-Shot CoT: Even without exemplars, simply appending a phrase like “Let’s think step by step” (or a similar prompt) to invoke the chain-of-thought can be highly effective. Kojima et al. (2022) discovered that instructing the model with “Let’s think step by step.” often triggers it to generate a coherent multi-step reasoning process, significantly improving accuracy on reasoning tasks.
There are two main approaches to Chain-of-Thought (CoT) prompting in large language models: Few-Shot CoT and Zero-Shot CoT. Each method guides AI models to reason step by step, significantly enhancing performance on complex tasks.
Few-Shot CoT: This approach involves providing one or more examples where a question is followed by a detailed, step-by-step explanation and answer. For instance, a prompt might say: “Q: [Question]? A: Let’s think step by step.” followed by the reasoning process. The model then mimics this structure to solve new problems with similar logic.
Zero-Shot CoT: Even without examples, simply adding a phrase like “Let’s think step by step” to the prompt can activate the model’s reasoning abilities. Kojima et al. (2022) found that this minimal prompt technique often leads to coherent multi-step reasoning and improved accuracy in various reasoning benchmarks.
Chain-of-thought (CoT) prompting is now a go-to strategy for tasks that involve reasoning or multi-step calculations. This prompting technique not only enhances the accuracy of large language models but also improves transparency by revealing the reasoning path taken to reach an answer.
CoT prompting is particularly effective with larger models, which are more capable of following step-by-step instructions and generating coherent reasoning chains. In contrast, smaller language models may struggle to adhere to the prompt or produce inaccurate reasoning sequences.
Instructional Prompting (Explicit Instructions)
Instructional prompting refers to the practice of writing prompts as clear instructions or commands for the model. This technique takes advantage of the fact that modern large language models (LLMs), such as ChatGPT, are trained to follow human instructions. These models have been fine-tuned on datasets containing instructions and responses, and further optimized with human feedback, to better understand and execute user commands.
Phrasing your query as an explicit instruction often results in more accurate, relevant, and focused outputs from the model. Instead of asking vague or open-ended questions, instructional prompting ensures clarity and intention in communication with AI.
Key ideas for effective instructional prompts:
Be Clear and Specific: State exactly what you want the model to do. For example, instead of asking “What about climate change and policies?”, instruct the model: “Explain in one paragraph how climate change policies can reduce carbon emissions, in simple terms.” Precision in the prompt reduces ambiguity and improves response quality.
Use Imperative Verbs: Start the prompt with verbs like “List,” “Summarize,” “Translate,” “Write,” or “Explain.” For instance: “Summarize the following text…” or “Translate the text between <French> tags into English.” This aligns with the model’s training to follow direct instructions.
Provide Format or Constraints: If you need a specific format, tell the model explicitly. Examples include: “Provide the answer as bullet points.” or “Give the response in JSON format.” Specifying the style or format helps the model meet your expectations more effectively.
Because models like ChatGPT are aligned to human instructions through methods like instruction tuning and reinforcement learning from human feedback (RLHF), they tend to respond best to well-phrased commands. If a response doesn’t meet your needs, review your prompt and ensure it communicates exactly what is expected.
Role-Based Prompting (Assigning a Persona or Role)
Another powerful prompting pattern is role-based prompting, where you ask the model to adopt a specific role, perspective, or persona. By starting the prompt with a role assignment—such as “You are a data analyst…” or “Act as a friendly tutor…”—you can guide the tone, vocabulary, and level of detail in the model’s response. This technique is ideal for tailoring output to specific audiences or use cases.
Role-based prompts leverage the model’s ability to condition on context. When you state “You are X,” the model will typically generate responses that align with that role, making the interaction more coherent and purpose-driven.
Common use cases for role-based prompting include:
Professional Roles: “You are a financial advisor. Explain the concept of compound interest to a client in simple terms.” This prompt encourages a professional and instructive tone, suitable for helping laypeople understand financial topics.
Stylistic Personas: “You are a storyteller speaking to children. Tell a short story about a brave rabbit.” The model will likely respond with playful, imaginative language appropriate for a young audience.
Domain Experts: “Act as a medical expert. Provide an overview of diabetes management.” The model’s reply may include technical details and a clinical tone, mimicking expert-level communication.
Historical or Fictional Characters: For creative writing or entertainment, prompts like “You are Albert Einstein, and someone asks you to explain relativity” can yield responses styled after how that figure might speak or explain concepts.
In practice, role-based prompting helps align model responses with your intended purpose. Assigning a persona often results in more relevant and context-aware output, especially in conversational or creative scenarios. While extremely complex or fantastical roles might confuse the model, most reasonable personas are followed effectively. You can also combine role prompts with other techniques. For example: “You are a critical reviewer. Analyze the following text step by step…” blends a role with instruction and chain-of-thought prompting for deeper results.
Reframing or Rewriting Questions (Prompt Rewriting)
Sometimes, the initial way a question is asked isn’t the most effective for a language model to interpret or solve. Reframing the question—or asking the model to restate it in its own words—can clarify the task, reduce ambiguity, and improve the quality of the response. This prompt engineering technique adjusts the prompt to highlight important details or eliminate confusion.
Why reframing matters: Large language models (LLMs) are highly sensitive to the wording of prompts. Even slight changes in phrasing can dramatically alter how the model interprets and answers a query. Research shows that asking the model to “rephrase and then answer” significantly improves performance, especially on ambiguous or complex questions. This approach allows the model to demonstrate understanding before generating a final answer.
Techniques for effective prompt reframing:
User-Driven Rewriting: If the original question is unclear or yields a weak answer, the user can revise it to include more context or specificity. For example, instead of asking “Is the concert good for kids?”, reframe it as: “Would a child enjoy the content and atmosphere of this concert?” This provides clearer guidance and helps the model assess both content and setting.
Ask the Model to Rephrase: Another effective method is instructing the model to rephrase the question before answering it. For example: “Rephrase the question in simpler terms and then answer it: ‘Was Beethoven born in an even month?’” This technique, known as Rephrase-and-Respond (RaR), allows the model to clarify its understanding—for instance, interpreting “even month” as “a month with an even number like 2, 4, or 12”—before providing an accurate answer.
Provide Additional Context: Sometimes a reframe simply requires clarifying a term within the question. For instance, if the prompt is “What’s the best way to bank?”, adding a note like “(By ‘bank’, I mean doing a turn while skiing.)” can dramatically improve interpretation. Contextualizing vague or multi-meaning terms helps guide the model more precisely.
Prompt reframing is a simple yet powerful tool in prompt engineering. Whether by rewriting the question yourself, asking the model to clarify first, or adding specific context, this method can lead to more accurate, helpful, and aligned outputs. Even highly capable models benefit from well-rephrased queries—so when in doubt, ask the question a different way.
Delimiting and Structuring the Output
Large language models can sometimes confuse instructions with the content if everything is provided in one continuous block of text. Delimiting and structuring your prompt helps clearly separate instructions from the content, improving both the model’s understanding and output quality. This is especially important when working with examples, passages, or any text the model should act upon without misinterpreting it as part of the instruction.
Using Delimiters: One effective technique is to wrap content in special markers such as quotation marks, triple backticks, or XML-style tags. For instance, you might say: “Translate the text delimited by triple quotes into French.” followed by the text between triple quotes. This signals to the model that the enclosed section is the source content, not an instruction.
Common delimiters include:
- Quotation marks:
"...text..."
- Code fences:
```...text...```
- XML-style tags:
<text>...</text>
- Custom markers:
<start>...<end>
Structured Prompts: In addition to delimiters, organizing prompts with bullet points, numbered steps, or labeled sections can improve the clarity and structure of the model’s response. For example, using “Step 1,” “Step 2,” etc., encourages the model to generate outputs in a structured, easy-to-follow format. This is especially helpful for instructional or multi-step tasks, and it effectively triggers chain-of-thought reasoning.
Example usage:
- Summarizing a specific section: “Here is an article. Summarize the section between <start> and <end> tags.
<start>
” – This keeps the model focused on the relevant portion.
... (text) ...
<end> - Analyzing a quote: “Read the following quote and analyze it:
"To be or not to be, that is the question."
” – The use of quotation marks clearly distinguishes the quoted material from the instruction.
Delimiting is also a useful safety measure. It helps prevent prompt injection by clearly separating instructions from user-provided text or code. Overall, marking off distinct sections of your prompt—such as examples, inputs, or formatting constraints—ensures that large language models handle the task accurately, maintain boundaries, and produce coherent results.
Self-Consistency Prompting
Self-consistency prompting is an advanced strategy designed to improve the reliability of answers from large language models by exploring multiple reasoning paths. In traditional chain-of-thought prompting, a model follows a single reasoning trajectory—which may be flawed. Self-consistency enhances this by generating several independent reasoning chains and selecting the answer that appears most frequently or consistently among them.
This concept, introduced by Wang et al. (2023), is based on the idea that well-posed problems typically have a correct answer that multiple reasoning paths will converge upon. Even if each path varies in wording or logic, the right answer often emerges through consensus.
How self-consistency prompting works in practice:
- Instead of relying on one model output, the user prompts the model multiple times—either through repeated calls or a crafted prompt asking for alternative solutions (e.g., “Solve this again, using a different approach”).
- The resulting answers are compared. The most frequent or agreed-upon answer is selected as the final response.
- This method simulates a “majority vote” among several expert perspectives, all generated by the same model.
Why self-consistency works: Large language models often exhibit randomness or variability in reasoning, especially with higher temperature settings. Each run may explore a different way of solving the problem. By aggregating these diverse chains of thought, users can avoid one-off errors and reduce the risk of early missteps in reasoning. In effect, it’s like assembling a panel of AI experts and using their consensus to determine the correct answer.
According to research, self-consistency combined with chain-of-thought prompting significantly boosts performance on tasks requiring complex reasoning. For example, studies showed a +17.9% improvement on math word problems when using this method.
Takeaway: For high-stakes or ambiguous questions, run the prompt multiple times, or explicitly ask the model to reflect and answer again using different reasoning. If the same result appears across runs, it’s a strong signal of correctness. Self-consistency is a simple yet powerful way to improve output quality when accuracy matters most.
Retrieval-Augmented Prompting
Large language models have vast knowledge baked into their parameters, but they can still struggle with specific, up-to-date, or obscure information. Retrieval-augmented prompting is a pattern where you supply the model with relevant external information retrieved from outside sources (documents, databases, the web) as part of the prompt. This hybrid approach combines an LLM’s language abilities with a information retrieval component, guiding the model with factual data to ground its responses.
One popular framework for this is Retrieval-Augmented Generation (RAG). In RAG, when given a query, the system first uses a retriever (e.g. a search engine or vector database lookup) to fetch documents relevant to the query. Those documents (or summaries of them) are then appended to the prompt as context, and the model generates the answer using both its internal knowledge and the provided text. By having access to an external knowledge source at prompt time, the model’s outputs can be more up-to-date and factually accurate.
Benefits of retrieval augmentation:
- Factual Consistency: The model can quote or summarize the retrieved documents, which reduces the chance of hallucination (making up facts). It’s effectively using a small reference library for each query.
- Dynamic Knowledge: Unlike the static training data of the model (which might be months or years out of date), retrieval allows access to the latest information or niche knowledge that the model may not have seen. This makes it adaptive to evolving facts.
- Explainability: The retrieved text can be shown as part of the answer (with citations or quotes), making it clear why the model gave a certain answer (because it had document X saying so).
Example application: Suppose you’re using ChatGPT for a question-answering system about recent events. You might have a pipeline where for any user question, you first do a web search or database query, and then feed the top results into the ChatGPT prompt like: “Reference text: [insert an excerpt of relevant info]\n\nQuestion: [user question]\nAnswer:”. ChatGPT will then incorporate the reference text in its answer. If the user asks, “Who won the Best Actress Oscar in 2024?”, your system could retrieve a news article about the 2024 Oscars and include a snippet in the prompt. ChatGPT, seeing that snippet, will produce an answer based on it (greatly increasing the chances it names the correct person, with supporting details, rather than guessing).
From an academic perspective, Lewis et al. (2020) introduced RAG to handle knowledge-intensive tasks by combining a neural retriever with a generative model. They found it achieved strong results on open-domain QA benchmarks, producing answers that were more factual and specific. More recent work and applications have adopted this approach widely, even pairing ChatGPT with real-time search or databases to create robust QA agents. For prompt engineers, the takeaway is: if the task requires factual knowledge, especially time-sensitive or niche info, consider retrieval augmentation. In practice, this means your prompt should include the relevant snippets of information. Always cite or indicate the source in the prompt if possible – not only does it help the model, but it can also have the model output the answer with references (if instructed to do so).
Socratic Questioning (Ask and Answer via Sub-Questions)
Socratic questioning as a prompt pattern involves the model engaging in a dialogue of asking and answering questions to work through a problem. Inspired by the Socratic method, this technique drives the model to break a complex query into simpler sub-questions, answer those, and use them to construct the final solution. It’s akin to the model having an internal discussion or the user prompting the model step-by-step with pointed questions.
There are a couple of ways this manifests:
- Model self-questions (internal chain): You prompt the model to explicitly generate questions it needs to answer in order to solve the original question. For example: “To answer the question, first list any clarifying questions you’d ask yourself, answer them, then give the final answer.” The model might output: “Q1: What are the key facts of the problem? A1: …; Q2: What is being asked? A2: …; Therefore, [final answer].” This approach forces the model to articulate uncertainties and resolve them, mimicking a stepwise reasoning that can catch mistakes or missing info.
- User-led Socratic prompting: The user (or an automated system) asks the model a sequence of questions: first a broad question, then follow-ups drilling into specifics, guiding the model to the answer. Essentially, you conduct an interview with the model. Each prompt in the sequence is a question focusing on an aspect of the problem, and the model’s previous answer informs the next question.
Socratic prompting has been studied for its potential to yield more accurate and interpretable results. Qi et al. (2023) propose “Socratic questioning” as a divide-and-conquer algorithm that recursively breaks down a problem into sub-problems, which are answered and aggregated. They found this method more robust than standard chain-of-thought in cases where a single linear chain might go astray. By explicitly navigating the problem space through questions, the model can correct itself along the way. In their experiments, this technique improved performance on complex reasoning tasks and even outperformed other advanced prompting methods like Tree-of-Thoughts.
Practical example: Suppose the question is, “How can we increase renewable energy adoption in urban areas?” A Socratic approach might have the model first ask itself (or the user might ask): “What are the main barriers to renewable energy adoption in urban areas?” After the model lists barriers (e.g. cost, infrastructure, policy, awareness), the next question: “For each barrier, what are potential solutions or incentives to overcome it?” The model then enumerates solutions. Finally: “Given those solutions, summarize a coherent strategy to increase adoption.” The end result is an answer that is structured as a synthesis of answers to the sub-questions. This method is powerful for analytical and decision-making queries, as it ensures multiple facets of a problem are considered.
Using Socratic questioning in prompts can be as simple as saying: “Break down the problem into a series of questions and answers, and then give the final answer.” or manually guiding the model with iterative questions. It encourages a thorough exploration of the topic. One caution: the process can be more time-consuming (and costlier in terms of tokens) since it generates more text. But for important problems, the clarity and improved accuracy may be worth it.
Incremental Prompting (Iterative Approach)
Incremental prompting is the practice of breaking a complex task or query into a sequence of smaller prompts, guiding the model step-by-step rather than asking for everything at once. This approach recognizes that for tasks requiring multiple steps (writing a long essay, performing a multi-part calculation, conducting a dialogue, etc.), it can be effective to prompt incrementally, checking and refining the output at each stage.
Key aspects of incremental prompting:
- Divide and Conquer: Split the task. For instance, instead of asking “Write a detailed report on topic X including A, B, and C,” you might prompt first for outline: “Give me an outline for a report on X covering A, B, C.” Once the outline is obtained, you then prompt the model to fill in section by section: “Now draft the introduction based on that outline,” then “Explain A in detail,” and so on. By the end, you’ve built the report piece by piece. This ensures each part is focused and you can correct course if one part is off-track before moving to the next.
- Iterative Refinement: Start with an initial attempt, then refine it. You can have the model produce an answer, then critique or analyze its own answer, and then prompt it again to improve the answer. For example, “Generate a short story about a robot. Now critique your story in terms of character development. Now rewrite the story incorporating your improvements.” Each prompt builds on the last output, incrementally improving quality. OpenAI’s guidance specifically encourages iterative prompt refinement: begin with a draft prompt, observe the result, and then adjust the prompt in subsequent turns to get closer to the desired output.
- Memory Utilization: In a chat setting, you have the conversation history. Use it to incrementally add information. For instance, first prompt: “Explain the theory of relativity in one paragraph.” Model gives an answer. Next prompt: “Now expand that into three paragraphs with an example.” The model remembers its previous answer and elaborates. This incremental build-up can yield a comprehensive final result that might be hard to get in one giant prompt.
Example (Prompt Chaining): An incremental prompt chain for a coding task might be:
- Prompt: “Write a function in Python to check if a number is prime.”
- Model output: (function code)
- Prompt: “Great. Now write tests for that function for edge cases.”
- Model output: (test code)
- Prompt: “Combine the function and tests into a single code block, and include comments explaining each part.”
- Model output: (final combined code with comments)
Here, each step guided the model to produce a component of the final solution. Trying to get the model to do everything in one prompt (write function + tests + comments) might have been more error-prone or resulted in a disorganized answer. The incremental approach allowed checking the function first, then proceeding.
Academic literature also recognizes this pattern. One paper describes it as guiding the model through the process incrementally, breaking down the task into a series of prompts that build on each other. This not only helps the model maintain coherence over long outputs but also allows the prompt engineer to intervene and correct course at intermediate steps. Essentially, think of a dialogue with the model to solve a problem step by step, rather than a one-shot query for a complex output.
Best Practices
Prompt engineering for ChatGPT and similar LLMs is both an art and science. We’ve discussed a range of patterns – from zero-shot simplicity to sophisticated multi-step reasoning techniques – each with their use cases. To wrap up, here are some best practices and takeaways for effective prompt engineering:
- Start Simple, Then Iterate: Begin with a clear zero-shot or basic instructional prompt. If the answer isn’t adequate, incrementally refine your prompt or switch to few-shot by adding examples. Prompt design is often an iterative process – don’t expect perfect results on the first try for complex tasks.
- Be Explicit and Clear: The model cannot read your mind. State exactly what you want, whether it’s the format (bullet points, JSON, step-by-step), the role (“as an expert,” “in a friendly tone”), or the focus of the answer. Ambiguity is the enemy of accurate responses.
- Leverage In-Context Learning: Provide examples (one-shot/few-shot) when appropriate, especially if the task has a specific format or nuanced output. Demonstrations can hugely improve performance. Make sure your examples are representative of what you want.
- Encourage Reasoning for Complex Tasks: If a question requires reasoning, don’t hesitate to use chain-of-thought or Socratic techniques. Phrases like “think step by step” or instructing the model to break the problem into sub-questions can lead to more accurate and justified answers.
- Keep Context and Data at Hand: For factual queries, consider supplying relevant context via retrieval (if available). A prompt with the necessary reference text will outperform one without on specific knowledge tasks. Also, use delimiters to clearly separate this reference material from your instructions.
- Monitor for Consistency: If the correctness of the answer is critical, you might use self-consistency – ask the model multiple times (or ask it to reflect on its answer) to see if it converges on the same result. Consistent answers across independent tries are more likely to be correct.
- Stay Within Model Limits: Even the best prompt won’t fix tasks that exceed the model’s capabilities or context window. Use tools (via retrieval or external calculators) when needed, and be mindful of token limits (don’t stuff too many examples or overly long texts unnecessarily).
By applying these patterns and tips, users can dramatically improve their interactions with ChatGPT and other LLMs. Prompt engineering empowers you to steer the model effectively – it’s like writing a brief program in natural language that the model will execute. As research in this field grows and models evolve, new patterns and refinements will emerge, but the core idea remains: the way you ask matters. Craft your prompts with intention, experiment with the techniques above, and you’ll unlock more powerful and accurate outputs from AI language models.