The Correct Way to Use Chain-of-Thought Prompting: Avoiding Common Pitfalls

2 minute read

Published: July 09, 2025

Large Language Models Prompting Chain-of-Thought

I recently attended an AI in Finance conference and was surprised to discover that many researchers are using chain-of-thought (CoT) prompting incorrectly. This powerful technique can significantly improve reasoning in LLMs — but only when implemented properly.

Key Takeaways

Zero-shot CoT requires two separate prompting rounds, not one.
Combining reasoning and the final answer in a single prompt introduces answer bleeding.
CoT is for structured reasoning; explainable prompting is for human-readable justification — know when to use which.

What is Zero-Shot Chain-of-Thought?

Zero-shot CoT involves two distinct rounds of prompting without using any task-specific examples:

Round 1 — Prompt the model to generate step-by-step reasoning.
Round 2 — Explicitly ask for the final answer based on that reasoning.

This differs from few-shot CoT, which includes labeled examples.

Why two rounds? Separating reasoning from the final answer prevents the model from "anchoring" on a premature conclusion and then rationalizing backwards.

Example Question

Consider the question: “A company just announced a 20% dividend increase while simultaneously reporting declining revenues. Is this news good or bad?”

Incorrect: Single-Stage Approach

Common Mistake Many researchers combine reasoning and answer extraction into a single prompt. This allows the model to peek at its own conclusion while still "reasoning."

WRONG

# Single prompt — reasoning and answer are entangled
response = llm.generate(
    prompt="Let's think step by step: A company just announced..."
)
# Output includes both reasoning AND final answer in one response

Correct: Two-Stage Approach

CORRECT

# STEP 1: Trigger reasoning only
reasoning_prompt = (
    "Q: A company announced a 20% dividend increase "
    "but declining revenues... "
    "A: Let's think step by step."
)
intermediate_response = llm.generate(reasoning_prompt)

# STEP 2: Extract final answer from the reasoning
answer_prompt = f"""
Based on this analysis: '{intermediate_response}'
Is the news good or bad? Answer ONLY 'good' or 'bad'."""
final_answer = llm.generate(answer_prompt)

Why This Matters

Problem	Single-Stage	Two-Stage (Correct)
Answer bleeding	Model sees its conclusion while reasoning	Reasoning is isolated from the answer
Transparency	Tangled output, hard to audit	Clean separation of logic and decision
Hallucination risk	Higher — model may fabricate justifications	Lower — reasoning is evaluated independently

Explainable Prompting vs. Chain-of-Thought

Although both aim to improve interpretability, they serve different purposes:

Feature	Explainable Prompting	Chain-of-Thought
Goal	Human-readable justification	Structured multi-step reasoning
Output	Single response with embedded rationale	Two-step: reasoning → answer
Best for	Summaries, end-user reports	Complex logic, quantitative analysis
Prompt style	"Explain why…"	"Let's think step by step"

Rule of thumb Use explainable prompting when the audience needs to understand the conclusion. Use CoT when correctness and traceability matter more than readability.

Implementation Checklist

Never include “so the answer is…” in the initial reasoning prompt.
Always split into two separate prompts/responses.
Validate intermediate reasoning before extracting the final answer.

References

Wei, J. et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS 2022.
Kojima, T. et al. (2022). Large Language Models are Zero-Shot Reasoners. NeurIPS 2022.

Yutong Yan

Key Takeaways

What is Zero-Shot Chain-of-Thought?

Example Question

Incorrect: Single-Stage Approach

Correct: Two-Stage Approach

Why This Matters

Explainable Prompting vs. Chain-of-Thought

Implementation Checklist

References