Prompt Engineering Is Not Magic. It Is Instruction Writing.
You Know How the Engine Works. Now Learn to Drive.
The last post broke down tokens, context windows, attention, and sampling. You know the internals. That knowledge is useless if you keep writing prompts like you are texting a friend.
Prompt engineeringThe practice of designing and refining inputs to AI language models to get more accurate, useful, and consistent outputs. Read more → is not a creative skill. It is not vibes. It is instruction writing. The same discipline that goes into a military operations order, a network configuration template, or an APIA set of rules and protocols that allows different software applications to communicate with each other and share data or functionality. Read more → specification. Precise. Structured. Unambiguous.
The model does exactly what you tell it to do. When the output is garbage, the input was garbage. Full stop.
Everything in this post is sourced from the official documentation of the three major AI platforms: Anthropic’s Claude prompting best practices, OpenAI’s GPT-5 prompting guide, and Google’s Gemini prompt design strategies. Not blog opinions. Not Twitter threads. Primary sources from the people who built the models.
The Five Principles
Before the techniques. Before the examples. Internalize these.
- Be explicit. The model cannot read your mind. If you did not say it, it does not exist.
- Be structured. Prose instructions produce prose-quality compliance. Structured inputs produce structured outputs.
- Be specific about format. Tell it what the output looks like. JSONA lightweight, human-readable data format used to exchange structured information between systems, based on JavaScript object syntax. Read more →, bullet points, a table, a single word. If you leave format open, you get whatever the training data averaged out to.
- Constrain the scope. Every degree of freedom you leave open is a degree of freedom the model fills with its best guess. Best guesses are not good enough.
- Provide examples. One concrete example communicates more than five paragraphs of description.
These principles are universal. Anthropic’s documentation puts it this way: “Think of Claude as a brilliant but new employee who lacks context on your norms and workflows. The more precisely you explain what you want, the better the result.” (Source)
Google’s Gemini 3 guide says the same thing differently: “State your goal clearly and concisely. Avoid unnecessary or overly persuasive language.” (Source)
OpenAI’s GPT-5 guide warns that contradictory instructions force the model to waste reasoning tokens reconciling conflicts instead of solving your actual problem. (Source)
Same conclusion from three independent engineering teams. Clarity wins.
Technique 1: Role and Context Assignment
Tell the model who it is before you tell it what to do. This is not roleplay. It is context loading. A role activates relevant patterns from training data and suppresses irrelevant ones.
Anthropic’s documentation confirms this: “Setting a role in the system prompt focuses Claude’s behavior and tone for your use case. Even a single sentence makes a difference.” (Source)
Google’s Gemini 3 guide adds a critical caveat: assign personas explicitly, but review them carefully because the model “prioritizes maintaining persona adherence even over other instructions.” (Source)
Weak prompt:
What ports should I open on my firewall?
You will get a generic list. Every default portA numbered endpoint on a device that identifies a specific application or service, allowing multiple network services to run on the same IP address. Read more → for every common service. Useless.
Strong prompt:
You are a senior network security engineer conducting a firewall
audit for a PCI-DSS compliant e-commerce environment running
Kubernetes on AWS. The environment serves HTTPS traffic through
an ALB, uses RDS PostgreSQL for the backend database, and runs
Redis for session caching.
List only the ports that must be open on the VPC security groups,
grouped by tier (public, application, data). For each port,
specify the protocol, source CIDR restriction, and justification.
Flag any port that introduces PCI-DSS compliance risk.
Same question. Completely different output. The role and context eliminated 90% of the noise before the model generated a single tokenThe smallest unit of text that a language model processes, typically a word, part of a word, or a punctuation mark. Read more →.
Why it works: Remember attention from the last post. The model weighs every token against every other token. When you load the context with “PCI-DSS,” “KubernetesAn open-source platform for automating the deployment, scaling, and management of containerized applications across clusters of machines. Read more →,” and “AWS,” the attention mechanism amplifies patterns associated with those domains. Without that context, it distributes attention across everything it knows about firewalls, including home routers and gaming setups.
Anthropic also recommends providing the motivation behind your instructions. Instead of saying “NEVER use ellipses,” explain: “Your response will be read aloud by a text-to-speech engine, so never use ellipses since the text-to-speech engine will not know how to pronounce them.” Claude generalizes from the explanation. (Source)
Technique 2: Few-Shot Examples
The model learns patterns from examples faster than it learns from descriptions. This is called few-shot prompting. You provide input/output pairs, and the model extrapolates the pattern.
Anthropic recommends 3 to 5 examples for best results and advises wrapping them in <example> tags so the model can distinguish examples from instructions. (Source)
Google’s Gemini documentation recommends 2 to 5 varied examples and warns against providing too many, which can cause overfitting where the model mimics examples too literally instead of generalizing the pattern. (Source)
Task: Extract structured data from unstructured network alerts.
Prompt:
Extract structured incident data from network alerts.
<example>
Input:
"Critical alert: Unauthorized SSH login attempt detected on
10.0.3.47 from external IP 203.0.113.42 at 14:32 UTC.
5 failed attempts in 60 seconds. Account: root."
Output:
{
"severity": "critical",
"event_type": "brute_force_attempt",
"target_host": "10.0.3.47",
"source_ip": "203.0.113.42",
"protocol": "SSH",
"timestamp": "14:32 UTC",
"attempt_count": 5,
"window_seconds": 60,
"target_account": "root",
"recommended_action": "Block source IP, force password
reset on target account, review auth logs for lateral
movement"
}
</example>
Now extract from this alert:
"Warning: DNS exfiltration pattern detected. Host 10.0.1.15
made 847 TXT record queries to suspicious domain
x4k9.badactor.net over 300 seconds starting at 09:17 UTC.
Average query length exceeds normal baseline by 340%."
One example. The model now knows the exact schema, the field names, the value formats, and that you want a recommended_action field with operational guidance. One example replaced an entire specification document.
The rule: One strong example beats three paragraphs of description. Two examples nail edge cases. Three to five is the sweet spot for consistent patterns. Beyond that, you are wasting tokens and risking overfitting.
Technique 3: Structured Formatting with Tags
All three major platforms agree on this: structure your prompts with clear delimiters. The specific format varies, but the principle is universal.
Anthropic recommends XML tags and has specifically trained Claude to parse them. Tags like <instructions>, <context>, and <examples> reduce misinterpretation. (Source)
OpenAI recommends <instruction_spec> tags for organizing complex requirements, allowing clear internal referencing across prompt sections. (Source)
Google recommends XML-style tags or Markdown headings, advising you to choose one format and use it consistently within a single prompt. (Source)
Example: Multi-section prompt with XML structure
<role>
You are a senior infrastructure engineer reviewing Kubernetes
deployment manifests for production readiness. You follow the
CIS Kubernetes Benchmark v1.8.
</role>
<standards>
- All containers must run as non-root (runAsNonRoot: true)
- Resource limits are mandatory (no unbounded containers)
- Image tags must be SHA digests, not :latest
- Liveness and readiness probes required on all containers
- No hostNetwork, hostPID, or hostIPC unless approved
</standards>
<output_format>
For each violation, return a JSON object:
{
"field": "<YAML path>",
"violation": "<which standard>",
"severity": "critical | warning",
"fix": "<exact YAML to replace it with>"
}
Return a JSON array. No commentary. No markdown fences.
</output_format>
<input>
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-gateway
spec:
replicas: 3
template:
spec:
containers:
- name: gateway
image: company/api-gateway:latest
ports:
- containerPort: 8080
</input>
Every section has a purpose. The model knows exactly where the role ends and the standards begin. No ambiguity. No guessing.
Technique 4: Chain of Thought
When you need the model to reason through a problem, tell it to show its work. This forces the model to activate its reasoning pathways before committing to an answer.
Anthropic’s documentation notes that for Claude 4.6, “a prompt like ‘think thoroughly’ often produces better reasoning than a hand-written step-by-step plan. Claude’s reasoning frequently exceeds what a human would prescribe.” They also recommend asking Claude to self-check: “Before you finish, verify your answer against [test criteria].” (Source)
Google’s Gemini 3 guide takes a different approach, recommending split-step verification: first verify that information exists before generating answers. This prevents “plausible-sounding but incorrect outputs.” (Source)
Without chain of thought:
Is this subnet configuration valid?
10.0.1.0/24 with gateway 10.0.2.1
The model might say “yes” or “no” with no way to verify if it actually evaluated the configuration.
With chain of thought:
Evaluate this subnet configuration step by step.
Show your reasoning for each check before the final verdict.
Configuration:
- Subnet: 10.0.1.0/24
- Gateway: 10.0.2.1
Check:
1. Is the gateway IP within the subnet range?
2. Is the subnet mask valid for the stated network?
3. Are there any RFC violations?
4. What is the usable host range?
After all checks, state VALID or INVALID with justification.
Output:
1. Subnet range: 10.0.1.0 to 10.0.1.255. Gateway 10.0.2.1
is NOT within this range. The gateway must reside within
the subnet or be reachable via a connected interface.
2. /24 (255.255.255.0) is valid for 10.0.1.0.
3. No RFC violations on the subnet itself, but gateway
placement violates standard Layer 3 design.
4. Usable range: 10.0.1.1 to 10.0.1.254 (254 hosts).
INVALID. Gateway 10.0.2.1 is outside 10.0.1.0/24.
Correct to 10.0.1.1 or re-architect the routing.
The step-by-step requirement forced the model to evaluate each condition instead of guessing at the aggregate. This is the difference between a tech who checks every cable and one who says “looks fine” from across the room.
Technique 5: Output Constraints
If you do not define the output format, the model picks one. It will be verbose, inconsistent, and different every time. Lock it down.
OpenAI’s GPT-5 guide introduces a dual verbosity strategy: set a low verbosity parameter globally while requesting high verbosity only for specific contexts like code output. This keeps status updates concise while maintaining readable code. (Source)
Google’s Gemini 3 documentation notes that Gemini 3 models default to concise, direct answers. If you need more detail, you must explicitly request it. (Source)
Anthropic recommends telling Claude what to do instead of what not to do. Instead of “Do not use markdown in your response,” try “Your response should be composed of smoothly flowing prose paragraphs.” (Source)
Unconstrained:
Analyze this log entry for security issues.
You get a three-paragraph essay. Sometimes bullets. Sometimes narrative. Never consistent.
Constrained:
Analyze the log entry below. Respond with ONLY a JSON object.
No markdown fences. No explanation text.
{
"log_line": "<original log entry>",
"classification": "benign | suspicious | malicious",
"confidence": <float 0.0-1.0>,
"indicators": ["<specific indicators found>"],
"mitre_technique": "<ATT&CK technique ID or null>",
"recommended_action": "<one sentence>"
}
Log entry:
POST /wp-login.php HTTP/1.1 from 198.51.100.23 - 47 requests
in 12 seconds - all returned 401 - User-Agent: python-requests/2.28
Machine-parseable output. Every time. Same schema. Same fields. You can pipe this into a SIEMA platform that collects, correlates, and analyzes log data from across your infrastructure to detect security threats and support incident investigation. Read more →, a database, or a downstream automation. That is the difference between a tool and a toy.
Technique 6: Context Placement
Where you put information in your prompt matters as much as what you put in it. This is the practical application of the lost-in-the-middle problem covered in the last post.
Anthropic is explicit: “Put longform data at the top. Place your long documents and inputs near the top of your prompt, above your query, instructions, and examples. Queries at the end can improve response quality by up to 30% in tests, especially with complex, multi-document inputs.” (Source)
Google’s Gemini 3 guide mirrors this: “Place specific questions after large context blocks and anchor reasoning with phrases like ‘Based on the information above.’” It also advises placing critical restrictions at the end of the prompt to prevent the model from dropping them. (Source)
The placement rule:
[LONG REFERENCE DOCUMENTS / DATA]
[YOUR SPECIFIC QUESTION OR TASK]
[CRITICAL CONSTRAINTS AND FORMAT REQUIREMENTS]
Data first. Task second. Constraints last. The model attends most strongly to the beginning (your reference material) and the end (your constraints). Your task sits in between, anchored by both.
For multi-document prompts, Anthropic recommends wrapping each document in indexed tags:
<documents>
<document index="1">
<source>firewall_rules_prod.csv</source>
<document_content>
... rule data ...
</document_content>
</document>
<document index="2">
<source>incident_report_2026-04.pdf</source>
<document_content>
... report content ...
</document_content>
</document>
</documents>
Based on the documents above, identify any firewall rules
that would have permitted the attack vector described in
the incident report. Output as a table with columns:
Rule ID, Source, Destination, Port, Risk Assessment.
Technique 7: Negative Constraints
Telling the model what NOT to do is as important as telling it what to do. Models have strong defaults from training. If those defaults conflict with what you need, override them explicitly.
Anthropic’s documentation includes a detailed prompt template for suppressing common unwanted behaviors like excessive markdown, bullet points, and bold/italic formatting. (Source)
Google specifically warns against broad negative instructions like “do not infer.” Instead, specify what the model should use for reasoning. (Source)
Common defaults to suppress:
Do not include introductory text like "Sure!" or
"Here is the analysis."
Do not wrap code in markdown fences unless asked.
Do not add disclaimers about limitations.
Do not explain reasoning unless asked. Output the result.
Do not use placeholder values. If data is missing,
output null.
Real-world example: Generating TerraformAn infrastructure-as-code tool that lets you define cloud resources (servers, networks, databases) in configuration files and provision them automatically. Read more → configurations.
Generate a Terraform resource block for an AWS security group
allowing inbound HTTPS (443) from 0.0.0.0/0 and SSH (22)
from 10.0.0.0/8 only.
Output ONLY the resource block.
- No provider block
- No variable declarations
- No comments
- No egress rules
- No tags
- Resource name: "web_server_sg"
Without those constraints, the model produces a complete Terraform file with provider config, variables, outputs, inline comments, and invented tags. All of which you strip out. Save the tokens. Save the time.
Technique 8: Iterative Refinement
One prompt rarely produces a perfect result. Plan for iteration, but do it systematically.
OpenAI’s GPT-5 guide introduces the concept of metaprompting: ask the model itself “what phrases could be added or deleted from this prompt to elicit desired behavior.” Use the model to improve the prompt that drives it. (Source)
Google frames prompt engineering explicitly as “a test-driven and iterative process” and recommends rephrasing requests multiple ways, switching to analogous tasks, and reordering prompt content to test impact. (Source)
Round 1: Get the structure right.
Draft a network diagram description for a three-tier web
application on AWS. Output as a structured list, not prose.
Include VPC layout, subnet tiers, and security group
boundaries.
Round 2: Add precision.
Good structure. Now add:
- Specific CIDR blocks (10.0.0.0/16 VPC)
- NAT gateway placement
- Which tier gets public IPs
- Cross-AZ redundancy notation
Round 3: Pressure test.
Review this architecture for single points of failure.
For each one, propose a mitigation and rate cost impact
as low/medium/high.
Round 4: Use metaprompting.
Review the prompt I used to generate this architecture.
What instructions could I add or remove to get a more
production-ready result on the first pass?
Each round builds on the last. You are not starting over. You are narrowing. And in round 4, you are using the model to improve your own process.
The Anti-Patterns
These waste time and tokens. Stop doing them.
Vague delegation. “Make this better” is not an instruction. Better how? Faster? More secure? More readable? If you cannot articulate what “better” means, you do not know what you want. The model cannot fix that.
Prompt stuffing. Dumping 50 pages of documentation and saying “analyze this” guarantees the model will miss the parts that matter. Remember the lost-in-the-middle problem. Curate your context.
Politeness tokens. “Could you please kindly help me with” burns tokens and adds zero signal. The model processes instructions. Give it instructions.
Ambiguous references. “Update it to use the new format” after a long conversation. What is “it”? What is “the new format”? Name the file. Specify the format. The model will not ask for clarification. It will guess.
Over-prompting for modern models. Both Anthropic and OpenAI warn about this. Instructions that were necessary for older models (“CRITICAL: You MUST use this tool”) will cause current models to overtrigger. Anthropic’s guidance: “Where you might have said ‘CRITICAL: You MUST use this tool when…’, you can use more normal prompting like ‘Use this tool when…’” OpenAI’s guidance: soften thoroughness instructions because “GPT-5 is already naturally introspective about context gathering.” (Sources, OpenAI)
Platform-Specific Tips
Claude (Anthropic)
- XML tags are first-class. Claude is specifically trained to parse XML structure. Use
<context>,<instructions>,<examples>to delineate sections. - 3 to 5 examples is the recommended range for few-shot prompting.
- Long documents go at the top of the prompt, queries at the bottom. This improves response quality up to 30%.
- Ground responses in quotes. For long document tasks, ask Claude to quote relevant parts before answering. This cuts through noise.
- CLAUDE.md files for Claude Code users provide persistent instructions that load every session without burning tokens on repeated prompts.
GPT-5 (OpenAI)
- Metaprompting. Ask GPT-5 to review and improve your prompt. It will suggest phrases to add or remove.
- Reasoning effort parameter. Adjust
reasoning_effort(low/medium/high) to control how much the model explores before answering. - Persistence framing. Use “keep going until the user’s query is completely resolved” to prevent premature task termination.
- Self-reflection rubrics. For complex generation tasks, have the model create a 5 to 7 category excellence rubric internally, then iterate against it.
Gemini 3 (Google)
- Keep temperatureA parameter that controls how random or deterministic an AI model's output is, with lower values producing focused answers and higher values producing creative ones. Read more → at 1.0. Gemini 3’s reasoning is optimized for the default. Lowering it can cause looping or degraded performance on complex tasks.
- Split-step verification. Verify information exists before generating answers to prevent confident but incorrect outputs.
- Concise by default. Gemini 3 gives direct, efficient answers. Request detail explicitly if you need it.
- Avoid broad negatives. “Do not infer” causes problems. Instead, specify what the model should use for reasoning.
Quick Reference: Technique Cheat Sheet
| Technique | What It Does | When to Use |
|---|---|---|
| Role Assignment | Activates domain-specific patterns | Every prompt with a specialized context |
| Few-Shot Examples | Teaches output format by pattern | Structured extraction, classification, formatting |
| XML/Tag Structure | Eliminates ambiguity between sections | Multi-part prompts, mixed instructions and data |
| Chain of Thought | Forces step-by-step reasoning | Validation, debugging, complex analysis |
| Output Constraints | Locks format and schema | API integrations, automation pipelines |
| Context Placement | Exploits attention distribution | Long documents, multi-source analysis |
| Negative Constraints | Overrides training defaults | Terraform, code gen, any structured output |
| Iterative Refinement | Narrows toward precision over rounds | Architecture, planning, complex deliverables |
| Metaprompting | Uses the model to improve your prompts | Prompt optimization, workflow development |
Bottom Line
Prompt engineering is instruction writing. That is it. No mysticism. No secret sauce.
The model is a system that follows orders literally. Vague orders produce vague results. Precise orders produce precise results. Three independent engineering teams at Anthropic, OpenAI, and Google arrived at the same conclusion: be clear, be structured, be specific, provide examples, and constrain the output.
You would not hand a junior admin a firewall change request that says “make it more secure.” You would specify the exact rules, the exact interfaces, the exact traffic flows, and the expected behavior. Treat the model the same way.
Write your prompts like operations orders. State the situation. Define the task. Specify the constraints. Dictate the format. Provide examples. Execute.
Next post: RAG (retrieval-augmented generation) and how to give your AI access to knowledge it was never trained on. The context window is not the only way to feed data to a model. It is not even the best way.