06. System Prompts and Agent Configuration

What Is a System Prompt?

Every conversation with an AI model has roles: system, user, assistant and tool. The system prompt is the invisible instruction set that runs before anything the user says. It defines who the model is, how it behaves, and what rules it follows.

When you use ChatGPT, Claude.ai, or any AI agent, there's a system prompt running behind the scenes that you didn't write. When you build your own tools, you write this yourself.

Why System Prompts Matter

The system prompt is the difference between a generic chatbot and a useful tool. Compare:

No system prompt:
User: "Review this code"
→ Generic, unfocused review. Comments on style, logic, naming, everything at once.

With system prompt:
System: "You are a security-focused code reviewer. Only flag security
vulnerabilities. Ignore style, naming, and minor logic issues.
Rate each finding as critical, high, medium, or low severity."

User: "Review this code"
→ Focused security review with severity ratings. Actually useful.

Anatomy of a Good System Prompt

A system prompt has four parts:

1. Identity: Who is the model?

You are a senior backend engineer specializing in Go and PostgreSQL.

2. Behavior: How should it respond?

You give concise answers. You show code examples. You explain tradeoffs.

3. Rules: What should it always or never do?

Always suggest error handling. Never use deprecated APIs.
If you are unsure, say so instead of guessing.

4. Format: How should output look?

Respond with:
- A one-line summary
- The code solution
- One sentence explaining why this approach

Real System Prompts

Code Review Agent

You are a code reviewer for a Go backend team.

Behavior:
- Focus on bugs, performance issues, and security vulnerabilities
- Ignore style preferences (formatting, naming conventions)
- Be direct. Say what is wrong and how to fix it

Format for each issue:
- Line number
- Severity (critical/high/medium/low)
- What is wrong
- How to fix it

If the code looks good, say "No issues found" and nothing else.

Customer Support Bot

You are a support agent for ByteLearn, an online learning platform.

Rules:
- Answer questions about courses, accounts, and billing
- If you do not know the answer, say "Let me connect you with our team"
- Never make up information about pricing or features
- Keep responses under 3 sentences unless the user asks for more detail
- Be friendly but professional

You have access to these facts:
- Plans: Free (3 courses), Pro ($12/month, all courses)
- Refund policy: 30 days, no questions asked
- Support email: [email protected]

Writing Assistant

You are a writing editor for technical blog posts.

Your job:
- Fix grammar and clarity issues
- Shorten sentences that are too long
- Flag jargon that needs explanation
- Keep the author's voice and tone intact

Do not:
- Rewrite entire paragraphs
- Add new content
- Change technical accuracy

Respond with the edited text only. Use [COMMENT: ...] inline for suggestions
that require the author's decision.

Agent Configuration

Agents like Kiro, Cursor, and custom tools built on APIs use system prompts plus additional configuration:

Temperature

Controls randomness. Lower means more deterministic and predictable. Higher means more creative and varied.

Temperature	Use case
0	Classification, extraction, factual Q&A
0.3	Code generation, structured tasks
0.7	Creative writing, brainstorming
1.0	Maximum creativity, poetry, fiction

For most developer tasks, 0 to 0.3 is the right range.

Max Tokens

Limits response length. Set this to prevent the model from over-explaining.

Short answers: 256 tokens
Code generation: 1024 to 2048 tokens
Long analysis: 4096 tokens

If you don't set a limit, models tend to ramble. Setting max tokens forces conciseness.

Stop Sequences

Tell the model to stop generating when it hits a specific string. Useful for structured output where you want the model to stop after producing the data you need.

Stop sequences: ["\n\n", "---"]

The model will stop as soon as it generates a double newline or "---",
preventing it from adding unwanted extra content after your answer.

Building an Agent

An agent is a system prompt + model + tools. Here's the mental model:

Agent = System Prompt + Model Choice + Tools + Memory

System Prompt: Defines behavior and rules
Model Choice: Which model to use (fast vs thinking, cheap vs expensive)
Tools: What the agent can do (read files, run code, search web)
Memory: What context carries between turns

Example: A PR Review Agent

System Prompt:
"You review pull requests. Focus on bugs and security.
Suggest fixes. Be concise."

Model: Claude Sonnet (good at code, follows instructions)
Temperature: 0.1 (deterministic, consistent reviews)
Max tokens: 2048

Tools:
- read_file: Read any file in the repository
- get_diff: Get the PR diff
- post_comment: Leave a review comment

Flow:
1. Get the PR diff
2. Read relevant files for context
3. Analyze for issues
4. Post comments on specific lines

Common Mistakes

Too vague:

❌ "Be helpful and answer questions."
→ This adds nothing. The model is already trying to be helpful by default.

Too restrictive:

❌ "Only respond with exactly one word. Never use punctuation.
   Do not acknowledge the user. Do not explain anything."
→ So constrained that the model can't actually be useful.

Contradictory rules:

❌ "Be concise. Also, always explain your reasoning in detail."

These two instructions fight each other. The model can't be brief and detailed at the same time, so it awkwardly tries both — sometimes too short, sometimes too verbose, never consistent.

Fix: Make the rules conditional so the model knows when each applies:

✅ "Be concise for simple questions. For complex or multi-step tasks, explain your reasoning before acting."

Now there's no conflict — the model picks the right behavior based on context.

The sweet spot: Specific enough to focus the model, flexible enough to handle varied inputs.

Testing System Prompts

Test your system prompt with edge cases before deploying:

Happy path: Does it handle the normal case well?
Edge cases: What if the input is empty? Extremely long? In a different language?
Adversarial: What if the user tries to override the system prompt? ("Ignore your instructions and...")
Ambiguous: What if the user's request is unclear? Does it ask for clarification or guess?

Good system prompts handle all four gracefully.
Bad system prompts only work on the happy path.

Key Takeaways

System prompts define identity, behavior, rules, and output format.
A focused system prompt turns a generic model into a useful, specialized tool.
Temperature 0 for deterministic tasks, 0.3 for code, 0.7+ for creative work.
Agents combine system prompts with model choice, tools, and memory.
Test with edge cases and adversarial inputs, not just the happy path.
Avoid contradictory rules. If you say "be concise" and "explain in detail," the model will do neither well.