| A System for Staying Current |
12. Staying Current |
| A/B Testing in Production |
10. Evaluating Model Quality |
| Agent Configuration |
06. System Prompts and Agent Configuration |
| Alibaba (Qwen) |
01. The Model Landscape |
| Anatomy of a Good System Prompt |
06. System Prompts and Agent Configuration |
| Anthropic (Claude) |
01. The Model Landscape |
| Batching |
08. Cost and Token Optimization |
| Be Selective About What You Include |
07. Context Management |
| Be Specific |
04. Writing Prompts That Work Everywhere |
| Best Models by Task |
09. Local Models with Ollama |
| Building a Test Set |
10. Evaluating Model Quality |
| Building an Agent |
06. System Prompts and Agent Configuration |
| Bulk processing without cost |
09. Local Models with Ollama |
| Caching |
08. Cost and Token Optimization |
| Claude (Anthropic) |
05. Model-Specific Prompting |
| Claude Tips |
05. Model-Specific Prompting |
| Claude with Extended Thinking |
03. Thinking Models vs Fast Models |
| Cloud vs Local |
01. The Model Landscape |
| Cloud vs Local Equivalents |
02. How to Choose a Model |
| Code Review Agent |
06. System Prompts and Agent Configuration |
| Code with complex logic |
03. Thinking Models vs Fast Models |
| Common Evaluation Mistakes |
10. Evaluating Model Quality |
| Constrain the Output |
04. Writing Prompts That Work Everywhere |
| Context for Agents |
07. Context Management |
| Context in Multi-Turn Conversations |
07. Context Management |
| Context length |
09. Local Models with Ollama |
| Context Window Sizes (2025) |
07. Context Management |
| Cost Advantage |
05. Model-Specific Prompting |
| Cost Comparison |
03. Thinking Models vs Fast Models, 11. Multi-Model Strategies |
| Customer Support Bot |
06. System Prompts and Agent Configuration |
| DeepSeek |
01. The Model Landscape, 05. Model-Specific Prompting |
| DeepSeek Tips |
05. Model-Specific Prompting |
| DeepSeek-R1 |
03. Thinking Models vs Fast Models |
| Exact match |
10. Evaluating Model Quality |
| Example: A PR Review Agent |
06. System Prompts and Agent Configuration |
| Function Calling |
05. Model-Specific Prompting |
| Future-Proofing Your Setup |
12. Staying Current |
| Gemini (Google) |
05. Model-Specific Prompting |
| Gemini Tips |
05. Model-Specific Prompting |
| Getting Started with Ollama |
09. Local Models with Ollama |
| Give Context, Not Everything |
04. Writing Prompts That Work Everywhere |
| Google (Gemini) |
01. The Model Landscape |
| GPT (OpenAI) |
05. Model-Specific Prompting |
| GPT Tips |
05. Model-Specific Prompting |
| GPU acceleration |
09. Local Models with Ollama |
| How Pricing Works |
08. Cost and Token Optimization |
| How Thinking Models Work |
03. Thinking Models vs Fast Models |
| How to detect failure |
11. Multi-Model Strategies |
| Hybrid Strategy |
03. Thinking Models vs Fast Models |
| Implementation Tips |
11. Multi-Model Strategies |
| Instruction Following |
05. Model-Specific Prompting |
| Iteration, Not Perfection |
04. Writing Prompts That Work Everywhere |
| Llama and Qwen (Local Models) |
05. Model-Specific Prompting |
| LLM-as-judge |
10. Evaluating Model Quality |
| LLM-based router |
11. Multi-Model Strategies |
| Local Model Tips |
05. Model-Specific Prompting |
| Local RAG |
09. Local Models with Ollama |
| Long Document Handling |
05. Model-Specific Prompting |
| Massive Context |
05. Model-Specific Prompting |
| Matching Tasks to Models |
02. How to Choose a Model |
| Math and logic |
03. Thinking Models vs Fast Models |
| Max Tokens |
06. System Prompts and Agent Configuration |
| Meta (Llama) |
01. The Model Landscape |
| Model Families |
01. The Model Landscape |
| Model Sizes |
01. The Model Landscape |
| Monthly check-in (15 minutes) |
12. Staying Current |
| Multi-step problems |
03. Thinking Models vs Fast Models |
| Multimodal |
05. Model-Specific Prompting |
| Multiple models |
09. Local Models with Ollama |
| Offline development |
09. Local Models with Ollama |
| Ollama Server |
09. Local Models with Ollama |
| One Task Per Prompt |
04. Writing Prompts That Work Everywhere |
| OpenAI (GPT) |
01. The Model Landscape |
| OpenAI o3 / o4-mini |
03. Thinking Models vs Fast Models |
| Optimization Strategies |
08. Cost and Token Optimization |
| Performance Tips |
09. Local Models with Ollama |
| Practical Use Cases |
09. Local Models with Ollama |
| Private code review |
09. Local Models with Ollama |
| Prompt Format Matters |
05. Model-Specific Prompting |
| Prompting Differences |
03. Thinking Models vs Fast Models |
| Pulling and Running Models |
09. Local Models with Ollama |
| Quantization Tradeoffs |
05. Model-Specific Prompting |
| Quarterly evaluation (1 to 2 hours) |
12. Staying Current |
| Quick Evaluation Script |
10. Evaluating Model Quality |
| Real Cost Breakdown |
08. Cost and Token Optimization |
| Real Example: Building a Feature |
02. How to Choose a Model |
| Real example: PR review pipeline |
11. Multi-Model Strategies |
| Real System Prompts |
06. System Prompts and Agent Configuration |
| Reduce Input Tokens |
08. Cost and Token Optimization |
| Reduce Output Tokens |
08. Cost and Token Optimization |
| Rubric scoring |
10. Evaluating Model Quality |
| Running an Evaluation |
10. Evaluating Model Quality |
| Scoring Methods |
10. Evaluating Model Quality |
| Show, Don't Tell |
04. Writing Prompts That Work Everywhere |
| Stop Sequences |
06. System Prompts and Agent Configuration |
| Strategies for Managing Context |
07. Context Management |
| Structure Context with Clear Boundaries |
07. Context Management |
| Structure Your Input |
04. Writing Prompts That Work Everywhere |
| Structured Output |
05. Model-Specific Prompting |
| Temperature |
06. System Prompts and Agent Configuration |
| Testing System Prompts |
06. System Prompts and Agent Configuration |
| The "Good Enough" Principle |
02. How to Choose a Model |
| The Agents Layer |
01. The Model Landscape |
| The Attention Problem |
07. Context Management |
| The Big Picture |
01. The Model Landscape |
| The Cascade Pattern |
11. Multi-Model Strategies |
| The Decision Framework |
02. How to Choose a Model |
| The Decision Rule |
03. Thinking Models vs Fast Models |
| The Ensemble Pattern |
11. Multi-Model Strategies |
| The Four Dimensions |
02. How to Choose a Model |
| The Meta-Skill |
12. Staying Current |
| The Model Tier List (2025) |
02. How to Choose a Model |
| The Pace of Change |
12. Staying Current |
| The Pipeline Pattern |
11. Multi-Model Strategies |
| The Problem with Vibes |
10. Evaluating Model Quality |
| The Router Pattern |
11. Multi-Model Strategies |
| The Thinking Models |
03. Thinking Models vs Fast Models |
| The Universal Prompt Template |
04. Writing Prompts That Work Everywhere |
| Tools That Connect to Local Models |
09. Local Models with Ollama |
| Trends Worth Watching |
12. Staying Current |
| Trim Conversation History |
07. Context Management |
| Two Modes of AI |
03. Thinking Models vs Fast Models |
| Two Phases of Model Selection |
02. How to Choose a Model |
| Universal Principles |
04. Writing Prompts That Work Everywhere |
| Use Retrieval Instead of Stuffing |
07. Context Management |
| Use Roles Effectively |
04. Writing Prompts That Work Everywhere |
| Use the API |
09. Local Models with Ollama |
| Use the Cheapest Model That Works |
08. Cost and Token Optimization |
| Visible Reasoning |
05. Model-Specific Prompting |
| What Actually Changes |
12. Staying Current |
| What Eats Your Context |
07. Context Management |
| What Is a System Prompt? |
06. System Prompts and Agent Configuration |
| What to Evaluate |
10. Evaluating Model Quality |
| What to Ignore |
12. Staying Current |
| When Fast Models Win |
03. Thinking Models vs Fast Models |
| When Local Isn't Enough |
09. Local Models with Ollama |
| When Not to Optimize |
08. Cost and Token Optimization |
| When Thinking Models Win |
03. Thinking Models vs Fast Models |
| When to switch immediately |
12. Staying Current |
| When to Upgrade (or Downgrade) |
02. How to Choose a Model |
| Where to get test data |
10. Evaluating Model Quality |
| Writing Assistant |
06. System Prompts and Agent Configuration |
| XML Tags |
05. Model-Specific Prompting |