Source: wshobson/agents Original Plugin: llm-application-dev
Prompt Engineering Patterns
Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability.
When to Use This Skill
- Designing complex prompts for production LLM applications
- Optimizing prompt performance and consistency
- Implementing structured reasoning patterns (chain-of-thought, tree-of-thought)
- Building few-shot learning systems with dynamic example selection
- Creating reusable prompt templates with variable interpolation
- Debugging and refining prompts that produce inconsistent outputs
- Implementing system prompts for specialized AI assistants
Core Capabilities
1. Few-Shot Learning
- Example selection strategies (semantic similarity, diversity sampling)
- Balancing example count with context window constraints
- Constructing effective demonstrations with input-output pairs
- Dynamic example retrieval from knowledge bases
- Handling edge cases through strategic example selection
2. Chain-of-Thought Prompting
- Step-by-step reasoning elicitation
- Zero-shot CoT with "Let's think step by step"
- Few-shot CoT with reasoning traces
- Self-consistency techniques (sampling multiple reasoning paths)
- Verification and validation steps
3. Prompt Optimization
- Iterative refinement workflows
- A/B testing prompt variations
- Measuring prompt performance metrics (accuracy, consistency, latency)
- Reducing token usage while maintaining quality
- Handling edge cases and failure modes
4. Template Systems
- Variable interpolation and formatting
- Conditional prompt sections
- Multi-turn conversation templates
- Role-based prompt composition
- Modular prompt components
5. System Prompt Design
- Setting model behavior and constraints
- Defining output formats and structure
- Establishing role and expertise
- Safety guidelines and content policies
- Context setting and background information
Quick Start
PYTHON
from prompt_optimizer import PromptTemplate, FewShotSelector
# Define a structured prompt template
template = PromptTemplate(
    system="You are an expert SQL developer. Generate efficient, secure SQL queries.",
    instruction="Convert the following natural language query to SQL:\n{query}",
    few_shot_examples=True,
    output_format="SQL code block with explanatory comments"
)
# Configure few-shot learning
selector = FewShotSelector(
    examples_db="sql_examples.jsonl",
    selection_strategy="semantic_similarity",
    max_examples=3
)
# Generate optimized prompt
prompt = template.render(
    query="Find all users who registered in the last 30 days",
    examples=selector.select(query="user registration date filter")
)
Key Patterns
Progressive Disclosure
Start with simple prompts, add complexity only when needed:
- 
Level 1: Direct instruction - "Summarize this article"
 
- 
Level 2: Add constraints - "Summarize this article in 3 bullet points, focusing on key findings"
 
- 
Level 3: Add reasoning - "Read this article, identify the main findings, then summarize in 3 bullet points"
 
- 
Level 4: Add examples - Include 2-3 example summaries with input-output pairs
 
Instruction Hierarchy
[System Context] → [Task Instruction] → [Examples] → [Input Data] → [Output Format]
Error Recovery
Build prompts that gracefully handle failures:
- Include fallback instructions
- Request confidence scores
- Ask for alternative interpretations when uncertain
- Specify how to indicate missing information
Best Practices
- Be Specific: Vague prompts produce inconsistent results
- Show, Don't Tell: Examples are more effective than descriptions
- Test Extensively: Evaluate on diverse, representative inputs
- Iterate Rapidly: Small changes can have large impacts
- Monitor Performance: Track metrics in production
- Version Control: Treat prompts as code with proper versioning
- Document Intent: Explain why prompts are structured as they are
Common Pitfalls
- Over-engineering: Starting with complex prompts before trying simple ones
- Example pollution: Using examples that don't match the target task
- Context overflow: Exceeding token limits with excessive examples
- Ambiguous instructions: Leaving room for multiple interpretations
- Ignoring edge cases: Not testing on unusual or boundary inputs
Integration Patterns
With RAG Systems
PYTHON
# Combine retrieved context with prompt engineering
prompt = f"""Given the following context:
{retrieved_context}
{few_shot_examples}
Question: {user_question}
Provide a detailed answer based solely on the context above. If the context doesn't contain enough information, explicitly state what's missing."""
With Validation
PYTHON
# Add self-verification step
prompt = f"""{main_task_prompt}
After generating your response, verify it meets these criteria:
1. Answers the question directly
2. Uses only information from provided context
3. Cites specific sources
4. Acknowledges any uncertainty
If verification fails, revise your response."""
Performance Optimization
Token Efficiency
- Remove redundant words and phrases
- Use abbreviations consistently after first definition
- Consolidate similar instructions
- Move stable content to system prompts
Latency Reduction
- Minimize prompt length without sacrificing quality
- Use streaming for long-form outputs
- Cache common prompt prefixes
- Batch similar requests when possible
Resources
- references/few-shot-learning.md: Deep dive on example selection and construction
- references/chain-of-thought.md: Advanced reasoning elicitation techniques
- references/prompt-optimization.md: Systematic refinement workflows
- references/prompt-templates.md: Reusable template patterns
- references/system-prompts.md: System-level prompt design
- assets/prompt-template-library.md: Battle-tested prompt templates
- assets/few-shot-examples.json: Curated example datasets
- scripts/optimize-prompt.py: Automated prompt optimization tool
Success Metrics
Track these KPIs for your prompts:
- Accuracy: Correctness of outputs
- Consistency: Reproducibility across similar inputs
- Latency: Response time (P50, P95, P99)
- Token Usage: Average tokens per request
- Success Rate: Percentage of valid outputs
- User Satisfaction: Ratings and feedback
Next Steps
- Review the prompt template library for common patterns
- Experiment with few-shot learning for your specific use case
- Implement prompt versioning and A/B testing
- Set up automated evaluation pipelines
- Document your prompt engineering decisions and learnings