Who this is for: Developers building MCP servers to integrate external APIs or services with LLMs, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
What This Skill Does
Provides a comprehensive guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to effectively interact with external services through well-designed tools.
Core Capabilities
- Agent-Centric Design — Build for workflows, not just API endpoints
- Context Optimization — Return high-signal information, not exhaustive data dumps
- Actionable Errors — Design error messages that guide agents toward correct usage
- Evaluation-Driven — Create realistic evaluation scenarios early
- Language Support — Python (FastMCP) and Node/TypeScript (MCP SDK) guides
- Quality Checklists — Verify implementation quality before deployment
Usage
Start MCP Server Project
Help me build an MCP server for [API/service name]
Guide me through creating an MCP server step by step
Language-Specific Guidance
Show me the Python MCP implementation guide
What are the TypeScript MCP best practices?
Quality Review
Review my MCP server code for quality issues
Check if my MCP tools follow best practices
Example Workflow
User: "Help me build an MCP server for the GitHub API"
Process:
- Study MCP protocol documentation
- Review Python or TypeScript SDK docs
- Analyze GitHub API exhaustively
- Create comprehensive implementation plan
- Implement core infrastructure first
- Build tools systematically
- Review against quality checklist
- Create evaluation scenarios
Output:
MARKDOWN
## MCP Server Development Plan
### Phase 1: Research Complete
- MCP protocol studied
- Python SDK reviewed
- GitHub API documented
### Phase 2: Implementation
- Core utilities created (API helpers, error handling)
- Tools implemented: list_repos, create_issue, get_pull_request
- Input validation with Pydantic
- Response formats: JSON and Markdown
### Phase 3: Quality Review
✓ DRY principle followed
✓ Error handling comprehensive
✓ Type safety complete
✓ Documentation thorough
### Phase 4: Evaluations
10 complex scenarios created for testing
Four-Phase Development Process
Phase 1: Deep Research and Planning
| Step | Description |
|---|---|
| Agent-Centric Design | Build for workflows, not just API endpoints |
| Context Optimization | Make every token count, provide concise/detailed options |
| Actionable Errors | Guide agents toward correct usage patterns |
| Natural Task Subdivisions | Tool names reflect how humans think about tasks |
| Evaluation-Driven | Create realistic scenarios early |
Phase 2: Implementation
| Step | Description |
|---|---|
| Project Structure | Single .py file or modules (Python), package.json + tsconfig (TypeScript) |
| Core Infrastructure | API helpers, error handling, response formatting, pagination |
| Tool Implementation | Input schemas, docstrings, logic, annotations |
| Language Best Practices | Type hints, async/await, proper imports, build process |
Phase 3: Review and Refine
| Check | Description |
|---|---|
| DRY Principle | No duplicated code between tools |
| Composability | Shared logic extracted into functions |
| Consistency | Similar operations return similar formats |
| Error Handling | All external calls have error handling |
| Type Safety | Full type coverage |
| Documentation | Comprehensive docstrings/descriptions |
Phase 4: Create Evaluations
| Requirement | Description |
|---|---|
| Independent | Not dependent on other questions |
| Read-only | Only non-destructive operations |
| Complex | Requires multiple tool calls |
| Realistic | Based on real use cases |
| Verifiable | Single, clear correct answer |
Tool Design Principles
| Principle | Description |
|---|---|
| Build for Workflows | Consolidate related operations, enable complete tasks |
| Optimize for Context | High-signal info, human-readable identifiers |
| Actionable Errors | Suggest next steps, make errors educational |
| Natural Subdivisions | Tool names reflect human task thinking |
| Evaluation-Driven | Let agent feedback drive improvements |
Input/Output Design
Input Validation
- Python: Pydantic v2 models with
model_config - TypeScript: Zod schemas with
.strict() - Include constraints (min/max, regex, ranges)
- Provide clear field descriptions with examples
Response Formats
- JSON: Structured data for parsing
- Markdown: Human-readable output
- Configurable detail levels (concise/detailed)
- Character limits and truncation strategies
Tool Annotations
PYTHON
readOnlyHint: true # For read-only operations
destructiveHint: false # For non-destructive operations
idempotentHint: true # If repeated calls have same effect
openWorldHint: true # If interacting with external systems
Testing Guidelines
Important: MCP servers are long-running processes. Don't run directly in main process.
| Method | Description |
|---|---|
| Evaluation Harness | Recommended approach, manages server for stdio transport |
| tmux | Run server in tmux to keep outside main process |
| Timeout | Use timeout 5s python server.py for quick tests |
Related Use Cases
- Integrating external APIs with Claude Code
- Building custom tool servers for specific workflows
- Creating evaluable MCP servers with test scenarios
- Following MCP protocol best practices
- Implementing Python or TypeScript MCP servers