Skip to main content
Back to Blog

AI Coding Assistants 2025: Cursor vs Copilot vs Windsurf vs Claude Code

A comprehensive comparison of AI coding assistants in 2025—Cursor, GitHub Copilot, Windsurf, Claude Code, and more. Features, pricing, use cases, and how to maximize productivity with each tool.

19 min read
Share:

The AI Coding Revolution

AI coding assistants have become indispensable for modern software development. In 2025, these tools have evolved from simple autocomplete to autonomous agents that can understand entire codebases, implement features across multiple files, run tests, and create pull requests.

From industry surveys: "78% of developers now use AI coding tools regularly, with productivity gains of 30-50% reported across various tasks."

This guide provides a comprehensive comparison of the leading AI coding assistants, helping you choose the right tool for your workflow.

Two Categories: IDE vs Terminal

AI coding assistants fall into two distinct categories:

IDE-Based Tools

Full graphical IDE experience with visual diff previews, inline suggestions, and GUI-based interactions.

ToolBaseKey Advantage
CursorVS Code forkComposer multi-file editing
GitHub CopilotVS Code/JetBrains pluginGitHub PR integration
WindsurfVS Code forkCascade session memory

Best for: Visual developers, those who prefer GUI interactions, teams needing shared IDE settings.

Terminal-Based Tools

CLI-native interfaces for developers who live in the terminal.

ToolInterfaceKey Advantage
OpenAI CodexCLI + IDE ExtensionGPT-5.2-Codex model, parallel agents
Claude CodeCLI + REPL200K context, deep reasoning
Gemini CLICLIFree tier, multimodal
AiderCLIGit-native workflow

Best for: DevOps engineers, vim/emacs users, automation scripts, CI/CD integration.

Open-Source Agentic Extensions

VS Code extensions providing autonomous coding with any model provider.

ToolBased OnKey Advantage
ClineOriginalPlan/Act modes, MCP support, browser use
Roo CodeCline forkMulti-mode (Code/Architect/Debug), boomerang tasks
Kilo CodeRoo Code forkOrchestrator mode, Memory Bank, $20 free credits

Best for: Developers wanting open-source flexibility with any AI provider.

Specialized Code Models

Open-source models optimized specifically for code generation.

ModelParametersSWE-benchBest For
Qwen 2.5 Coder0.5B-32B69.6% (32B)Local deployment, 92 languages
DeepSeek Coder V216B/236B68.4%Cost-effective, MoE
CodeLlama7B-70B~45%Meta ecosystem

Quick Comparison

ToolTypeBest ForModel AccessPricingKey Feature
CursorIDEPower users, multi-file editsClaude, GPT-4o, custom$20/monthComposer mode
GitHub CopilotIDEGitHub integration, teamsGPT-4o, Claude$10/monthAgent mode + PR creation
WindsurfIDESession continuity, budgetGPT-4o, Claude$15/monthCascade memory system
OpenAI CodexTerminal/IDEOpenAI ecosystemGPT-5.2-CodexChatGPT PlusParallel cloud agents
Claude CodeTerminalLarge codebases, DevOpsClaude 4Usage-based200K context, CLI-native
Gemini CLITerminalGoogle ecosystem, free useGemini 2.5/3Free tierMultimodal support
ClineVS Code ExtOpen-source flexibilityAny (configurable)Free (OSS)Plan/Act modes
Roo CodeVS Code ExtMulti-mode workflowsAny (configurable)Free (OSS)Boomerang tasks
Kilo CodeVS Code ExtFree credits, orchestration500+ modelsFree + creditsMemory Bank
AiderTerminalGit workflows, open sourceAny (configurable)Free (OSS)Git-native commits

Cursor

Overview

Cursor is a VS Code fork that pioneered IDE-native AI with deep codebase understanding. Its Composer mode enables multi-file, autonomous code generation that goes far beyond autocomplete.

Key stats:

  • 40,000+ companies using Cursor
  • Autocomplete powered by Supermaven (fastest in class)
  • Multi-model support: Claude 4, GPT-4o, and custom models

Features Deep Dive

Composer Mode

Composer is Cursor's flagship feature—an agentic coding assistant that can:

  • Understand feature requests in natural language
  • Modify multiple files simultaneously
  • Run terminal commands
  • Execute tests and iterate on failures
  • Create entire features autonomously
Code
User: "Add user authentication with JWT tokens, including
      login/logout endpoints, middleware, and tests"

Composer:
1. Creates auth/jwt.ts with token generation/verification
2. Adds routes/auth.ts with login/logout endpoints
3. Creates middleware/auth.ts for protected routes
4. Updates app.ts to register new routes
5. Generates tests/auth.test.ts
6. Runs tests, fixes any failures

Tab Autocomplete

Cursor's autocomplete (powered by Supermaven) predicts multi-line completions:

Python
# Type: "def calculate_"
# Cursor suggests:
def calculate_total_price(items: list[dict], tax_rate: float = 0.08) -> float:
    """Calculate total price including tax."""
    subtotal = sum(item['price'] * item['quantity'] for item in items)
    return subtotal * (1 + tax_rate)

Codebase Indexing

Cursor indexes your entire codebase for context-aware suggestions:

Python
# Cursor understands your existing patterns
# If your codebase uses a specific ORM pattern:

# Type: "def get_user_by_"
# Cursor suggests based on YOUR existing code:
def get_user_by_email(email: str) -> User | None:
    """Fetch user by email using existing repository pattern."""
    return UserRepository.find_one({"email": email})

Chat with @ References

Reference specific files, functions, or documentation:

Code
@auth.ts @middleware.ts How does the current auth flow work?

@docs Can you explain how to use the payment API?

@git-diff Review these changes for security issues

Cursor Configuration

Configure Cursor's behavior through settings. These control indexing depth, model selection, and context management. Getting these right significantly impacts response quality—more context helps the model understand your codebase, but too much can slow things down.

JSON
// .cursor/settings.json
{
  "cursor.cpp.enableIndexing": true,
  "cursor.general.gitGraphEnabled": true,
  "cursor.chat.showSuggestedFiles": true,
  "cursor.composer.enabled": true,
  "cursor.autocomplete.enabled": true,
  "cursor.autocomplete.useSupermaven": true,

  // Model preferences
  "cursor.models.default": "claude-sonnet-4",
  "cursor.models.composer": "claude-sonnet-4",

  // Context settings
  "cursor.context.maxFiles": 20,
  "cursor.context.includeOpenTabs": true
}

Key settings explained:

  • enableIndexing: Indexes your codebase for semantic search. Essential for "find similar code" and context-aware suggestions. Disable only for huge monorepos where indexing is slow.
  • useSupermaven: Uses Supermaven's fast autocomplete model. Significantly faster than GPT-based completion but may be less accurate for complex patterns.
  • models.default vs models.composer: Use a faster model (Sonnet) for chat, and the same or stronger for Composer's multi-file edits where accuracy matters more than speed.
  • maxFiles: How many files Cursor includes as context. Higher values give better understanding but use more tokens and cost more.

Custom Instructions

The .cursorrules file is your secret weapon for consistent code generation. It tells Cursor about your project's conventions, tech stack, and coding standards. The model reads this before every interaction, so be specific—vague rules get vague results.

Create .cursorrules for project-specific behavior:

Markdown
# .cursorrules

## Project Context
This is a TypeScript backend using:
- Express.js for routing
- Prisma for database ORM
- Jest for testing
- Zod for validation

## Code Style
- Use functional programming patterns where possible
- Prefer `const` over `let`
- Use early returns to reduce nesting
- All functions must have JSDoc comments
- Error handling with custom AppError class

## Testing
- Every new function needs unit tests
- Use test factories for mock data
- Integration tests for API endpoints

## Avoid
- Any use of `any` type
- Console.log in production code
- Hardcoded strings (use constants)

Pricing

PlanPriceFeatures
Free$02000 completions, 50 slow requests
Pro$20/monthUnlimited completions, 500 fast requests
Business$40/user/monthTeam features, admin controls, SSO

GitHub Copilot

Overview

GitHub Copilot is the most widely adopted AI coding assistant, with deep GitHub integration and a new agent mode that can autonomously create pull requests.

Key stats:

  • 1.8 million paid subscribers
  • Used by 77,000+ organizations
  • Powers Copilot Workspace and agent mode

Features Deep Dive

Agent Mode

Copilot's agent mode can take a GitHub issue and autonomously:

  1. Analyze the issue requirements
  2. Explore the codebase for context
  3. Write the implementation
  4. Create and run tests
  5. Open a pull request
Bash
# Trigger agent mode from CLI
gh copilot agent --issue 123

# Or from GitHub UI:
# Click "Start agent" on any issue

Inline Suggestions

Context-aware completions as you type:

TypeScript
// Type a comment describing what you want:
// Function to validate email addresses using regex

// Copilot suggests:
function validateEmail(email: string): boolean {
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return emailRegex.test(email);
}

Copilot Chat

Integrated chat with codebase awareness:

Code
/explain What does this function do?
/fix There's a bug in the sorting logic
/tests Generate unit tests for the selected code
/doc Add documentation to this class

Pull Request Summaries

Auto-generate PR descriptions:

Markdown
## Summary
This PR adds user authentication with JWT tokens.

## Changes
- Added `auth/jwt.ts` for token management
- Created login/logout API endpoints
- Implemented auth middleware
- Added comprehensive test coverage

## Testing
- [x] Unit tests pass
- [x] Integration tests pass
- [x] Manual testing completed

Copilot Configuration

Fine-tune Copilot's behavior through VS Code settings. The most important setting is which file types get suggestions—enable for code, often disable for prose where suggestions can be distracting.

JSON
// VS Code settings.json
{
  "github.copilot.enable": {
    "*": true,
    "markdown": true,
    "plaintext": false,
    "yaml": true
  },
  "github.copilot.advanced": {
    "temperature": 0.3,
    "top_p": 0.95,
    "max_tokens": 500
  },
  "github.copilot.chat.localeOverride": "en",
  "github.copilot.editor.enableAutoCompletions": true
}

Understanding the settings:

  • enable by file type: The "*": true enables for all files, then you can override specific types. Disable for plaintext to avoid suggestions in notes/docs.
  • temperature: Controls randomness. Lower (0.1-0.3) for more deterministic code; higher (0.7+) for creative suggestions. 0.3 is a good default for most coding.
  • top_p: Nucleus sampling parameter. 0.95 means consider tokens in the top 95% probability mass. Lower values make output more focused.
  • max_tokens: Maximum length of suggestions. 500 is enough for most functions; increase for longer code blocks.

Custom Instructions

Copilot reads .github/copilot-instructions.md to understand your project's conventions. Unlike Cursor's .cursorrules, this file lives in the .github folder (so it can be repo-wide) and follows Markdown format. Include your tech stack, coding standards, and patterns you want Copilot to follow.

Create .github/copilot-instructions.md:

Markdown
# Copilot Instructions

## Language & Framework
- TypeScript with strict mode
- React 18 with hooks
- TanStack Query for data fetching
- Tailwind CSS for styling

## Patterns
- Use custom hooks for reusable logic
- Prefer composition over inheritance
- Use discriminated unions for state
- Error boundaries for error handling

## Code Generation
- Include TypeScript types for all functions
- Add JSDoc for public APIs
- Generate tests using Vitest
- Use `describe`/`it` test structure

Copilot CLI

Bash
# Install
gh extension install github/gh-copilot

# Explain a command
gh copilot explain "git rebase -i HEAD~3"

# Suggest a command
gh copilot suggest "find large files in git history"

# Ask questions
gh copilot ask "How do I squash commits?"

Pricing

PlanPriceFeatures
Free$02000 completions/month, limited chat
Individual$10/monthUnlimited completions, chat, CLI
Business$19/user/monthOrganization policies, audit logs
Enterprise$39/user/monthSSO, IP indemnity, advanced security

Windsurf

Overview

Windsurf (formerly Codeium) was acquired by Cognition (makers of Devin) in 2025 and offers the best session memory with its Cascade system. It's ideal for developers who value context continuity across sessions.

Key stats:

  • Acquired by Cognition
  • Best-in-class session memory
  • Clear diff previews before applying changes

Features Deep Dive

Cascade System

Cascade maintains context across your entire coding session:

Code
Session 1 (Morning):
"I'm building a REST API for a blog platform"
[Discusses architecture, creates initial routes]

Session 2 (Afternoon):
"Let's add comments to posts"
[Cascade remembers the blog context, existing routes, patterns]

Session 3 (Next day):
"Add authentication"
[Still has full context of blog platform, existing models]

Flows

Pre-built workflows for common tasks:

Code
/flow create-api
→ Walks through: endpoint design, validation, testing, docs

/flow refactor
→ Analyzes code, suggests improvements, applies changes

/flow debug
→ Examines error, traces cause, suggests fix

Diff Preview

See exactly what will change before applying:

Code
// Windsurf shows clear diffs:
- function getUser(id) {
-   return users.find(u => u.id === id);
- }
+ function getUser(id: string): User | undefined {
+   if (!id) {
+     throw new ValidationError('User ID is required');
+   }
+   return users.find(u => u.id === id);
+ }

Windsurf Configuration

Windsurf's configuration controls the Cascade memory system and completion behavior. The key differentiator is memoryDuration—set to "session" for conversation continuity across reopening the editor, or "project" to persist across sessions (useful for long-running projects).

JSON
// windsurf.config.json
{
  "cascade": {
    "enabled": true,
    "memoryDuration": "session",
    "contextWindow": 128000
  },
  "completions": {
    "enabled": true,
    "delay": 200,
    "multiline": true
  },
  "chat": {
    "model": "gpt-4o",
    "temperature": 0.2
  },
  "flows": {
    "enableBuiltIn": true,
    "customFlowsPath": ".windsurf/flows"
  }
}

Custom Flows

Create custom workflows:

YAML
# .windsurf/flows/create-component.yaml
name: Create React Component
description: Generate a new React component with tests and stories
steps:
  - prompt: "What should this component do?"
    variable: purpose

  - prompt: "What props does it need?"
    variable: props

  - action: generate
    template: |
      Create a React component that:
      - Purpose: {{purpose}}
      - Props: {{props}}
      - Include TypeScript types
      - Add unit tests
      - Add Storybook story

  - action: create_files
    files:
      - "src/components/{{name}}/index.tsx"
      - "src/components/{{name}}/{{name}}.test.tsx"
      - "src/components/{{name}}/{{name}}.stories.tsx"

Pricing

PlanPriceFeatures
Free$0Unlimited basic completions
Pro$15/monthCascade, Flows, premium models
Team$30/user/monthShared memory, team flows, admin

Claude Code

Overview

Claude Code is Anthropic's terminal-native coding assistant. With a 200K token context window and deep reasoning capabilities, it excels at understanding entire codebases and complex refactoring tasks.

Key stats:

  • 80.9% on SWE-bench (state-of-the-art)
  • 200K token context (entire large codebases)
  • Terminal-first interface

Features Deep Dive

Codebase Understanding

Claude Code can ingest entire repositories:

Bash
# Initialize Claude Code in your project
claude-code init

# Ask about the codebase
claude-code ask "How does the authentication system work?"

# Get architecture overview
claude-code ask "Explain the overall architecture and key components"

Multi-File Editing

Edit multiple files with a single command:

Bash
claude-code edit "Rename the User model to Account across the entire codebase"

# Claude Code:
# 1. Finds all references to User
# 2. Updates model definition
# 3. Updates all imports
# 4. Updates all usages
# 5. Updates tests
# 6. Shows diff for approval

Agentic Tasks

Run complex multi-step tasks:

Bash
claude-code task "Add a caching layer for the API responses"

# Claude Code autonomously:
# 1. Analyzes current API structure
# 2. Determines best caching strategy
# 3. Implements Redis caching layer
# 4. Adds cache invalidation
# 5. Updates existing endpoints
# 6. Adds configuration options
# 7. Creates tests
# 8. Documents the changes

Claude Code Configuration

Configure Claude Code through a YAML file at your project root. The context section is critical—it determines which files Claude Code indexes and can access. Be strategic: include source and test files, exclude dependencies and build artifacts. The hooks section automates pre/post-edit checks, catching type errors before you see the diff.

YAML
# claude-code.yaml
project:
  name: my-app
  type: typescript

context:
  include:
    - "src/**/*.ts"
    - "tests/**/*.ts"
    - "*.json"
  exclude:
    - "node_modules"
    - "dist"
    - "*.log"

preferences:
  model: claude-sonnet-4
  max_tokens: 8192
  temperature: 0.2

  style:
    typescript:
      strict: true
      prefer_interfaces: true
      use_type_imports: true

  testing:
    framework: vitest
    coverage_threshold: 80

  documentation:
    style: jsdoc
    require_for_public: true

hooks:
  pre_edit:
    - "npm run typecheck"
  post_edit:
    - "npm run lint:fix"
    - "npm run test"

Claude Code API Integration

Use Claude Code programmatically:

Python
from claude_code import ClaudeCode

# Initialize
cc = ClaudeCode(
    api_key="your-api-key",
    project_path="/path/to/project"
)

# Index the codebase
cc.index()

# Ask questions
response = cc.ask(
    "What are the main API endpoints and their purposes?"
)
print(response.answer)
print(response.relevant_files)

# Edit files
result = cc.edit(
    instruction="Add input validation to all API endpoints",
    files=["src/routes/*.ts"],
    dry_run=True  # Preview changes first
)

for change in result.changes:
    print(f"File: {change.file}")
    print(f"Diff:\n{change.diff}")

# Apply changes
if input("Apply changes? (y/n): ") == "y":
    cc.apply(result)

Shell Integration

Bash
# Add to .bashrc/.zshrc
alias cc="claude-code"
alias cca="claude-code ask"
alias cce="claude-code edit"
alias cct="claude-code task"

# Quick commands
cca "What does this function do?" -f src/utils/parser.ts
cce "Add error handling" -f src/api/routes.ts
cct "Write tests for the auth module"

Pricing

Claude Code uses usage-based pricing:

UsageCost
Input tokens$3/million tokens
Output tokens$15/million tokens
Typical session$0.10-0.50

OpenAI Codex

Overview

OpenAI Codex represents OpenAI's ambitious entry into the agentic coding space. Originally launched as a code completion model (the engine behind GitHub Copilot), Codex has evolved into a full-fledged autonomous coding agent that can understand entire codebases, execute multi-step tasks, and even run multiple parallel agents in the cloud.

The current iteration, powered by GPT-5.2-Codex, is specifically optimized for long-horizon coding tasks—meaning it excels at multi-step problems that require sustained context and planning over many tool calls and file modifications.

Key stats:

  • Cloud-based parallel agent execution (up to 5 concurrent agents)
  • Powered by GPT-5.2-Codex (optimized for long-horizon coding)
  • Available as CLI, VS Code extension, and integrated in Cursor/Windsurf
  • Included in ChatGPT Plus/Pro/Team/Enterprise subscriptions
  • Sandbox execution environment for safe code running
  • Native integration with GitHub for PR creation

The Evolution of Codex

Understanding Codex's history helps contextualize its current capabilities:

2021 - Codex Original: Launched as a fine-tuned GPT-3 model for code completion. Powered the original GitHub Copilot. Limited to autocomplete suggestions.

2023 - GPT-4 Code Interpreter: Code execution capabilities added via ChatGPT. Users could upload files and run Python code in a sandbox.

2024 - ChatGPT Canvas: Introduced a dedicated coding interface within ChatGPT with side-by-side editing and iterative refinement.

2025 - Codex Agent (Current): Full autonomous coding agent with:

  • Cloud-based execution environment
  • Parallel agent support
  • Native IDE integration
  • Git/GitHub workflow integration
  • Persistent project understanding

Architecture Deep Dive

Codex operates fundamentally differently from local coding assistants:

Code
Traditional (Cursor, Cline):
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│ Your IDE    │───>│   LLM API   │───>│   Response  │
│ (local)     │    │   (cloud)   │    │   (local)   │
└─────────────┘    └─────────────┘    └─────────────┘
     ^                                       │
     └───────────────────────────────────────┘
              (runs locally)

Codex Architecture:
┌─────────────┐    ┌─────────────────────────────┐
│ Your IDE    │───>│      Codex Cloud            │
│ (local)     │    │  ┌─────────────────────┐   │
└─────────────┘    │  │   GPT-5.2-Codex     │   │
     ^             │  │   ┌───────────────┐ │   │
     │             │  │   │ Sandbox VM    │ │   │
     │             │  │   │ - Your code   │ │   │
     │             │  │   │ - Git clone   │ │   │
     │             │  │   │ - Test runner │ │   │
     │             │  │   └───────────────┘ │   │
     │             │  └─────────────────────┘   │
     │             └─────────────────────────────┘
     └──────── (PR/diff returned) ──────────────┘

This cloud-native architecture enables several unique capabilities:

  1. Parallel execution: Multiple agents can work simultaneously without competing for local resources
  2. Isolated environments: Each task runs in a fresh sandbox with your code cloned
  3. Full execution: Agents can actually run tests, builds, and other commands
  4. No local compute: Your machine isn't taxed during heavy AI processing

Features Deep Dive

Parallel Cloud Agents

Codex's most distinctive feature is running multiple agents simultaneously in the cloud. Each agent gets its own isolated environment:

Bash
# Start multiple agents working in parallel
codex task "Implement user authentication with JWT" --background &
codex task "Write comprehensive unit tests for the User model" --background &
codex task "Update API documentation with new endpoints" --background &

# Check status of all running agents
codex status

# Output:
# Agent 1: "Implement user authentication" - 67% complete
#   └─ Created: auth/jwt.ts, auth/middleware.ts
#   └─ Currently: Writing login route tests
#
# Agent 2: "Unit tests for User model" - 45% complete
#   └─ Created: tests/user.model.test.ts
#   └─ Currently: Testing edge cases
#
# Agent 3: "API documentation" - 89% complete
#   └─ Updated: docs/api.md, docs/auth.md
#   └─ Currently: Generating OpenAPI spec

# View specific agent details
codex agent view agent-1

# Cancel an agent
codex agent cancel agent-3

# Wait for all agents to complete
codex wait --all

Each agent works independently and produces either:

  • A pull request with all changes
  • A diff for local review
  • Applied changes (if auto-apply is enabled)

The Sandbox Environment

Every Codex agent runs in a sandboxed Linux VM with:

YAML
Sandbox Specifications:
  OS: Ubuntu 22.04 LTS
  CPU: 4 vCPUs
  RAM: 16GB
  Storage: 50GB ephemeral
  Network: Configurable (default: restricted)
  Timeout: 30 minutes (default), up to 2 hours

Pre-installed:
  - Node.js 20.x, 18.x
  - Python 3.11, 3.10
  - Go 1.21
  - Rust 1.75
  - Java 21
  - .NET 8.0
  - Docker (rootless)
  - Common build tools (make, cmake, etc.)

Available on request:
  - Database servers (PostgreSQL, MySQL, Redis)
  - Custom Docker images
  - Specific language versions

The sandbox can:

  • Clone your repository (via GitHub integration or uploaded zip)
  • Install dependencies (npm install, pip install, etc.)
  • Run tests
  • Execute build scripts
  • Start development servers for testing
  • Make HTTP requests (if network enabled)
Bash
# Enable network access for agents that need external APIs
codex task "Integrate Stripe payments" --sandbox-network=true

# Use a specific sandbox configuration
codex task "Build mobile app" --sandbox-config=mobile.yaml

# Custom sandbox with Docker
codex task "Test in production-like environment" \
  --sandbox-docker="postgres:15,redis:7"

Agent Skills System

Skills are reusable instruction packages that standardize common operations:

YAML
# .codex/skills/security-review.yaml
name: security-review
description: Review code for security vulnerabilities
version: "1.0"

instructions: |
  Perform a comprehensive security review:

  1. OWASP Top 10 check:
     - SQL Injection
     - XSS
     - CSRF
     - Authentication issues
     - Access control
     - Security misconfiguration
     - Cryptographic failures

  2. Dependency vulnerabilities:
     - Run npm audit / pip-audit / cargo audit
     - Check for known CVEs

  3. Secret detection:
     - Scan for hardcoded credentials
     - Check for API keys in code
     - Verify .env files are gitignored

  4. Code patterns:
     - Unsafe deserialization
     - Command injection risks
     - Path traversal vulnerabilities

scripts:
  pre:
    - npm audit
    - npx secret-scanner
  post:
    - npm run lint:security

output:
  format: markdown
  sections:
    - severity_high
    - severity_medium
    - severity_low
    - recommendations

triggers:
  - on_pr: true
  - on_push_to: [main, develop]
YAML
# .codex/skills/migration-helper.yaml
name: migration-helper
description: Help migrate between versions/frameworks

parameters:
  from_version:
    type: string
    required: true
    description: Current version or framework
  to_version:
    type: string
    required: true
    description: Target version or framework

instructions: |
  Migrate codebase from {{from_version}} to {{to_version}}:

  1. Analyze current usage patterns
  2. Identify breaking changes between versions
  3. Create migration plan
  4. Implement changes systematically
  5. Update dependencies
  6. Fix type errors and deprecation warnings
  7. Run test suite, fix failures
  8. Update documentation

  Approach:
  - Make atomic commits for each logical change
  - Preserve git history where possible
  - Add migration notes as comments where behavior changed

Using skills:

Bash
# Run a skill
codex --skill security-review

# Run with parameters
codex --skill migration-helper \
  --param from_version="React 17" \
  --param to_version="React 19"

# List available skills
codex skills list

# Create skill from successful task
codex skills create-from-task task-abc123 --name my-new-skill

Interactive Mode

Codex's interactive REPL provides a conversational coding experience:

Bash
$ codex
Welcome to Codex! Type 'help' for commands.

codex> What does the payment processing module do?

The payment processing module (src/payments/) handles:
- Payment intent creation via Stripe
- Webhook handling for async payment events
- Refund processing
- Subscription management

Key files:
- stripe.ts: Stripe SDK wrapper
- webhooks.ts: Event handlers
- subscriptions.ts: Recurring billing logic

codex> There's a bug where refunds sometimes fail silently. Can you investigate?

I'll investigate the refund handling. Let me:
1. Read the refund code in src/payments/refunds.ts
2. Check the error handling patterns
3. Look at recent related commits

[Reading files...]

I found the issue. In `processRefund()` on line 45, errors from the Stripe
API are caught but not properly propagated:

```typescript
// Current (problematic):
try {
  await stripe.refunds.create({ payment_intent: intentId });
} catch (e) {
  console.error(e); // Error logged but not thrown
}

// Should be:
try {
  await stripe.refunds.create({ payment_intent: intentId });
} catch (e) {
  console.error('Refund failed:', e);
  throw new RefundError(e.message, { intentId, originalError: e });
}

Would you like me to fix this and add proper error handling throughout the payments module?

codex> Yes, fix it and add tests

Starting task in background... Agent ID: agent-7f3a2b Use 'codex status' to monitor progress.

codex> /status

Agent agent-7f3a2b: "Fix refund error handling" Status: In Progress (43%) ├─ ✅ Fixed src/payments/refunds.ts ├─ ✅ Added RefundError class ├─ 🔄 Writing tests for refund.test.ts └─ ⏳ Pending: Integration test updates

codex> /quit

Code

#### CLI Commands Reference

```bash
# Authentication
codex auth login           # Sign in with ChatGPT account
codex auth logout          # Sign out
codex auth status          # Check authentication status

# Basic operations
codex "your prompt"        # Quick one-shot task
codex ask "question"       # Ask about codebase (no changes)
codex edit "instruction"   # Edit specific files
codex task "description"   # Full autonomous task

# Agent management
codex status               # List all agents
codex agent view <id>      # View agent details
codex agent logs <id>      # Stream agent logs
codex agent cancel <id>    # Cancel running agent
codex wait <id>            # Wait for agent completion
codex wait --all           # Wait for all agents

# Skills
codex --skill <name>       # Run a skill
codex skills list          # List available skills
codex skills show <name>   # Show skill details
codex skills create        # Create new skill
codex skills delete <name> # Delete a skill

# Configuration
codex config show          # Show current config
codex config set <k> <v>   # Set config value
codex init                 # Initialize project

# Git integration
codex pr create            # Create PR from agent result
codex pr list              # List Codex-created PRs

# Advanced
codex --model <name>       # Use specific model
codex --sandbox-network    # Enable network in sandbox
codex --timeout <mins>     # Set task timeout
codex --verbose            # Verbose output
codex --dry-run            # Preview without executing

Configuration

Codex supports both global and project-level configuration:

YAML
# .codex/config.yaml (project-level)

# Model selection
model: gpt-5.2-codex          # Default model
model_fallback: gpt-4o        # Fallback if primary unavailable

# Approval modes
approval_mode: suggest        # suggest | auto-edit | full-auto
  # suggest: Shows diff, waits for approval
  # auto-edit: Applies non-breaking changes automatically
  # full-auto: Applies all changes (use with caution)

# Context configuration
context:
  include:
    - "src/**/*"
    - "tests/**/*"
    - "docs/**/*.md"
    - "*.json"
    - "*.yaml"
  exclude:
    - "node_modules/**"
    - "dist/**"
    - "*.log"
    - ".env*"
  max_files: 100              # Maximum files to include
  max_file_size: 100KB        # Skip files larger than this

# Skills configuration
skills:
  directory: .codex/skills
  enabled:
    - security-review
    - migration-helper
  disabled:
    - experimental-feature

# Sandbox configuration
sandbox:
  enabled: true
  network: false              # Disable network by default
  timeout: 1800               # 30 minutes
  memory: 16GB
  docker_images:              # Pre-pull these images
    - node:20
    - postgres:15

# Git integration
git:
  auto_commit: false          # Auto-commit changes
  commit_prefix: "[codex]"    # Prefix for commits
  branch_prefix: "codex/"     # Prefix for branches
  create_pr: prompt           # prompt | auto | never

# Output preferences
output:
  format: detailed            # minimal | detailed | verbose
  show_thinking: false        # Show agent reasoning
  syntax_highlight: true

# Safety settings
safety:
  max_files_modified: 20      # Warn if more files changed
  require_tests: true         # Require tests for new code
  no_force_push: true         # Prevent force pushes
  protected_files:            # Never modify these
    - ".env"
    - "secrets.yaml"
YAML
# ~/.codex/config.yaml (global defaults)

# Global preferences
default_model: gpt-5.2-codex
theme: dark
editor: code                  # Editor for viewing diffs

# Authentication
api_base_url: https://api.openai.com/v1

# Usage limits (for cost control)
limits:
  daily_tasks: 50
  concurrent_agents: 3
  warn_at_cost: 10.00

# Telemetry
telemetry: false

IDE Integration

VS Code Extension

The Codex VS Code extension provides a rich GUI experience:

JSON
// VS Code settings.json
{
  "codex.enabled": true,
  "codex.model": "gpt-5.2-codex",

  // Inline suggestions (like Copilot)
  "codex.inlineSuggestions.enabled": true,
  "codex.inlineSuggestions.debounceMs": 200,

  // Agent panel
  "codex.agentPanel.position": "right",
  "codex.agentPanel.showOnStart": true,

  // Auto-apply settings
  "codex.autoApply.readOnlyOperations": true,
  "codex.autoApply.formatting": true,
  "codex.autoApply.imports": true,
  "codex.autoApply.codeChanges": false,

  // Keybindings
  "codex.keybindings.triggerAgent": "ctrl+shift+c",
  "codex.keybindings.explainSelection": "ctrl+shift+e",
  "codex.keybindings.fixError": "ctrl+shift+f"
}

VS Code features:

  • Agent Panel: View and manage running agents
  • Inline Diff Preview: See changes before applying
  • Context Menu: Right-click to trigger Codex actions
  • Problems Integration: Codex can auto-fix diagnostics
  • Terminal Integration: Run Codex commands in integrated terminal

Cursor Integration

Codex works alongside Cursor's native features:

JSON
// Cursor settings
{
  // Use Codex for agentic tasks, Cursor for autocomplete
  "cursor.autocomplete.enabled": true,
  "codex.inlineSuggestions.enabled": false,

  // Trigger Codex from Cursor
  "codex.cursorIntegration": true,
  "codex.cursorHotkey": "ctrl+shift+o"
}

Advanced Usage Patterns

Batch Operations

Process multiple tasks efficiently:

Bash
# Create a batch file
cat > tasks.yaml << EOF
tasks:
  - name: "Add TypeScript types"
    files: ["src/legacy/*.js"]
    instruction: "Convert to TypeScript with strict types"

  - name: "Add error handling"
    files: ["src/api/*.ts"]
    instruction: "Add proper error handling with custom error classes"

  - name: "Generate tests"
    files: ["src/utils/*.ts"]
    instruction: "Generate comprehensive unit tests"
    depends_on: ["Add TypeScript types"]
EOF

# Run batch
codex batch tasks.yaml

# Monitor batch progress
codex batch status

CI/CD Integration

Use Codex in automated pipelines:

YAML
# .github/workflows/codex-review.yaml
name: Codex Code Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run Codex Security Review
        uses: openai/codex-action@v2
        with:
          skill: security-review
          api-key: ${{ secrets.OPENAI_API_KEY }}

      - name: Run Codex Code Quality
        uses: openai/codex-action@v2
        with:
          prompt: |
            Review this PR for:
            - Code quality issues
            - Performance problems
            - Missing tests
            - Documentation gaps
          comment-on-pr: true
YAML
# GitLab CI integration
codex-review:
  stage: review
  image: openai/codex-cli:latest
  script:
    - codex auth login --token $OPENAI_API_KEY
    - codex --skill code-review --output=report.md
    - codex --skill security-review --output=security.md
  artifacts:
    reports:
      codequality: report.md

Custom Model Configuration

Use Codex with specific model settings:

Bash
# Use a specific model version
codex --model gpt-5.2-codex-20250115 "task description"

# Adjust generation parameters
codex --temperature 0.2 --max-tokens 8192 "precise task"

# Use with fine-tuned model
codex --model ft:gpt-5.2-codex:my-org:my-finetune "specialized task"

Pricing

Codex is included in ChatGPT subscriptions with varying limits:

PlanMonthly CostAgent MinutesParallel AgentsFeatures
ChatGPT Plus$20/month600 min2CLI + VS Code
ChatGPT Pro$200/monthUnlimited5Priority, all features
ChatGPT Team$25/user/month1000 min/user3/userAdmin controls
EnterpriseCustomUnlimited10+/userSSO, audit logs, SLA

API pricing (for programmatic access):

ModelInputOutput
gpt-5.2-codex$3.00/1M tokens$12.00/1M tokens
gpt-5.2-codex-mini$0.50/1M tokens$2.00/1M tokens

Sandbox compute:

  • Included in subscription limits
  • Additional compute: $0.02/minute

Codex vs Other Tools

CapabilityCodexClaude CodeCursorCline
Execution modelCloud sandboxLocalLocalLocal
Parallel agents✅ 2-10❌ No❌ No❌ No
Test execution✅ In sandbox✅ Local✅ Local✅ Local
Context window128K200K128KModel-dependent
Offline use❌ No✅ Yes⚠️ Limited✅ Yes
IDE integration✅ Good⚠️ CLI-focused✅ Native✅ VS Code
Cost modelSubscriptionPay-per-useSubscriptionPay-per-use

When to choose Codex:

  • You need parallel agent execution
  • You want isolated sandbox environments
  • You're already paying for ChatGPT Plus/Pro
  • You need GitHub integration for PR creation
  • You prefer subscription pricing over pay-per-use

When to choose alternatives:

  • You need offline capability
  • You require 200K+ token context
  • You prefer local execution for privacy
  • You want to use Claude or other non-OpenAI models

Gemini CLI

Overview

Gemini CLI is Google's terminal-first coding assistant, offering access to Gemini 2.5/3 models with a generous free tier and deep integration with Google Cloud.

Key stats:

  • Free tier with Gemini 2.5 Flash
  • 1M token context window
  • Native multimodal support (images, diagrams)

Features Deep Dive

Interactive Mode

Bash
# Start interactive session
gemini

> Explain the architecture of this project
[Analyzes codebase, provides overview]

> Generate a REST API for user management
[Creates complete API with routes, controllers, models]

> /image architecture.png What improvements would you suggest?
[Analyzes architecture diagram, provides recommendations]

Code Generation

Bash
# Generate code from description
gemini generate "A TypeScript function that validates credit card numbers using the Luhn algorithm"

# Output:
function validateCreditCard(cardNumber: string): boolean {
  // Remove spaces and dashes
  const cleaned = cardNumber.replace(/[\s-]/g, '');

  // Check if only digits
  if (!/^\d+$/.test(cleaned)) {
    return false;
  }

  // Luhn algorithm
  let sum = 0;
  let isEven = false;

  for (let i = cleaned.length - 1; i >= 0; i--) {
    let digit = parseInt(cleaned[i], 10);

    if (isEven) {
      digit *= 2;
      if (digit > 9) {
        digit -= 9;
      }
    }

    sum += digit;
    isEven = !isEven;
  }

  return sum % 10 === 0;
}

File Operations

Bash
# Explain a file
gemini explain src/complex-algorithm.ts

# Review code

gemini review --security src/auth/

# Refactor
gemini refactor "Convert callbacks to async/await" src/legacy/

# Generate tests
gemini test src/utils/validation.ts -o tests/

Gemini CLI Configuration

YAML
# ~/.gemini/config.yaml
model: gemini-2.5-flash  # or gemini-3-flash for paid
temperature: 0.2
max_tokens: 8192

context:
  auto_include_open_files: true
  max_context_files: 50

output:
  format: markdown
  syntax_highlighting: true

project_detection:
  enabled: true
  config_files:
    - package.json
    - pyproject.toml
    - Cargo.toml
    - go.mod

Pricing

TierModelPrice
FreeGemini 2.5 Flash$0 (rate limited)
Pay-as-you-goGemini 2.5 Pro$0.075/1K input tokens
Pay-as-you-goGemini 3 Flash$0.50/1M input tokens

Aider (Open Source)

Overview

Aider is the leading open-source terminal-based coding assistant. It's git-native, meaning it automatically commits changes with descriptive messages.

Key stats:

  • 100% open source (Apache 2.0)
  • Works with any LLM (Claude, GPT-4, Llama, etc.)
  • Git-native: automatic commits with good messages
  • Active community and frequent updates

Features Deep Dive

Git-Native Workflow

Aider automatically commits each change:

Bash
# Start aider
aider

> Add input validation to the login form

# Aider:
# 1. Analyzes the codebase
# 2. Makes changes to relevant files
# 3. Automatically creates a git commit:
#    "feat: Add input validation to login form
#     - Added email format validation
#     - Added password strength requirements
#     - Added error message display"

Multi-Model Support

Bash
# Use Claude
aider --model claude-3-5-sonnet

# Use GPT-4
aider --model gpt-4o

# Use local Ollama model
aider --model ollama/llama3.1

# Use DeepSeek
aider --model deepseek/deepseek-chat

Watch Mode

Aider watches for file changes and responds:

Bash
# Start in watch mode
aider --watch

# Now edit files manually or with another tool
# Aider sees changes and can help integrate them

Architect Mode

For planning before coding:

Bash
aider --architect

> Plan a microservices architecture for an e-commerce platform

# Aider creates a plan, then you approve before implementation

Aider Configuration

YAML
# .aider.conf.yml
model: claude-3-5-sonnet
auto-commits: true
commit-prompt: conventional  # conventional commits format

# File handling
auto-lint: true
lint-cmd: npm run lint
test-cmd: npm test

# Git settings
attribute-author: true
attribute-committer: true
dirty-commits: false

# Context
map-tokens: 1024
map-refresh: auto
subtree-only: false

Usage Examples

Bash
# Add files to chat context
aider src/auth/*.ts tests/auth/*.ts

# Ask questions
aider --message "Explain how the auth flow works"

# Make changes
aider --message "Add rate limiting to the login endpoint"

# Review and fix
aider --message "Review this code for security issues and fix them"

# Generate tests
aider --message "Add comprehensive tests for the User model"

Aider Scripting

Bash
#!/bin/bash
# automated-review.sh

# Run aider in non-interactive mode for CI
aider --yes --message "
Review the changes in this PR for:
1. Security vulnerabilities
2. Performance issues
3. Code style violations

Fix any issues found.
" --model claude-3-5-sonnet

# Check if aider made changes
if [ -n "$(git status --porcelain)" ]; then
    echo "Aider found and fixed issues"
    git push
fi

Pricing

Aider itself is free. You pay for the LLM API:

ModelCost per 1M tokens
Claude 3.5 Sonnet3input/3 input / 15 output
GPT-4o2.50input/2.50 input / 10 output
Local (Ollama)Free
DeepSeek0.14input/0.14 input / 0.28 output

Cline

Overview

Cline is the original open-source autonomous coding agent for VS Code. It pioneered the human-in-the-loop approach where every action requires approval, making it safe for production use.

Key stats:

  • 100% open-source (Apache 2.0)
  • Works with any API provider (OpenAI, Anthropic, local models)
  • Plan/Act mode separation for strategic thinking
  • MCP (Model Context Protocol) support for extensibility

Features Deep Dive

Plan & Act Mode

Cline's unique approach separates thinking from doing:

Code
Plan Mode (Read-Only):
├── Explore codebase
├── Analyze architecture
├── Create implementation strategy
└── No file modifications allowed

Act Mode (Execute):
├── Implement planned changes
├── Run commands
├── Create/modify files
└── Human approval for each action
Bash
# Configure different models for each mode
# Plan: Use cheaper/faster model for exploration
# Act: Use more capable model for implementation

# In Cline settings:
Plan Model: deepseek-chat (cost-effective)
Act Model: claude-sonnet-4 (high quality)

Browser Integration

Cline can interact with web browsers for testing:

TypeScript
// Cline can:
// 1. Launch a headless browser
// 2. Navigate to your local dev server
// 3. Click, type, scroll
// 4. Capture screenshots
// 5. Read console logs
// 6. Fix runtime errors it discovers

// Example task:
"Test the login flow and fix any errors you find"

// Cline will:
// - Start dev server
// - Open browser to localhost:3000
// - Try to log in
// - Capture any errors
// - Fix the code
// - Verify the fix works

MCP Tool Creation

Extend Cline with custom tools:

JSON
{
  "mcpServers": {
    "database": {
      "command": "node",
      "args": ["./mcp-servers/database.js"],
      "tools": [
        {
          "name": "query_database",
          "description": "Execute read-only SQL queries",
          "inputSchema": {
            "type": "object",
            "properties": {
              "query": { "type": "string" }
            }
          }
        }
      ]
    }
  }
}

Configuration

JSON
// .vscode/settings.json
{
  "cline.apiProvider": "anthropic",
  "cline.apiKey": "${env:ANTHROPIC_API_KEY}",
  "cline.model": "claude-sonnet-4",

  "cline.planMode": {
    "enabled": true,
    "model": "deepseek-chat"
  },

  "cline.autoApprove": {
    "readFiles": true,
    "listFiles": true,
    "writeFiles": false,
    "executeCommands": false
  },

  "cline.contextWindow": 128000,
  "cline.maxTokens": 8192
}

Pricing

Cline is free. You pay only for API usage:

ProviderModelCost
AnthropicClaude Sonnet 43/3/15 per 1M tokens
OpenAIGPT-4o2.50/2.50/10 per 1M tokens
DeepSeekDeepSeek Chat0.14/0.14/0.28 per 1M tokens
LocalOllama/LM StudioFree

Roo Code

Overview

Roo Code (formerly Roo Cline) is a fork of Cline with enhanced multi-mode capabilities. It provides specialized "personas" for different tasks and innovative features like boomerang tasks.

Key stats:

  • 900K+ VS Code Marketplace installs
  • Multi-mode: Code, Architect, Ask, Debug, Custom
  • Boomerang tasks for complex workflows
  • Stable codebase indexing

Features Deep Dive

Multi-Mode System

Different modes for different tasks:

Code
Modes:
├── Code Mode
│   └── Implementation focused, writes code
├── Architect Mode
│   └── Planning focused, designs solutions
├── Ask Mode
│   └── Q&A focused, explains code
├── Debug Mode
│   └── Bug-finding focused, traces issues
└── Custom Modes
    └── User-defined personas
TypeScript
// Configure different models per mode
const modeConfig = {
  architect: {
    model: "o3",  // Best for planning
    systemPrompt: "You are a senior software architect..."
  },
  code: {
    model: "claude-sonnet-4",  // Best for implementation
    systemPrompt: "You are an expert programmer..."
  },
  debug: {
    model: "gpt-4o",  // Good at finding issues
    systemPrompt: "You are a debugging expert..."
  }
};

Boomerang Tasks

Chain complex workflows across modes:

Code
Boomerang Task Flow:

1. User: "Add user authentication"
   ↓
2. Architect Mode: Plans the implementation
   - Designs auth flow
   - Identifies files to create/modify
   - Creates task breakdown
   ↓
3. Code Mode: Implements each task
   - Creates auth middleware
   - Adds login/logout routes
   - Updates database schema
   ↓
4. Debug Mode: Verifies implementation
   - Runs tests
   - Checks for security issues
   - Validates edge cases
   ↓
5. Returns to User: Complete with summary

Codebase Indexing

Persistent understanding of your project:

Bash
# Roo Code indexes your codebase
# - Function definitions
# - Class hierarchies
# - Import relationships
# - Test coverage mapping

# Query the index
"What functions call the validateUser method?"
"Show me all API endpoints that don't have tests"
"Find all places where we handle authentication errors"

Configuration

JSON
// .roo/config.json
{
  "modes": {
    "architect": {
      "model": "anthropic/claude-sonnet-4",
      "temperature": 0.3
    },
    "code": {
      "model": "anthropic/claude-sonnet-4",
      "temperature": 0.2
    },
    "debug": {
      "model": "openai/gpt-4o",
      "temperature": 0.1
    }
  },
  "boomerang": {
    "enabled": true,
    "autoTransition": true
  },
  "indexing": {
    "enabled": true,
    "excludePaths": ["node_modules", "dist", ".git"]
  }
}

Pricing

Free and open-source. Pay only for AI provider APIs.

Kilo Code

Overview

Kilo Code is a fork of Roo Code with additional enterprise features, including the Orchestrator mode, Memory Bank, and generous free credits for new users.

Key stats:

  • #1 on OpenRouter (6.1 trillion tokens/month)
  • 750K+ active users
  • $20 free credits for new users
  • Access to 500+ AI models

Features Deep Dive

Orchestrator Mode

Automatically chains specialized agents:

Code
Orchestrator Flow:

User: "Build a REST API for a todo app"

Orchestrator:
├── 1. Architect Agent
│   └── Designs API structure, endpoints, database schema
├── 2. Code Agent
│   └── Implements routes, controllers, models
├── 3. Test Agent
│   └── Writes unit and integration tests
├── 4. Documentation Agent
│   └── Generates API docs and README
└── 5. Review Agent
    └── Checks for issues, suggests improvements

Memory Bank

Persistent project context across sessions:

TypeScript
// Memory Bank stores:
// - Project architecture decisions
// - Coding conventions used
// - Previous task context
// - User preferences

// Example: Memory Bank remembers your patterns
Session 1: "Use Prisma for database"
Session 2: "Add a new model"
// Kilo automatically uses Prisma, not raw SQL

// Memory Bank file structure
.kilo/
├── memory/
│   ├── architecture.md    # Design decisions
│   ├── conventions.md     # Code style
│   ├── context.md         # Ongoing work
│   └── preferences.md     # User preferences

Cross-Platform Sync

Work across devices:

Bash
# Start on mobile (Kilo mobile app)
"Plan a caching system for the API"

# Continue on desktop (VS Code)
"Implement the caching plan"

# Finish on laptop (Cursor)
"Add tests for the cache"

# All context is preserved across devices

Configuration

JSON
// kilo.config.json
{
  "orchestrator": {
    "enabled": true,
    "agents": ["architect", "code", "test", "docs", "review"]
  },
  "memoryBank": {
    "enabled": true,
    "syncToCloud": true
  },
  "models": {
    "default": "claude-sonnet-4",
    "planning": "gemini-3-pro",
    "coding": "claude-sonnet-4",
    "review": "gpt-4o"
  },
  "credits": {
    "warnAt": 5.00,
    "hardLimit": null
  }
}

Pricing

TierCostFeatures
Free Start0+0 + 20 creditFull access while credits last
Pay-as-you-goProvider ratesNo markup, transparent pricing
PromotionsUp to $120 freeCheck kilo.ai for current offers

Qwen Coder

Overview

Qwen 2.5 Coder is Alibaba's state-of-the-art open-source code model, available in sizes from 0.5B to 32B parameters. It's not a tool itself, but powers many tools and can be run locally.

Key stats:

  • 69.6% on SWE-bench (32B model)
  • 92 programming languages
  • 128K context window
  • Apache 2.0 license

Using Qwen Coder

With Ollama (Local)

Bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull Qwen Coder
ollama pull qwen2.5-coder:32b

# Run interactively
ollama run qwen2.5-coder:32b

# Or use with other tools

With Aider

Bash
# Use Qwen Coder with Aider
aider --model ollama/qwen2.5-coder:32b

# Or via OpenRouter
aider --model openrouter/qwen/qwen-2.5-coder-32b-instruct

With Cline/Roo Code/Kilo Code

JSON
// Configure in VS Code settings
{
  "cline.apiProvider": "ollama",
  "cline.ollamaBaseUrl": "http://localhost:11434",
  "cline.model": "qwen2.5-coder:32b"
}

Direct API Usage

Python
from openai import OpenAI

# Via OpenRouter
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="your-openrouter-key"
)

response = client.chat.completions.create(
    model="qwen/qwen-2.5-coder-32b-instruct",
    messages=[
        {
            "role": "system",
            "content": "You are an expert programmer."
        },
        {
            "role": "user",
            "content": "Write a Python function to validate email addresses"
        }
    ]
)

print(response.choices[0].message.content)

Model Variants

ModelParametersVRAM RequiredBest For
qwen2.5-coder:0.5b0.5B1GBEdge devices
qwen2.5-coder:1.5b1.5B2GBLight tasks
qwen2.5-coder:7b7B8GBGood balance
qwen2.5-coder:14b14B16GBStrong performance
qwen2.5-coder:32b32B32GBBest quality

Benchmark Performance

BenchmarkQwen 2.5 Coder 32BGPT-4oClaude Sonnet
HumanEval92.7%90.2%92.0%
MBPP90.2%88.1%89.5%
SWE-bench69.6%68.4%72.7%
MultiPL-E75.2%73.8%74.1%

Pricing

MethodCost
Local (Ollama)Free (your hardware)
OpenRouter0.06/0.06/0.18 per 1M tokens
Together AI0.08/0.08/0.24 per 1M tokens
Alibaba CloudVariable

Head-to-Head Comparison

Autocomplete Speed

Based on benchmarks (lower is better):

ToolTime to First SuggestionMulti-line Support
Cursor (Supermaven)~50msExcellent
GitHub Copilot~150msGood
Windsurf~120msGood
Cline/Roo/Kilo~200ms (depends on model)Good
Claude CodeN/A (not autocomplete-focused)N/A
Gemini CLI~200msGood

Agentic Capabilities

CapabilityCursorCopilotWindsurfCline/Roo/KiloClaude CodeCodex
Multi-file editing✅ Excellent✅ Good✅ Good✅ Excellent✅ Excellent✅ Excellent
Autonomous PR creation⚠️ Limited✅ Excellent⚠️ Limited⚠️ Limited✅ Good✅ Excellent
Test generation & running✅ Yes✅ Yes✅ Yes✅ Yes✅ Yes✅ Yes
Codebase understanding✅ Good✅ Good✅ Good✅ Excellent✅ Excellent✅ Good
Terminal command execution✅ Yes✅ Yes✅ Yes✅ Yes✅ Native✅ Yes
Session memory⚠️ Limited⚠️ Limited✅ Excellent✅ Good (Kilo best)⚠️ Session⚠️ Limited
Browser testing❌ No❌ No❌ No✅ Yes (Cline)❌ No❌ No
MCP support⚠️ Limited❌ No❌ No✅ Full❌ No❌ No
Plan/Act separation❌ No❌ No❌ No✅ Yes❌ No❌ No

Benchmark Performance

On SWE-bench Verified (higher is better):

Tool/ModelScoreNotes
Claude Code80.9%State-of-the-art
Cursor (Claude)78.2%With Composer
GitHub Copilot72.5%Agent mode
Windsurf71.8%With Cascade
Cline (Claude Sonnet 4)72.7%Depends on model used
Qwen 2.5 Coder 32B69.6%Open-source local
OpenAI Codex69.1%GPT-5.2-Codex
Gemini CLI68.4%Gemini 3 Flash

Open-Source Tool Comparison

FeatureClineRoo CodeKilo CodeAider
BaseOriginalCline forkRoo forkOriginal
Plan/Act modes✅ Yes✅ Yes✅ Yes❌ No
Multi-mode personas❌ No✅ Yes✅ Yes✅ Architect mode
Memory Bank❌ No⚠️ Basic✅ Full❌ No
Orchestrator❌ No⚠️ Boomerang✅ Full❌ No
Browser use✅ Yes✅ Yes✅ Yes❌ No
MCP support✅ Full✅ Full✅ Full❌ No
Git integration⚠️ Basic⚠️ Basic⚠️ Basic✅ Native
Free credits❌ No❌ No✅ $20❌ No
Marketplace installs2M+900K+750K+CLI only

Cost Analysis

For a typical developer (100 requests/day, 20 working days/month):

ToolMonthly CostCost per Request
Cursor Pro$20~$0.01
Copilot Individual$10~$0.005
Windsurf Pro$15~$0.0075
Claude Code~$30-60~$0.015-0.03
Gemini CLI (Free)$0$0

Choosing the Right Tool

Decision Framework

Code
START
  │
  ├─ Do you need GitHub integration (PRs, issues)?
  │    └─ YES → GitHub Copilot
  │
  ├─ Do you work in terminal primarily?
  │    └─ YES → Claude Code or Gemini CLI
  │
  ├─ Do you need session memory across days?
  │    └─ YES → Windsurf
  │
  ├─ Do you do complex multi-file refactoring?
  │    └─ YES → Cursor or Claude Code
  │
  ├─ Are you budget-constrained?
  │    └─ YES → Gemini CLI (free) or Copilot ($10)
  │
  └─ DEFAULT → Cursor (best all-around)

Best Tool by Use Case

Use CaseRecommendedWhy
Daily codingCursorFast autocomplete + Composer
Open source contributionsCopilotPR agent, GitHub integration
Large codebase refactoringClaude Code200K context, deep understanding
Team collaborationCopilot or WindsurfTeam features, shared context
Budget-consciousGemini CLIFree tier is generous
DevOps/InfrastructureClaude CodeUnderstands K8s, Terraform
Learning/StudentsCopilot FreeBest free tier for IDE

Multi-Tool Strategy

Many teams use multiple tools strategically:

YAML
# Recommended multi-tool setup
daily_development:
  primary: Cursor
  reason: "Fast autocomplete, Composer for features"

code_review:
  primary: Claude Code
  reason: "Deep codebase understanding for reviews"

pull_requests:
  primary: GitHub Copilot
  reason: "Native PR creation and GitHub integration"

exploration:
  primary: Gemini CLI
  reason: "Free, multimodal for architecture diagrams"

Advanced Tips

Cursor Power Tips

TypeScript
// 1. Use multi-cursor with AI
// Select multiple similar patterns, trigger AI to transform all

// 2. Chain Composer commands
/*
Composer:
1. Create a new API endpoint for /products
2. Add validation using Zod
3. Write integration tests
4. Update OpenAPI spec
*/

// 3. Use @codebase for project-wide context
// "@codebase What's the pattern for error handling here?"

Copilot Power Tips

Bash
# 1. Use workspace agents
gh copilot workspace "Add dark mode to the entire app"

# 2. Inline chat with /commands
# Select code, then:
# /explain - Understand code
# /fix - Fix bugs
# /simplify - Reduce complexity
# /optimize - Improve performance

# 3. Use Copilot in commits
git commit # Copilot suggests message based on diff

Claude Code Power Tips

Bash
# 1. Use --depth for analysis depth
claude-code ask --depth deep "Analyze security vulnerabilities"

# 2. Chain commands with pipes
claude-code explain src/auth.ts | claude-code task "Add the missing error handling"

# 3. Use templates for common tasks
claude-code task --template api-endpoint "Create /users/search"

Building Custom Integrations

VS Code Extension API

TypeScript
import * as vscode from 'vscode';

export function activate(context: vscode.ExtensionContext) {
  // Register custom AI command
  const disposable = vscode.commands.registerCommand(
    'myExtension.aiRefactor',
    async () => {
      const editor = vscode.window.activeTextEditor;
      if (!editor) return;

      const selection = editor.document.getText(editor.selection);

      // Call your preferred AI API
      const response = await callAI({
        prompt: `Refactor this code for better readability:\n\n${selection}`,
        model: 'claude-sonnet-4'
      });

      // Replace selection with AI response
      editor.edit(editBuilder => {
        editBuilder.replace(editor.selection, response.code);
      });
    }
  );

  context.subscriptions.push(disposable);
}

async function callAI(params: { prompt: string; model: string }) {
  // Implement API call to Claude, OpenAI, etc.
  const response = await fetch('https://api.anthropic.com/v1/messages', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-api-key': process.env.ANTHROPIC_API_KEY!,
      'anthropic-version': '2024-01-01'
    },
    body: JSON.stringify({
      model: params.model,
      max_tokens: 4096,
      messages: [{ role: 'user', content: params.prompt }]
    })
  });

  return response.json();
}

CLI Wrapper

Python
#!/usr/bin/env python3
"""
Custom AI coding assistant CLI that combines multiple providers.
"""

import click
import os
from anthropic import Anthropic
from openai import OpenAI

anthropic = Anthropic()
openai = OpenAI()

@click.group()
def cli():
    """Multi-provider AI coding assistant."""
    pass

@cli.command()
@click.argument('file')
@click.option('--provider', '-p', default='claude', help='AI provider')
def explain(file: str, provider: str):
    """Explain a code file."""
    with open(file, 'r') as f:
        code = f.read()

    prompt = f"Explain this code in detail:\n\n```\n{code}\n```"

    if provider == 'claude':
        response = anthropic.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            messages=[{"role": "user", "content": prompt}]
        )
        click.echo(response.content[0].text)
    elif provider == 'openai':
        response = openai.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": prompt}]
        )
        click.echo(response.choices[0].message.content)

@cli.command()
@click.argument('instruction')
@click.argument('files', nargs=-1)
@click.option('--dry-run', is_flag=True, help='Show changes without applying')
def edit(instruction: str, files: tuple, dry_run: bool):
    """Edit files based on instruction."""
    for file in files:
        with open(file, 'r') as f:
            original = f.read()

        prompt = f"""Edit this code according to the instruction.

Instruction: {instruction}

Code:

{original}

Code

Return only the modified code, no explanations."""

        response = anthropic.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=8192,
            messages=[{"role": "user", "content": prompt}]
        )

        new_code = response.content[0].text

        if dry_run:
            click.echo(f"=== Changes for {file} ===")
            # Show diff
            import difflib
            diff = difflib.unified_diff(
                original.splitlines(keepends=True),
                new_code.splitlines(keepends=True),
                fromfile=f'{file} (original)',
                tofile=f'{file} (modified)'
            )
            click.echo(''.join(diff))
        else:
            with open(file, 'w') as f:
                f.write(new_code)
            click.echo(f"Updated {file}")

@cli.command()
@click.argument('description')
@click.option('--output', '-o', default='.', help='Output directory')
def generate(description: str, output: str):
    """Generate code from description."""
    prompt = f"""Generate production-ready code for:

{description}

Include:
- TypeScript types
- Error handling
- JSDoc comments
- Unit tests

Format response as:
=== filename.ts ===
<code>
=== filename.test.ts ===
<code>
"""

    response = anthropic.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=8192,
        messages=[{"role": "user", "content": prompt}]
    )

    content = response.content[0].text

    # Parse and create files
    import re
    files = re.split(r'=== (.+?) ===', content)

    for i in range(1, len(files), 2):
        filename = files[i].strip()
        code = files[i + 1].strip()

        filepath = os.path.join(output, filename)
        os.makedirs(os.path.dirname(filepath) or '.', exist_ok=True)

        with open(filepath, 'w') as f:
            f.write(code)
        click.echo(f"Created {filepath}")

if __name__ == '__main__':
    cli()

Future Outlook

  1. Deeper autonomy: Agents that can handle entire feature requests end-to-end
  2. Better context: 1M+ token context becoming standard
  3. Specialized models: Fine-tuned models for specific frameworks/languages
  4. Voice integration: Code by talking to your assistant
  5. Multi-agent collaboration: Multiple AI agents working together

What to Watch

  • OpenAI Codex successor: GPT-5's coding capabilities
  • Claude Code improvements: Longer context, faster inference
  • Cursor evolution: More agentic capabilities
  • Open source alternatives: StarCoder 2, CodeLlama improvements

Conclusion

AI coding assistants have transformed software development. The choice between tools depends on your specific needs:

  • Cursor for power users wanting the best IDE experience
  • GitHub Copilot for teams deeply integrated with GitHub
  • Windsurf for those who value session continuity
  • Claude Code for complex codebases and terminal workflows
  • Gemini CLI for budget-conscious developers and Google ecosystem

Most developers will benefit from trying multiple tools and potentially using them in combination. The productivity gains from AI coding assistants—30-50% reported by most users—make them essential for modern development.

Frequently Asked Questions

Enrico Piovano, PhD

Co-founder & CTO at Goji AI. Former Applied Scientist at Amazon (Alexa & AGI), focused on Agentic AI and LLMs. PhD in Electrical Engineering from Imperial College London. Gold Medalist at the National Mathematical Olympiad.

Related Articles