Which tool has the best autocomplete?

Cursor with Supermaven has the fastest autocomplete (~50ms). GitHub Copilot is a close second with better context awareness in some cases. For pure speed, Cursor wins.

Can I use multiple tools together?

Yes, and many developers do. Common combinations: Cursor for daily coding + Copilot for PR creation, or Claude Code for complex tasks + Cursor for quick edits.

Are these tools worth paying for?

For professional developers, absolutely. At $10-20/month, even a 10% productivity increase pays for itself many times over. The free tiers are good for evaluation but limited for daily use.

Which is best for learning to code?

GitHub Copilot's free tier with VS Code is excellent for learners—the explanations and suggestions help understand code patterns. Gemini CLI's free tier is also good for beginners.

How do these handle sensitive/private code?

All major tools offer enterprise tiers with privacy guarantees. For maximum privacy, Claude Code and self-hosted options give you the most control. Check each provider's data handling policies for your compliance requirements.

Will AI replace developers?

No. These tools amplify developer capabilities rather than replace them. The developers who learn to use AI effectively will be more productive, but human judgment, architecture decisions, and problem-solving remain essential.

AI Coding Assistants 2025: Cursor vs Copilot vs Windsurf vs Claude Code | Enrico Piovano

The AI Coding Revolution

AI coding assistants have become indispensable for modern software development. In 2025, these tools have evolved from simple autocomplete to autonomous agents that can understand entire codebases, implement features across multiple files, run tests, and create pull requests.

From industry surveys: "78% of developers now use AI coding tools regularly, with productivity gains of 30-50% reported across various tasks."

This guide provides a comprehensive comparison of the leading AI coding assistants, helping you choose the right tool for your workflow.

Two Categories: IDE vs Terminal

AI coding assistants fall into two distinct categories:

IDE-Based Tools

Full graphical IDE experience with visual diff previews, inline suggestions, and GUI-based interactions.

Tool	Base	Key Advantage
Cursor	VS Code fork	Composer multi-file editing
GitHub Copilot	VS Code/JetBrains plugin	GitHub PR integration
Windsurf	VS Code fork	Cascade session memory

Best for: Visual developers, those who prefer GUI interactions, teams needing shared IDE settings.

Terminal-Based Tools

CLI-native interfaces for developers who live in the terminal.

Tool	Interface	Key Advantage
OpenAI Codex	CLI + IDE Extension	GPT-5.2-Codex model, parallel agents
Claude Code	CLI + REPL	200K context, deep reasoning
Gemini CLI	CLI	Free tier, multimodal
Aider	CLI	Git-native workflow

Best for: DevOps engineers, vim/emacs users, automation scripts, CI/CD integration.

Open-Source Agentic Extensions

VS Code extensions providing autonomous coding with any model provider.

Tool	Based On	Key Advantage
Cline	Original	Plan/Act modes, MCP support, browser use
Roo Code	Cline fork	Multi-mode (Code/Architect/Debug), boomerang tasks
Kilo Code	Roo Code fork	Orchestrator mode, Memory Bank, $20 free credits

Best for: Developers wanting open-source flexibility with any AI provider.

Specialized Code Models

Open-source models optimized specifically for code generation.

Model	Parameters	SWE-bench	Best For
Qwen 2.5 Coder	0.5B-32B	69.6% (32B)	Local deployment, 92 languages
DeepSeek Coder V2	16B/236B	68.4%	Cost-effective, MoE
CodeLlama	7B-70B	~45%	Meta ecosystem

Quick Comparison

Tool	Type	Best For	Model Access	Pricing	Key Feature
Cursor	IDE	Power users, multi-file edits	Claude, GPT-4o, custom	$20/month	Composer mode
GitHub Copilot	IDE	GitHub integration, teams	GPT-4o, Claude	$10/month	Agent mode + PR creation
Windsurf	IDE	Session continuity, budget	GPT-4o, Claude	$15/month	Cascade memory system
OpenAI Codex	Terminal/IDE	OpenAI ecosystem	GPT-5.2-Codex	ChatGPT Plus	Parallel cloud agents
Claude Code	Terminal	Large codebases, DevOps	Claude 4	Usage-based	200K context, CLI-native
Gemini CLI	Terminal	Google ecosystem, free use	Gemini 2.5/3	Free tier	Multimodal support
Cline	VS Code Ext	Open-source flexibility	Any (configurable)	Free (OSS)	Plan/Act modes
Roo Code	VS Code Ext	Multi-mode workflows	Any (configurable)	Free (OSS)	Boomerang tasks
Kilo Code	VS Code Ext	Free credits, orchestration	500+ models	Free + credits	Memory Bank
Aider	Terminal	Git workflows, open source	Any (configurable)	Free (OSS)	Git-native commits

Cursor

Overview

Cursor is a VS Code fork that pioneered IDE-native AI with deep codebase understanding. Its Composer mode enables multi-file, autonomous code generation that goes far beyond autocomplete.

Key stats:

40,000+ companies using Cursor
Autocomplete powered by Supermaven (fastest in class)
Multi-model support: Claude 4, GPT-4o, and custom models

Features Deep Dive

Composer Mode

Composer is Cursor's flagship feature—an agentic coding assistant that can:

Understand feature requests in natural language
Modify multiple files simultaneously
Run terminal commands
Execute tests and iterate on failures
Create entire features autonomously

Code

User: "Add user authentication with JWT tokens, including
      login/logout endpoints, middleware, and tests"

Composer:
1. Creates auth/jwt.ts with token generation/verification
2. Adds routes/auth.ts with login/logout endpoints
3. Creates middleware/auth.ts for protected routes
4. Updates app.ts to register new routes
5. Generates tests/auth.test.ts
6. Runs tests, fixes any failures

Tab Autocomplete

Cursor's autocomplete (powered by Supermaven) predicts multi-line completions:

Python

# Type: "def calculate_"
# Cursor suggests:
def calculate_total_price(items: list[dict], tax_rate: float = 0.08) -> float:
    """Calculate total price including tax."""
    subtotal = sum(item['price'] * item['quantity'] for item in items)
    return subtotal * (1 + tax_rate)

Codebase Indexing

Cursor indexes your entire codebase for context-aware suggestions:

Python

# Cursor understands your existing patterns
# If your codebase uses a specific ORM pattern:

# Type: "def get_user_by_"
# Cursor suggests based on YOUR existing code:
def get_user_by_email(email: str) -> User | None:
    """Fetch user by email using existing repository pattern."""
    return UserRepository.find_one({"email": email})

Chat with @ References

Reference specific files, functions, or documentation:

Code

@auth.ts @middleware.ts How does the current auth flow work?

@docs Can you explain how to use the payment API?

@git-diff Review these changes for security issues

Cursor Configuration

Configure Cursor's behavior through settings. These control indexing depth, model selection, and context management. Getting these right significantly impacts response quality—more context helps the model understand your codebase, but too much can slow things down.

JSON

// .cursor/settings.json
{
  "cursor.cpp.enableIndexing": true,
  "cursor.general.gitGraphEnabled": true,
  "cursor.chat.showSuggestedFiles": true,
  "cursor.composer.enabled": true,
  "cursor.autocomplete.enabled": true,
  "cursor.autocomplete.useSupermaven": true,

  // Model preferences
  "cursor.models.default": "claude-sonnet-4",
  "cursor.models.composer": "claude-sonnet-4",

  // Context settings
  "cursor.context.maxFiles": 20,
  "cursor.context.includeOpenTabs": true
}

Key settings explained:

enableIndexing: Indexes your codebase for semantic search. Essential for "find similar code" and context-aware suggestions. Disable only for huge monorepos where indexing is slow.
useSupermaven: Uses Supermaven's fast autocomplete model. Significantly faster than GPT-based completion but may be less accurate for complex patterns.
models.default vs models.composer: Use a faster model (Sonnet) for chat, and the same or stronger for Composer's multi-file edits where accuracy matters more than speed.
maxFiles: How many files Cursor includes as context. Higher values give better understanding but use more tokens and cost more.

Custom Instructions

The .cursorrules file is your secret weapon for consistent code generation. It tells Cursor about your project's conventions, tech stack, and coding standards. The model reads this before every interaction, so be specific—vague rules get vague results.

Create .cursorrules for project-specific behavior:

Markdown

# .cursorrules

## Project Context
This is a TypeScript backend using:
- Express.js for routing
- Prisma for database ORM
- Jest for testing
- Zod for validation

## Code Style
- Use functional programming patterns where possible
- Prefer `const` over `let`
- Use early returns to reduce nesting
- All functions must have JSDoc comments
- Error handling with custom AppError class

## Testing
- Every new function needs unit tests
- Use test factories for mock data
- Integration tests for API endpoints

## Avoid
- Any use of `any` type
- Console.log in production code
- Hardcoded strings (use constants)

Pricing

Plan	Price	Features
Free	$0	2000 completions, 50 slow requests
Pro	$20/month	Unlimited completions, 500 fast requests
Business	$40/user/month	Team features, admin controls, SSO

GitHub Copilot

Overview

GitHub Copilot is the most widely adopted AI coding assistant, with deep GitHub integration and a new agent mode that can autonomously create pull requests.

Key stats:

1.8 million paid subscribers
Used by 77,000+ organizations
Powers Copilot Workspace and agent mode

Features Deep Dive

Agent Mode

Copilot's agent mode can take a GitHub issue and autonomously:

Analyze the issue requirements
Explore the codebase for context
Write the implementation
Create and run tests
Open a pull request

Bash

# Trigger agent mode from CLI
gh copilot agent --issue 123

# Or from GitHub UI:
# Click "Start agent" on any issue

Inline Suggestions

Context-aware completions as you type:

TypeScript

// Type a comment describing what you want:
// Function to validate email addresses using regex

// Copilot suggests:
function validateEmail(email: string): boolean {
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return emailRegex.test(email);
}

Copilot Chat

Integrated chat with codebase awareness:

Code

/explain What does this function do?
/fix There's a bug in the sorting logic
/tests Generate unit tests for the selected code
/doc Add documentation to this class

Pull Request Summaries

Auto-generate PR descriptions:

Markdown

## Summary
This PR adds user authentication with JWT tokens.

## Changes
- Added `auth/jwt.ts` for token management
- Created login/logout API endpoints
- Implemented auth middleware
- Added comprehensive test coverage

## Testing
- [x] Unit tests pass
- [x] Integration tests pass
- [x] Manual testing completed

Copilot Configuration

Fine-tune Copilot's behavior through VS Code settings. The most important setting is which file types get suggestions—enable for code, often disable for prose where suggestions can be distracting.

JSON

// VS Code settings.json
{
  "github.copilot.enable": {
    "*": true,
    "markdown": true,
    "plaintext": false,
    "yaml": true
  },
  "github.copilot.advanced": {
    "temperature": 0.3,
    "top_p": 0.95,
    "max_tokens": 500
  },
  "github.copilot.chat.localeOverride": "en",
  "github.copilot.editor.enableAutoCompletions": true
}

Understanding the settings:

enable by file type: The "*": true enables for all files, then you can override specific types. Disable for plaintext to avoid suggestions in notes/docs.
temperature: Controls randomness. Lower (0.1-0.3) for more deterministic code; higher (0.7+) for creative suggestions. 0.3 is a good default for most coding.
top_p: Nucleus sampling parameter. 0.95 means consider tokens in the top 95% probability mass. Lower values make output more focused.
max_tokens: Maximum length of suggestions. 500 is enough for most functions; increase for longer code blocks.

Custom Instructions

Copilot reads .github/copilot-instructions.md to understand your project's conventions. Unlike Cursor's .cursorrules, this file lives in the .github folder (so it can be repo-wide) and follows Markdown format. Include your tech stack, coding standards, and patterns you want Copilot to follow.

Create .github/copilot-instructions.md:

Markdown

# Copilot Instructions

## Language & Framework
- TypeScript with strict mode
- React 18 with hooks
- TanStack Query for data fetching
- Tailwind CSS for styling

## Patterns
- Use custom hooks for reusable logic
- Prefer composition over inheritance
- Use discriminated unions for state
- Error boundaries for error handling

## Code Generation
- Include TypeScript types for all functions
- Add JSDoc for public APIs
- Generate tests using Vitest
- Use `describe`/`it` test structure

Copilot CLI

Bash

# Install
gh extension install github/gh-copilot

# Explain a command
gh copilot explain "git rebase -i HEAD~3"

# Suggest a command
gh copilot suggest "find large files in git history"

# Ask questions
gh copilot ask "How do I squash commits?"

Pricing

Plan	Price	Features
Free	$0	2000 completions/month, limited chat
Individual	$10/month	Unlimited completions, chat, CLI
Business	$19/user/month	Organization policies, audit logs
Enterprise	$39/user/month	SSO, IP indemnity, advanced security

Windsurf

Overview

Windsurf (formerly Codeium) was acquired by Cognition (makers of Devin) in 2025 and offers the best session memory with its Cascade system. It's ideal for developers who value context continuity across sessions.

Key stats:

Acquired by Cognition
Best-in-class session memory
Clear diff previews before applying changes

Features Deep Dive

Cascade System

Cascade maintains context across your entire coding session:

Code

Session 1 (Morning):
"I'm building a REST API for a blog platform"
[Discusses architecture, creates initial routes]

Session 2 (Afternoon):
"Let's add comments to posts"
[Cascade remembers the blog context, existing routes, patterns]

Session 3 (Next day):
"Add authentication"
[Still has full context of blog platform, existing models]

Flows

Pre-built workflows for common tasks:

Code

/flow create-api
→ Walks through: endpoint design, validation, testing, docs

/flow refactor
→ Analyzes code, suggests improvements, applies changes

/flow debug
→ Examines error, traces cause, suggests fix

Diff Preview

See exactly what will change before applying:

Code

// Windsurf shows clear diffs:
- function getUser(id) {
-   return users.find(u => u.id === id);
- }
+ function getUser(id: string): User | undefined {
+   if (!id) {
+     throw new ValidationError('User ID is required');
+   }
+   return users.find(u => u.id === id);
+ }

Windsurf Configuration

Windsurf's configuration controls the Cascade memory system and completion behavior. The key differentiator is memoryDuration—set to "session" for conversation continuity across reopening the editor, or "project" to persist across sessions (useful for long-running projects).

JSON

// windsurf.config.json
{
  "cascade": {
    "enabled": true,
    "memoryDuration": "session",
    "contextWindow": 128000
  },
  "completions": {
    "enabled": true,
    "delay": 200,
    "multiline": true
  },
  "chat": {
    "model": "gpt-4o",
    "temperature": 0.2
  },
  "flows": {
    "enableBuiltIn": true,
    "customFlowsPath": ".windsurf/flows"
  }
}

Custom Flows

Create custom workflows:

YAML

# .windsurf/flows/create-component.yaml
name: Create React Component
description: Generate a new React component with tests and stories
steps:
  - prompt: "What should this component do?"
    variable: purpose

  - prompt: "What props does it need?"
    variable: props

  - action: generate
    template: |
      Create a React component that:
      - Purpose: {{purpose}}
      - Props: {{props}}
      - Include TypeScript types
      - Add unit tests
      - Add Storybook story

  - action: create_files
    files:
      - "src/components/{{name}}/index.tsx"
      - "src/components/{{name}}/{{name}}.test.tsx"
      - "src/components/{{name}}/{{name}}.stories.tsx"

Pricing

Plan	Price	Features
Free	$0	Unlimited basic completions
Pro	$15/month	Cascade, Flows, premium models
Team	$30/user/month	Shared memory, team flows, admin

Claude Code

Overview

Claude Code is Anthropic's terminal-native coding assistant. With a 200K token context window and deep reasoning capabilities, it excels at understanding entire codebases and complex refactoring tasks.

Key stats:

80.9% on SWE-bench (state-of-the-art)
200K token context (entire large codebases)
Terminal-first interface

Features Deep Dive

Codebase Understanding

Claude Code can ingest entire repositories:

Bash

# Initialize Claude Code in your project
claude-code init

# Ask about the codebase
claude-code ask "How does the authentication system work?"

# Get architecture overview
claude-code ask "Explain the overall architecture and key components"

Multi-File Editing

Edit multiple files with a single command:

Bash

claude-code edit "Rename the User model to Account across the entire codebase"

# Claude Code:
# 1. Finds all references to User
# 2. Updates model definition
# 3. Updates all imports
# 4. Updates all usages
# 5. Updates tests
# 6. Shows diff for approval

Agentic Tasks

Run complex multi-step tasks:

Bash

claude-code task "Add a caching layer for the API responses"

# Claude Code autonomously:
# 1. Analyzes current API structure
# 2. Determines best caching strategy
# 3. Implements Redis caching layer
# 4. Adds cache invalidation
# 5. Updates existing endpoints
# 6. Adds configuration options
# 7. Creates tests
# 8. Documents the changes

Claude Code Configuration

Configure Claude Code through a YAML file at your project root. The context section is critical—it determines which files Claude Code indexes and can access. Be strategic: include source and test files, exclude dependencies and build artifacts. The hooks section automates pre/post-edit checks, catching type errors before you see the diff.

YAML

# claude-code.yaml
project:
  name: my-app
  type: typescript

context:
  include:
    - "src/**/*.ts"
    - "tests/**/*.ts"
    - "*.json"
  exclude:
    - "node_modules"
    - "dist"
    - "*.log"

preferences:
  model: claude-sonnet-4
  max_tokens: 8192
  temperature: 0.2

  style:
    typescript:
      strict: true
      prefer_interfaces: true
      use_type_imports: true

  testing:
    framework: vitest
    coverage_threshold: 80

  documentation:
    style: jsdoc
    require_for_public: true

hooks:
  pre_edit:
    - "npm run typecheck"
  post_edit:
    - "npm run lint:fix"
    - "npm run test"

Claude Code API Integration

Use Claude Code programmatically:

Python

from claude_code import ClaudeCode

# Initialize
cc = ClaudeCode(
    api_key="your-api-key",
    project_path="/path/to/project"
)

# Index the codebase
cc.index()

# Ask questions
response = cc.ask(
    "What are the main API endpoints and their purposes?"
)
print(response.answer)
print(response.relevant_files)

# Edit files
result = cc.edit(
    instruction="Add input validation to all API endpoints",
    files=["src/routes/*.ts"],
    dry_run=True  # Preview changes first
)

for change in result.changes:
    print(f"File: {change.file}")
    print(f"Diff:\n{change.diff}")

# Apply changes
if input("Apply changes? (y/n): ") == "y":
    cc.apply(result)

Shell Integration

Bash

# Add to .bashrc/.zshrc
alias cc="claude-code"
alias cca="claude-code ask"
alias cce="claude-code edit"
alias cct="claude-code task"

# Quick commands
cca "What does this function do?" -f src/utils/parser.ts
cce "Add error handling" -f src/api/routes.ts
cct "Write tests for the auth module"

Pricing

Claude Code uses usage-based pricing:

Usage	Cost
Input tokens	$3/million tokens
Output tokens	$15/million tokens
Typical session	$0.10-0.50

OpenAI Codex

Overview

OpenAI Codex represents OpenAI's ambitious entry into the agentic coding space. Originally launched as a code completion model (the engine behind GitHub Copilot), Codex has evolved into a full-fledged autonomous coding agent that can understand entire codebases, execute multi-step tasks, and even run multiple parallel agents in the cloud.

The current iteration, powered by GPT-5.2-Codex, is specifically optimized for long-horizon coding tasks—meaning it excels at multi-step problems that require sustained context and planning over many tool calls and file modifications.

Key stats:

Cloud-based parallel agent execution (up to 5 concurrent agents)
Powered by GPT-5.2-Codex (optimized for long-horizon coding)
Available as CLI, VS Code extension, and integrated in Cursor/Windsurf
Included in ChatGPT Plus/Pro/Team/Enterprise subscriptions
Sandbox execution environment for safe code running
Native integration with GitHub for PR creation

The Evolution of Codex

Understanding Codex's history helps contextualize its current capabilities:

2021 - Codex Original: Launched as a fine-tuned GPT-3 model for code completion. Powered the original GitHub Copilot. Limited to autocomplete suggestions.

2023 - GPT-4 Code Interpreter: Code execution capabilities added via ChatGPT. Users could upload files and run Python code in a sandbox.

2024 - ChatGPT Canvas: Introduced a dedicated coding interface within ChatGPT with side-by-side editing and iterative refinement.

2025 - Codex Agent (Current): Full autonomous coding agent with:

Cloud-based execution environment
Parallel agent support
Native IDE integration
Git/GitHub workflow integration
Persistent project understanding

Architecture Deep Dive

Codex operates fundamentally differently from local coding assistants:

Code

Traditional (Cursor, Cline):
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│ Your IDE    │───>│   LLM API   │───>│   Response  │
│ (local)     │    │   (cloud)   │    │   (local)   │
└─────────────┘    └─────────────┘    └─────────────┘
     ^                                       │
     └───────────────────────────────────────┘
              (runs locally)

Codex Architecture:
┌─────────────┐    ┌─────────────────────────────┐
│ Your IDE    │───>│      Codex Cloud            │
│ (local)     │    │  ┌─────────────────────┐   │
└─────────────┘    │  │   GPT-5.2-Codex     │   │
     ^             │  │   ┌───────────────┐ │   │
     │             │  │   │ Sandbox VM    │ │   │
     │             │  │   │ - Your code   │ │   │
     │             │  │   │ - Git clone   │ │   │
     │             │  │   │ - Test runner │ │   │
     │             │  │   └───────────────┘ │   │
     │             │  └─────────────────────┘   │
     │             └─────────────────────────────┘
     └──────── (PR/diff returned) ──────────────┘

This cloud-native architecture enables several unique capabilities:

Parallel execution: Multiple agents can work simultaneously without competing for local resources
Isolated environments: Each task runs in a fresh sandbox with your code cloned
Full execution: Agents can actually run tests, builds, and other commands
No local compute: Your machine isn't taxed during heavy AI processing

Features Deep Dive

Parallel Cloud Agents

Codex's most distinctive feature is running multiple agents simultaneously in the cloud. Each agent gets its own isolated environment:

Bash

# Start multiple agents working in parallel
codex task "Implement user authentication with JWT" --background &
codex task "Write comprehensive unit tests for the User model" --background &
codex task "Update API documentation with new endpoints" --background &

# Check status of all running agents
codex status

# Output:
# Agent 1: "Implement user authentication" - 67% complete
#   └─ Created: auth/jwt.ts, auth/middleware.ts
#   └─ Currently: Writing login route tests
#
# Agent 2: "Unit tests for User model" - 45% complete
#   └─ Created: tests/user.model.test.ts
#   └─ Currently: Testing edge cases
#
# Agent 3: "API documentation" - 89% complete
#   └─ Updated: docs/api.md, docs/auth.md
#   └─ Currently: Generating OpenAPI spec

# View specific agent details
codex agent view agent-1

# Cancel an agent
codex agent cancel agent-3

# Wait for all agents to complete
codex wait --all

Each agent works independently and produces either:

A pull request with all changes
A diff for local review
Applied changes (if auto-apply is enabled)

The Sandbox Environment

Every Codex agent runs in a sandboxed Linux VM with:

YAML

Sandbox Specifications:
  OS: Ubuntu 22.04 LTS
  CPU: 4 vCPUs
  RAM: 16GB
  Storage: 50GB ephemeral
  Network: Configurable (default: restricted)
  Timeout: 30 minutes (default), up to 2 hours

Pre-installed:
  - Node.js 20.x, 18.x
  - Python 3.11, 3.10
  - Go 1.21
  - Rust 1.75
  - Java 21
  - .NET 8.0
  - Docker (rootless)
  - Common build tools (make, cmake, etc.)

Available on request:
  - Database servers (PostgreSQL, MySQL, Redis)
  - Custom Docker images
  - Specific language versions

The sandbox can:

Clone your repository (via GitHub integration or uploaded zip)
Install dependencies (npm install, pip install, etc.)
Run tests
Execute build scripts
Start development servers for testing
Make HTTP requests (if network enabled)

Bash

# Enable network access for agents that need external APIs
codex task "Integrate Stripe payments" --sandbox-network=true

# Use a specific sandbox configuration
codex task "Build mobile app" --sandbox-config=mobile.yaml

# Custom sandbox with Docker
codex task "Test in production-like environment" \
  --sandbox-docker="postgres:15,redis:7"

Agent Skills System

Skills are reusable instruction packages that standardize common operations:

YAML

# .codex/skills/security-review.yaml
name: security-review
description: Review code for security vulnerabilities
version: "1.0"

instructions: |
  Perform a comprehensive security review:

  1. OWASP Top 10 check:
     - SQL Injection
     - XSS
     - CSRF
     - Authentication issues
     - Access control
     - Security misconfiguration
     - Cryptographic failures

  2. Dependency vulnerabilities:
     - Run npm audit / pip-audit / cargo audit
     - Check for known CVEs

  3. Secret detection:
     - Scan for hardcoded credentials
     - Check for API keys in code
     - Verify .env files are gitignored

  4. Code patterns:
     - Unsafe deserialization
     - Command injection risks
     - Path traversal vulnerabilities

scripts:
  pre:
    - npm audit
    - npx secret-scanner
  post:
    - npm run lint:security

output:
  format: markdown
  sections:
    - severity_high
    - severity_medium
    - severity_low
    - recommendations

triggers:
  - on_pr: true
  - on_push_to: [main, develop]

YAML

# .codex/skills/migration-helper.yaml
name: migration-helper
description: Help migrate between versions/frameworks

parameters:
  from_version:
    type: string
    required: true
    description: Current version or framework
  to_version:
    type: string
    required: true
    description: Target version or framework

instructions: |
  Migrate codebase from {{from_version}} to {{to_version}}:

  1. Analyze current usage patterns
  2. Identify breaking changes between versions
  3. Create migration plan
  4. Implement changes systematically
  5. Update dependencies
  6. Fix type errors and deprecation warnings
  7. Run test suite, fix failures
  8. Update documentation

  Approach:
  - Make atomic commits for each logical change
  - Preserve git history where possible
  - Add migration notes as comments where behavior changed

Using skills:

Bash

# Run a skill
codex --skill security-review

# Run with parameters
codex --skill migration-helper \
  --param from_version="React 17" \
  --param to_version="React 19"

# List available skills
codex skills list

# Create skill from successful task
codex skills create-from-task task-abc123 --name my-new-skill

Interactive Mode

Codex's interactive REPL provides a conversational coding experience:

Bash

$ codex
Welcome to Codex! Type 'help' for commands.

codex> What does the payment processing module do?

The payment processing module (src/payments/) handles:
- Payment intent creation via Stripe
- Webhook handling for async payment events
- Refund processing
- Subscription management

Key files:
- stripe.ts: Stripe SDK wrapper
- webhooks.ts: Event handlers
- subscriptions.ts: Recurring billing logic

codex> There's a bug where refunds sometimes fail silently. Can you investigate?

I'll investigate the refund handling. Let me:
1. Read the refund code in src/payments/refunds.ts
2. Check the error handling patterns
3. Look at recent related commits

[Reading files...]

I found the issue. In `processRefund()` on line 45, errors from the Stripe
API are caught but not properly propagated:

```typescript
// Current (problematic):
try {
  await stripe.refunds.create({ payment_intent: intentId });
} catch (e) {
  console.error(e); // Error logged but not thrown
}

// Should be:
try {
  await stripe.refunds.create({ payment_intent: intentId });
} catch (e) {
  console.error('Refund failed:', e);
  throw new RefundError(e.message, { intentId, originalError: e });
}

Would you like me to fix this and add proper error handling throughout the payments module?

codex> Yes, fix it and add tests

Starting task in background... Agent ID: agent-7f3a2b Use 'codex status' to monitor progress.

codex> /status

Agent agent-7f3a2b: "Fix refund error handling" Status: In Progress (43%) ├─ ✅ Fixed src/payments/refunds.ts ├─ ✅ Added RefundError class ├─ 🔄 Writing tests for refund.test.ts └─ ⏳ Pending: Integration test updates

codex> /quit

Code


#### CLI Commands Reference

```bash
# Authentication
codex auth login           # Sign in with ChatGPT account
codex auth logout          # Sign out
codex auth status          # Check authentication status

# Basic operations
codex "your prompt"        # Quick one-shot task
codex ask "question"       # Ask about codebase (no changes)
codex edit "instruction"   # Edit specific files
codex task "description"   # Full autonomous task

# Agent management
codex status               # List all agents
codex agent view <id>      # View agent details
codex agent logs <id>      # Stream agent logs
codex agent cancel <id>    # Cancel running agent
codex wait <id>            # Wait for agent completion
codex wait --all           # Wait for all agents

# Skills
codex --skill <name>       # Run a skill
codex skills list          # List available skills
codex skills show <name>   # Show skill details
codex skills create        # Create new skill
codex skills delete <name> # Delete a skill

# Configuration
codex config show          # Show current config
codex config set <k> <v>   # Set config value
codex init                 # Initialize project

# Git integration
codex pr create            # Create PR from agent result
codex pr list              # List Codex-created PRs

# Advanced
codex --model <name>       # Use specific model
codex --sandbox-network    # Enable network in sandbox
codex --timeout <mins>     # Set task timeout
codex --verbose            # Verbose output
codex --dry-run            # Preview without executing

Configuration

Codex supports both global and project-level configuration:

YAML

# .codex/config.yaml (project-level)

# Model selection
model: gpt-5.2-codex          # Default model
model_fallback: gpt-4o        # Fallback if primary unavailable

# Approval modes
approval_mode: suggest        # suggest | auto-edit | full-auto
  # suggest: Shows diff, waits for approval
  # auto-edit: Applies non-breaking changes automatically
  # full-auto: Applies all changes (use with caution)

# Context configuration
context:
  include:
    - "src/**/*"
    - "tests/**/*"
    - "docs/**/*.md"
    - "*.json"
    - "*.yaml"
  exclude:
    - "node_modules/**"
    - "dist/**"
    - "*.log"
    - ".env*"
  max_files: 100              # Maximum files to include
  max_file_size: 100KB        # Skip files larger than this

# Skills configuration
skills:
  directory: .codex/skills
  enabled:
    - security-review
    - migration-helper
  disabled:
    - experimental-feature

# Sandbox configuration
sandbox:
  enabled: true
  network: false              # Disable network by default
  timeout: 1800               # 30 minutes
  memory: 16GB
  docker_images:              # Pre-pull these images
    - node:20
    - postgres:15

# Git integration
git:
  auto_commit: false          # Auto-commit changes
  commit_prefix: "[codex]"    # Prefix for commits
  branch_prefix: "codex/"     # Prefix for branches
  create_pr: prompt           # prompt | auto | never

# Output preferences
output:
  format: detailed            # minimal | detailed | verbose
  show_thinking: false        # Show agent reasoning
  syntax_highlight: true

# Safety settings
safety:
  max_files_modified: 20      # Warn if more files changed
  require_tests: true         # Require tests for new code
  no_force_push: true         # Prevent force pushes
  protected_files:            # Never modify these
    - ".env"
    - "secrets.yaml"

YAML

# ~/.codex/config.yaml (global defaults)

# Global preferences
default_model: gpt-5.2-codex
theme: dark
editor: code                  # Editor for viewing diffs

# Authentication
api_base_url: https://api.openai.com/v1

# Usage limits (for cost control)
limits:
  daily_tasks: 50
  concurrent_agents: 3
  warn_at_cost: 10.00

# Telemetry
telemetry: false

IDE Integration

VS Code Extension

The Codex VS Code extension provides a rich GUI experience:

JSON

// VS Code settings.json
{
  "codex.enabled": true,
  "codex.model": "gpt-5.2-codex",

  // Inline suggestions (like Copilot)
  "codex.inlineSuggestions.enabled": true,
  "codex.inlineSuggestions.debounceMs": 200,

  // Agent panel
  "codex.agentPanel.position": "right",
  "codex.agentPanel.showOnStart": true,

  // Auto-apply settings
  "codex.autoApply.readOnlyOperations": true,
  "codex.autoApply.formatting": true,
  "codex.autoApply.imports": true,
  "codex.autoApply.codeChanges": false,

  // Keybindings
  "codex.keybindings.triggerAgent": "ctrl+shift+c",
  "codex.keybindings.explainSelection": "ctrl+shift+e",
  "codex.keybindings.fixError": "ctrl+shift+f"
}

VS Code features:

Agent Panel: View and manage running agents
Inline Diff Preview: See changes before applying
Context Menu: Right-click to trigger Codex actions
Problems Integration: Codex can auto-fix diagnostics
Terminal Integration: Run Codex commands in integrated terminal

Cursor Integration

Codex works alongside Cursor's native features:

JSON

// Cursor settings
{
  // Use Codex for agentic tasks, Cursor for autocomplete
  "cursor.autocomplete.enabled": true,
  "codex.inlineSuggestions.enabled": false,

  // Trigger Codex from Cursor
  "codex.cursorIntegration": true,
  "codex.cursorHotkey": "ctrl+shift+o"
}

Advanced Usage Patterns

Batch Operations

Process multiple tasks efficiently:

Bash

# Create a batch file
cat > tasks.yaml << EOF
tasks:
  - name: "Add TypeScript types"
    files: ["src/legacy/*.js"]
    instruction: "Convert to TypeScript with strict types"

  - name: "Add error handling"
    files: ["src/api/*.ts"]
    instruction: "Add proper error handling with custom error classes"

  - name: "Generate tests"
    files: ["src/utils/*.ts"]
    instruction: "Generate comprehensive unit tests"
    depends_on: ["Add TypeScript types"]
EOF

# Run batch
codex batch tasks.yaml

# Monitor batch progress
codex batch status

CI/CD Integration

Use Codex in automated pipelines:

YAML

# .github/workflows/codex-review.yaml
name: Codex Code Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run Codex Security Review
        uses: openai/codex-action@v2
        with:
          skill: security-review
          api-key: ${{ secrets.OPENAI_API_KEY }}

      - name: Run Codex Code Quality
        uses: openai/codex-action@v2
        with:
          prompt: |
            Review this PR for:
            - Code quality issues
            - Performance problems
            - Missing tests
            - Documentation gaps
          comment-on-pr: true

YAML

# GitLab CI integration
codex-review:
  stage: review
  image: openai/codex-cli:latest
  script:
    - codex auth login --token $OPENAI_API_KEY
    - codex --skill code-review --output=report.md
    - codex --skill security-review --output=security.md
  artifacts:
    reports:
      codequality: report.md

Custom Model Configuration

Use Codex with specific model settings:

Bash

# Use a specific model version
codex --model gpt-5.2-codex-20250115 "task description"

# Adjust generation parameters
codex --temperature 0.2 --max-tokens 8192 "precise task"

# Use with fine-tuned model
codex --model ft:gpt-5.2-codex:my-org:my-finetune "specialized task"

Pricing

Codex is included in ChatGPT subscriptions with varying limits:

Plan	Monthly Cost	Agent Minutes	Parallel Agents	Features
ChatGPT Plus	$20/month	600 min	2	CLI + VS Code
ChatGPT Pro	$200/month	Unlimited	5	Priority, all features
ChatGPT Team	$25/user/month	1000 min/user	3/user	Admin controls
Enterprise	Custom	Unlimited	10+/user	SSO, audit logs, SLA

API pricing (for programmatic access):

Model	Input	Output
gpt-5.2-codex	$3.00/1M tokens	$12.00/1M tokens
gpt-5.2-codex-mini	$0.50/1M tokens	$2.00/1M tokens

Sandbox compute:

Included in subscription limits
Additional compute: $0.02/minute

Codex vs Other Tools

Capability	Codex	Claude Code	Cursor	Cline
Execution model	Cloud sandbox	Local	Local	Local
Parallel agents	✅ 2-10	❌ No	❌ No	❌ No
Test execution	✅ In sandbox	✅ Local	✅ Local	✅ Local
Context window	128K	200K	128K	Model-dependent
Offline use	❌ No	✅ Yes	⚠️ Limited	✅ Yes
IDE integration	✅ Good	⚠️ CLI-focused	✅ Native	✅ VS Code
Cost model	Subscription	Pay-per-use	Subscription	Pay-per-use

When to choose Codex:

You need parallel agent execution
You want isolated sandbox environments
You're already paying for ChatGPT Plus/Pro
You need GitHub integration for PR creation
You prefer subscription pricing over pay-per-use

When to choose alternatives:

You need offline capability
You require 200K+ token context
You prefer local execution for privacy
You want to use Claude or other non-OpenAI models

Gemini CLI

Overview

Gemini CLI is Google's terminal-first coding assistant, offering access to Gemini 2.5/3 models with a generous free tier and deep integration with Google Cloud.

Key stats:

Free tier with Gemini 2.5 Flash
1M token context window
Native multimodal support (images, diagrams)

Features Deep Dive

Interactive Mode

Bash

# Start interactive session
gemini

> Explain the architecture of this project
[Analyzes codebase, provides overview]

> Generate a REST API for user management
[Creates complete API with routes, controllers, models]

> /image architecture.png What improvements would you suggest?
[Analyzes architecture diagram, provides recommendations]

Code Generation

Bash

# Generate code from description
gemini generate "A TypeScript function that validates credit card numbers using the Luhn algorithm"

# Output:
function validateCreditCard(cardNumber: string): boolean {
  // Remove spaces and dashes
  const cleaned = cardNumber.replace(/[\s-]/g, '');

  // Check if only digits
  if (!/^\d+$/.test(cleaned)) {
    return false;
  }

  // Luhn algorithm
  let sum = 0;
  let isEven = false;

  for (let i = cleaned.length - 1; i >= 0; i--) {
    let digit = parseInt(cleaned[i], 10);

    if (isEven) {
      digit *= 2;
      if (digit > 9) {
        digit -= 9;
      }
    }

    sum += digit;
    isEven = !isEven;
  }

  return sum % 10 === 0;
}

File Operations

Bash

# Explain a file
gemini explain src/complex-algorithm.ts

# Review code

gemini review --security src/auth/

# Refactor
gemini refactor "Convert callbacks to async/await" src/legacy/

# Generate tests
gemini test src/utils/validation.ts -o tests/

Gemini CLI Configuration

YAML

# ~/.gemini/config.yaml
model: gemini-2.5-flash  # or gemini-3-flash for paid
temperature: 0.2
max_tokens: 8192

context:
  auto_include_open_files: true
  max_context_files: 50

output:
  format: markdown
  syntax_highlighting: true

project_detection:
  enabled: true
  config_files:
    - package.json
    - pyproject.toml
    - Cargo.toml
    - go.mod

Pricing

Tier	Model	Price
Free	Gemini 2.5 Flash	$0 (rate limited)
Pay-as-you-go	Gemini 2.5 Pro	$0.075/1K input tokens
Pay-as-you-go	Gemini 3 Flash	$0.50/1M input tokens

Aider (Open Source)

Overview

Aider is the leading open-source terminal-based coding assistant. It's git-native, meaning it automatically commits changes with descriptive messages.

Key stats:

100% open source (Apache 2.0)
Works with any LLM (Claude, GPT-4, Llama, etc.)
Git-native: automatic commits with good messages
Active community and frequent updates

Features Deep Dive

Git-Native Workflow

Aider automatically commits each change:

Bash

# Start aider
aider

> Add input validation to the login form

# Aider:
# 1. Analyzes the codebase
# 2. Makes changes to relevant files
# 3. Automatically creates a git commit:
#    "feat: Add input validation to login form
#     - Added email format validation
#     - Added password strength requirements
#     - Added error message display"

Multi-Model Support

Bash

# Use Claude
aider --model claude-3-5-sonnet

# Use GPT-4
aider --model gpt-4o

# Use local Ollama model
aider --model ollama/llama3.1

# Use DeepSeek
aider --model deepseek/deepseek-chat

Watch Mode

Aider watches for file changes and responds:

Bash

# Start in watch mode
aider --watch

# Now edit files manually or with another tool
# Aider sees changes and can help integrate them

Architect Mode

For planning before coding:

Bash

aider --architect

> Plan a microservices architecture for an e-commerce platform

# Aider creates a plan, then you approve before implementation

Aider Configuration

YAML

# .aider.conf.yml
model: claude-3-5-sonnet
auto-commits: true
commit-prompt: conventional  # conventional commits format

# File handling
auto-lint: true
lint-cmd: npm run lint
test-cmd: npm test

# Git settings
attribute-author: true
attribute-committer: true
dirty-commits: false

# Context
map-tokens: 1024
map-refresh: auto
subtree-only: false

Usage Examples

Bash

# Add files to chat context
aider src/auth/*.ts tests/auth/*.ts

# Ask questions
aider --message "Explain how the auth flow works"

# Make changes
aider --message "Add rate limiting to the login endpoint"

# Review and fix
aider --message "Review this code for security issues and fix them"

# Generate tests
aider --message "Add comprehensive tests for the User model"

Aider Scripting

Bash

#!/bin/bash
# automated-review.sh

# Run aider in non-interactive mode for CI
aider --yes --message "
Review the changes in this PR for:
1. Security vulnerabilities
2. Performance issues
3. Code style violations

Fix any issues found.
" --model claude-3-5-sonnet

# Check if aider made changes
if [ -n "$(git status --porcelain)" ]; then
    echo "Aider found and fixed issues"
    git push
fi

Pricing

Aider itself is free. You pay for the LLM API:

Model	Cost per 1M tokens
Claude 3.5 Sonnet	$3 input /$ 15 output
GPT-4o	$2.50 input /$ 10 output
Local (Ollama)	Free
DeepSeek	$0.14 input /$ 0.28 output

Cline

Overview

Cline is the original open-source autonomous coding agent for VS Code. It pioneered the human-in-the-loop approach where every action requires approval, making it safe for production use.

Key stats:

100% open-source (Apache 2.0)
Works with any API provider (OpenAI, Anthropic, local models)
Plan/Act mode separation for strategic thinking
MCP (Model Context Protocol) support for extensibility

Features Deep Dive

Plan & Act Mode

Cline's unique approach separates thinking from doing:

Code

Plan Mode (Read-Only):
├── Explore codebase
├── Analyze architecture
├── Create implementation strategy
└── No file modifications allowed

Act Mode (Execute):
├── Implement planned changes
├── Run commands
├── Create/modify files
└── Human approval for each action

Bash

# Configure different models for each mode
# Plan: Use cheaper/faster model for exploration
# Act: Use more capable model for implementation

# In Cline settings:
Plan Model: deepseek-chat (cost-effective)
Act Model: claude-sonnet-4 (high quality)

Browser Integration

Cline can interact with web browsers for testing:

TypeScript

// Cline can:
// 1. Launch a headless browser
// 2. Navigate to your local dev server
// 3. Click, type, scroll
// 4. Capture screenshots
// 5. Read console logs
// 6. Fix runtime errors it discovers

// Example task:
"Test the login flow and fix any errors you find"

// Cline will:
// - Start dev server
// - Open browser to localhost:3000
// - Try to log in
// - Capture any errors
// - Fix the code
// - Verify the fix works

MCP Tool Creation

Extend Cline with custom tools:

JSON

{
  "mcpServers": {
    "database": {
      "command": "node",
      "args": ["./mcp-servers/database.js"],
      "tools": [
        {
          "name": "query_database",
          "description": "Execute read-only SQL queries",
          "inputSchema": {
            "type": "object",
            "properties": {
              "query": { "type": "string" }
            }
          }
        }
      ]
    }
  }
}

Configuration

JSON

// .vscode/settings.json
{
  "cline.apiProvider": "anthropic",
  "cline.apiKey": "${env:ANTHROPIC_API_KEY}",
  "cline.model": "claude-sonnet-4",

  "cline.planMode": {
    "enabled": true,
    "model": "deepseek-chat"
  },

  "cline.autoApprove": {
    "readFiles": true,
    "listFiles": true,
    "writeFiles": false,
    "executeCommands": false
  },

  "cline.contextWindow": 128000,
  "cline.maxTokens": 8192
}

Pricing

Cline is free. You pay only for API usage:

Provider	Model	Cost
Anthropic	Claude Sonnet 4	$3/$ 15 per 1M tokens
OpenAI	GPT-4o	$2.50/$ 10 per 1M tokens
DeepSeek	DeepSeek Chat	$0.14/$ 0.28 per 1M tokens
Local	Ollama/LM Studio	Free

Roo Code

Overview

Roo Code (formerly Roo Cline) is a fork of Cline with enhanced multi-mode capabilities. It provides specialized "personas" for different tasks and innovative features like boomerang tasks.

Key stats:

900K+ VS Code Marketplace installs
Multi-mode: Code, Architect, Ask, Debug, Custom
Boomerang tasks for complex workflows
Stable codebase indexing

Features Deep Dive

Multi-Mode System

Different modes for different tasks:

Code

Modes:
├── Code Mode
│   └── Implementation focused, writes code
├── Architect Mode
│   └── Planning focused, designs solutions
├── Ask Mode
│   └── Q&A focused, explains code
├── Debug Mode
│   └── Bug-finding focused, traces issues
└── Custom Modes
    └── User-defined personas

TypeScript

// Configure different models per mode
const modeConfig = {
  architect: {
    model: "o3",  // Best for planning
    systemPrompt: "You are a senior software architect..."
  },
  code: {
    model: "claude-sonnet-4",  // Best for implementation
    systemPrompt: "You are an expert programmer..."
  },
  debug: {
    model: "gpt-4o",  // Good at finding issues
    systemPrompt: "You are a debugging expert..."
  }
};

Boomerang Tasks

Chain complex workflows across modes:

Code

Boomerang Task Flow:

1. User: "Add user authentication"
   ↓
2. Architect Mode: Plans the implementation
   - Designs auth flow
   - Identifies files to create/modify
   - Creates task breakdown
   ↓
3. Code Mode: Implements each task
   - Creates auth middleware
   - Adds login/logout routes
   - Updates database schema
   ↓
4. Debug Mode: Verifies implementation
   - Runs tests
   - Checks for security issues
   - Validates edge cases
   ↓
5. Returns to User: Complete with summary

Codebase Indexing

Persistent understanding of your project:

Bash

# Roo Code indexes your codebase
# - Function definitions
# - Class hierarchies
# - Import relationships
# - Test coverage mapping

# Query the index
"What functions call the validateUser method?"
"Show me all API endpoints that don't have tests"
"Find all places where we handle authentication errors"

Configuration

JSON

// .roo/config.json
{
  "modes": {
    "architect": {
      "model": "anthropic/claude-sonnet-4",
      "temperature": 0.3
    },
    "code": {
      "model": "anthropic/claude-sonnet-4",
      "temperature": 0.2
    },
    "debug": {
      "model": "openai/gpt-4o",
      "temperature": 0.1
    }
  },
  "boomerang": {
    "enabled": true,
    "autoTransition": true
  },
  "indexing": {
    "enabled": true,
    "excludePaths": ["node_modules", "dist", ".git"]
  }
}

Pricing

Free and open-source. Pay only for AI provider APIs.

Kilo Code

Overview

Kilo Code is a fork of Roo Code with additional enterprise features, including the Orchestrator mode, Memory Bank, and generous free credits for new users.

Key stats:

#1 on OpenRouter (6.1 trillion tokens/month)
750K+ active users
$20 free credits for new users
Access to 500+ AI models

Features Deep Dive

Orchestrator Mode

Automatically chains specialized agents:

Code

Orchestrator Flow:

User: "Build a REST API for a todo app"

Orchestrator:
├── 1. Architect Agent
│   └── Designs API structure, endpoints, database schema
├── 2. Code Agent
│   └── Implements routes, controllers, models
├── 3. Test Agent
│   └── Writes unit and integration tests
├── 4. Documentation Agent
│   └── Generates API docs and README
└── 5. Review Agent
    └── Checks for issues, suggests improvements

Memory Bank

Persistent project context across sessions:

TypeScript

// Memory Bank stores:
// - Project architecture decisions
// - Coding conventions used
// - Previous task context
// - User preferences

// Example: Memory Bank remembers your patterns
Session 1: "Use Prisma for database"
Session 2: "Add a new model"
// Kilo automatically uses Prisma, not raw SQL

// Memory Bank file structure
.kilo/
├── memory/
│   ├── architecture.md    # Design decisions
│   ├── conventions.md     # Code style
│   ├── context.md         # Ongoing work
│   └── preferences.md     # User preferences

Cross-Platform Sync

Work across devices:

Bash

# Start on mobile (Kilo mobile app)
"Plan a caching system for the API"

# Continue on desktop (VS Code)
"Implement the caching plan"

# Finish on laptop (Cursor)
"Add tests for the cache"

# All context is preserved across devices

Configuration

JSON

// kilo.config.json
{
  "orchestrator": {
    "enabled": true,
    "agents": ["architect", "code", "test", "docs", "review"]
  },
  "memoryBank": {
    "enabled": true,
    "syncToCloud": true
  },
  "models": {
    "default": "claude-sonnet-4",
    "planning": "gemini-3-pro",
    "coding": "claude-sonnet-4",
    "review": "gpt-4o"
  },
  "credits": {
    "warnAt": 5.00,
    "hardLimit": null
  }
}

Pricing

Tier	Cost	Features
Free Start	$0 +$ 20 credit	Full access while credits last
Pay-as-you-go	Provider rates	No markup, transparent pricing
Promotions	Up to $120 free	Check kilo.ai for current offers

Qwen Coder

Overview

Qwen 2.5 Coder is Alibaba's state-of-the-art open-source code model, available in sizes from 0.5B to 32B parameters. It's not a tool itself, but powers many tools and can be run locally.

Key stats:

69.6% on SWE-bench (32B model)
92 programming languages
128K context window
Apache 2.0 license

Using Qwen Coder

With Ollama (Local)

Bash

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull Qwen Coder
ollama pull qwen2.5-coder:32b

# Run interactively
ollama run qwen2.5-coder:32b

# Or use with other tools

With Aider

Bash

# Use Qwen Coder with Aider
aider --model ollama/qwen2.5-coder:32b

# Or via OpenRouter
aider --model openrouter/qwen/qwen-2.5-coder-32b-instruct

With Cline/Roo Code/Kilo Code

JSON

// Configure in VS Code settings
{
  "cline.apiProvider": "ollama",
  "cline.ollamaBaseUrl": "http://localhost:11434",
  "cline.model": "qwen2.5-coder:32b"
}

Direct API Usage

Python

from openai import OpenAI

# Via OpenRouter
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="your-openrouter-key"
)

response = client.chat.completions.create(
    model="qwen/qwen-2.5-coder-32b-instruct",
    messages=[
        {
            "role": "system",
            "content": "You are an expert programmer."
        },
        {
            "role": "user",
            "content": "Write a Python function to validate email addresses"
        }
    ]
)

print(response.choices[0].message.content)

Model Variants

Model	Parameters	VRAM Required	Best For
qwen2.5-coder:0.5b	0.5B	1GB	Edge devices
qwen2.5-coder:1.5b	1.5B	2GB	Light tasks
qwen2.5-coder:7b	7B	8GB	Good balance
qwen2.5-coder:14b	14B	16GB	Strong performance
qwen2.5-coder:32b	32B	32GB	Best quality

Benchmark Performance

Benchmark	Qwen 2.5 Coder 32B	GPT-4o	Claude Sonnet
HumanEval	92.7%	90.2%	92.0%
MBPP	90.2%	88.1%	89.5%
SWE-bench	69.6%	68.4%	72.7%
MultiPL-E	75.2%	73.8%	74.1%

Pricing

Method	Cost
Local (Ollama)	Free (your hardware)
OpenRouter	$0.06/$ 0.18 per 1M tokens
Together AI	$0.08/$ 0.24 per 1M tokens
Alibaba Cloud	Variable

Head-to-Head Comparison

Autocomplete Speed

Based on benchmarks (lower is better):

Tool	Time to First Suggestion	Multi-line Support
Cursor (Supermaven)	~50ms	Excellent
GitHub Copilot	~150ms	Good
Windsurf	~120ms	Good
Cline/Roo/Kilo	~200ms (depends on model)	Good
Claude Code	N/A (not autocomplete-focused)	N/A
Gemini CLI	~200ms	Good

Agentic Capabilities

Capability	Cursor	Copilot	Windsurf	Cline/Roo/Kilo	Claude Code	Codex
Multi-file editing	✅ Excellent	✅ Good	✅ Good	✅ Excellent	✅ Excellent	✅ Excellent
Autonomous PR creation	⚠️ Limited	✅ Excellent	⚠️ Limited	⚠️ Limited	✅ Good	✅ Excellent
Test generation & running	✅ Yes	✅ Yes	✅ Yes	✅ Yes	✅ Yes	✅ Yes
Codebase understanding	✅ Good	✅ Good	✅ Good	✅ Excellent	✅ Excellent	✅ Good
Terminal command execution	✅ Yes	✅ Yes	✅ Yes	✅ Yes	✅ Native	✅ Yes
Session memory	⚠️ Limited	⚠️ Limited	✅ Excellent	✅ Good (Kilo best)	⚠️ Session	⚠️ Limited
Browser testing	❌ No	❌ No	❌ No	✅ Yes (Cline)	❌ No	❌ No
MCP support	⚠️ Limited	❌ No	❌ No	✅ Full	❌ No	❌ No
Plan/Act separation	❌ No	❌ No	❌ No	✅ Yes	❌ No	❌ No

Benchmark Performance

On SWE-bench Verified (higher is better):

Tool/Model	Score	Notes
Claude Code	80.9%	State-of-the-art
Cursor (Claude)	78.2%	With Composer
GitHub Copilot	72.5%	Agent mode
Windsurf	71.8%	With Cascade
Cline (Claude Sonnet 4)	72.7%	Depends on model used
Qwen 2.5 Coder 32B	69.6%	Open-source local
OpenAI Codex	69.1%	GPT-5.2-Codex
Gemini CLI	68.4%	Gemini 3 Flash

Open-Source Tool Comparison

Feature	Cline	Roo Code	Kilo Code	Aider
Base	Original	Cline fork	Roo fork	Original
Plan/Act modes	✅ Yes	✅ Yes	✅ Yes	❌ No
Multi-mode personas	❌ No	✅ Yes	✅ Yes	✅ Architect mode
Memory Bank	❌ No	⚠️ Basic	✅ Full	❌ No
Orchestrator	❌ No	⚠️ Boomerang	✅ Full	❌ No
Browser use	✅ Yes	✅ Yes	✅ Yes	❌ No
MCP support	✅ Full	✅ Full	✅ Full	❌ No
Git integration	⚠️ Basic	⚠️ Basic	⚠️ Basic	✅ Native
Free credits	❌ No	❌ No	✅ $20	❌ No
Marketplace installs	2M+	900K+	750K+	CLI only

Cost Analysis

For a typical developer (100 requests/day, 20 working days/month):

Tool	Monthly Cost	Cost per Request
Cursor Pro	$20	~$0.01
Copilot Individual	$10	~$0.005
Windsurf Pro	$15	~$0.0075
Claude Code	~$30-60	~$0.015-0.03
Gemini CLI (Free)	$0	$0

Choosing the Right Tool

Decision Framework

Code

START
  │
  ├─ Do you need GitHub integration (PRs, issues)?
  │    └─ YES → GitHub Copilot
  │
  ├─ Do you work in terminal primarily?
  │    └─ YES → Claude Code or Gemini CLI
  │
  ├─ Do you need session memory across days?
  │    └─ YES → Windsurf
  │
  ├─ Do you do complex multi-file refactoring?
  │    └─ YES → Cursor or Claude Code
  │
  ├─ Are you budget-constrained?
  │    └─ YES → Gemini CLI (free) or Copilot ($10)
  │
  └─ DEFAULT → Cursor (best all-around)

Best Tool by Use Case

Use Case	Recommended	Why
Daily coding	Cursor	Fast autocomplete + Composer
Open source contributions	Copilot	PR agent, GitHub integration
Large codebase refactoring	Claude Code	200K context, deep understanding
Team collaboration	Copilot or Windsurf	Team features, shared context
Budget-conscious	Gemini CLI	Free tier is generous
DevOps/Infrastructure	Claude Code	Understands K8s, Terraform
Learning/Students	Copilot Free	Best free tier for IDE

Multi-Tool Strategy

Many teams use multiple tools strategically:

YAML

# Recommended multi-tool setup
daily_development:
  primary: Cursor
  reason: "Fast autocomplete, Composer for features"

code_review:
  primary: Claude Code
  reason: "Deep codebase understanding for reviews"

pull_requests:
  primary: GitHub Copilot
  reason: "Native PR creation and GitHub integration"

exploration:
  primary: Gemini CLI
  reason: "Free, multimodal for architecture diagrams"

Advanced Tips

Cursor Power Tips

TypeScript

// 1. Use multi-cursor with AI
// Select multiple similar patterns, trigger AI to transform all

// 2. Chain Composer commands
/*
Composer:
1. Create a new API endpoint for /products
2. Add validation using Zod
3. Write integration tests
4. Update OpenAPI spec
*/

// 3. Use @codebase for project-wide context
// "@codebase What's the pattern for error handling here?"

Copilot Power Tips

Bash

# 1. Use workspace agents
gh copilot workspace "Add dark mode to the entire app"

# 2. Inline chat with /commands
# Select code, then:
# /explain - Understand code
# /fix - Fix bugs
# /simplify - Reduce complexity
# /optimize - Improve performance

# 3. Use Copilot in commits
git commit # Copilot suggests message based on diff

Claude Code Power Tips

Bash

# 1. Use --depth for analysis depth
claude-code ask --depth deep "Analyze security vulnerabilities"

# 2. Chain commands with pipes
claude-code explain src/auth.ts | claude-code task "Add the missing error handling"

# 3. Use templates for common tasks
claude-code task --template api-endpoint "Create /users/search"

Building Custom Integrations

VS Code Extension API

TypeScript

import * as vscode from 'vscode';

export function activate(context: vscode.ExtensionContext) {
  // Register custom AI command
  const disposable = vscode.commands.registerCommand(
    'myExtension.aiRefactor',
    async () => {
      const editor = vscode.window.activeTextEditor;
      if (!editor) return;

      const selection = editor.document.getText(editor.selection);

      // Call your preferred AI API
      const response = await callAI({
        prompt: `Refactor this code for better readability:\n\n${selection}`,
        model: 'claude-sonnet-4'
      });

      // Replace selection with AI response
      editor.edit(editBuilder => {
        editBuilder.replace(editor.selection, response.code);
      });
    }
  );

  context.subscriptions.push(disposable);
}

async function callAI(params: { prompt: string; model: string }) {
  // Implement API call to Claude, OpenAI, etc.
  const response = await fetch('https://api.anthropic.com/v1/messages', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-api-key': process.env.ANTHROPIC_API_KEY!,
      'anthropic-version': '2024-01-01'
    },
    body: JSON.stringify({
      model: params.model,
      max_tokens: 4096,
      messages: [{ role: 'user', content: params.prompt }]
    })
  });

  return response.json();
}

CLI Wrapper

Python

#!/usr/bin/env python3
"""
Custom AI coding assistant CLI that combines multiple providers.
"""

import click
import os
from anthropic import Anthropic
from openai import OpenAI

anthropic = Anthropic()
openai = OpenAI()

@click.group()
def cli():
    """Multi-provider AI coding assistant."""
    pass

@cli.command()
@click.argument('file')
@click.option('--provider', '-p', default='claude', help='AI provider')
def explain(file: str, provider: str):
    """Explain a code file."""
    with open(file, 'r') as f:
        code = f.read()

    prompt = f"Explain this code in detail:\n\n```\n{code}\n```"

    if provider == 'claude':
        response = anthropic.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            messages=[{"role": "user", "content": prompt}]
        )
        click.echo(response.content[0].text)
    elif provider == 'openai':
        response = openai.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": prompt}]
        )
        click.echo(response.choices[0].message.content)

@cli.command()
@click.argument('instruction')
@click.argument('files', nargs=-1)
@click.option('--dry-run', is_flag=True, help='Show changes without applying')
def edit(instruction: str, files: tuple, dry_run: bool):
    """Edit files based on instruction."""
    for file in files:
        with open(file, 'r') as f:
            original = f.read()

        prompt = f"""Edit this code according to the instruction.

Instruction: {instruction}

Code:

{original}

Code


Return only the modified code, no explanations."""

        response = anthropic.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=8192,
            messages=[{"role": "user", "content": prompt}]
        )

        new_code = response.content[0].text

        if dry_run:
            click.echo(f"=== Changes for {file} ===")
            # Show diff
            import difflib
            diff = difflib.unified_diff(
                original.splitlines(keepends=True),
                new_code.splitlines(keepends=True),
                fromfile=f'{file} (original)',
                tofile=f'{file} (modified)'
            )
            click.echo(''.join(diff))
        else:
            with open(file, 'w') as f:
                f.write(new_code)
            click.echo(f"Updated {file}")

@cli.command()
@click.argument('description')
@click.option('--output', '-o', default='.', help='Output directory')
def generate(description: str, output: str):
    """Generate code from description."""
    prompt = f"""Generate production-ready code for:

{description}

Include:
- TypeScript types
- Error handling
- JSDoc comments
- Unit tests

Format response as:
=== filename.ts ===
<code>
=== filename.test.ts ===
<code>
"""

    response = anthropic.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=8192,
        messages=[{"role": "user", "content": prompt}]
    )

    content = response.content[0].text

    # Parse and create files
    import re
    files = re.split(r'=== (.+?) ===', content)

    for i in range(1, len(files), 2):
        filename = files[i].strip()
        code = files[i + 1].strip()

        filepath = os.path.join(output, filename)
        os.makedirs(os.path.dirname(filepath) or '.', exist_ok=True)

        with open(filepath, 'w') as f:
            f.write(code)
        click.echo(f"Created {filepath}")

if __name__ == '__main__':
    cli()

Future Outlook

2025 Trends

Deeper autonomy: Agents that can handle entire feature requests end-to-end
Better context: 1M+ token context becoming standard
Specialized models: Fine-tuned models for specific frameworks/languages
Voice integration: Code by talking to your assistant
Multi-agent collaboration: Multiple AI agents working together

What to Watch

OpenAI Codex successor: GPT-5's coding capabilities
Claude Code improvements: Longer context, faster inference
Cursor evolution: More agentic capabilities
Open source alternatives: StarCoder 2, CodeLlama improvements

Conclusion

AI coding assistants have transformed software development. The choice between tools depends on your specific needs:

Cursor for power users wanting the best IDE experience
GitHub Copilot for teams deeply integrated with GitHub
Windsurf for those who value session continuity
Claude Code for complex codebases and terminal workflows
Gemini CLI for budget-conscious developers and Google ecosystem

Most developers will benefit from trying multiple tools and potentially using them in combination. The productivity gains from AI coding assistants—30-50% reported by most users—make them essential for modern development.

Table of Contents

The AI Coding Revolution

Two Categories: IDE vs Terminal

IDE-Based Tools

Terminal-Based Tools

Open-Source Agentic Extensions

Specialized Code Models

Quick Comparison

Cursor

Overview

Features Deep Dive

Composer Mode

Tab Autocomplete

Codebase Indexing

Chat with @ References

Cursor Configuration

Custom Instructions

Pricing

GitHub Copilot

Overview

Features Deep Dive

Agent Mode

Inline Suggestions

Copilot Chat

Pull Request Summaries

Copilot Configuration

Custom Instructions

Copilot CLI

Pricing

Windsurf

Overview

Features Deep Dive

Cascade System

Flows

Diff Preview

Windsurf Configuration

Custom Flows

Pricing

Claude Code

Overview

Features Deep Dive

Codebase Understanding

Multi-File Editing

Agentic Tasks

Claude Code Configuration

Claude Code API Integration

Shell Integration

Pricing

OpenAI Codex

Overview

The Evolution of Codex

Architecture Deep Dive

Features Deep Dive

Parallel Cloud Agents

The Sandbox Environment

Agent Skills System

Interactive Mode

Configuration

IDE Integration

VS Code Extension

Cursor Integration

Advanced Usage Patterns

Batch Operations

CI/CD Integration

Custom Model Configuration

Pricing

Codex vs Other Tools

Gemini CLI

Overview

Features Deep Dive

Interactive Mode

Code Generation

File Operations

Gemini CLI Configuration

Pricing

Aider (Open Source)

Overview

Features Deep Dive

Git-Native Workflow

Multi-Model Support