Skip to main content
Back to Blog

Google ADK: Building Multi-Agent Systems with Agent Development Kit

A comprehensive guide to Google's Agent Development Kit (ADK)—building agents, creating tools, orchestrating multi-agent systems with subagents, and deploying to production. Includes real examples from the official adk-samples repository.

12 min read
Share:

What is Google ADK?

Agent Development Kit (ADK) is Google's open-source framework for building, evaluating, and deploying AI agents. Introduced at Google Cloud NEXT 2025, ADK powers agents within Google products like Agentspace and Google Customer Engagement Suite (CES).

2025 ADK updates: The framework continues rapid evolution—ADK TypeScript v0.2.0 is now officially released for TypeScript/JavaScript developers, and ADK Go v0.3.0 includes agent-to-agent request callbacks and enhanced extendability. Four languages are now production-ready: Python, TypeScript, Go, and Java.

Why ADK matters: While frameworks like LangChain and LlamaIndex focus on chains and RAG, ADK was designed specifically for multi-agent systems from the ground up. It treats agent coordination, state management, and workflow orchestration as first-class concerns—not afterthoughts. The architecture provides three primary agent types: LLM Agents (the "brains" using models like Gemini), Workflow Agents (the "managers" orchestrating tasks), and Custom Agents (the "specialists" for specific logic).

The code-first philosophy: ADK makes agent development feel like regular software development. You define agents as Python/TypeScript/Go/Java classes, compose them using familiar patterns (sequential, parallel, hierarchical), and deploy them with standard tooling. No DSLs, no magic—just code.

Rich ecosystem integration: ADK provides a rich tool ecosystem including pre-built tools (Search, Code Execution), Model Context Protocol (MCP) tools, 3rd-party library integration (LangChain, LlamaIndex), and the ability to use other agents as tools (LangGraph, CrewAI). For models, ADK works with Gemini, plus LiteLLM integration for Anthropic, Meta, Mistral, AI21 Labs, and more.

FeatureADKLangChain/LangGraphCrewAI
Multi-agent nativeYesVia LangGraphYes
Model agnosticYes (Gemini, GPT-4, Claude)YesYes
Workflow agentsSequentialAgent, ParallelAgent, LoopAgentLangGraph nodesSequential/hierarchical
Built-in streamingBidirectional audio/videoToken streamingToken streaming
Deployment targetVertex AI Agent EngineAnyAny
LanguagesPython, TypeScript, Go, JavaPython, TypeScriptPython

Official Resources:

  • ADK Documentation
  • ADK Samples Repository
  • Google Cloud ADK Docs

Installation and Setup

Python Setup

Bash
# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # macOS/Linux
# Windows: .venv\Scripts\activate

# Install ADK
pip install google-adk

TypeScript Setup

Bash
mkdir my-adk-agent && cd my-adk-agent
npm init -y

# Install ADK packages
npm install @google/adk @google/adk-devtools
npm install -D typescript @types/node

Authentication

Create a .env file in your project root:

Bash
# Option 1: Google AI Studio (simpler, for development)
GOOGLE_GENAI_USE_VERTEXAI=FALSE
GOOGLE_API_KEY=your_api_key_here

# Option 2: Vertex AI (for production)
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1

Get your API key from Google AI Studio for development, or configure Vertex AI credentials for production deployments.


Understanding ADK Architecture

Before diving into code, let's understand ADK's core building blocks:

Code
ADK Architecture
├── Agents              # Execution units (LLM-powered or workflow)
│   ├── LlmAgent        # Uses LLM for reasoning and decisions
│   ├── SequentialAgent # Executes sub-agents in order
│   ├── ParallelAgent   # Runs sub-agents concurrently
│   └── LoopAgent       # Repeats until condition met
├── Tools               # Capabilities agents can invoke
│   ├── FunctionTool    # Custom Python/TS functions
│   ├── AgentTool       # Other agents as callable tools
│   └── Built-in        # Google Search, Code Execution, etc.
├── Sessions            # Conversation state management
│   ├── State           # Current conversation data
│   └── Memory          # Cross-session knowledge
└── Runner              # Orchestration engine

The key insight: ADK separates what (agents with their capabilities) from how (workflow agents that orchestrate execution). This separation enables complex multi-agent architectures without spaghetti code.


Creating Your First Agent

Project Structure

ADK expects a specific project structure:

Code
my_agent/
├── __init__.py         # Makes it a Python package
├── agent.py            # Agent definition (must export root_agent)
└── .env                # Environment variables

Basic Agent Definition

The simplest ADK agent has three components: a name, a model, and instructions. Everything else—tools, subagents, output schemas—builds on this foundation.

Python
# my_agent/agent.py
from google.adk.agents import Agent

root_agent = Agent(
    name="helpful_assistant",
    model="gemini-2.0-flash",
    description="A helpful assistant that answers questions.",
    instruction="""You are a helpful assistant.

    When users ask questions:
    1. Think through the problem step by step
    2. Provide clear, concise answers
    3. Ask for clarification if the question is ambiguous
    """
)

Understanding the parameters:

  • name: Unique identifier used in multi-agent systems for routing and logging. Avoid reserved names like user.

  • model: The LLM powering the agent. ADK supports Gemini models natively, plus GPT-4, Claude, and others via LiteLLM integration.

  • description: A concise summary of what this agent does. Critical for multi-agent systems—other agents read this description to decide whether to delegate tasks here.

  • instruction: The system prompt that defines the agent's behavior, personality, constraints, and output format. This is where you spend most of your time tuning agent behavior.

Running the Agent

ADK provides multiple ways to run your agent:

Bash
# Interactive web UI (best for development)
adk web

# Terminal chat interface
adk run my_agent

# API server (for integration)
adk api_server

The web UI at http://localhost:8000 provides conversation history, state inspection, and debugging tools—invaluable during development.


Creating Tools

Tools extend agent capabilities beyond the LLM's built-in knowledge. ADK automatically generates tool schemas from your function signatures.

Function Tools

The simplest way to create a tool: write a Python function with type hints and a docstring. ADK inspects the function to generate an LLM-compatible schema.

Python
from google.adk.agents import Agent

def get_weather(city: str, unit: str = "celsius") -> dict:
    """
    Get the current weather for a city.

    Args:
        city: The name of the city (e.g., "New York", "London")
        unit: Temperature unit - "celsius" or "fahrenheit" (default: celsius)

    Returns:
        dict with status, temperature, and conditions
    """
    # In production, call a real weather API
    weather_data = {
        "new york": {"temp": 22, "conditions": "Sunny"},
        "london": {"temp": 15, "conditions": "Cloudy"},
        "tokyo": {"temp": 28, "conditions": "Humid"},
    }

    city_lower = city.lower()
    if city_lower in weather_data:
        data = weather_data[city_lower]
        temp = data["temp"] if unit == "celsius" else (data["temp"] * 9/5) + 32
        return {
            "status": "success",
            "city": city,
            "temperature": temp,
            "unit": unit,
            "conditions": data["conditions"]
        }

    return {
        "status": "error",
        "error_message": f"Weather data not available for {city}"
    }


def get_current_time(timezone: str = "UTC") -> dict:
    """
    Get the current time in a specific timezone.

    Args:
        timezone: IANA timezone name (e.g., "America/New_York", "Europe/London")

    Returns:
        dict with status and current time
    """
    from datetime import datetime
    import pytz

    try:
        tz = pytz.timezone(timezone)
        current_time = datetime.now(tz)
        return {
            "status": "success",
            "timezone": timezone,
            "time": current_time.strftime("%Y-%m-%d %H:%M:%S"),
            "day_of_week": current_time.strftime("%A")
        }
    except Exception as e:
        return {
            "status": "error",
            "error_message": f"Invalid timezone: {timezone}"
        }


# Create agent with tools
root_agent = Agent(
    name="weather_time_agent",
    model="gemini-2.0-flash",
    description="Provides weather and time information",
    instruction="""You are a helpful assistant that provides weather and time information.

    When users ask about weather:
    - Use the get_weather tool to fetch current conditions
    - Always mention both temperature and conditions
    - Offer to convert units if the user might prefer different units

    When users ask about time:
    - Use the get_current_time tool with the appropriate timezone
    - If the user doesn't specify a timezone, ask for clarification
    - Include the day of the week in your response
    """,
    tools=[get_weather, get_current_time]
)

Key principles for tool design:

  • Type hints are required: ADK uses them to generate the tool schema. Without type hints, the LLM won't know what parameters to provide.

  • Docstrings are critical: The docstring becomes the tool description sent to the LLM. A vague docstring leads to incorrect tool usage. Be specific about what the tool does, what inputs it expects, and what it returns.

  • Return dictionaries with status: Include a status field ("success", "error", "pending") so the agent can handle failures gracefully. Non-dict returns are wrapped automatically, but explicit structure is clearer.

  • Keep parameters simple: Favor primitive types (str, int, float, bool) over complex objects. The LLM needs to generate these values from natural language.

TypeScript Tools

TypeScript requires explicit tool definition with Zod schemas:

TypeScript
import { FunctionTool, LlmAgent } from '@google/adk';
import { z } from 'zod';

const getWeather = new FunctionTool({
  name: 'get_weather',
  description: 'Get the current weather for a city',
  parameters: z.object({
    city: z.string().describe('The city name'),
    unit: z.enum(['celsius', 'fahrenheit']).default('celsius')
  }),
  execute: async ({ city, unit }) => {
    // Implementation here
    return {
      status: 'success',
      city,
      temperature: 22,
      unit,
      conditions: 'Sunny'
    };
  }
});

export const rootAgent = new LlmAgent({
  name: 'weather_agent',
  model: 'gemini-2.0-flash',
  description: 'Provides weather information',
  instruction: 'You help users with weather queries.',
  tools: [getWeather]
});

Built-in Tools

ADK provides pre-built tools for common capabilities:

Python
from google.adk.agents import Agent
from google.adk.tools import google_search, code_execution

# Agent with Google Search
search_agent = Agent(
    name="research_agent",
    model="gemini-2.0-flash",
    instruction="Research topics using Google Search.",
    tools=[google_search]
)

# Agent with code execution
code_agent = Agent(
    name="code_agent",
    model="gemini-2.0-flash",
    instruction="Write and execute Python code to solve problems.",
    tools=[code_execution]
)

Available built-in tools:

  • Google Search: Web search via Gemini
  • Code Execution: Run Python code in a sandbox
  • Vertex AI Search: Enterprise search over your data
  • BigQuery: Query data warehouses
  • RAG Engine: Retrieval-augmented generation

Multi-Agent Systems

This is where ADK shines. Complex applications compose multiple agents—each specialized for specific tasks—into coordinated systems.

Agent Hierarchy with Subagents

The fundamental pattern: A parent agent has sub_agents that it can delegate to. The parent decides which subagent should handle each request based on descriptions and instructions.

Python
from google.adk.agents import Agent

# Specialized subagents
billing_agent = Agent(
    name="billing_agent",
    model="gemini-2.0-flash",
    description="Handles billing questions, invoices, refunds, and payment issues.",
    instruction="""You are a billing specialist.

    You can help with:
    - Explaining charges and invoices
    - Processing refund requests
    - Updating payment methods
    - Resolving billing disputes

    Always verify the customer's identity before discussing account details.
    For refunds over $100, escalate to a human supervisor.
    """
)

technical_support_agent = Agent(
    name="technical_support",
    model="gemini-2.0-flash",
    description="Handles technical issues, bugs, troubleshooting, and product questions.",
    instruction="""You are a technical support specialist.

    You can help with:
    - Troubleshooting product issues
    - Explaining features and how to use them
    - Reporting bugs to the engineering team
    - Providing workarounds for known issues

    Always ask for error messages and steps to reproduce before diagnosing.
    """
)

general_info_agent = Agent(
    name="general_info",
    model="gemini-2.0-flash",
    description="Answers general questions about products, policies, and company information.",
    instruction="""You answer general questions about our company and products.

    You can help with:
    - Product information and comparisons
    - Company policies (returns, shipping, etc.)
    - Store locations and hours
    - General FAQs
    """
)

# Coordinator agent that routes to specialists
root_agent = Agent(
    name="customer_service_coordinator",
    model="gemini-2.0-flash",
    description="Main customer service agent that routes requests to specialists.",
    instruction="""You are the front-line customer service coordinator.

    Your job is to:
    1. Greet customers warmly
    2. Understand their needs
    3. Route them to the appropriate specialist:
       - billing_agent: For payment, invoice, or refund questions
       - technical_support: For product issues or troubleshooting
       - general_info: For general questions about products or policies

    If you're unsure which specialist to use, ask clarifying questions first.
    Always ensure a smooth handoff with context about what the customer needs.
    """,
    sub_agents=[billing_agent, technical_support_agent, general_info_agent]
)

How delegation works:

  1. User sends a message to the coordinator
  2. Coordinator's LLM reads the message and subagent descriptions
  3. LLM decides to delegate by generating transfer_to_agent(agent_name='billing_agent')
  4. ADK intercepts this, finds the target agent, and transfers the conversation
  5. The subagent handles the request and can transfer back or to another subagent

The description is crucial: The coordinator's LLM uses descriptions to decide where to route. Write descriptions that clearly differentiate each agent's capabilities.

Workflow Agents

For deterministic flows (not LLM-decided), ADK provides workflow agents:

SequentialAgent

Executes subagents in order, passing state between them:

Python
from google.adk.agents import Agent, SequentialAgent

# Step 1: Fetch data
data_fetcher = Agent(
    name="data_fetcher",
    model="gemini-2.0-flash",
    instruction="Fetch relevant data for the user's question.",
    output_key="fetched_data"  # Saves output to state
)

# Step 2: Analyze data
analyzer = Agent(
    name="analyzer",
    model="gemini-2.0-flash",
    instruction="""Analyze the data in {fetched_data}.

    Provide insights and patterns you observe.
    """,
    output_key="analysis"
)

# Step 3: Generate report
reporter = Agent(
    name="reporter",
    model="gemini-2.0-flash",
    instruction="""Based on the analysis in {analysis},
    generate a concise report for the user.

    Include:
    - Key findings
    - Recommendations
    - Next steps
    """
)

# Pipeline executes in order
root_agent = SequentialAgent(
    name="analysis_pipeline",
    sub_agents=[data_fetcher, analyzer, reporter]
)

Understanding output_key: When an agent has output_key="fetched_data", its final response is automatically saved to state['fetched_data']. Subsequent agents can access this via {fetched_data} in their instructions. This is how data flows through the pipeline.

ParallelAgent

Runs subagents concurrently for independent tasks:

Python
from google.adk.agents import Agent, ParallelAgent, SequentialAgent

# These run in parallel
fetch_weather = Agent(
    name="weather_fetcher",
    model="gemini-2.0-flash",
    instruction="Get the weather forecast for the user's location.",
    output_key="weather_data",
    tools=[get_weather]
)

fetch_news = Agent(
    name="news_fetcher",
    model="gemini-2.0-flash",
    instruction="Get the top news headlines.",
    output_key="news_data",
    tools=[get_news]
)

fetch_calendar = Agent(
    name="calendar_fetcher",
    model="gemini-2.0-flash",
    instruction="Get today's calendar events.",
    output_key="calendar_data",
    tools=[get_calendar]
)

# Parallel fetching
parallel_fetch = ParallelAgent(
    name="parallel_fetcher",
    sub_agents=[fetch_weather, fetch_news, fetch_calendar]
)

# Synthesize results
synthesizer = Agent(
    name="synthesizer",
    model="gemini-2.0-flash",
    instruction="""Create a morning briefing from:

    Weather: {weather_data}
    News: {news_data}
    Calendar: {calendar_data}

    Make it concise and actionable.
    """
)

# Full pipeline: parallel fetch, then synthesize
root_agent = SequentialAgent(
    name="morning_briefing",
    sub_agents=[parallel_fetch, synthesizer]
)

Why parallel matters: Each fetch might take 1-2 seconds. Running sequentially: 3-6 seconds. Running in parallel: 1-2 seconds total. For user-facing applications, this latency reduction is significant.

LoopAgent

Repeats execution until a condition is met:

Python
from google.adk.agents import Agent, LoopAgent

# Agent that checks a condition
checker = Agent(
    name="quality_checker",
    model="gemini-2.0-flash",
    instruction="""Review the draft in {current_draft}.

    If quality is acceptable, respond with APPROVED.
    If improvements needed, respond with specific feedback.
    """,
    output_key="feedback"
)

# Agent that improves based on feedback
improver = Agent(
    name="improver",
    model="gemini-2.0-flash",
    instruction="""Improve the draft based on feedback: {feedback}

    Original draft: {current_draft}

    Create an improved version.
    """,
    output_key="current_draft"
)

# Loop until approved (max 5 iterations)
root_agent = LoopAgent(
    name="refinement_loop",
    max_iterations=5,
    sub_agents=[improver, checker]
)

Loop termination: Loops exit when max_iterations is reached OR when any subagent returns an event with escalate=True. Use this for quality gates, polling, or iterative refinement.

Agent as Tool (AgentTool)

Sometimes you want an agent callable as a tool rather than a delegation target:

Python
from google.adk.agents import Agent
from google.adk.tools import AgentTool

# Specialist agent
code_reviewer = Agent(
    name="code_reviewer",
    model="gemini-2.0-flash",
    instruction="""Review the provided code for:
    - Bugs and potential issues
    - Security vulnerabilities
    - Performance concerns
    - Style and readability

    Provide specific, actionable feedback.
    """
)

# Wrap as a tool
code_review_tool = AgentTool(agent=code_reviewer)

# Main agent uses the reviewer as a tool
root_agent = Agent(
    name="coding_assistant",
    model="gemini-2.0-flash",
    instruction="""You help users write code.

    When you generate code, always use the code_reviewer tool
    to check it before presenting to the user.
    """,
    tools=[code_review_tool]
)

AgentTool vs sub_agents:

  • sub_agents: The LLM decides to transfer the entire conversation. The subagent takes over.
  • AgentTool: The LLM calls it like any other tool, gets a result, and continues its own response.

Use AgentTool when you want the main agent to orchestrate; use sub_agents when you want full delegation.


Sessions and State Management

Understanding State

State is data stored within a conversation session. Agents read from and write to state to share information:

Python
from google.adk.agents import Agent
from google.adk.sessions import InMemorySessionService

# Agent that writes to state
greeter = Agent(
    name="greeter",
    model="gemini-2.0-flash",
    instruction="""Greet the user and ask for their name.
    When they provide their name, confirm it.
    """,
    output_key="user_name"  # Saves response to state['user_name']
)

# Agent that reads from state
personalizer = Agent(
    name="personalizer",
    model="gemini-2.0-flash",
    instruction="""Create a personalized welcome message for {user_name}.

    Include:
    - A warm welcome using their name
    - Suggestions for what they can do
    """
)

State scope: State is scoped to a session. Different users (different sessions) have isolated state. Within a session, all agents share the same state.

Running with Sessions

Python
from google.adk.agents import Agent
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner

# Create session service
session_service = InMemorySessionService()

# Create runner
runner = Runner(
    agent=root_agent,
    session_service=session_service,
    app_name="my_app"
)

# Create a session
session = session_service.create_session(
    app_name="my_app",
    user_id="user_123"
)

# Run the agent
async def chat(user_message: str):
    response = await runner.run(
        session_id=session.id,
        user_message=user_message
    )
    return response.text

Session services: InMemorySessionService is for development only—data is lost on restart. For production, use persistent backends like Firestore, Cloud SQL, or Redis.


Callbacks and Events

Callbacks let you hook into the agent's execution at predefined points—observing, modifying, or controlling behavior without altering the core framework. This is essential for logging, guardrails, caching, and custom business logic.

Callback Types

ADK provides six callback hooks:

CallbackWhen It FiresCommon Use Cases
before_agentBefore agent processes requestInput validation, logging, authentication
after_agentAfter agent completesOutput transformation, logging
before_modelBefore LLM callPrompt modification, caching
after_modelAfter LLM responseResponse filtering, logging
before_toolBefore tool executionPermission checks, rate limiting
after_toolAfter tool completesResult transformation, logging

Implementing Callbacks

The key mechanism: Returning None allows default behavior to continue. Returning a specific object overrides the default step entirely—this is how you implement caching, guardrails, and custom routing.

Python
from google.adk.agents import Agent
from google.adk.agents.callback_context import CallbackContext
from google.genai.types import Content, Part
from typing import Optional

# Logging callback - observes but doesn't modify
async def log_before_model(
    callback_context: CallbackContext,
    llm_request
) -> Optional[any]:
    """Log all LLM requests for debugging and monitoring."""
    print(f"[{callback_context.agent_name}] LLM Request:")
    print(f"  Messages: {len(llm_request.contents)} turns")
    print(f"  Tools: {[t.name for t in (llm_request.tools or [])]}")

    # Return None to allow normal LLM call
    return None


# Caching callback - returns cached response to skip LLM
async def cache_before_model(
    callback_context: CallbackContext,
    llm_request
) -> Optional[any]:
    """Return cached responses for repeated queries."""
    # Generate cache key from request
    cache_key = hash(str(llm_request.contents[-1]))

    cached_response = cache.get(cache_key)
    if cached_response:
        print(f"Cache hit for {cache_key}")
        # Return LlmResponse to skip the actual LLM call
        return cached_response

    # Return None to proceed with LLM call
    return None


# Guardrail callback - blocks dangerous operations
async def guardrail_before_tool(
    callback_context: CallbackContext,
    tool_name: str,
    tool_args: dict
) -> Optional[dict]:
    """Block dangerous tool operations."""

    # Block file deletion
    if tool_name == "delete_file":
        return {
            "status": "error",
            "error_message": "File deletion is not allowed. Please contact an administrator."
        }

    # Block certain paths
    if tool_name in ["read_file", "write_file"]:
        path = tool_args.get("path", "")
        if "/etc/" in path or "/root/" in path:
            return {
                "status": "error",
                "error_message": f"Access to {path} is not permitted."
            }

    # Allow other operations
    return None


# Output filtering callback - modifies responses
async def filter_after_model(
    callback_context: CallbackContext,
    llm_response
) -> Optional[any]:
    """Filter PII from model responses."""
    import re

    # Get the response text
    if llm_response.content and llm_response.content.parts:
        for part in llm_response.content.parts:
            if hasattr(part, 'text') and part.text:
                # Redact email addresses
                part.text = re.sub(
                    r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b',
                    '[EMAIL REDACTED]',
                    part.text
                )
                # Redact phone numbers
                part.text = re.sub(
                    r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b',
                    '[PHONE REDACTED]',
                    part.text
                )

    return llm_response


# Register callbacks on the agent
root_agent = Agent(
    name="secure_assistant",
    model="gemini-2.0-flash",
    instruction="You are a helpful assistant.",
    before_model_callback=log_before_model,
    after_model_callback=filter_after_model,
    before_tool_callback=guardrail_before_tool
)

Understanding callback context: The CallbackContext provides access to:

  • agent_name: Which agent is executing
  • session: The current session object
  • state: Session state for reading/writing data
  • Artifact and memory services when configured

When to use callbacks vs. instructions: Use instructions for behavior the LLM should learn. Use callbacks for hard constraints that must never be violated, regardless of what the LLM decides. Guardrails are safer as callbacks because the LLM can't be prompt-injected into bypassing them.


Bidirectional Streaming

ADK's killer feature is bidirectional (bidi) streaming—real-time voice and video conversations with agents. While other frameworks focus on text, ADK integrates directly with Gemini's Live API for multimodal interactions.

What Bidi-Streaming Enables

  • Voice conversations: Natural speech input and output
  • Video understanding: Process camera feeds in real-time
  • Interruption handling: Users can interrupt agent responses mid-speech
  • Low latency: Sub-second response times for natural conversation

Basic Streaming Setup

Python
from google.adk.agents import Agent
from google.adk.streaming import LiveRequestQueue, run_live
from google.adk.runners import Runner
import asyncio

# Create a standard agent
agent = Agent(
    name="voice_assistant",
    model="gemini-2.0-flash",  # Must support live API
    instruction="""You are a voice assistant.

    Keep responses concise (1-2 sentences) for natural conversation.
    If interrupted, acknowledge and pivot to the new topic.
    """
)

async def voice_conversation():
    """Run a live voice conversation with the agent."""

    # Create the request queue for sending audio/text
    request_queue = LiveRequestQueue()

    # Start the live session
    async with run_live(
        agent=agent,
        request_queue=request_queue,
        response_modalities=["AUDIO", "TEXT"]  # Get both audio and text
    ) as live_session:

        # Send text message (or audio bytes)
        await request_queue.send_text("Hello, what can you help me with?")

        # Process responses
        async for event in live_session:
            if event.type == "audio":
                # Play audio through speakers
                play_audio(event.audio_data)
            elif event.type == "text":
                # Display text transcript
                print(f"Agent: {event.text}")
            elif event.type == "interrupted":
                # User interrupted - agent will stop and listen
                print("(interrupted)")

# Run the conversation
asyncio.run(voice_conversation())

WebSocket Server for Web Apps

For web applications, create a WebSocket server that bridges browser audio to ADK:

Python
from fastapi import FastAPI, WebSocket
from google.adk.agents import Agent
from google.adk.streaming import LiveRequestQueue, run_live
import json

app = FastAPI()

agent = Agent(
    name="web_voice_agent",
    model="gemini-2.0-flash",
    instruction="You are a helpful voice assistant for our website."
)

@app.websocket("/ws/voice")
async def voice_websocket(websocket: WebSocket):
    """WebSocket endpoint for voice conversations."""
    await websocket.accept()

    request_queue = LiveRequestQueue()

    async with run_live(
        agent=agent,
        request_queue=request_queue,
        response_modalities=["AUDIO", "TEXT"]
    ) as live_session:

        # Task to receive from browser
        async def receive_from_browser():
            while True:
                data = await websocket.receive_bytes()
                # Forward audio to ADK
                await request_queue.send_audio(data)

        # Task to send to browser
        async def send_to_browser():
            async for event in live_session:
                if event.type == "audio":
                    await websocket.send_bytes(event.audio_data)
                elif event.type == "text":
                    await websocket.send_json({
                        "type": "transcript",
                        "text": event.text
                    })

        # Run both tasks concurrently
        await asyncio.gather(
            receive_from_browser(),
            send_to_browser()
        )

Production considerations:

  • Use the ADK Bidi-streaming Demo as a reference implementation
  • Implement proper audio format handling (sample rate, encoding)
  • Add authentication to WebSocket connections
  • Handle reconnection for dropped connections
  • Monitor latency and audio quality metrics

Long-Term Memory

While sessions handle short-term conversation state, the MemoryService enables agents to remember information across multiple sessions—giving them persistent knowledge about users, past interactions, and learned facts.

Memory vs. State

AspectState (Session)Memory (Long-term)
ScopeSingle conversationAcross all conversations
LifetimeSession durationPersistent
AccessDirect read/writeSearch-based retrieval
Use caseCurrent conversation contextUser preferences, past decisions, learned facts

Setting Up Memory

Python
from google.adk.agents import Agent
from google.adk.memory import InMemoryMemoryService, VertexAiMemoryBankService
from google.adk.tools import PreloadMemoryTool, LoadMemoryTool
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService

# Development: In-memory (non-persistent)
memory_service = InMemoryMemoryService()

# Production: Vertex AI Memory Bank (persistent, semantic search)
# memory_service = VertexAiMemoryBankService(
#     project="your-project-id",
#     location="us-central1",
#     agent_engine_id="your-agent-engine-id"
# )

# Agent with memory tools
agent = Agent(
    name="memory_agent",
    model="gemini-2.0-flash",
    instruction="""You are a personal assistant with long-term memory.

    At the start of conversations:
    - Use PreloadMemory to recall relevant information about this user
    - Reference past conversations when relevant

    During conversations:
    - Remember important facts the user shares (preferences, decisions, etc.)
    - Use LoadMemory if you need to recall something specific

    Be natural about memory - don't constantly remind users what you remember,
    but do use past context to provide better assistance.
    """,
    tools=[
        PreloadMemoryTool(),  # Auto-loads relevant memories at conversation start
        LoadMemoryTool()       # Agent can explicitly search memories
    ]
)

# Configure runner with memory service
session_service = InMemorySessionService()
runner = Runner(
    agent=agent,
    session_service=session_service,
    memory_service=memory_service,
    app_name="my_app"
)

How Memory Works

Ingestion: After a session ends, you can add it to long-term memory:

Python
# After a conversation ends, store it in memory
async def end_session(session_id: str):
    session = await session_service.get_session(session_id)

    # Add session contents to memory
    await memory_service.add_session_to_memory(
        session=session,
        app_name="my_app"
    )

Retrieval: Agents search memory using the provided tools:

Python
# PreloadMemoryTool automatically runs a search like:
# "What do I know about this user?"

# LoadMemoryTool lets the agent search explicitly:
# Agent decides to call: load_memory(query="user's dietary restrictions")

Memory in Practice

Python
# First conversation (new user)
user: "I'm allergic to peanuts and prefer vegetarian food."
agent: "Got it! I'll remember that you're allergic to peanuts and prefer vegetarian meals."

# [Session ends, added to memory]

# Second conversation (days later)
user: "Can you recommend a restaurant for tonight?"
agent: "I remember you're vegetarian and allergic to peanuts.
        Let me find restaurants that accommodate those needs..."
# [Agent used PreloadMemoryTool which retrieved past preferences]

Memory best practices:

  • Only store information users would expect to be remembered
  • Implement memory expiration for time-sensitive facts
  • Allow users to view and delete their stored memories (privacy)
  • Use Vertex AI Memory Bank for production (semantic search is much better than keyword matching)

Artifacts: File and Binary Handling

Artifacts enable agents to work with files, images, audio, and other binary data—not just text. They're versioned, can be scoped to sessions or users, and integrate with cloud storage for production.

Artifact Basics

Python
from google.adk.agents import Agent
from google.adk.artifacts import InMemoryArtifactService, GcsArtifactService
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
import google.genai.types as types

# Development: In-memory
artifact_service = InMemoryArtifactService()

# Production: Google Cloud Storage
# artifact_service = GcsArtifactService(bucket_name="my-artifacts-bucket")

# Configure runner with artifact service
runner = Runner(
    agent=root_agent,
    session_service=InMemorySessionService(),
    artifact_service=artifact_service,
    app_name="my_app"
)

Saving Artifacts from Tools

Python
from google.adk.tools import ToolContext
import google.genai.types as types

async def generate_report(
    context: ToolContext,
    report_type: str,
    data: dict
) -> dict:
    """
    Generate a PDF report and save it as an artifact.

    Args:
        report_type: Type of report ("sales", "inventory", etc.)
        data: Data to include in the report
    """
    # Generate PDF bytes (using your preferred library)
    pdf_bytes = create_pdf_report(report_type, data)

    # Create artifact from bytes
    artifact = types.Part.from_bytes(
        data=pdf_bytes,
        mime_type="application/pdf"
    )

    # Save to artifact service
    # Returns version number (1, 2, 3, etc. for each save)
    version = await context.save_artifact(
        filename=f"{report_type}_report.pdf",
        artifact=artifact
    )

    return {
        "status": "success",
        "message": f"Report saved as {report_type}_report.pdf (version {version})",
        "filename": f"{report_type}_report.pdf"
    }


async def analyze_image(
    context: ToolContext,
    image_description: str
) -> dict:
    """
    Load a previously saved image for analysis.

    Args:
        image_description: Description to identify which image
    """
    # Load the artifact
    image_artifact = await context.load_artifact("uploaded_image.png")

    if not image_artifact:
        return {
            "status": "error",
            "error_message": "No image found. Please upload an image first."
        }

    # Access the binary data
    image_bytes = image_artifact.inline_data.data
    mime_type = image_artifact.inline_data.mime_type

    # Process the image...
    analysis_result = process_image(image_bytes)

    return {
        "status": "success",
        "analysis": analysis_result
    }

Artifact Namespacing

Python
# Session-scoped (default): Only accessible within this session
await context.save_artifact("report.pdf", artifact)

# User-scoped: Accessible across all user's sessions
# Prefix filename with "user:"
await context.save_artifact("user:profile_picture.png", artifact)

Session-scoped artifacts: Temporary files for the current conversation (drafts, intermediate results).

User-scoped artifacts: Persistent files tied to the user (profile pictures, saved documents).

Listing Available Artifacts

Python
async def list_user_files(context: ToolContext) -> dict:
    """List all artifacts available in this session."""

    list_response = await context.list_artifacts()

    return {
        "status": "success",
        "files": list_response.filenames
    }

MCP Tools: Model Context Protocol Integration

The Model Context Protocol (MCP) is an open standard for LLM-to-tool communication. ADK can use existing MCP servers as tool providers, giving your agents access to a growing ecosystem of pre-built integrations.

Using MCP Servers in ADK

Python
from google.adk.agents import Agent
from google.adk.tools.mcp_tool import McpToolset
from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams
from mcp import StdioServerParameters

# Connect to a filesystem MCP server
filesystem_tools = McpToolset(
    connection_params=StdioConnectionParams(
        server_params=StdioServerParameters(
            command="npx",
            args=[
                "-y",
                "@modelcontextprotocol/server-filesystem",
                "/Users/me/documents"  # Directory to expose
            ]
        )
    ),
    # Only expose specific tools (security best practice)
    tool_filter=["read_file", "list_directory", "search_files"]
)

# Connect to GitHub MCP server
github_tools = McpToolset(
    connection_params=StdioConnectionParams(
        server_params=StdioServerParameters(
            command="npx",
            args=["-y", "@modelcontextprotocol/server-github"],
            env={"GITHUB_TOKEN": os.environ["GITHUB_TOKEN"]}
        )
    ),
    tool_filter=["list_repos", "get_file_contents", "create_issue"]
)

# Agent with MCP tools
agent = Agent(
    name="dev_assistant",
    model="gemini-2.0-flash",
    instruction="""You are a developer assistant with access to:

    1. Local filesystem (read-only, in ~/documents)
       - Use list_directory to explore
       - Use read_file to view file contents
       - Use search_files to find files

    2. GitHub integration
       - Use list_repos to see available repositories
       - Use get_file_contents to read repo files
       - Use create_issue to report bugs or request features

    Always confirm with the user before creating issues.
    """,
    tools=[filesystem_tools, github_tools]
)

Available MCP Servers

The MCP ecosystem includes servers for:

CategoryServers
DevelopmentGitHub, GitLab, filesystem, Git
ProductivityNotion, Linear, Slack, Google Drive
DataPostgreSQL, SQLite, BigQuery
CloudAWS, GCP, Kubernetes
AI/MLHugging Face, Qdrant, Pinecone

Find more at the MCP Servers Registry.

Creating an MCP Server from ADK Tools

You can also expose ADK tools as an MCP server:

Python
from mcp.server.lowlevel import Server
from mcp import types as mcp_types
from google.adk.tools.function_tool import FunctionTool
from google.adk.tools.mcp_tool.conversion_utils import adk_to_mcp_tool_type
import json

# Your ADK tool
def calculate_mortgage(
    principal: float,
    annual_rate: float,
    years: int
) -> dict:
    """Calculate monthly mortgage payment."""
    monthly_rate = annual_rate / 100 / 12
    num_payments = years * 12
    payment = principal * (monthly_rate * (1 + monthly_rate)**num_payments) / \
              ((1 + monthly_rate)**num_payments - 1)
    return {"monthly_payment": round(payment, 2)}

# Wrap as FunctionTool
adk_tool = FunctionTool(calculate_mortgage)

# Create MCP server
app = Server("mortgage-calculator-server")

@app.list_tools()
async def list_tools() -> list[mcp_types.Tool]:
    """Advertise available tools to MCP clients."""
    return [adk_to_mcp_tool_type(adk_tool)]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[mcp_types.Content]:
    """Handle tool invocations from MCP clients."""
    if name == adk_tool.name:
        result = await adk_tool.run_async(args=arguments, tool_context=None)
        return [mcp_types.TextContent(type="text", text=json.dumps(result))]
    raise ValueError(f"Unknown tool: {name}")

# Run server (stdio transport)
if __name__ == "__main__":
    import asyncio
    from mcp.server.stdio import stdio_server
    asyncio.run(app.run(stdio_server()))

OpenAPI Tools: Auto-Generate from API Specs

If you have an OpenAPI (Swagger) specification for a REST API, ADK can automatically generate tools from it—no manual function definitions needed.

Basic OpenAPI Integration

Python
from google.adk.agents import Agent
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset

# Load your OpenAPI spec
openapi_spec = """
openapi: 3.0.0
info:
  title: Pet Store API
  version: 1.0.0
paths:
  /pets:
    get:
      operationId: listPets
      summary: List all pets
      parameters:
        - name: limit
          in: query
          schema:
            type: integer
      responses:
        '200':
          description: A list of pets
  /pets/{petId}:
    get:
      operationId: getPet
      summary: Get a pet by ID
      parameters:
        - name: petId
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: A pet
"""

# Create toolset from spec
pet_api_tools = OpenAPIToolset(
    spec_str=openapi_spec,
    spec_str_type="yaml"
)

# Agent automatically gets list_pets and get_pet tools
agent = Agent(
    name="pet_store_agent",
    model="gemini-2.0-flash",
    instruction="""You help users find information about pets in our store.

    Use the list_pets tool to show available pets.
    Use the get_pet tool to get details about a specific pet.
    """,
    tools=[pet_api_tools]
)

OpenAPI with Authentication

Python
from google.adk.tools.openapi_tool.openapi_spec_parser.openapi_toolset import OpenAPIToolset
from google.adk.tools.openapi_tool.auth import ApiKeyAuth, BearerAuth, OAuth2Auth

# API Key authentication
toolset = OpenAPIToolset(
    spec_str=openapi_spec,
    spec_str_type="json",
    auth_scheme="apiKey",
    auth_credential=ApiKeyAuth(
        api_key="your-api-key",
        header_name="X-API-Key"
    )
)

# Bearer token authentication
toolset = OpenAPIToolset(
    spec_str=openapi_spec,
    spec_str_type="json",
    auth_scheme="bearer",
    auth_credential=BearerAuth(token="your-bearer-token")
)

# OAuth2 (for complex flows)
toolset = OpenAPIToolset(
    spec_str=openapi_spec,
    spec_str_type="json",
    auth_scheme="oauth2",
    auth_credential=OAuth2Auth(
        client_id="your-client-id",
        client_secret="your-client-secret",
        token_url="https://auth.example.com/token"
    )
)

How tool names are generated: ADK converts operationId to snake_case (e.g., listPetslist_pets). If no operationId is present, it generates names from the path and method.

Tool descriptions: Automatically extracted from the summary and description fields in your OpenAPI spec. Write good API documentation, and your agent tools will be well-documented too.


Error Handling Patterns

Agents face many potential failures: tools fail, APIs time out, LLMs hallucinate, users provide invalid input. Robust error handling is essential for production agents.

Tool-Level Error Handling

Always return structured errors from tools:

Python
def search_database(query: str, limit: int = 10) -> dict:
    """Search the product database."""

    # Validate inputs
    if not query or len(query) < 2:
        return {
            "status": "error",
            "error_type": "validation",
            "error_message": "Query must be at least 2 characters."
        }

    if limit < 1 or limit > 100:
        return {
            "status": "error",
            "error_type": "validation",
            "error_message": "Limit must be between 1 and 100."
        }

    try:
        results = database.search(query, limit=limit)
        return {
            "status": "success",
            "results": results,
            "count": len(results)
        }
    except ConnectionError as e:
        return {
            "status": "error",
            "error_type": "connection",
            "error_message": "Database connection failed. Please try again.",
            "retry_after_seconds": 5
        }
    except TimeoutError as e:
        return {
            "status": "error",
            "error_type": "timeout",
            "error_message": "Search timed out. Try a more specific query."
        }
    except Exception as e:
        # Log the actual error for debugging
        logger.exception(f"Unexpected error in search_database: {e}")
        return {
            "status": "error",
            "error_type": "internal",
            "error_message": "An unexpected error occurred. Please try again."
        }

Agent Instructions for Error Handling

Include error handling guidance in instructions:

Python
agent = Agent(
    name="resilient_agent",
    model="gemini-2.0-flash",
    instruction="""You are a helpful assistant.

    ## Error Handling

    When a tool returns an error:
    1. Don't apologize excessively - be matter-of-fact
    2. Explain what went wrong in simple terms
    3. Suggest alternatives if available
    4. For "retry_after_seconds" errors, tell the user to wait

    When you're uncertain:
    1. Don't guess or make up information
    2. Tell the user what you don't know
    3. Suggest how they might find the answer

    When user input is unclear:
    1. Ask clarifying questions
    2. Provide examples of valid input
    """,
    tools=[search_database]
)

Callbacks for Centralized Error Handling

Python
async def handle_tool_errors(
    callback_context: CallbackContext,
    tool_name: str,
    tool_result: dict
) -> Optional[dict]:
    """Centralized error handling for all tools."""

    if tool_result.get("status") == "error":
        error_type = tool_result.get("error_type", "unknown")

        # Log all errors
        logger.error(
            f"Tool error: {tool_name}",
            extra={
                "error_type": error_type,
                "error_message": tool_result.get("error_message"),
                "session_id": callback_context.session.id
            }
        )

        # Track error metrics
        metrics.increment(
            "tool_errors",
            tags={"tool": tool_name, "error_type": error_type}
        )

        # For certain errors, modify the response
        if error_type == "authentication":
            return {
                "status": "error",
                "error_message": "Session expired. Please log in again."
            }

    return tool_result  # Return original result

agent = Agent(
    name="monitored_agent",
    model="gemini-2.0-flash",
    instruction="...",
    after_tool_callback=handle_tool_errors
)

Graceful Degradation

Design multi-agent systems to handle subagent failures:

Python
from google.adk.agents import Agent, SequentialAgent

# Primary data source
primary_search = Agent(
    name="primary_search",
    model="gemini-2.0-flash",
    instruction="Search the primary database.",
    tools=[primary_db_search],
    output_key="search_results"
)

# Fallback if primary fails
fallback_search = Agent(
    name="fallback_search",
    model="gemini-2.0-flash",
    instruction="""Check if {search_results} contains an error.
    If so, search the backup database instead.
    If primary succeeded, just pass through the results.
    """,
    tools=[backup_db_search],
    output_key="final_results"
)

# Pipeline with fallback
search_pipeline = SequentialAgent(
    name="resilient_search",
    sub_agents=[primary_search, fallback_search]
)

Testing and Evaluation

ADK provides built-in evaluation capabilities—essential for ensuring agents work correctly before deployment and don't regress over time.

Test File Structure

Create test files with expected inputs and outputs:

JSON
// tests/weather_agent.test.json
{
  "name": "Weather Agent Tests",
  "description": "Test cases for the weather agent",
  "eval_cases": [
    {
      "name": "basic_weather_query",
      "conversation": [
        {
          "role": "user",
          "content": "What's the weather in New York?"
        }
      ],
      "expected_tool_calls": [
        {
          "tool_name": "get_weather",
          "arguments": {"city": "New York"}
        }
      ],
      "expected_response_contains": ["New York", "temperature"],
      "reference_response": "The weather in New York is currently sunny with a temperature of 22°C."
    },
    {
      "name": "unknown_city",
      "conversation": [
        {
          "role": "user",
          "content": "What's the weather in Atlantis?"
        }
      ],
      "expected_tool_calls": [
        {
          "tool_name": "get_weather",
          "arguments": {"city": "Atlantis"}
        }
      ],
      "expected_response_contains": ["not available", "don't have"]
    }
  ]
}

Running Tests

Via CLI:

Bash
# Run all tests for an agent
adk eval my_agent tests/

# Run specific test file
adk eval my_agent tests/weather_agent.test.json

# Show detailed results
adk eval my_agent tests/ --print_detailed_results

Via pytest:

Python
import pytest
from google.adk.evaluation.agent_evaluator import AgentEvaluator

@pytest.mark.asyncio
async def test_weather_agent():
    results = await AgentEvaluator.evaluate(
        agent_module="my_agent",
        eval_dataset_file_path_or_dir="tests/weather_agent.test.json"
    )

    # Check overall pass rate
    assert results.pass_rate >= 0.9, f"Pass rate too low: {results.pass_rate}"

    # Check specific metrics
    for case_result in results.case_results:
        assert case_result.tool_trajectory_score >= 0.8
        assert case_result.response_match_score >= 0.7

Via Web UI:

Bash
adk web  # Navigate to Evaluation tab

Evaluation Metrics

MetricWhat It MeasuresWhen to Use
tool_trajectory_avg_scoreExact match of tool call sequenceCI/CD, regression testing
response_match_scoreROUGE-1 similarity to referenceQuick quality checks
final_response_match_v2LLM-judged semantic similarityNuanced evaluation
rubric_based_final_response_quality_v1Quality without reference responseExploratory testing
hallucinations_v1Groundedness to contextSafety checks
safety_v1Harmful content detectionSafety checks

Multi-Turn Conversation Tests

JSON
{
  "name": "multi_turn_conversation",
  "conversation": [
    {"role": "user", "content": "I want to book a flight to Paris."},
    {"role": "assistant", "content": "I'd be happy to help you book a flight to Paris. What date are you looking to travel?"},
    {"role": "user", "content": "Next Friday."},
    {"role": "assistant", "content": "And which city will you be departing from?"},
    {"role": "user", "content": "New York."}
  ],
  "expected_tool_calls": [
    {
      "tool_name": "search_flights",
      "arguments": {
        "origin": "New York",
        "destination": "Paris"
      }
    }
  ]
}

Continuous Integration

YAML
# .github/workflows/agent-tests.yml
name: Agent Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: |
          pip install google-adk pytest pytest-asyncio
          pip install -r requirements.txt

      - name: Run agent evaluations
        env:
          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
        run: |
          adk eval my_agent tests/ --print_detailed_results

      - name: Run pytest
        run: pytest tests/ -v

Real-World Examples from adk-samples

The google/adk-samples repository contains production-ready examples across multiple domains:

Customer Service Agent

A multi-agent system for handling customer inquiries:

Code
agents/python/customer-service/
├── agent.py           # Main coordinator
├── agents/
│   ├── billing.py     # Billing specialist
│   ├── technical.py   # Technical support
│   └── general.py     # General inquiries
├── tools/
│   ├── crm.py         # Customer data tools
│   └── ticketing.py   # Support ticket tools
└── prompts/
    └── system.py      # Shared prompt templates

Data Science Agent

An agent that performs data analysis:

Code
agents/python/data-science/
├── agent.py
├── tools/
│   ├── pandas_tools.py    # DataFrame operations
│   ├── viz_tools.py       # Chart generation
│   └── stats_tools.py     # Statistical analysis
└── notebooks/             # Example analyses

Travel Concierge

Multi-agent travel planning:

Code
agents/python/travel-concierge/
├── agent.py               # Coordinator
├── agents/
│   ├── flights.py         # Flight search/booking
│   ├── hotels.py          # Hotel recommendations
│   ├── activities.py      # Local activities
│   └── itinerary.py       # Trip planning
└── tools/
    ├── search_apis.py     # External API integrations
    └── calendar.py        # Schedule management

Running the Samples

Bash
# Clone the samples repository
git clone https://github.com/google/adk-samples.git
cd adk-samples

# Navigate to a sample
cd agents/python/customer-service

# Install dependencies
pip install -r requirements.txt

# Set up environment
cp .env.example .env
# Edit .env with your API key

# Run with web UI
adk web

# Or run in terminal
adk run .

Advanced Configuration

Model Configuration

Fine-tune LLM behavior:

Python
from google.adk.agents import Agent
from google.genai.types import GenerateContentConfig

agent = Agent(
    name="creative_writer",
    model="gemini-2.0-flash",
    instruction="Write creative stories based on user prompts.",
    generate_content_config=GenerateContentConfig(
        temperature=0.9,        # Higher = more creative
        max_output_tokens=2000,
        top_p=0.95,
        top_k=40
    )
)

Structured Output

Force agents to respond in specific formats:

Python
from pydantic import BaseModel
from typing import List

class TaskList(BaseModel):
    tasks: List[str]
    priority: str
    estimated_hours: float

agent = Agent(
    name="task_planner",
    model="gemini-2.0-flash",
    instruction="Create task lists based on user goals.",
    output_schema=TaskList  # Enforces JSON output matching schema
)

Planning Modes

For complex reasoning, enable thinking/planning:

Python
from google.adk.agents import Agent
from google.adk.planners import BuiltInPlanner

agent = Agent(
    name="complex_reasoner",
    model="gemini-2.0-flash-thinking",  # Thinking-enabled model
    instruction="Solve complex problems step by step.",
    planner=BuiltInPlanner(
        thinking_budget=1024,      # Tokens for thinking
        include_thoughts=True      # Include reasoning in response
    )
)

Deployment to Vertex AI

For production, deploy to Vertex AI Agent Engine:

Bash
# Install the Vertex AI SDK
pip install google-cloud-aiplatform

# Deploy
adk deploy --project=your-project-id --region=us-central1

Vertex AI Agent Engine provides:

  • Managed infrastructure (no servers to maintain)
  • Auto-scaling based on traffic
  • Integrated monitoring and logging
  • Version management
  • A/B testing capabilities

ADK vs Other Frameworks

AspectADKLangChain/LangGraphCrewAI
Primary focusMulti-agent systemsGeneral LLM appsRole-based agents
Agent compositionSequential, Parallel, LoopGraph-basedSequential, hierarchical
State managementBuilt-in sessionsManual or LangGraphLimited
StreamingBidirectional audio/videoToken streamingToken streaming
DeploymentVertex AI nativeAnyAny
Model supportGemini + LiteLLMAll majorAll major
Learning curveModerateSteepEasy
Production readinessHigh (powers Google products)HighMedium

When to choose ADK:

  • Building multi-agent systems with complex coordination
  • Deploying on Google Cloud / Vertex AI
  • Need bidirectional streaming (voice, video)
  • Want production-ready framework from day one

When to choose alternatives:

  • Already invested in LangChain ecosystem
  • Need maximum model/deployment flexibility
  • Simpler single-agent applications

Best Practices

Agent Design

  1. Single responsibility: Each agent should do one thing well. Split complex behaviors into multiple agents.

  2. Clear descriptions: Write descriptions that differentiate agents. Other agents (and debugging tools) rely on these.

  3. Explicit instructions: Be specific about what the agent should and shouldn't do. Include examples for complex behaviors.

  4. Graceful degradation: Handle tool failures and unexpected inputs. Include fallback behaviors in instructions.

Tool Design

  1. Simple parameters: Use primitive types. Avoid complex objects the LLM might struggle to construct.

  2. Comprehensive docstrings: The docstring is the tool's manual for the LLM. Be thorough.

  3. Status in returns: Always include a status field so agents can handle failures.

  4. Idempotent when possible: Tools might be called multiple times. Design for this.

Multi-Agent Systems

  1. Start simple: Begin with a single agent, add complexity only when needed.

  2. Test in isolation: Each subagent should work standalone before composition.

  3. Monitor state: Use the web UI to inspect state flow between agents.

  4. Log everything: In production, log tool calls, transfers, and state changes.


Conclusion

Google ADK provides a production-ready framework for building multi-agent systems. Its strengths:

  • Multi-agent native: Sequential, parallel, and hierarchical workflows built-in
  • Production proven: Powers Google's own agent products
  • Developer friendly: Code-first approach with excellent tooling
  • Flexible deployment: Local development to Vertex AI production

The adk-samples repository provides excellent starting points for common use cases. Start there, understand the patterns, then build your own.

Frequently Asked Questions

Enrico Piovano, PhD

Co-founder & CTO at Goji AI. Former Applied Scientist at Amazon (Alexa & AGI), focused on Agentic AI and LLMs. PhD in Electrical Engineering from Imperial College London. Gold Medalist at the National Mathematical Olympiad.

Related Articles