Skip to main content

Documentation Index

Fetch the complete documentation index at: https://exa.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Exa Agent is in beta. It may change before launch, and requires Exa-Beta: agent-2026-05-07 on every request.
Exa Agent is an async, high-compute, usage-based endpoint that handles list building, enrichment, and deep research tasks that require dozens of structured output fields and complex reasoning. Each run can return a natural-language answer, schema-validated JSON, field-level grounding, metadata, and a cost breakdown. You can retrieve completed runs later, list past runs, replay events, or continue from a previously completed run.

When to use Exa Agent

Use Exa Agent when a workflow needs more than a single search or extraction call:
  • Build lists from open-ended criteria, then enrich each result
  • Research entities across many fields with citations
  • Run multi-hop tasks like “find companies, then find their decision makers”
  • Produce structured JSON from a long-running web research task
  • Continue from a previous run with a follow-up request like “find 10 more results”
For simpler low-latency search, start with the Search API.

Quickstart

This example starts a run that builds a structured list of people matching your criteria. It returns JSON in output.structured.

1. Install the Exa SDK

pip install exa-py

2. Set your API key

export EXA_API_KEY="YOUR_EXA_API_KEY"
SDK calls below pass betas=["agent-2026-05-07"] or betas: ["agent-2026-05-07"], which sets the required Exa-Beta header.

3. Create a run

import json
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
run = exa.beta.agent.runs.create(
    betas=["agent-2026-05-07"],
    query="Find engineering leaders at AI infrastructure companies that raised a Series A or B in the last 6 months.",
    output_schema={
        "type": "object",
        "properties": {
            "people": {
                "type": "array",
                "maxItems": 10,
                "items": {
                    "type": "object",
                    "properties": {
                        "name": {"type": "string"},
                        "job_title": {"type": "string"},
                        "linkedin_url": {"type": "string", "format": "uri"},
                    },
                    "required": ["name", "job_title", "linkedin_url"],
                },
            }
        },
        "required": ["people"],
    },
    effort="auto",
)

print(json.dumps(run.model_dump(), indent=2))
Add Accept: text/event-stream when creating a run to receive server-sent events as the run is queued, started, and completed:
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
events = exa.beta.agent.runs.create(
    betas=["agent-2026-05-07"],
    query="Find five recently launched developer tools for evaluating AI agents.",
    stream=True,
)

for event in events:
    print(event.event, event.data)
id: 1
event: agent_run.created
data: {"id":"agent_run_01j...","status":"queued","createdAt":"2026-05-07T21:21:52.051Z"}

id: 2
event: agent_run.started
data: {"id":"agent_run_01j...","status":"running"}

id: 3
event: agent_run.completed
data: {"id":"agent_run_01j...","object":"agent_run","status":"completed"}

4. Poll for completion

If you do not stream events, save the returned id and poll the run until it reaches a terminal status.
import json
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
run_id = "agent_run_01j..."
run = exa.beta.agent.runs.poll_until_finished(
    run_id,
    betas=["agent-2026-05-07"],
    poll_interval=4000,
)

print(json.dumps(run.model_dump(), indent=2))
Completed runs include:
  • output.text: a natural-language answer
  • output.structured: validated JSON when you provide outputSchema
  • output.grounding: citations for text or structured fields, when emitted
  • costDollars: the run’s cost breakdown

Return structured JSON

Use outputSchema when you need /agent to return in specific format. When you specify an outputSchema, the returned object will contain an output matching your outputSchema in output.structured. outputSchema supports the JSON Schema specification. To request contact information, describe the desired contact fields in outputSchema. Use standard JSON Schema shapes such as { "type": "string", "format": "email" } for email addresses, { "type": "string", "format": "phone" } for phone numbers, and { "type": "string", "format": "uri" } for URLs. Bound list sizes with maxItems when possible so the maximum contact-enrichment cost is predictable.
import json
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
run = exa.beta.agent.runs.create(
    betas=["agent-2026-05-07"],
    query="Find AI infrastructure companies that raised a Series A or B in the last 6 months.",
    effort="auto",
    output_schema={
        "type": "object",
        "properties": {
            "companies": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "name": {"type": "string"},
                        "round": {"type": "string"},
                        "website": {"type": "string"},
                    },
                    "required": ["name", "round"],
                },
            }
        },
        "required": ["companies"],
    },
)
run = exa.beta.agent.runs.poll_until_finished(
    run.id,
    betas=["agent-2026-05-07"],
)

print(json.dumps(run.output.structured if run.output else None, indent=2))

Process input rows

Use input.data when you have an existing set of data that you want to enrich. You can add more fields to each data entity, or surface more entities based on the data you bring in, or both! In the example below, we have a list of companies and we want to produce a research brief for each one.
import json
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
run = exa.beta.agent.runs.create(
    betas=["agent-2026-05-07"],
    query="For each input company, produce a concise research brief with an overview and source URLs.",
    input={
        "data": [
            {"company": "Ramp", "domain": "ramp.com"},
            {"company": "Mercury", "domain": "mercury.com"},
        ]
    },
)

print(json.dumps(run.model_dump(), indent=2))

Process exclusions

Use input.exclusion to exclude certain entries from being surfaced in the run. In the example below, we want to look for the top 10 cutest animals, but we exclude goats and pandas from the run because we already know how cute they are.
import json
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
run = exa.beta.agent.runs.create(
    betas=["agent-2026-05-07"],
    query="Find the top 10 cutest animals. Return each animal's common name and a source URL.",
    input={
        "exclusion": [
            {"animal": "goat"},
            {"animal": "panda"},
        ]
    },
)

print(json.dumps(run.model_dump(), indent=2))

Continue from a previous run

Use previousRunId to ask follow-ups to the run’s previous response. Follow-up runs will share the same run ID as the previousRunId supplied.
import json
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
run = exa.beta.agent.runs.create(
    betas=["agent-2026-05-07"],
    query="Narrow that list to companies hiring in San Francisco.",
    previous_run_id="agent_run_01j...",
)

print(json.dumps(run.model_dump(), indent=2))

Find a run ID

List recent runs and inspect their statuses:
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
runs = exa.beta.agent.runs.list(
    betas=["agent-2026-05-07"],
    limit=10,
)

for run in runs.data:
    query = (run.request or {}).get("query", "")
    print(f"{run.id}\t{run.status}\t{run.created_at}\t{query}")

Production Workflows

Most production workflows center around three fields:
  • query: what the agent should do
  • input.data: the data the agent should process
  • outputSchema: the shape of the result you want back
In this example, Agent takes in the companies listed in input.data, produce a concise research brief for each one, and returns structured JSON that matches the outputSchema.
import json
from exa_py import Exa

exa = Exa(api_key="YOUR_EXA_API_KEY")
run = exa.beta.agent.runs.create(
    betas=["agent-2026-05-07"],
    # Describe the work Agent should do for each input row.
    query="For each input company, produce a concise research brief with an overview and source URLs.",
    # Provide the records Agent should research or enrich.
    input={
        "data": [
            {"company": "Ramp", "domain": "ramp.com"},
            {"company": "Mercury", "domain": "mercury.com"},
        ]
    },
    # Require a predictable JSON shape in output.structured after completion.
    output_schema={
        "type": "object",
        "required": ["reports"],
        "properties": {
            "reports": {
                "type": "array",
                "items": {
                    "type": "object",
                    "required": ["company", "domain", "overview", "sourceUrls"],
                    "properties": {
                        "company": {"type": "string"},
                        "domain": {"type": "string"},
                        "overview": {"type": "string"},
                        "sourceUrls": {
                            "type": "array",
                            "items": {"type": "string", "format": "uri"},
                            "minItems": 1,
                        },
                    },
                },
            }
        },
    },
)

# Save run.id and poll until status is completed before reading output.structured.
print(json.dumps(run.model_dump(), indent=2))

Pricing

Exa Agent is in beta and pricing may change before launch.
Costs are usage-based and priced by component:
ComponentPrice
Agent Compute Units1 ACU / $0.0001
Search tool calls$7 / 1,000 searches
Contact enrichment is separate from the core pricing components above: email contact enrichment is $0.02 / email, and phone number contact enrichment is $0.07 / phone number.
usage.agentComputeUnits measures model computation across the full run. More complex queries, or queries that contain a large input.data field will generally take more reasoning steps and make more tool calls, and will generally consumer more ACUs. Your Agent concurrency limit is one fifth of your account QPS. For pay-as-you-go accounts with default QPS, this means two active Agent runs at a time.

Effort

Use effort to set a cost and reasoning effort preference for a run. Supported values are low, medium, high, xhigh, and auto; the default is auto. If an effort is set, each run is capped at the following costs:
EffortPrice*
low$25 / 1,000 searches
medium$100 / 1,000 searches
high$500 / 1,000 searches
xhigh$2000 / 1,000 searches
*Email and phone enrichment is additional and is not included in fixed effort pricing.

Next