Back to Blog
ProductMay 25, 20268 min read

Autonomous Mapping Agents — Hand Off the Boring Half

By Mapping Engineering

Most of what a mapping operator does day-to-day is procedural. A new inventory file lands → trigger a job → poll until done → eyeball the unmatched rows → search the golden dataset for plausible candidates → file a proposed match → check coverage didn't slip → repeat for the next supplier. None of those steps is interesting. All of them are necessary.

Today we're shipping autonomous mapping agents on Pro Max. You give an agent a goal, a slice of inventory to care about, and a budget. It runs the procedural half so you can spend your time on the things only a human should do — investigating weird mismatches, deciding when a duplicate is really a rebrand, calling your supplier rep when something's wrong upstream.

The Problem

If you've operated a mapping pipeline for more than a quarter, this list will look familiar:

  • Coverage drift. A supplier ships new inventory weekly; without someone running mapping promptly, your coverage percentage silently decays.
  • Unmatched-row purgatory. A 5,000-row file produces 200 unmatched rows. Each one needs a human to search the golden dataset, eyeball candidates, and either file a match or escalate.
  • Mismatch report backlog. Reports come in faster than anyone can triage them. The ones that need to be grouped with similar reports sit OPEN for weeks.
  • Cron-job sprawl. Teams glue together internal scripts that pull from S3, hit the mapping API, write results back somewhere. Every script is a snowflake; nobody owns the failure modes.

Agents collapse this list to: "tell the agent your goal."

What an Agent Is

An agent is a tier-gated record in your workspace with:

  • A goal in plain English — "keep coverage above 95% for European hotels and propose matches for everything else".
  • A scope — which suppliers, regions, and inventory sources it cares about.
  • A versioned rule set — what triggers it, what it's allowed to do, what daily spend cap it lives under.
  • Optional data sources and sinks — S3 bucket to pull new inventory from, webhook to push results to, both already supported in v1.
  • A schedule and event subscriptions — cron tick, new inventory upload, mapping job finished, mismatch report filed.

When a trigger fires, the agent runs a cycle: load context, call an LLM with a tool registry, execute whatever the LLM picks (subject to the rule set), persist every step, write a one-line summary, fire notifications. Every cycle is auditable; every mutating action is undoable for 7 days.

Architecture, In One Diagram

Agent architecture: trigger → scheduler → runtime → LLM loop → tools → audit

The runtime lives in the same Kotlin/Spring API process as the rest of the platform — no new services, no new infra besides a handful of Postgres tables prefixed agent_*. Per-agent concurrency is serialised by a DB advisory lock keyed on agent_id; cross-agent parallelism is bounded by a global config cap (default 8).

Each cycle's LLM loop:

  1. Build a system prompt with cached static content (identity, scope, rule set, tool defs) and dynamic content (trigger, recent memory notes, open pending items).
  2. Call the LLM with the tool registry.
  3. For each tool_use block in the response, validate against the rule set, snapshot the before-state, dispatch the call to existing API services, persist the result.
  4. Repeat until stop_reason == end_turn or the per-cycle step cap (default 20) is hit.
  5. The final assistant turn must include summary(text) — that string becomes the run's headline.

A Walkthrough

Let's say you operate a small US hotel marketplace and you've just hit Pro Max. You set up one agent:

  • Goal: "Map all new inventory from our partner suppliers within 30 minutes of upload. File proposed matches for anything unmatched. Alert me if coverage drops below 95%."
  • Scope: All suppliers, US region.
  • Schedule: Event-only (no cron).
  • Rules: mapping_jobs.trigger_on_new_inventory_upload = true, unmatched_results.propose_candidates_for_review = true with max 3 candidates per row, coverage.alert_below_pct = 95.
  • Daily spend cap: $5.
  • Notifications: Slack webhook on action and on escalation.

You upload a new 2,000-row inventory file from the UI. Within seconds:

  1. InventoryUploadCompletedEvent fires.
  2. AgentScheduler picks it up, acquires the advisory lock on your agent.
  3. The runtime loads the agent, rule set, recent memory, open pending items.
  4. The LLM is called. It sees the trigger context, decides to call trigger_mapping_job with the new file's scope.
  5. A job spins up. The agent polls get_mapping_job until status is COMPLETED (it's smart enough not to busy-loop; it returns and the next cycle is triggered by MappingJobFinishedEvent).
  6. On the follow-up cycle: 1,870 of 2,000 matched. The agent calls get_job_results with a filter for unmatched, then search_golden_dataset_by_geo for each row, then propose_match for the top candidate per row — 130 proposed matches land as pending items.
  7. Coverage moves from 96.2% to 96.4%. No alert fires.
  8. The agent calls summary("Processed 2,000-row upload. 1,870 matched, 130 proposed matches filed for review. Coverage 96.4%.") and the cycle closes.

A Slack message lands in your channel with the summary, a link to the run, and a link to the pending items. Total LLM cost for the cycle: $0.14 (cached system prompt, fast model for the polling step, default model for the proposed-match reasoning).

Guardrails

Three layers of safety because autonomy without safety is just a foot-gun:

Cost governance

Each agent has a per-day spend cap in cents. Before every LLM call, CostGovernor.checkAndReserve(estimatedTokens) checks the running ledger. If the next call would breach the cap, the agent is moved to PAUSED_COST immediately, a pending item is filed, you're notified. Spend cap is soft (auto-resumes at midnight in your tenant timezone, configurable); the per-hour action cap is hard (requires human resume). 80% warning fires its own notification.

Rule gating

PartnerMappingResult.matchConfidence is unreliable enough in production that we explicitly do not let the agent use it as a confidence judgement. Instead, the rule set has explicit on/off toggles per action category. Mutating cross-tenant golden data: off entirely. Auto-marking a mismatch fixed: off entirely. Auto-pushing to a data sink: off by default, opt-in only. Auto-changing your mapping_mode_preference: off by default. The defaults file pending items; the agent never quietly mutates state.

7-day undo

Every mutating tool call captures a before_state snapshot before dispatch. Snapshots live in agent_undo_snapshot for 7 days, indexed by expires_at. The workspace Undo center lists them sorted by expiry; one click rolls back, with conflict detection if state has drifted since. Audit log entries persist forever; only the snapshot payloads age out.

Bring Your Own LLM (Optional)

The hosted Anthropic key is included with Pro Max — no extra setup. If you want to use your own Anthropic or OpenAI key (e.g. for org-level billing aggregation, or because you have negotiated rates), each agent can be configured with BYO_ANTHROPIC or BYO_OPENAI. Keys go into the same secret vault as your data-source credentials, redacted in audit logs and only visible to OWNER/ADMIN roles.

The cost ledger tracks usage either way, so the daily spend cap protects you whether you're spending Mapping's dollars or your own.

Watching, Pausing, Talking To It

Three surfaces:

  • The Agents UI — list, detail, run timeline with collapsible step-by-step expansion, pending-items inbox, threaded conversations with each agent.
  • The MCP server — list agents, view runs, pause/resume, chat. From Claude Desktop, Cursor, Continue, or any MCP client.
  • Webhooks and Slack — every notification subscription you configure receives a structured payload (on_action, on_escalation, on_pause, on_cost_threshold_pct).

You can stop the world at any time: per-agent pause from the UI, or workspace-wide "pause all" if you need a kill switch.

Pricing & Availability

Agents are included with Pro Max — $740/month or $7,104/year. No new Stripe products to provision; if you're on Pro Max today, agents appear in your sidebar the moment we flip the feature flag for your workspace. Enterprise includes agents as part of its standard scope.

For design partners during the beta period, we're enabling agents per-workspace via an allow-list before flipping the global flag. Email contact@mapping.travel if you want in early.

What's Next

v1 ships the runtime, the create flow, the run viewer, the Undo center, the cost dashboard, and the read/chat subset over MCP. The roadmap behind v1:

  • pgvector-backed memory so the agent's recall stays sharp past the first hundred runs.
  • Customer-defined trigger DSL so triggers can compose more than the v1 set.
  • More data-source types — SFTP and GCS, on the way.
  • Multi-agent collaboration so a coverage agent can hand a row to a triage agent without you in the loop.
  • An agent template marketplace beyond the 3 canned starters.

Until then: open app.mapping.travel, head to Agents → New, give the first one a tight scope and a small daily cap, and let it loose on next Monday's inventory drop.

More