AIEconomics2025

Agent-to-Agent Commerce: From Protocol to Practice

Picture this: a procurement agent finds a cheaper supplier for your office furniture. It has the authority to negotiate. It has the context — your budget, your specs, your delivery timeline. It drafts a purchase order. And then it stops. Because there's no way for the supplier's agent to verify who it's dealing with, no way to settle payment without a human wiring funds, and no record either side can trust after the fact.

The agent is capable. The commerce layer is missing.

That's the question I kept coming back to while building the procurement agent: who pays the agent? And how does anyone trust the receipt?

The Commerce Gap

We've solved agent capability. Tool calls, function calling, multi-step reasoning, web browsing — the demos are impressive and the production deployments are real. An agent can research suppliers, compare prices, draft emails, and negotiate terms. That part works.

What doesn't work is the last mile. The moment an agent needs to commit resources— spend money, sign an agreement, make a binding promise — the entire system falls back to human intervention. Someone has to log in, review, approve, and click "send."

The gap isn't capability. It's trust plus payment. Two problems that look separate but are actually the same problem: how do you create a verifiable record that both parties — and any observer — can rely on?

Current agent frameworks don't address this. They give you tool definitions, retry logic, and orchestration. They assume settlement happens somewhere else. But "somewhere else" means a human in a browser, which means the entire value proposition of autonomous agents collapses at the transaction boundary.

HLOS: The Minimum Viable Trust Layer

HLOS is an agentic commerce protocol. The framing I find most useful: it's the settlement layer for agent-to-agent transactions.

Think about what settlement means in finance. It's not just "money moved." It's: this transaction happened, at this time, these parties were involved, these rules were followed, and no one can dispute that afterward. Settlement converts a handshake into a record.

HLOS provides three primitives — the minimum viable trust layer for agent commerce:

Identity — who.Every agent gets a verifiable on-chain identity, registered as a Metaplex Core NFT on Solana. This isn't a profile page. It's a cryptographic anchor that ties every action to a specific agent. When a procurement agent sends a purchase order, the receiving agent can verify the sender's identity without trusting a centralized directory. The identity is self-sovereign — no platform can revoke it, and no intermediary sits between the agents.

Ledger — what happened.Every transaction produces an immutable record. Not a log file on someone's server — a cryptographically signed, append-only ledger entry that neither party can alter after the fact. This is the STAAMP receipt system. It records what was agreed, what was delivered, what was paid, and when. The ledger is the source of truth that replaces the trust you'd normally place in a relationship or a brand.

Attestation — was it good. After a transaction completes, a quality score is submitted on-chain via the ATOM protocol. This is where reputation begins. Not self-reported ratings, not reviews you can delete — permanent, verifiable attestations that accumulate over time. An agent that delivers late, overcharges, or underperforms carries that record publicly. An agent that consistently delivers builds a track record that opens doors.

These three primitives — identity, ledger, attestation — are deliberately minimal. They don't try to solve business logic, domain-specific negotiation, or workflow orchestration. They solve the trust problem at the infrastructure level so that everything built on top can assume settlement works.

STAAMP: Why the Receipt Is the Product

Traditional invoices are documents. They describe what should happen. A STAAMP receipt describes what did happen — and proves it cryptographically.

The data model is built around five fields: Service, Time, Amount, Agent, Method, and Proof. Each receipt captures what service was rendered, when, how much was paid, which agents were involved, the payment method used, and a cryptographic proof that ties the whole thing together. It's not a PDF you email — it's a structured, machine-readable record that any agent can verify independently.

The append-only constraint is critical. In traditional commerce, disputes happen because records can be altered — an invoice gets "corrected," a line item disappears, a date shifts. STAAMP receipts can't be edited after creation. If something needs to change, you issue a new receipt that references the original. The history is always intact. This matters less when two humans are exchanging invoices and it matters enormously when two agents are transacting at scale — thousands of transactions per hour, across jurisdictions, with no human reviewing each one.

The difference from traditional invoices is architectural, not cosmetic. An invoice says "please pay me." A STAAMP receipt says "this happened, here's the proof, verify it yourself."

The Procurement Agent: A Full Walkthrough

The HLOS Procurement Agent does one thing well: it negotiates with your suppliers, tracks purchase orders, and charges 18% of verified savings. No savings, no fee.

Here's the full flow, step by step:

1. Discover.The agent scans your existing supplier contracts and purchase history. It identifies categories where spend is high and pricing hasn't been renegotiated recently. It builds a target list — not by guessing, but by analyzing your actual procurement data.

2. Negotiate. The Negotiation Agent contacts suppliers directly — via email, through procurement portals, or via supplier-side agents if they exist. It has market pricing data. It knows your volume. It makes the case for a better rate. Critically, this agent runs in an isolated HLOS space — it can communicate externally but cannot access billing, treasury, or PO systems. If a supplier sends a manipulated email designed to exploit the agent, the blast radius is contained.

3. Pay via x402.When a deal is reached, payment settles through the x402 protocol. The agent sends an HTTP POST. The supplier's endpoint returns a 402 Payment Required response with USDC payment instructions — amount, wallet address, memo. The agent pays. The supplier's system detects the on-chain payment and releases the goods or service. No invoices. No net-30. No "check's in the mail."

4. Receive and verify. The PO Tracking Agent — running in its own isolated space with ERP access but no negotiation or billing capability — confirms delivery against the purchase order. Quantities, specs, timelines. Any discrepancy gets flagged before payment finalizes.

5. Attest. The Savings Audit Agent — read-only, no write access to anything — calculates verified savings by comparing the negotiated price against historical spend. Only verified savings trigger the 18% fee. The entire calculation is recorded in a STAAMP receipt that the client can independently audit.

Three agents, three isolated spaces, one transaction. The isolation isn't paranoia — it's the architecture that makes autonomy safe. A compromised negotiation agent can't drain the treasury. A buggy PO tracker can't sign contracts. Each agent has the minimum permissions it needs and nothing more.

x402: HTTP for Commerce

The x402 protocolis named after the HTTP 402 status code — "Payment Required" — a code that's been reserved since HTTP/1.1 was drafted in 1997 but never standardized for actual payments. x402 finally gives it a job.

The flow is deliberately simple:

Step 1. An agent sends a standard HTTP POST to a service endpoint — requesting an evaluation, a data lookup, a compute job, anything.

Step 2. The server responds with HTTP 402, including a JSON body: the price (in USDC), the recipient wallet address, a payment memo for correlation, and an expiry window.

Step 3. The requesting agent submits an on-chain USDC payment matching the 402 response parameters.

Step 4. The agent retries the original POST, this time including the transaction signature in a header. The server verifies payment on-chain and returns the result.

No API keys. No OAuth. No accounts. No invoices. An agent with a funded wallet can pay for any x402-enabled service the same way a browser with cookies can access any website. The protocol is stateless and composable — an agent can chain ten x402 calls across ten different providers in a single workflow without any provider knowing about the others.

This is what makes agent-to-agent commerce scalable. The alternative — bespoke payment integrations, billing accounts, credit terms — works for a handful of known partners. It falls apart when agents need to discover and transact with services they've never seen before.

Reputation: The Cold-Start Problem

The most interesting implication of agent-to-agent commerce isn't the efficiency gains. It's the reputation systems that emerge from it.

Right now, every agent starts from zero. Agent A negotiates with Supplier B, but Supplier B has no record of Agent A's history. Next week, a different company deploys Agent A against the same supplier, and neither side has context. Every interaction is isolated. There's no memory, no track record, no basis for trust beyond the current transaction.

This is the cold-start problem, and it's the same problem that eBay, Uber, and Airbnb solved for humans. Their answer was ratings — but ratings are self-reported, platform-locked, and easy to game. An eBay seller with 10,000 five-star reviews can't port that reputation to Amazon.

On-chain attestation solves this differently. Reputation is public— anyone can read an agent's transaction history and quality scores. It's not self-reported— the counterparty submits the attestation, not the agent itself. It's portable— an agent's reputation follows it across platforms, providers, and use cases. And it's permanent— you can't delete a bad review or start over with a fresh account.

Over time, this creates emergent market dynamics. Suppliers offer better terms to agents with proven track records. Agents with strong reputations get priority access to scarce resources. New agents pay a premium until they build trust. Commerce has always worked this way for humans — we're building the infrastructure to extend it to machines.

The Hard Problems

I don't want to oversell this. Three things still need solving, and they're genuinely hard:

Legal standing.When an agent signs a contract, who's liable? The company that deployed it? The protocol operator? The model provider whose weights produced the decision? Current contract law assumes a human principal behind every agreement. Agent-to-agent contracts don't have that — you have a chain of delegation where the deploying entity may not have reviewed or even seen the specific terms the agent agreed to. Until legal frameworks catch up, every agent transaction carries legal ambiguity. The practical workaround today is treating the deploying entity as the principal and constraining agent authority through wallet limits and scope restrictions — but that's a patch, not a solution.

Adversarial inputs.Supplier emails, product listings, and counterparty messages are all untrusted content. A sophisticated adversary could craft a supplier response designed to manipulate a negotiation agent — injecting hidden instructions in a price quote, embedding misleading context in a product description, or exploiting the agent's optimization function to accept terms that look good on one metric but are terrible on another. HLOS's isolation model contains the blast radius, but prompt injection is an unsolved problem in the broader AI ecosystem. The defense today is defense-in-depth: isolation, wallet limits, human review for transactions above a threshold, and anomaly detection on agent behavior patterns.

Rubric governance. For the grant allocator, who decides what "good" looks like? For procurement, who decides what "fair price" means? Right now, these rubrics are hardcoded — I set the evaluation dimensions and their weights. That works for a single-operator system but breaks the moment multiple stakeholders need input. A DAO funding public goods has different criteria than a corporate procurement department, and both need the ability to define their rubrics without breaking the autonomous evaluation chain. This is a governance problem disguised as a technical one, and governance problems are the hardest kind.

These aren't reasons to stop. They're the actual work. The protocols exist. The infrastructure is live on devnet. The question now is whether the hard problems get solved fast enough for agent commerce to matter — or whether the ecosystem settles for human-in-the-loop workarounds that defeat the purpose.

I'm betting on the former.

hlos.ai ↗·All projects·All writing