11 ยท Pain Points and Open Problems

"In application security, 99% is a failing grade." โ€” Simon Willison, on prompt injection.[^1]

Previous sections have described what exists in the agentic-payments stack: mandates (see AP2), SharedPaymentTokens (see ACP), on-chain facilitators (see x402 and crypto), network overlays (see Card Networks), and regulatory scaffolding (see Regulation and Compliance). This section does something different. It steps back and asks: what is structurally broken, and where is the industry's current marketing ahead of the engineering?

Each pain point below is examined under a uniform lens: (a) the problem, (b) why it's hard, (c) current proposed directions with named companies/protocols and citations, (d) what remains unsolved. A severity-versus-tractability prioritization matrix appears at the end. The tone is deliberately critical: vendor decks announcing "end-to-end agent commerce" in Q4 2025 were, in almost every case, pointing at one or two pieces of a problem whose other five pieces are still unsolved.


11.1 Interoperability and protocol fragmentation

(a) Problem

Between April 2025 and January 2026 the industry shipped at least six "agent payments" protocols that overlap in scope but disagree on primitives: Google's AP2 (Intent/Cart/Payment Mandates as W3C VCs),[^2] OpenAI+Stripe's ACP (SharedPaymentTokens, merchant remains Merchant-of-Record),[^3] Coinbase's x402 (HTTP 402 + stablecoin settlement),[^4] Visa's Trusted Agent Protocol (HTTP Message Signatures + Web Bot Auth),[^5] Mastercard's Agent Pay (Agentic Tokens extending the MTS tokenization service),[^6] and Google+Shopify's Universal Commerce Protocol (UCP) announced at NRF 2026.[^7] PayPal, Amex, Crossmint, Skyfire, Nekuda, and Catena's ACK sit between or across these.

(b) Why it's hard

These are not different implementations of the same spec; they make incompatible architectural choices. AP2 puts the user at the centre via signed mandates; ACP puts the merchant at the centre via shared tokens; x402 puts the HTTP resource at the centre via on-chain settlement; Visa's TAP puts the network at the centre via signed bot identity. A merchant that wants to sell to a Google Agent, a ChatGPT agent, and a crypto-native agent at once must implement three overlapping consent/identity/settlement stacks. Worse, the semantics of "mandate," "token," and "agent identity" are not shared: an AP2 Cart Mandate is a W3C Verifiable Credential;[^8] an ACP SharedPaymentToken is a Stripe-issued bearer credential; an x402 payment is an EIP-3009 signed transfer;[^4] none of these translate losslessly.

(c) Current proposals

  • Profile bridges. UCP explicitly positions itself as a superset that can wrap AP2 and ACP flows.[^7] Stripe has announced ACP-to-AP2 mappings in their Agent Toolkit docs.
  • Common substrate. Both AP2 and TAP lean on IETF RFC 9421 HTTP Message Signatures[^9] and the Cloudflare-authored Web Bot Auth draft,[^10] giving at least a shared transport-layer identity primitive.
  • Network compatibility layers. Mastercard's Agent Pay and Visa's TAP are card-rail wrappers โ€” they can in principle tunnel any agent transaction into an existing card authorization, giving the networks a natural "gateway" role.[^5][^6]

(d) What remains unsolved

There is no semantic interoperability layer. A Cart Mandate signed under AP2 is not automatically acceptable to an ACP merchant; a Visa-signed agent is not automatically trusted by an x402 facilitator. Until a neutral body (W3C, IETF, or ISO TC68) standardizes a mandate/credential envelope across protocols โ€” not merely the signing format โ€” the ecosystem will see the same Nร—M adapter explosion that plagued early open banking. Vendor claims of "open" protocols should be read with skepticism: open-source is not the same as open-standard, and controlling the reference implementation of an open-source protocol is a strong form of lock-in.


11.2 Agent identity and Know-Your-Agent (KYA)

(a) Problem

Who is this agent, who does it act for, and under what authority? Four largely incompatible answers exist.

(b) Why it's hard

  • ERC-8004 "Trustless Agents" (Ethereum EIP) registers agents on-chain with DIDs and reputation attestations.[^11]
  • Web Bot Auth (Cloudflare/IETF draft) authenticates agent HTTP traffic via signed requests keyed to domain-controlled keys.[^10]
  • AP2 mandates authenticate intent, not agent identity โ€” identity is derived from the principal's signature on the mandate.[^2]
  • X.509 / mTLS (the traditional server-identity stack) is used by enterprise MCP deployments; Visa TAP sits atop this.[^5]
  • Skyfire issues stablecoin-backed agent identities; Nekuda issues Agent Wallets; Catena's ACK defines an agent-commerce identity kit.[^12][^13][^14]

These live at different layers (network vs. transaction vs. principal). There is no consensus on whether agent identity should be chain-native, DNS-native, PKI-native, or credential-native. Each answer assumes a different threat model and a different incumbent (Ethereum vs. Cloudflare vs. Google vs. DigiCert).

(c) Current proposals

  • A "stacking" pattern is emerging: Web Bot Auth for transport identity, AP2/ACP mandates for transaction authority, ERC-8004 for cross-domain reputation.
  • Credential formats are converging on W3C VCs 2.0[^8] and DIDs,[^15] giving at least a common data model even if trust anchors differ.
  • Catena's ACK explicitly tries to be protocol-agnostic, signing mandates that downstream rails (card, x402, ACH) can each consume.[^14]

(d) What remains unsolved

  • Revocation at scale. If an agent key is compromised, how is the revocation propagated to every merchant, facilitator, and issuer in seconds? Certificate revocation has never really worked on the web;[^16] it is unlikely to work better for agents.
  • Principal-binding. None of the current schemes prove, cryptographically, that the human being behind a mandate is not themselves a bot or a compromised account. KYA is not KYC.
  • Reputation systems are gameable. ERC-8004 acknowledges reputation is optional; in practice, any numeric reputation score attached to money will be Sybil-farmed.

11.3 Consent UX and standing mandates

(a) Problem

AP2's model assumes users sign Intent Mandates ("buy concert tickets up to $400 in the next 24h") which the agent then discharges against Cart Mandates.[^2] ACP's model assumes users approve each checkout in ChatGPT.[^3] Real use requires standing mandates ("handle my grocery reorders") โ€” and signing a standing mandate is a cognitive problem cards never had to solve.

(b) Why it's hard

Humans are famously bad at reading EULAs; they are worse at reasoning about parameterized authorization ("up to $N per month, category X, not Y"). The scope of a mandate is a policy language, and policy languages are hard for non-specialists โ€” see the decades-long failure of Android permission dialogs and OAuth consent screens.[^17] Worse, the consequences of a misconfigured mandate are financial, not informational.

(c) Current proposals

  • AP2 includes structured mandate fields (amount caps, merchant allow-lists, expiry) intended to be rendered as constrained UI rather than free text.[^2]
  • Cloud Security Alliance has published a "secure use of AP2" framework recommending short-lived mandates, human-in-the-loop for high-value transactions, and explicit re-consent at policy boundaries.[^18]
  • Mastercard's Agent Pay Acceptance Framework and Visa's TAP both require issuer-side risk scoring that can downgrade a standing mandate to step-up auth.[^5][^6]

(d) What remains unsolved

No one has demonstrated a consent UX that (i) is intelligible to median users, (ii) composes across merchants, and (iii) does not degenerate into "accept all" clicks within a month. The deep problem is adversarial: every friction point in consent UX is also a conversion-killer, so merchants and platforms are structurally incentivized to make consent less rigorous over time. Regulators (CFPB, FCA โ€” see ยง10) have not yet prescribed a consent-UX floor comparable to PSD2 SCA.[^19]


11.4 Prompt injection and authority leakage

(a) Problem

An agent empowered to move money, reading untrusted web content, is the worst-case deployment of a large language model. A single hostile paragraph on a product page or in an email can subvert the agent's goals.

(b) Why it's hard

Simon Willison's "lethal trifecta" โ€” (1) access to private data, (2) exposure to untrusted content, (3) ability to externally act โ€” describes exactly the agentic-payments operating model.[^1][^20] Prompt injection is not a bug class that can be patched; it is a consequence of the LLM architecture treating instructions and data in the same token stream. As Willison puts it, "99% is a failing grade."[^1]

(c) Current proposals

  • Architectural patterns (planner/executor separation, dual-LLM, capability-bounded tool-use) described in Willison's "Design Patterns for Securing LLM Agents."[^20]
  • Mandate-bound execution: AP2 and ACP both require that the signed Cart/SharedPaymentToken matches the executed transaction, so that injected instructions cannot re-target the payment to a new merchant.[^2][^3] This is a genuine mitigation but only for the final-step substitution attack; it does not stop injection from steering the construction of the cart.
  • Google's threat-modelling work on prompt-injection risk estimation.[^21]
  • Academic literature: arXiv surveys on prompt injection in agent systems;[^22] MDPI 2026 review.[^23]

(d) What remains unsolved

Prompt injection is, as of April 2026, still an open research problem with no general defense. Every deployed agent that reads a third-party web page while holding payment authority has the lethal trifecta. The industry's response โ€” "mandates constrain the damage" โ€” is necessary but not sufficient: a mandate for "$400 of groceries" can be fully drained to a merchant the user did not intend, if the injection fully controls cart construction. See section 09 Security and Trust for the full threat decomposition.


11.5 Dispute resolution and chargebacks

(a) Problem

An agent "hallucinates" that it should buy two Peloton bikes instead of one, or confuses a 49subscriptionfora49 subscription for a4,900 one. Who pays?

(b) Why it's hard

Chargeback rules (Visa VCR, Mastercard Chargeback Guide) were written for human card-present and card-not-present flows. "Did the cardholder authorize?" becomes ambiguous when the cardholder authorized the agent which authorized the transaction. Justt.ai and other chargeback-management vendors have already flagged this as a coming wave of friendly fraud.[^24] Merchants fear a flood of "my agent did it" disputes; issuers fear a flood of "agent-initiated first-party misuse" claims.

(c) Current proposals

  • AP2's Payment Mandate is designed specifically as a non-repudiable artifact โ€” a signed VC that can be shown in a dispute to prove the user's pre-authorization.[^2]
  • Visa TAP and Mastercard Agent Pay carry agent-indicator flags in the authorization message, giving issuers data to route disputes differently.[^5][^6]
  • Stripe's ACP retains the merchant as Merchant-of-Record precisely to preserve existing chargeback mechanics.[^3]
  • Linklaters and the Consumer Bankers Association have flagged the need for new dispute codes specifically for agent-initiated transactions.[^25][^26]

(d) What remains unsolved

Chargeback code schemes have not been updated. The legal question โ€” is a signed mandate conclusive evidence of authorization, or merely presumptive? โ€” has no case law. For on-chain agents (x402), there is no chargeback at all (see ยง11.12 below). A realistic outcome is that issuers quietly absorb early agent-dispute losses as a customer-acquisition cost; that is not a scalable equilibrium.


11.6 Liability ambiguity (as a business problem)

(a) Problem

When an agent makes a bad purchase, liability can plausibly sit with: the user (they signed the mandate), the agent operator (they built the model), the platform (they operationalized it), the merchant (they accepted an agent transaction), the issuer (they approved the auth), or the network (they routed it). Existing payment-law allocations assume one of the first four.

(b) Why it's hard

The agent operator is usually an AI company whose Terms of Service disclaim consequential damages. The platform is usually the same company. Merchant-acceptance rules (Visa/Mastercard operating regulations) say the merchant must verify authorization, but offer no guidance on accepting a mandate. Section 10 Regulation and Compliance covers the statutory side; the business reality is that someone is writing off the first few hundred million dollars of agent errors, and no one wants it to be them.

(c) Current proposals

  • Contractual allocation via Agent Pay Acceptance Frameworks (Mastercard)[^6] and TAP ecosystem rules (Visa).[^5]
  • Insurance products: Catena Labs and Crossmint have both hinted at agent-action insurance.[^14][^27]
  • Merchant-side indemnity from platforms (Stripe has offered limited chargeback indemnity for ACP transactions in its dev docs).[^3]

(d) What remains unsolved

No statutory safe harbor exists for "good-faith reliance on a signed mandate." Without it, merchants and issuers price agent risk as extreme-tail; consumers face a liability regime less clear than card-present or ACH. The Consumer Bankers Association explicitly called for legislative clarification in its 2025 white paper.[^26]


11.7 Micropayment economics โ€” can stablecoins and x402 actually beat cards?

(a) Problem

A canonical agentic use case is paying 0.001perAPIcall,0.001 per API call,0.05 per piece of scraped content, 1.50perimagegeneration.Cardrailsarestructurallyincapableofthis:interchange+assessment+acquirermarkupfloorsatย 1.50 per image generation. Card rails are structurally incapable of this: interchange+assessment+acquirer markup floors at ~0.05โ€“$0.10 per transaction,[^28] so a 1-cent payment is economically impossible.

(b) Why it's hard

Cards have floor fees because they are built around 1970s settlement, fraud insurance, and chargeback reserves. Stablecoin rails promise near-zero marginal cost, but L1 gas is volatile (Ethereum mainnet fees spiked above $10 per transfer multiple times in 2024โ€“25), and L2 or Solana fees, while sub-cent, still depend on network conditions. Coinbase's CDP facilitator offers free x402 on Base for the first 1,000 transactions/month per account[^4] โ€” which is a subsidy, not a cost structure.

(c) Current proposals

  • x402 via Base / Solana / Polygon with stablecoin USDC settlement.[^4]
  • Cloudflare pay-per-crawl using x402 as the pay-per-access rail for AI training data.[^29]
  • Aggregation approaches (sum micropayments then settle over card/ACH) from Skyfire and Nekuda.[^12][^13]
  • Card-network counter-moves: Mastercard Agentic Tokens permit low-value authorizations with pre-aggregated settlement,[^6] though floor economics remain.

(d) What remains unsolved

Whether stablecoin rails can beat cards at scale and under adversarial load is untested. Three unresolved factors:

  1. Chargeback and fraud insurance are not free โ€” cards charge for them. Stablecoin micropayments currently have no equivalent insurance layer; whoever builds one will re-introduce fees.
  2. On/off-ramp costs (USDโ†”USDC) still sit at 0.5โ€“1.5% and dominate at high volume.
  3. Regulatory treatment of stablecoin payments under MiCA (EU), the US GENIUS Act (expected 2026), and UK stablecoin rules is not yet settled โ€” see ยง10.

The honest answer: stablecoins win economically today for genuinely sub-cent machine-to-machine flows; they do not yet win for the 5โ€“5โ€“50 consumer transactions that dominate commerce, and vendor projections that assume they will are extrapolating from a subsidized base.


11.8 Privacy โ€” the agent sees everything

(a) Problem

An agent capable of shopping on your behalf has, by construction, access to your identity, payment instruments, purchase history, location, calendar, and (often) inbox. Data minimization โ€” a core principle of GDPR, UK DPA, and CCPA โ€” is structurally at odds with agent usefulness.

(b) Why it's hard

LLMs are stateful in their context window and stateless across sessions, but agent operators retain logs for debugging, safety, abuse detection, and model improvement. "We don't train on your data" is not the same as "we don't retain your data." Privacy-preserving agent architectures (local inference, confidential-compute TEEs, selective-disclosure VCs) exist in principle but are absent from all consumer deployments as of Q1 2026.

(c) Current proposals

  • Selective-disclosure VCs in AP2 (a Payment Mandate can prove "cardholder is over 18 and in the EU" without revealing more).[^2][^8]
  • On-device agents (Apple's rumoured on-device Siri agent; Apple Intelligence's Private Cloud Compute model) provide a template for confidential execution.[^30]
  • Skyfire's pseudonymous agent identities avoid coupling agent traffic to the underlying human.[^12]

(d) What remains unsolved

No deployed consumer agent today runs fully on-device with end-to-end confidential execution. OpenAI, Anthropic, and Google all see the full prompt-plus-tool-stream for consumer agent sessions โ€” which is plausibly the richest commercial dataset ever aggregated. The privacy critique is not that this is illegal; it is that no one has credibly offered a privacy-preserving alternative with equivalent capability.


11.9 Merchant disintermediation and brand risk

(a) Problem

If agents select products, merchants lose the Zero Moment of Truth (Google's own framing, circa 2011). An agent comparing ten detergent SKUs on price and shipping will not be moved by a TV spot. Brand premium collapses; commoditization accelerates.

(b) Why it's hard

Merchants have two structural responses: (i) refuse to sell to agents (which forfeits share to those that do), or (ii) optimize for agent-selection (which means exposing rich structured data, which in turn helps agents commoditize them further). This is a prisoner's dilemma at industry scale. Walmart's early adoption of ChatGPT Instant Checkout[^31] is a defensive move: better to be an agent-preferred retailer than an agent-neutral one.

(c) Current proposals

  • Agent-facing SKU feeds (Shopify has published agent-commerce schemas for UCP).[^7]
  • Differentiated agent experiences โ€” Etsy's agent integration promotes seller stories and authenticity signals that price-only ranking would hide.[^3]
  • Brand-pays-for-placement patterns, analogous to retail-media networks, within agent surfaces.

(d) What remains unsolved

Agent-surface advertising is currently unregulated. If OpenAI or Google accept payment for product placement inside an agent's "recommendation," the distinction from organic selection is invisible to the user. This is a harder version of the early-2000s search-ads disclosure problem, and there is no consensus on how it should be regulated. Expect a CFPB/FTC enforcement action before the end of 2026.


11.10 Trust calibration โ€” how does a user decide $X?

(a) Problem

Every mandate UX requires the user to pick an amount cap, a scope, and an expiry. Users have no intuition for these parameters. Should the monthly cap for a grocery agent be 200?200?500? $2,000?

(b) Why it's hard

Trust calibration requires a mental model of agent failure modes (hallucination rate, adversarial robustness, tool reliability) that ordinary users do not possess โ€” and that the industry itself publishes almost no transparent data about. Issuers have no agent-level risk scores comparable to FICO. The user is being asked to price a risk no one has priced before.

(c) Current proposals

  • Issuer-side throttling (U.S. Bank and Citi pilot with Mastercard Agent Pay applies issuer risk scores regardless of the mandate cap).[^6]
  • Category defaults in AP2 ("grocery-category standing mandate" ships with a conservative default cap).[^2]
  • Visa TAP's signal enrichment gives the issuer enough context to apply additional rules.[^5]

(d) What remains unsolved

There is no public benchmark of agent transaction-error rates โ€” not for OpenAI, not for Google, not for Anthropic. Users are asked to delegate money without the kind of product-safety disclosure required for almost every other financial product. Absent benchmark transparency, trust calibration will be learned through loss, which is slow and expensive.


11.11 Agent-to-agent market manipulation and collusion

(a) Problem

If both buyer and seller are agents, the market becomes an algorithmic bargaining game. Known failure modes from algorithmic trading โ€” flash crashes, spoofing, tacit collusion โ€” can recur in consumer commerce.

(b) Why it's hard

Pricing agents that learn from each other can converge on supracompetitive equilibria without explicit communication; the EU's 2023 competition-policy work on algorithmic pricing already flagged this.[^32] When every merchant runs a pricing agent and every consumer runs a purchasing agent, the joint dynamics are an open research question; MDPI's 2025 survey of agent-blockchain systems touches this but does not solve it.[^33]

(c) Current proposals

  • Rate-limiting and anomaly detection at network level (Visa TAP explicitly identifies "agent storms").[^5]
  • On-chain transparency (ERC-8004 attestations produce public logs that can be post-hoc audited).[^11]
  • Regulator-side efforts from the CMA (UK) and DG COMP (EU) to extend algorithmic-pricing guidance.

(d) What remains unsolved

No jurisdiction has rules specifically for agent-vs-agent markets. Collusion via shared model weights (both sides using the same frontier LLM) is not addressed by traditional antitrust frameworks, which assume separable decision-makers.


11.12 Reversibility โ€” the on-chain / card asymmetry

(a) Problem

A card payment is reversible for up to 120 days (chargeback). An x402 payment on Base is irreversible at confirmation. An agent confused about which rail it is using โ€” or which rail its counterparty is using โ€” can make irreversible errors.

(b) Why it's hard

Agents compose tools. An agent might retrieve a card token from Stripe (reversible), then spend stablecoin via x402 (irreversible), then settle an invoice via PayPal (complex dispute rules). The agent must track reversibility semantics per-rail; LLM-based agents are notoriously bad at tracking such state.

(c) Current proposals

  • Mandate-pinned rails: AP2 and ACP both let the user scope a mandate to a specific rail.[^2][^3]
  • Rail-aware routing in agent SDKs (Catena ACK, Crossmint).[^14][^27]
  • On-chain escrow via smart-contract intermediaries to emulate chargeback-like holds for x402.[^4]

(d) What remains unsolved

Escrow-based "chargeback emulation" for stablecoins exists only in prototype form. There is no cross-rail reversibility standard. Until there is, any agent with both card and on-chain authority is a liability.


11.13 Token cost and environmental ballooning

(a) Problem

A single non-trivial shopping session with a frontier LLM agent can consume 50kโ€“500k tokens across planning, tool calls, and re-plans. At 2026 list prices that is a meaningful fraction of the purchase's interchange โ€” and the carbon footprint is non-trivial.

(b) Why it's hard

Agent quality scales with compute. Reducing token usage reduces robustness, which raises failure rates, which raises dispute rates, which raises real cost. The industry has optimized for capability, not efficiency.

(c) Current proposals

  • Caching and prompt-compression (Anthropic prompt-caching, OpenAI structured-output mode).
  • Small-model routing for deterministic sub-tasks (e.g., cart-arithmetic in a small model, persuasion/negotiation in a frontier model).
  • On-device inference for privacy and cost (Apple Private Cloud Compute model).[^30]

(d) What remains unsolved

No public data quantifies the net economic or environmental cost of an agent transaction vs. a human-initiated one. Vendor claims of efficiency gains are, as of 2026, unbacked by audited figures. Expect this to become an ESG-disclosure issue for listed merchants that adopt agents at scale.


11.14 Accessibility and the digital divide

(a) Problem

The agentic-payments stack, as deployed, assumes: a smartphone, a card-linked wallet, fluency in one of a handful of languages, a platform account (OpenAI/Google/Anthropic), and โ€” increasingly โ€” a crypto wallet. Each requirement excludes a population that already experiences payment friction.

(b) Why it's hard

Agents are being positioned as an accessibility win (voice, natural language, lowered skill requirements). That framing is partly true and partly marketing. The underlying account, identity, and dispute infrastructure still presupposes digital-native users. The unbanked and under-banked โ€” around 1.4 billion adults globally per the World Bank Findex โ€” are not served by any of the protocols above, because none address first-mile onboarding.

(c) Current proposals

  • Crossmint + MoneyGram stablecoin payouts as a cash-out path for agent-earned income.[^27]
  • Ant International / Antom integration into Mastercard Agent Pay brings emerging-market rails into scope.[^6]
  • Voice-first agent interfaces (OpenAI Voice Mode, Gemini Live) do lower literacy barriers.

(d) What remains unsolved

None of the announced protocols address disability accessibility at the mandate-UX layer (WCAG compliance for mandate-signing flows is unspecified). Disparate-impact risk is real: if agent commerce gets meaningfully cheaper or faster than non-agent commerce, users who cannot use agents face a price disadvantage. Regulators have not yet addressed this.


11.15 Prioritization matrix

The following matrix rates each pain point on severity (how bad the outcome if unaddressed) and tractability (how likely a credible fix emerges in 18โ€“36 months). Both scales run 1โ€“5. The product is not an objective risk score; it is a reading guide.

# Pain point Severity Tractability Sร—T Category
4 Prompt injection / authority leakage 5 1 5 Existential, intractable
2 Agent identity / KYA 5 2 10 Existential, hard
1 Interoperability / fragmentation 4 3 12 Structural, tractable
5 Dispute / chargebacks 5 3 15 Structural, tractable
6 Liability ambiguity 4 2 8 Structural, slow
11 Agent-vs-agent manipulation 5 1 5 Latent, intractable
12 Reversibility asymmetry 4 3 12 Structural, tractable
3 Consent UX 4 2 8 UX, hard
10 Trust calibration 4 2 8 UX, hard
8 Privacy 4 2 8 Structural, slow
9 Merchant disintermediation 3 3 9 Market, evolving
7 Micropayment economics 3 4 12 Market, tractable
13 Token cost / environment 2 4 8 Market, tractable
14 Accessibility 3 2 6 Equity, slow

Read across rows: prompt injection is the single most dangerous item and the one with the least credible fix; identity, disputes, and reversibility are nearly as severe but at least have plausible 18-month paths. Interoperability looks bad today but is likely to be solved by market consolidation (UCP-ification) more than by heroic engineering. Agent-vs-agent manipulation is the dark horse: low industry attention, high potential severity.


11.16 Synthesis

If one strips away the vendor announcements, three honest summary statements emerge:

  1. The industry shipped protocols before it shipped trust primitives. AP2, ACP, TAP, x402 all assume identity, consent, and dispute infrastructure that does not yet exist. The protocols are running on I.O.U.s from future standards bodies.
  2. The worst security problem is unsolved and possibly unsolvable. Prompt injection is structural to current LLMs; every agent with both tool authority and untrusted-content exposure is vulnerable. Mandates reduce blast radius; they do not eliminate it.
  3. The economic case beyond novelty is thin. Micropayment economics work for genuine machine-to-machine flows; for consumer commerce, agents mostly re-package existing card rails and add token cost and dispute risk on top. The win is UX, not unit economics โ€” and UX wins evaporate at the first chargeback scandal.

A useful frame: agentic payments in 2026 are where e-commerce was in 1998 โ€” real, growing, profoundly under-regulated, with participants racing to establish default behaviours before anyone has agreed what "correct" means. The pain points above are not bugs to be patched in the next release; they are the research and policy agenda of the next decade.


Sources

[^1]: Simon Willison, "Prompt injection: the AI vulnerability we still can't fix." Quoted in context in his ongoing series, e.g. "Design Patterns for Securing LLM Agents against Prompt Injections" (June 13, 2025). https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/

[^2]: Google, "Agent Payments Protocol (AP2)" specification. https://github.com/google-agentic-commerce/AP2/blob/main/docs/specification.md ; announcement: https://cloud.google.com/blog/products/ai-machine-learning/announcing-agents-to-payments-ap2-protocol

[^3]: Agentic Commerce Protocol (OpenAI + Stripe). https://github.com/agentic-commerce-protocol/agentic-commerce-protocol ; https://agenticcommerce.dev/ ; Stripe docs: https://docs.stripe.com/agentic-commerce/protocol

[^4]: Coinbase, "x402" protocol documentation. https://docs.x402.org/ ; facilitator concept: https://docs.x402.org/core-concepts/facilitator ; repo: https://github.com/coinbase/x402

[^5]: Visa, "Trusted Agent Protocol" (Oct 14, 2025). https://investor.visa.com/news/news-details/2025/Visa-Introduces-Trusted-Agent-Protocol-An-Ecosystem-Led-Framework-for-AI-Commerce/default.aspx ; https://github.com/visa/trusted-agent-protocol

[^6]: Mastercard, "Agent Pay" (Apr 30, 2025). https://paymentexpert.com/2025/04/30/mastercard-microsoft-ai-agent-pay/ ; PayPal expansion (Oct 27, 2025): https://newsroom.paypal-corp.com/2025-10-27-Mastercard-and-PayPal-Join-Forces-To-Accelerate-Secure-Global-Agentic-Commerce

[^7]: Google Developers Blog, "Under the Hood: Universal Commerce Protocol (UCP)" (NRF Big Show, January 2026). https://developers.googleblog.com/under-the-hood-universal-commerce-protocol-ucp/

[^8]: W3C, "Verifiable Credentials Data Model 2.0" (W3C Recommendation). https://www.w3.org/TR/vc-data-model-2.0/

[^9]: IETF RFC 9421, "HTTP Message Signatures" (Feb 2024). https://datatracker.ietf.org/doc/rfc9421/

[^10]: IETF draft-meunier-web-bot-auth-architecture (Cloudflare). https://datatracker.ietf.org/doc/draft-meunier-web-bot-auth-architecture/

[^11]: ERC-8004 "Trustless Agents." https://eips.ethereum.org/EIPS/eip-8004 ; Ethereum Foundation: https://ai.ethereum.foundation/blog/intro-erc-8004

[^12]: Skyfire: TechCrunch (Aug 21, 2024), "Skyfire lets AI agents spend your money." https://techcrunch.com/2024/08/21/skyfire-lets-ai-agents-spend-your-money/

[^13]: Nekuda funding announcement, Crowdfund Insider (May 2025). https://www.crowdfundinsider.com/2025/05/239660-fintech-startup-nekuda-secures-funding-led-by-madrona-ventures-to-enable-agentic-payments/

[^14]: Catena Labs + Agent Commerce Kit (ACK). BusinessWire (May 20, 2025). https://www.businesswire.com/news/home/20250520361792/en/Circle-Co-Founder-Sean-Neville-Takes-Catena-Labs-Out-of-Stealth-with-Plans-to-Build-the-First-AI-Native-Financial-Institution

[^15]: W3C, "Decentralized Identifiers (DIDs) v1.0." https://www.w3.org/TR/did-core/

[^16]: On the historical failure of certificate revocation at web scale, see Adam Langley's well-known critique (referenced in Cloudflare Web Bot Auth design rationale[^10]). General background: https://blog.cloudflare.com/x402/ (illustrating the parallel design concerns).

[^17]: General background on consent-UX failure modes in OAuth and mobile permissions โ€” see e.g. Cloud Security Alliance AP2 guidance[^18] for the agent-era framing.

[^18]: Cloud Security Alliance, "Secure Use of the Agent Payments Protocol (AP2)" (Oct 6, 2025). https://cloudsecurityalliance.org/blog/2025/10/06/secure-use-of-the-agent-payments-protocol-ap2-a-framework-for-trustworthy-ai-driven-transactions

[^19]: For the regulatory baseline see ยง10 Regulation and Compliance; Linklaters TechInsights, "Agentic payments: legal risks." https://techinsights.linklaters.com/post/102l0hm/agentic-payments-what-are-they-what-are-the-legal-risks-and-whats-next

[^20]: Simon Willison, "Design Patterns for Securing LLM Agents against Prompt Injections" (Jun 13, 2025). https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/

[^21]: Google Security Blog, "How we estimate the risk from prompt injection attacks on AI systems" (Jan 2025). https://security.googleblog.com/2025/01/how-we-estimate-risk-from-prompt.html

[^22]: "Securing AI Agents Against Prompt Injection Attacks," arXiv:2511.15759 (Nov 2025). https://arxiv.org/abs/2511.15759

[^23]: "Prompt Injection Attacks in Large Language Models and AI Agent Systems: A Comprehensive Review," Information 17(1):54 (MDPI, 2026). https://www.mdpi.com/2078-2489/17/1/54

[^24]: Justt.ai, "Agentic Commerce: Preparing for Chargeback and Fraud Risks." https://justt.ai/blog/agentic-commerce-chargeback-risk-preparation/

[^25]: Linklaters TechInsights, "Agentic payments: legal risks." https://techinsights.linklaters.com/post/102l0hm/agentic-payments-what-are-they-what-are-the-legal-risks-and-whats-next

[^26]: Consumer Bankers Association, "White paper: Agentic AI, consumer payments, and the future of regulation" (2025). https://consumerbankers.com/press-release/cba-releases-white-paper-examining-agentic-ai-consumer-payments-and-the-future-of-regulation/

[^27]: Crossmint, "Agentic Payments." https://www.crossmint.com/solutions/agentic-payments ; Circle Ventures investment: https://cryptobriefing.com/circle-ventures-investment-crossmint-stablecoin/

[^28]: For card interchange floor economics see Kearney, "Agentic payments: a new frontier in digital commerce." https://www.kearney.com/industry/financial-services/article/agentic-payments-a-new-frontier-in-digital-commerce ; also Payments Association analysis: https://thepaymentsassociation.org/article/ai-powered-payment-agents-the-next-payments-revolution/

[^29]: Cloudflare, "x402 and pay-per-crawl" (Oct 2025). https://blog.cloudflare.com/x402/ ; press: https://www.cloudflare.com/press/press-releases/2025/cloudflare-collaborates-with-leading-payments-companies-to-secure-and-enable-agentic-commerce/

[^30]: Apple's Private Cloud Compute architecture (referenced in industry analysis as a template for confidential agent execution). See discussion in McKinsey, "The agentic commerce opportunity." https://www.mckinsey.com/capabilities/quantumblack/our-insights/europes-agentic-commerce-moment-decision-influence-is-here-execution-is-coming

[^31]: Walmart ร— OpenAI Instant Checkout (Oct 14, 2025). https://corporate.walmart.com/news/2025/10/14/walmart-partners-with-openai-to-create-ai-first-shopping-experiences ; CNBC: https://www.cnbc.com/2025/10/14/walmart-openai-chatgpt-shopping.html

[^32]: On algorithmic pricing and tacit collusion โ€” general framing discussed in MDPI agent-blockchain survey[^33]; industry framing in Chainlink, "AI Agent Payments: The Future of Autonomous Commerce." https://chain.link/article/ai-agent-payments

[^33]: "AI Agents Meet Blockchain: A Survey on Secure and Scalable Multi-Agent Systems," MDPI Future Internet 17(2):57. https://www.mdpi.com/1999-5903/17/2/57