Start With the ProblemAn open brief on agentic commerce, nine ideas, and what a structured approach to invalidation leaves you with.

Steven Gerson · December 2025

Context: This article documents how I approached an open product brief on agentic commerce. The brief was set in December 2025 as part of an interview process.

The proposition, a network-level trust layer for AI-agent-driven commerce, is described in full. The approach that got there, generating ideas and systematically invalidating them, is the more transferable part.

1. The Brief

The brief was direct: define a product proposition in the agentic commerce space that Mastercard should consider.

Open briefs like this look like an invitation to ideate. The technology is named, you picture applications, you start building a list. That instinct is worth resisting. Ideas generated quickly against a technology, before the problem space has been understood, are usually solutions in search of problems.

The first move was to reframe the question.

2. Reframe Before You Ideate

The brief asked: what can we build with agentic commerce? That is a solution-space question. It produces a list of things that could be built, but leaves the harder questions open: which of them addresses a real problem, and whether the organisation asking is the right one to build it.

A better question: what does Mastercard need to become to thrive in a world where agentic commerce is mainstream? That reframe puts the organisation’s position and the changing landscape at the centre, rather than the technology. The ideas it generates are different in kind: they are grounded in what the organisation already holds and what the market actually requires, rather than in what the technology makes possible.

What can we build?solution-space

What must we become?problem-space

Original presentation slide: Asking the Right Questions. Natural instinct: What can we build with agentic commerce? A better question: How will agentic commerce change how buying happens? What does Mastercard need to become to thrive in that world? — From the original presentation (December 2025)

Once the question centres on the organisation rather than the technology, the research it demands changes: understand how the landscape is shifting, and where that shift creates problems the organisation is positioned to solve.

3. Understand the Space

That research started with the domain itself: how agentic commerce was developing, what infrastructure it relied on, and where the trajectory would create problems that current systems could not handle. That meant working through Mastercard’s published data on transaction fraud and authorisation patterns, industry analysis of agent adoption across payment ecosystems, and the technical constraints of how agents interact with payment rails today.

Agents are moving from supervised assistants to independent actors: triggered by data, chaining together, one agent’s action triggering another system. Increasingly, the buyer is an autonomous agent.

That shift matters for Mastercard specifically. Their fraud, identity, and risk systems are built on human behavioural patterns. Agent behaviour looks different: high-frequency bursts, event-driven purchasing, adaptive drift. Models trained on human patterns misread agent activity in both directions: a burst of twenty purchases in a minute looks fraudulent under human baselines, but may be routine restocking for an agent. Legitimate transactions get blocked. Compromised agents slip through. Both outcomes erode trust and have a direct commercial cost.

Original presentation slide: Why Current Models Fail. Human behaviour shows a broad predictable rhythm and stable location. Agent behaviours show high-frequency, event burst, multi-peak, and adaptive drift patterns. Human-trained models misread agent behaviour, causing false declines and fraud leakage. — From the original presentation (December 2025)

If the buyer is shifting from humans to agents, the trust model must shift with them. If Mastercard does not set trust standards for autonomous buyers, competing ecosystems will, gradually shifting trust governance away from the card network.

4. The Graveyard

With the landscape understood, the next step was to go wide. Nine ideas were generated for what Mastercard could build in this space, using AI to accelerate the process. The value of generating quickly is that you can invalidate quickly. Each idea was then tested against a structured framework: is the problem real, does the customer feel it, would they change their behaviour, and does this address it better than what they do today?

Problem real?

Customer feels it?

Would change behaviour?

Reasonable alternative?

Theirs to build?

Most collapsed. Some looked strong on first pass.

Idea	Why it collapsed
Autonomous agents negotiating B2B contracts and pricing	Negotiation is a relationship problem. Simpler procurement tools already address the efficiency side. The problem is felt differently than the idea assumes.
Robots and drones autonomously buying their own replacement parts	Industrial systems already have central orchestrators managing procurement. Decentralised purchasing adds cost and risk without solving a problem the market has.
Autonomous energy trading between devices, homes, and EVs	Regulators actively prohibit decentralised grid autonomy. Central coordination is required for safety. The constraint is structural, and the structure is stable.
AI-driven autonomous travel booking agents	Recommendation engines already solve the real pain point, which is discovery rather than execution. The friction in travel is in choosing what to book.

Four of the nine are shown above. The remaining five collapsed for similar reasons: the problem assumption was wrong, the reasonable alternative was adequate, or the organisation had no particular advantage in pursuing it.

Each of these failures is informative in a specific way. An idea that fails the “is the problem real” test is different from one that fails the “is it theirs to build” test. The first requires a different customer. The second requires a different organisation. Knowing which assumption broke tells you what would need to change for the idea to work.

Ideas collapse when the problem they address is felt differently than the idea assumes. Finding which assumption breaks is as useful as finding the idea that works.

5. What Survived

After those eight collapsed, one idea remained: a network-level trust layer that allows issuers to confidently approve AI-agent-driven checkouts. The problem it addresses, rising false declines and missed fraud as agent volume scales, connects directly to the gap identified in section 3. Issuers feel it directly and have a clear financial incentive to solve it.

The proposition has four components:

Agent declaration and identity. Recognising agent-initiated transactions as distinct from human ones.
Delegated permissions. Rules that define what the agent is authorised to do, enforced at the point of authorisation.
Behaviour profiles. Baselines for expected agent activity, establishing what “normal” looks like for a given agent before scoring begins.
Agent-aware fraud scoring. Risk models built specifically for agent behaviour patterns.

Defining the components is one thing. Validating whether they hold requires a specific order: test the cheapest, most consequential question first.

Desirability: issuers already lose revenue to false declines on legitimate transactions. As agent-initiated volume grows, that loss scales with it. A mechanism that makes agent transactions interpretable at authorisation directly reduces that cost. The problem is felt, and it gets worse without intervention.

Feasibility: each component builds on infrastructure Mastercard already operates. Transaction data, identity verification, and risk scoring already flow through the network at authorisation. Extending those systems to handle agent-specific signals is a tractable engineering problem.

Viability: higher approval rates protect issuer revenue on every approved transaction. A network usage fee at authorisation, consistent with how Mastercard monetises existing network services, creates a commercial model that scales with adoption.

Desirability comes first. Confirm that someone wants the thing before investing in whether it can be built.

6. Is It Theirs to Build?

A desirable, feasible proposition still needs to fit the organisation pursuing it. This is the purpose fit question: does the organisation’s existing position give it a natural advantage? The same idea, pursued by a different actor from a weaker starting position, faces a harder road.

Every proposition in this exercise was tested for fit and adjacency: does this idea compound because of the position already held? The question applies regardless of the domain.

For this proposition, the answer was yes. Mastercard already holds network-level authority over global commerce: interoperability across issuers, neutral governance over disputes and liability, and standards authority developed across decades. The agent trust problem requires exactly those assets. Extending network-level trust to autonomous buyers is a continuation of what Mastercard already does, applied to a new category of buyer.

7. The Thinking Behind It

The approach running through each section above (reframe, research, generate, invalidate, test fit) is something I first developed at Experian, in a team that evaluated ideas rigorously before committing to build them. The Experian career story describes that environment in more detail.

The core distinction is between ideas and propositions. An idea is a partly formed thought about what you might build. A proposition frames the same thought from the point of view of the person it would help, making the assumptions explicit: who has the problem, how deeply do they feel it, and would they change their behaviour to address it. That reframing is what turns a thought into something testable.

The discipline is to defer investment. Test desirability before feasibility. Test feasibility before viability. At each stage, spend the least you can to learn the most you need. If an idea fails a test, identify which assumption broke rather than discarding the work entirely. A failed assumption in one context may hold in another.

This brief is a recent example of those principles applied to an open ideation problem. The technology changes. The questions stay the same: what problem does this actually solve, does the customer feel it, what do they do today without this, is this ours to build, and what is the cheapest way to find out whether we are right?