GuideJune 23, 2026 · 8 min read

Agent washing: how to tell a real autonomous agent from a rebranded chatbot before you sign

Gartner estimates only about 130 of thousands of “agentic AI” vendors are the real thing. A buyer’s field guide to spotting rebranded chatbots and automation before procurement.

By the RankShield Helix team · Published June 23, 2026

SCROLL TO READ ↓

Every vendor deck now says “agentic.” The label got popular faster than the technology did, and the gap between the two has a name: agent washing — rebranding chatbots, robotic process automation, and assistants as autonomous agents. Gartner put a number on it, estimating only about 130 of thousands of agentic-AI vendors are the genuine article. That is not a rounding error; it is the market. For a buyer, the risk is concrete. Gartner predicts more than 40% of agentic AI projects will be canceled by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls. Some of that is agent washing coming due — software that could never do what the label promised. This guide gives you a field test. It separates the four capabilities a real agent has from the ones a chatbot only imitates, then puts the sharpest question at the top: when the agent acts, can it prove what it did? An accountable agent produces a record you can check yourself — not a screenshot you have to trust.

Key takeaways

Agent washing is real and quantified: Gartner estimates only about 130 of thousands of self-described agentic vendors are genuine, and predicts more than 40% of agentic AI projects will be canceled by the end of 2027 on cost, unclear value, and weak controls.
A real agent clears four bars a chatbot can’t: goal-directed action, tool use with real effect, bounded and reversible autonomy, and an independently verifiable record of what it did.
Make “can it prove its actions?” the first question in your buyer’s test — an accountable agent produces verifiable receipts you can check yourself, not screenshots and internal logs you have to trust.

What “agent washing” means and why Gartner counted only about 130 real vendors

Agent washing is the practice of dressing up existing software — chatbots, RPA scripts, virtual assistants — in the language of autonomous agents without adding the underlying capability. The word does the selling; the product stays the same. Gartner named the pattern directly and estimated that only about 130 of the thousands of vendors claiming to be agentic actually are. The distinction matters because the two categories fail differently. A chatbot that overpromises frustrates users. An “agent” trusted with consequential work it cannot safely perform creates real exposure.

The cost of getting this wrong is already visible in the forecast. Gartner predicts more than 40% of agentic AI projects will be canceled by the end of 2027, pointing to escalating costs, unclear business value, and inadequate risk controls. Read that list as a buyer, not a headline: two of the three failure modes are things you can screen for before you sign. Unclear value and weak controls are questions you ask in procurement — and the vendors who cannot answer them are usually the ones who leaned hardest on the label.

The four capabilities a real agent has that a chatbot doesn’t

Strip away the marketing and an autonomous agent is defined by what it can do, not what it can say. A chatbot responds inside a conversation; an agent pursues a goal across steps, tools, and systems, and stays accountable while it does. Four capabilities separate the real thing from a rebrand. Use them as a checklist — a genuine agent clears all four, and most agent-washed products fail at least one, usually the last.

Goal-directed action: it plans and executes multi-step work toward an objective, rather than answering one prompt at a time.
Tool use with real effect: it can call tools and take actual actions in your systems — not just describe what someone should do.
Bounded, reversible autonomy: its actions have limits and an undo path, so a wrong move is contained rather than catastrophic.
Verifiable accountability: it produces an independently checkable record of what it did, so you can confirm actions without taking the vendor’s word.

The verifiability test: can the vendor prove each action independently?

The first three capabilities are about power — can it act? The fourth is about trust — can it prove it acted correctly? Put that question at the top of your evaluation, because it is the one agent washing cannot fake. A chatbot in a costume can be scripted to look goal-directed in a demo. What it cannot produce is a record of its actions that you can verify yourself, without trusting the dashboard that generated it. “Independently verifiable” has a precise meaning: a third party, or you, can confirm the agent did what it claims using evidence the vendor cannot quietly alter after the fact.

This is where most “agent” pitches quietly downgrade. Ask how you would prove, three months from now, that a specific action happened as recorded — and watch whether the answer is a verifiable receipt or a screenshot. Internal logs are better than nothing, but logs a vendor controls prove very little when something goes wrong. The stronger answer is a cryptographically signed, tamper-evident trail: each action carries its own proof, anchored so that after-the-fact edits are detectable. RankShield’s helix is built around exactly this — an agent that can prove what it did, not merely assert it. Make “can it prove its actions?” the first line of your buyer’s test, not the last.

Real agent, or a chatbot in a costume?

Score a vendor against the same five checks a real evaluation uses. Answer honestly — the bands are calibrated to Gartner’s finding that most self-described agentic vendors are not the real thing.

VENDOR SCORER

Real agent, or a chatbot in a costume?

1. Does it take multi-step action toward a goal, not just answer questions?

2. Can it use tools and take real actions in your systems?

3. Are its actions bounded and reversible?

4. Does it produce an independently verifiable record of what it did?

5. Can you watch it and stop it in real time?

0 / 5 answered

Ten questions to put in your RFP

Turn the field test into procurement language. Drop these ten questions into your RFP verbatim; the ones a vendor dodges tell you as much as the ones they answer. Weight the verifiability and control questions most heavily — they are the hardest to agent-wash and the most expensive to discover you lack after go-live.

Describe a task the product completes end to end, autonomously, and name the specific steps it plans and executes.
Which real actions can it take in our systems, and which are read-only or human-in-the-loop?
What bounds constrain its autonomy, and how do we configure or tighten them?
When it takes a wrong action, what is the undo path and how fast does it reverse?
How does it produce an independently verifiable record of each action — and can we check that record without your dashboard?
Is the action trail tamper-evident, so an after-the-fact edit would be detectable by us or a third party?
Can we watch the agent live and stop it mid-task with a kill switch?
What happens to in-flight actions when we halt it — are they rolled back or left partial?
What concretely distinguishes this from the chatbot, RPA, or assistant we may already own?
What is the measurable business outcome, and how do we verify it rather than take it on faith?

Sources

See it run — and prove it.

Autonomous, quantum-safe, and verifiable, for enterprise and small business.

Get started →How the core works

What “agent washing” means and why Gartner counted only about 130 real vendors

The four capabilities a real agent has that a chatbot doesn’t

The verifiability test: can the vendor prove each action independently?

Real agent, or a chatbot in a costume?

Real agent, or a chatbot in a costume?

Ten questions to put in your RFP

See it run — and prove it.

Hiding in plain sight: the AI already running your business (and why no one can prove it)

The 3 a.m. problem: what your business does while you sleep