Back to Articles

Your Website Is Invisible to 800 Million People

And robots.txt can't fix it.

February 2, 2026

Every week, 800 million people ask ChatGPT to help them find things. Products. Services. Answers. Companies.

97% of websites have no AI identity.

They're not blocked. They're just... invisible. AI agents don't know they exist.


The New Discovery Layer

Search is fragmenting. Google still dominates, but increasingly, people start with AI:

  • "Hey ChatGPT, find me a contractor in Denver"
  • "Perplexity, what's the best CRM for small teams?"
  • "Claude, summarize the options for X"

These aren't searches. They're conversations with systems that synthesize answers from... somewhere.

If your site isn't in that "somewhere," you don't exist.


The Broken Bargain

For 30 years, the web ran on a simple exchange: crawlers index your content, search engines send you traffic.

AI broke that deal.

Cloudflare's 2025 data shows Anthropic's crawler hit sites 500,000 times for every single visitor it sent back. Five hundred thousand to one.

That's not indexing. That's extraction.

And publishers noticed. 79% of top news sites now block at least one AI crawler. The response is understandable.

But blocking comes with a cost.


The Trap

A recent Wharton/Rutgers study tracked what happened when major publishers blocked AI bots:

  • 23% total traffic loss
  • 14% drop in human traffic (not just bot removal — actual readers)

The mechanism isn't fully understood, but the pattern is clear: block AI, lose visibility. Allow AI, train your competitors for free.

robots.txt gives you exactly two options: block or allow. On or off. Exploited or invisible.

There's no way to say: "Yes, you can read this. No, you can't train on it. Here's the version I want you to see."


The Third Option

That's why I've been working on Machine Web Protocol (MWP).

MWP is a simple protocol that lets websites publish AI-readable content on their terms:

  • Selective exposure: Decide what AI agents see
  • Machine-optimized format: Plain text, structured for LLM consumption
  • Terms built in: Signal how your content can be used

It's not a replacement for your website. It's a parallel channel — one designed for machines.

Think of it like RSS, but for AI agents.


How It Works

A site implementing MWP adds a /machine/ directory with plain text files:

Source-URL: https://acme.com/about
Published: 2026-01-15T10:00:00Z
Author: Acme Corp
Title: About Acme Corp
Categories: company, widgets, enterprise
---
We're a widget company founded in 2019. We make 
enterprise-grade widgets for teams who need reliability...

Header declares metadata. Body provides clean, machine-readable content. No HTML, no JavaScript, no parsing required.

AI agents check for MWP before (or instead of) scraping HTML. They get clean content. You control what they see.

Simple. No JavaScript rendering. No parsing complex DOM structures. Just text.


Why Now

The window is closing.

News publishers expect 43% of their search traffic to disappear in the next three years. Google's AI Overviews now appear on 60% of searches. The answer box is eating the click.

The old playbook — SEO, backlinks, keyword optimization — was built for a world where humans click through to read. That world is shrinking.

The sites that thrive in the AI era won't be the ones with the best meta tags. They'll be the ones with the best machine interfaces.


Check Your Site

Before you do anything else: find out where you stand.

I built a free tool that checks your site's AI visibility in about 10 seconds:

machinewebprotocol.com/check

It'll tell you what AI crawlers can see, what you're blocking, and whether you have any AI identity files in place.


The Spec Is Open

MWP is open source (Apache 2.0).

This isn't a product pitch. It's a protocol proposal. I think the web needs this, and I'm building it in public.

If you have feedback, open an issue. If you think this is the wrong approach, tell me why.

The current system — block or be exploited — isn't sustainable. We need a third option.

MWP is my attempt to build one.


Mike Bumpus builds AI infrastructure at DigitalEgo. He's currently focused on agent protocols, traceable reasoning systems, and making sure the web doesn't accidentally become invisible.