Generative Models will kill the Web (as we know it)

How the web will change with the new generative models wave?
I believe the ability to scale up the generation of content that is indistinguishable from human-created material has far-reaching implications on the web landscape as we currently understand it.
In this post I will share my thoughts on the shifts that lie ahead, building upon a marvellous article by James Bekter (OpenAI Engineer and ex-Googler), that serves as a perfect foundation for this discussion.

Signal vs Noise

I see more and more "generative AI experts" rambling about how these new tools will "improve productivity". This so-called improvement often consists in the exponential acceleration of what the content farms already did for the past decade: generation of poor quality and valueless content for the sole purpose of monetisation or SEO indexing.

James Bekter's initial insight struck me deeply: the human-generated information will be eaten away by the AI-generated one to the point that humans just stop producing content.

"[...] as AI-generated information drowns out human-generated information, humans will simply stop producing content. So not only is the noise floor about to rise by an order of magnitude, the signal is going to drop in tandem."
James Bekter

While this scenario seems plausible, I'm optimistic that high-quality signal will rise amidst the deluge of mediocrity.
There is of course a scale problem: the share of subpar content swarming the web will cover almost its entirety with a relatively modest number of respectable sources remaining alive.
But let's put quantity aside for a moment and focus on quality: I believe these few "survivors" will have the chance to emerge as beacons of reliability, originality, and substance.

Consider this: if anyone can effortlessly generate anything (thanks to AI's ability to rephrase or recreate existing content using SEO tricks ranking up the recommender systems), who would be interested?
Why settle for AI-generated replicas when a simple question to an AI assistant yields the same result?
This could be bring to the web obsolescence, but it could also become the catalyst for even greater creativity and genuine value in content creation, reserved for those willing to strive for excellence.

You may argue that the AI-generated wave will eventually "consume" these works too, rephrasing, reproducing and regurgitating them; however, I am confident that the new information paradigm won't be based solely on SEO score anymore: if the search engines want to survive, the signal, although weaker, must become more relevant.

I will further touch this point in the next paragraph, but I anticipate that the current information gatekeepers like Google will be profoundly interested in preserving the fruition of valuable content.
Trustworthiness, originality, and authenticity in what's delivered might take the center stage, and in this context, some blockchain principles could prove invaluable.

A new web economy

The prevailing web economy model propels the acceleration of AI-generated content, with a reward system that prioritise quantity and hype over quality.
I agree with James Bekter in saying that this model will eventually destroy the web as a trustworthy source of information.

As more of the internet becomes AI-generated, humans will no longer be able to effectively use it as an information storage system. They won’t write because their voice will get drowned out by the AI cacaphony. They won’t read because – why would you read AI generated content when you can just talk to your own virtual assistant?
James Bekter

He calls GPT an "information virus that feeds on money", and I fully concur.
Companies like Google have become fertile ground for the proliferation of AI-generated content of any sort; text is the most obvious, but images, audio and, probably as last, video will follow sooner than what we think.

Search engines are currently incapable of any "fake AI-generated content" recognition, and their recommender system won't be able to work reliably.
Indistinguishable from genuine content, both correct and erroneous responses generated by AI models are going to inundate the web. This jeopardizes the credibility of search engines and undermines their foundational purpose.

Should search engines lose their value, no one will use them and no one will be interested in advertising on them, thereby undermining their entire business model.
To ensure their survival as repositories of knowledge and information, companies like Google will soon need to adopt new monetization structures.

Should we trust the future ahead?

While James Bekter envisions a web where published content generates negative revenue as the path towards decentralization and Web 3.0, I believe that the aforementioned qualities of trustworthiness, originality, and authenticity will also play a vital role, if properly valued by whoever will be the "information gatekeeper" (if there will be one).

Some form of decentralized web might very well become the only way for humans to communicate with each other via computers. The only way to halt the information virus is to make the economic value of producing content negative.
James Bekter

People will continue using search engines if they can find relevant content, and a crucial aspect lies in tracking the origin and reliability of such content, thereby crediting the original creators and discouraging the generation of low value outputs.

This shift from the current model focused on quantity and speed could pave the way for a new paradigm where quality and persistency become the primary factors driving content consumption, dissemination and reward.

Signal vs Noise

A new web economy

Should we trust the future ahead?

Leave a Comment Cancel reply