Artificial Intelligence fails, and will always fail

Victor Hugo Germano
2 hours ago
9 min read

Original post in Brazilian Portuguese

I spent a few months collecting AI failure cases to bring another article like the last one about Lies and the Promises of AI , and how we are basically being fooled by half-finished products and incredible promises to keep us running in circles while financial speculation creates the world's new billionaires. But I have found it unproductive, so I will look for a new approach.

You probably already know that AI fails consistently in everyday use. If you’re in technology, you’ve experienced it while using Copilot, while trying ten different prompts to refactor a method, or searching for a new solution to a problem until you realize that it’s hard to trust these systems when your job is on the line: we all come to this realization at some point.

Waymo recalls over 1,000 vehicles due to bug — Bug no Waymo dos outros...

While I was leading AI teams for my company, a recurring conversation with executives during our visits and projects was: " Can you guarantee that the model is 100% accurate? We need 100% accuracy, otherwise this system doesn't work! "

In an attempt to create solutions with Machine Learning and Computer Vision, companies expect technology to solve any and all occurrences of the problem that needs to be solved:

Identify problems in the operation of industrial machinery
Finding leaks in pipes
Occupational Safety and use of PPE
Predict accidents or property security risks
Counting animals in breeding facilities

My answer has always been the same, since 2018, when we bought a Datascience and Machine Learning company :

"It is not possible to guarantee accuracy in probabilistic models based on Machine Learning. Despite the appeal of these systems, there is an inherent process of approximating results from training data, which leads to some degree of uncertainty - this means that continuous work of fine-tuning and improvement will be necessary to achieve any satisfactory result in the long term"

Changing to plain english: "Don't even try, there's no way"

Although there is a recent discussion that it is enough to have more data, and that eventually the models will be able to generalize any situation, the evidence is still not sufficient.

Furthermore, every system in production needs to be adapted over time. This is exactly why all products sold as AI-First, which bring the dream of fully automated and intelligent systems, depend on a legion of people fine-tuning models and updating these systems with new information found in operation - it is the standard because it is the only way to guarantee system consistency.

Every AI project worth its salt works with at least a tradeoff between Accuracy and Performance, and each objective depends on a real analysis of the best approaches to solve the problem. And most of these approaches assume that 100% accuracy is almost impossible. In many cases, the company does not need Machine Learning, and old strategies used by statisticians, such as Regression practices, can achieve quite satisfactory results.

But if you want to use Machine Learning or GenAI with multimodal models, or agent solutions, know that your reality will be quite different from what marketing campaigns make it seem.

From killer drones to self-driving cars , to smart supermarkets , ALL current products use Data Annotators to ensure the dream of autonomous systems. Including using impoverished regions as headhunting targets for large data annotation consultancies (Appen, Playment, Zhubajie, Clickworker).

We want AGI now, not in September — AGI now

Even though influencers, journalists and tool vendors sell the utopia of a post-scarcity world, where technological advances can transform our production system for the better, in the real world things move a little slower. And unfortunately most of us can't wait until 2027 for AGI ( which was 2025 , then became 2028 , but who knows for sure after 2030 ).

All technology adoption takes time and money, but above all, it tends to be co-opted by marketing teams: it is important to describe the investment in projects with the hype of the moment so that the image of innovation can be conveyed to the market.

What you need to know: The vast majority of AI projects today FAIL . A recent IBM study of 2,000 CEOs points out a reality that you won't see in press releases and demos of AI Agents:

• CEOs surveyed report that only 25% of AI initiatives have delivered the expected return on investment (ROI) in recent years, and only 16% have been scaled across the enterprise .

• To accelerate progress, two-thirds (65%) of CEOs surveyed say their organization is prioritizing AI use cases based on ROI, and 68% report their organization has clear metrics in place to effectively measure the ROI of innovation.

• Just over half (52%) of CEOs surveyed say their organization is seeing value from investments in generative AI beyond cost savings.

• 59% of CEOs surveyed admit their organization struggles to balance funding for existing operations and investing in innovation when unexpected change occurs, while 67% say they need more budgetary flexibility to seize digital opportunities that drive long-term growth and innovation.

• By 2027, 85% of CEOs surveyed expect their investments in AI-enhanced efficiency and cost savings to have generated a positive ROI, while 77% expect to see a positive return on investments in growth and expansion through AI at scale.

These numbers are quite different from the flood of comments on LinkedIn from those who say they are transforming their operations to seek greater operational efficiency with AI and deliver more value to customers 🤮

And most companies are playing a game that is rarely discussed, but is more important than successful products: signaling results to investors - so every time you hear a CEO discussing how his company is AI-first, or how the next few years will potentially be filled with abundance and unimaginable transformations, be wary: this conversation is not for you.

As I've said before: In a market that expects accelerated growth, if it doesn't multiply revenue, it's trash. Anything goes to promote the prospect of infinite growth, and executives have the greatest incentives to lie and exaggerate in their statements.

Klarna stops using AI alone and starts hiring agents again — AI chatbots not that good

When you are in an investment round or signaling a new stage of fundraising, it is important to generate engagement through absurd promises, which every startup executive follows to the letter: Sebastian Siemiatkowski, CEO of chatbot service company Klarna, in February for Bloomberg:

"I am of the opinion that AI can already do all the jobs that we humans do."

But it seems that the game has changed, and that although "autonomous" chatbots are cheap, they are not as good as a trained CX support team. The same CEO, this week commented on the return on hiring people for support :

“As cost unfortunately seems to have been too predominant an evaluation factor in the operation, what ended up happening was inferior quality.”

Want to succeed with AI? Work in an industry that does not depend on operational assertiveness - where failure does not put your operation at risk and can be understood as just another sales opportunity: such as fake news , social media engagement and AI Slop - you will never put your operation at risk in these markets using current GenAI models.

Tech won't save us podcast

Does it have a future?

On top of all this, there is growing concern about the current crisis in the job market, and its relationship with the use of artificial intelligence to replace professionals. Brian Merchant takes a deeper look at the current situation , and shifts the perspective to what I also believe is the current reason for layoffs and changes:

"The AI jobs crisis doesn't look like sentient programs popping up all around us and inevitably replacing human jobs en masse. It's a series of management decisions made by executives looking to reduce labor costs and consolidate control in their organizations."

It's interesting to hear something similar to what I've been saying for a long time: You lose your job because of a management decision made by someone who only has an incentive to reduce costs in the short term, and not a long-term vision. The future impact of the decision simply doesn't matter - there's someone saying that using AI will cut costs.

"The connection between AI capabilities and their social and economic impact is weak: The bottleneck to impact is the pace of product development and adoption, not AI capabilities."

Arvind Narayanan, co-author of AI Snake Oil

The more I have studied the impact of using LLM as knowledge mechanisms for professional activities, the more I have encountered real problems and future risks that would be unimaginable in other situations.

AI erodes our ability to think critically about work and life

GenAI solves problems for corporate executives , but almost no real domain problems. Using unauthorized intellectual property, generic and mediocre texts that allow for the reduction of journalist teams, distorted images that take away artists' jobs, padding for mediocre scripts. At the same time, reducing the team brings more profit to companies that opt for technology, even if it is to the detriment of the quality of the content produced: be it code, text, or image.

Understand that GenAI is the perfect tool for the social media machine : generic and fast content is used to generate engagement, which is favored by the social media algorithm, which in turn depends on this engagement to sell views to companies that pay for advertising. GenAI exists to accelerate our society of inattention, in quick posts, catchphrases and short videos. Personally, I refuse to go down that path, and that is exactly why the texts here tend to be worse than what the algorithm favors.

As Ruha Benjamin tells us: we are trapped inside the imagination of those who monopolize power and resources to benefit a few at the expense of the rest of the world. While tech lords invest in space travel and the search for super artificial intelligence, building luxury bunkers to survive the eventual collapse of society, they shout from every corner that poverty reduction through affordable healthcare and social justice are impossible to achieve.

When we are talking about work with serious consequences for people's lives, the evidence is even worse. Since this is a separate world, which I intend to dedicate more time to, I will leave here an important reference, discussing the evident degradation of medical records with the use of AI tools, to the point of rendering such records useless.

Will it ever get better?

What I hope you will do with my texts about AI is that we can define a common ground of skepticism and criticism of this market that only hopes to exploit engagement machines and sell an unattainable utopia. Doubt everything you read about artificial intelligence online, since most of the news seems more like marketing pieces than critical analysis of the current market situation.

In my predictions post earlier this year, I talked about an effort to hold AI tools accountable, discussing how this could be a driver for better tools and less hype. I think I was wrong in my prediction for the healthcare market, but on the bright side: The latest developments in the copyright world have been quite promising:

The US Copyright Office has just released a pre-publication version of the report “ Copyright and Artificial Intelligence Part 3: Generative AI Training ” (pdf), which is super extensive in detailing the research related to the use of intellectual property material for Generative AI training.

The report concludes that “making commercial use of large volumes of copyrighted works to produce meaningful content that competes with them in existing markets, especially when done through illegal access, goes beyond the limits established for fair use.”

And also:

“In our view, American leadership in AI would be best promoted by supporting both of these world-class industries that contribute so much to our economic and cultural advancement. Effective licensing options can ensure that innovation continues to advance without encroaching on intellectual property rights. These groundbreaking technologies should benefit both the innovators who develop them and the creators whose content powers them, as well as the general public.”

This report will be very important for the next process of using restricted material, and it positions itself in favor of the creators (artists, musicians, writers, film directors, etc.)

As this is a relatively polarized world in AI , there has also been a decision in China on the same topic recently:

Two recent court rulings in China are important milestones in the protection of intellectual property related to artificial intelligence. One ruled that AI-generated images are not original and therefore not protected by copyright. The other, for the first time, granted protection to the structure and parameters of an AI model under the country’s Anti-Competition Law. I am delving deeper into this topic, but it is worth mentioning for future study.

And if that weren't enough, companies have increasingly demonstrated that they care little about justice, as long as it favors their interests:

Anthropic uses GPT to file its own copyright lawsuit and fabricates references — Seria trágico se não fosse cômico

A federal judge in San Jose has ordered Anthropic to rule on allegations that one of its experts cited a nonexistent academic paper — likely generated by AI — in court documents in support of a copyright lawsuit brought by major record labels Universal Music Group, Concord and ABKCO.

The quote, which was supposed to support Anthropic’s claims that chatbot Claude rarely reproduces copyrighted song lyrics, turned out to be a “complete fabrication.” It’s like me handing in my master’s thesis that Claude Sonet programmed for me, but in the real world and with real consequences…

And as if that weren't enough, we still have cases of collapse of known LLMs. This time grok has gone crazy and put information about South Africa in generic questions , such as a survey about Hawk Tuah meme, and about landscapes. Another vision for a future in which whoever controls the LLMs dictates what is true in the world.

Grok goes crazy and starts talking nonsense

Upper Echelon has a much more negative view of technology than I do: