Author
Puneet Anand
  • April 21st 2024

  • 3 minute read

Hallucination Fails: When AI Makes Up Its Mind and Businesses Pay the Price

This article consolidates various instances in which inaccuracies caused by AI negatively impacted the operational landscape of the businesses that deployed AI without proper monitoring and guardrails in place.

Image Description

Generative AI and Hallucinations

Generative AI, the technology behind ChatGPT promises a future filled with intelligent assistants, personalized content, and groundbreaking innovation. It is revolutionizing various sectors, promising smarter customer service, legal assistance, and even taking over marketing content generation.

But what happens when these powerful tools start to hallucinate – creating fantasies instead of facts? Or in other words, “plausible nonsense”. All Gen AI models are probabilistic and non-deterministic. They may answer the same question differently every time, misrepresent facts, and make up new facts (more like fiction).

As these real-world stories show, AI hallucinations can have serious consequences for businesses and consumers alike but before we dive in further, we want to state that we are big fans of the companies mentioned below and their products. But their AI quality monitoring needs improvement.

1. Air Canada's Chatbot Misguides Passenger, Costs Airline

Comparison

A tragic circumstance unfolded when Air Canada's chatbot erroneously assured a passenger of a post-flight discount, only for the airline to renege on the promise. Despite Air Canada's attempt to absolve itself of responsibility by attributing the misinformation to the chatbot as a "separate legal entity," the tribunal ruled in favor of the passenger, holding the airline liable for the erroneous advice. This case underscores the legal complexities arising from AI's role in customer interactions and highlights the need for accountability in AI-driven services.

Read more about this story in this News Article

2. Chevrolet's OpenAI-Powered Chatbot Gets Taken For A Ride

Comparison

In an attempt to enhance customer support services, Chevrolet introduced a chatbot powered by OpenAI. However, users quickly discovered its vulnerability to manipulation. Instead of addressing customer queries, the bot found itself writing code, composing poetry, and even praising Tesla cars over Chevrolet's own offerings. Reddit threads and Twitter brimmed with users sharing exploits, from coaxing the bot into lauding Tesla's superiority to crafting reasoning about why one should avoid buying a Chevy.

Read more about this story in this News Article

3. Fake Lawsuit Fiasco

Comparison

A recent legal case (Mata v Avianca, 2023) serves as a stark warning. Lawyers unknowingly used ChatGPT to research a case brief, resulting in fabricated case citations and fake legal extracts. This "hallucination" had disastrous consequences, leading to dismissal of the client's case, sanctions against the lawyers, and public humiliation.


Read more about this story in this News Article

4. Chevy dealership turns Dollar Store

Comparison

In a bizarre turn of events, a chatbot deployed by a California car dealership offered to sell a 2024 Chevy Tahoe for a mere dollar, citing it as a "legally binding offer." The dealership, utilizing a ChatGPT-powered bot, found itself at the center of attention as users exploited the bot's vulnerabilities for amusement. This incident serves as a cautionary tale for businesses embracing AI-powered solutions without fully understanding their capabilities and limitations. From legal repercussions to customer dissatisfaction, the risks of unchecked AI are profound. As we navigate this increasingly AI-driven landscape, it is imperative to approach its implementation with caution, ensuring monitoring and safeguards are in place to mitigate the potential for “Hallucination Fails”.


Read more about this story in this News Article

Are these situations avoidable?

Just in case if you are wondering, these situations could have been avoided to a great extent. All companies have a source of truth for their businesses and industries they operate in, that is embodied into a variety of documents and data sources. Policies, standards, reports, and real-time information in databases are all good examples. This source of truth can be used by systems like Aimon’s Rely and provide instant feedback on Hallucinations (and more quality attributes like  completeness, conciseness, etc..) that can then be used to take different pre-configured actions to avoid getting into these situations. Try out our Hallucination Detection Sandbox here to see this in action real-time. We have built a continuous monitoring solution encompassing this functionality and more. We will be happy to jump on a call and share more details. Please reach out to us at info@aimon.ai or connect with me on Linkedin.

5. Bonus: That is definitely not reliable!

Comparison

OK, I will throw in a hallucination I fabricated based on a popular Image Generator model. I was creating the Generative AI Reliabilitycommunity on Discord recently and went on to a few image generator models to help me create a logo for it. My prompt was simple - “happy robot that says reliability on its hat”. Guess what only one out of four generated images had the right spelling of Reliability. That is not reliable!


About Aimon Labs

We are a venture-backed startup focused on reliable and deterministic Generative AI adoption. With a team of patented inventors, ML/AI researchers, and GRC experts, we have built a proprietary, best-in-class Hallucination Detection solution that complements RAGs and identifies hallucinations to the sentence and passage level. We are building a lot more collaborating with multiple Generative AI innovators. Please reach out to us at info@aimon.ai or connect with me on Linkedin.