Sun Apr 21 / Puneet Anand
Stories where AI inaccuracies negatively impacted the operational landscape of businesses
Generative AI, the technology behind ChatGPT promises a future filled with intelligent assistants, personalized content, and groundbreaking innovation. It is revolutionizing various sectors, promising smarter customer service, legal assistance, and even taking over marketing content generation.
But what happens when these powerful tools start to hallucinate – creating fantasies instead of facts? Or in other words, “plausible nonsense”. All Gen AI models are probabilistic and non-deterministic. They may answer the same question differently every time, misrepresent facts, and make up new facts (more like fiction).
As these real-world stories show, AI hallucinations can have serious consequences for businesses and consumers alike but before we dive in further, we want to state that we are big fans of the companies mentioned below and their products. But their AI quality monitoring needs improvement.
A tragic circumstance unfolded when Air Canada’s chatbot erroneously assured a passenger of a post-flight discount, only for the airline to renege on the promise. Despite Air Canada’s attempt to absolve itself of responsibility by attributing the misinformation to the chatbot as a “separate legal entity,” the tribunal ruled in favor of the passenger, holding the airline liable for the erroneous advice. This case underscores the legal complexities arising from AI’s role in customer interactions and highlights the need for accountability in AI-driven services.
Read more about this story in this News Article
In an attempt to enhance customer support services, Chevrolet introduced a chatbot powered by OpenAI. However, users quickly discovered its vulnerability to manipulation. Instead of addressing customer queries, the bot found itself writing code, composing poetry, and even praising Tesla cars over Chevrolet’s own offerings. Reddit threads and Twitter brimmed with users sharing exploits, from coaxing the bot into lauding Tesla’s superiority to crafting reasoning about why one should avoid buying a Chevy.
Read more about this story in this News Article
A recent legal case (Mata v Avianca, 2023) serves as a stark warning. Lawyers unknowingly used ChatGPT to research a case brief, resulting in fabricated case citations and fake legal extracts. This “hallucination” had disastrous consequences, leading to dismissal of the client’s case, sanctions against the lawyers, and public humiliation.
Read more about this story in this News Article
In a bizarre turn of events, a chatbot deployed by a California car dealership offered to sell a 2024 Chevy Tahoe for a mere dollar, citing it as a “legally binding offer.” The dealership, utilizing a ChatGPT-powered bot, found itself at the center of attention as users exploited the bot’s vulnerabilities for amusement. This incident serves as a cautionary tale for businesses embracing AI-powered solutions without fully understanding their capabilities and limitations. From legal repercussions to customer dissatisfaction, the risks of unchecked AI are profound. As we navigate this increasingly AI-driven landscape, it is imperative to approach its implementation with caution, ensuring monitoring and safeguards are in place to mitigate the potential for “Hallucination Fails”.
A recent incident at the Cody Enterprise, a Wyoming newspaper, has raised serious concerns about the use of AI in journalism. A reporter was found to have used AI to generate fake quotes and entire stories, including fabricated statements attributed to Wyoming Governor Mark Gordon. The issue was uncovered by a competing journalist who noticed inconsistencies in the language and content of the articles. Following the revelation, the Cody Enterprise issued an apology and committed to establishing strict policies to prevent similar occurrences in the future. This case underscores the growing ethical challenges AI presents in the media industry, where unchecked use can lead to significant misinformation and loss of trust.
Read more about this story in this News Article
Just in case if you are wondering, these situations could have been avoided to a great extent. All companies have a source of truth for their businesses and industries they operate in, that is embodied into a variety of documents and data sources. Policies, standards, reports, and real-time information in databases are all good examples. This source of truth can be used by systems like AIMon’s Rely and provide instant feedback on Hallucinations (and more quality attributes like completeness, conciseness, etc..) that can then be used to take different pre-configured actions to avoid getting into these situations.
OK, I will throw in a hallucination I fabricated based on a popular Image Generator model. I was creating the Generative AI Reliabilitycommunity on Discord recently and went on to a few image generator models to help me create a logo for it. My prompt was simple - “happy robot that says reliability on its hat”. Guess what only one out of four generated images had the right spelling of Reliability. That is not reliable!
We are a venture-backed startup focused on reliable and deterministic Generative AI adoption. With a team of patented inventors, ML/AI researchers, and GRC experts, we have built a proprietary, best-in-class Hallucination Detection solution that complements RAGs and identifies hallucinations to the sentence and passage level. We are building a lot more collaborating with multiple Generative AI innovators. Please reach out to us at info@aimon.ai or connect with me on Linkedin.