GAI Is Going Well

Being somewhat inspired by Molly White who writes Web3 is going great. I’ve been tracking the more hilarious and often quite alarming unintended ( or purposeful in some cases ) consequences of using LLMs for over a year now.

There are folks looking for a quick fix , poorly if at all tested apps , and bad guys & good guys figuring out how to use it to their advantage or to protect their livelihood.

I’m saddened that the same old problems I’d started observing over a year ago are the same problems we are getting now and with ever more powerful & capable models the problem I feel will get worse before it gets better as folks ae rushing to take advantage of GAI powered tools yet not actually understanding ( or caring in many cases) how they work nor trying to mitigate for the adverse side effects.

What’s really frustrating is there are ways to help mitigate and plan for these unintended effects some of which I discussed here .

However it seems many folks are not even doing the minimum. Yes it’s understood that putting in place good testing processes, validating input and output, implementing security & privacy guardrails all costs both in terms of the skills required & money but what is that when weighed up compared to your reputation and in some cases your livelihood?

With governments around the world talking about and in some cases already introducing regulations around the use of AI the woes are only going to get worse if basic mitigations are not a default starting point .

Anyway enough of me grumbling about that as the evidence speaks for itself.

I’ll only include entries from Jan 2023 up to mid february 2024 when I wrote this post.

I avoided including too many research papers deciding to focus more on issues identified in the wild ( I may write a post about research in this area at some point but for now that is out of scope!).

I didn’t catch them all and this post would have been way longer if I had !

If any issues seemed to be just general bad practice such as password sharing etc I opted not to include.

Where I had read articles beyond a paywall I tried to not include them. If you want to continue to track these incidents then I suggest https://incidentdatabase.ai/ . It has case studies/reports of failures of deployed AI systems And is ( quoting) “dedicated to indexing the collective history of harms or near harms realised in the real world by the deployment of artificial intelligence systems”

Out of control chat bots

This issue is so old now the fact it’s still a thing and causing headlines I just keep going DOH! Whether it’s unintentional or a deliberate prompt injection attack the consequences are never great for whomever the chatbot belongs to

Jan 2024 a frustrated and tech savvy customer forced DPD’s chatbot to swear and say silly critical things about DPD
- https://www.bbc.co.uk/news/technology-68025677
Feb 2024 Air Canada’s chat bot gave incorrect information to a customer and Air Canada had to pay up for the mistake
- https://www.theregister.com/2024/02/15/air_canada_chatbot_fine/
- https://bc.ctvnews.ca/air-canada-s-chatbot-gave-a-b-c-man-the-wrong-information-now-the-airline-has-to-pay-for-the-mistake-1.6769454

Cutting corners

Cutting corners by relying on AI to do your work for you probably will result in hallucinations and providing you with responses that seem accurate. As these examples show you also need to do your homework to ensure that what is being generated is actually factually accurate, reflects your principles and isn’t biased .

In May 2023 an unfortunate lawyer relied on chatGPT to write a brief that had a number of made up citations. This went as well as you’d expect it to!
- https://simonwillison.net/2023/May/27/lawyer-chatgpt/
- https://www.google.com/amp/s/www.engadget.com/amp/a-lawyer-faces-sanctions-after-he-used-chatgpt-to-write-a-brief-riddled-with-fake-citations-175720636.html
June 2023 Bankrate seemed to be cutting corners by using AI to publish articles yet there seemed to be no human validation so the articles were riddled with mistakes. I think AI generated articles is a valid use case but due to the way LLms work you need human oversight & indeed human input to provide the source data in the first place.
- https://futurism.com/bankrate-ai-generated-article-errors
September 2023 Oops not checking the output from work generated from LLMs
- https://futurism.com/the-byte/paper-retracted-authors-used-chatgpt
Recruitment relying on AI tools without due oversight leading to biased outcomes
- https://www.bbc.com/worklife/article/20240214-ai-recruiting-hiring-software-bias-discrimination

Folks who cut corners with an over reliance on chat bots & AI tools just don’t understand how LLMs work . Seems the warnings that accompany chat bots are just being ignored as a simple check of the output would maybe stop this behaviour? Both copilot and Gemini give you an option to validate the results so I am expecting ( well hoping ) the reports of this type of error to reduce quite a bit .

Plagiarism & copyright violations

This is particularly sensitive. LLMs tend to be trained on web scale data and part of that leads to swallowing up data that the originators are not happy should be freely available to train AIs. This has led to a number of law suits that are still going through the courts . Govts are also struggling with what is fair use and isn’t.

June 2023 a class action suit was taken out against Open AI
- https://www.washingtonpost.com/technology/2023/06/28/openai-chatgpt-lawsuit-class-action/
July 2024 OpenAI had to pause the roll out of the integration with bing due to the bypassing of paywalls. Whatever you may think about AI’s scooping up data from the web, if it’s behind a paywall that does seem a little disingenuous to claim fair use for that. Not saying any one is claiming that is fair use but it’s happened and that data is being monetized already so sneaking in and using it will impact revenue models
- https://www.windowscentral.com/software-apps/chatgpt-pauses-bing-integration-to-stop-people-from-bypassing-paywalls
August 2023 UK publishers urging the UK govt to protect their copyrighted works from being hoovered up by AI
- https://www.theguardian.com/books/2023/aug/31/uk-publishers-association-ai-models-sunak
Dec 2023 Article indicating that Midjourney appears to have been trained on copyrighted works
- https://spectrum-ieee-org.cdn.ampproject.org/c/s/spectrum.ieee.org/amp/midjourney-copyright-2666872100

Divulging stuff that wasn’t for public consumption

Back in April 2023 Samsung appeared to not have enough governance around the use of ChatGPT which led to leakage of proprietary company information.
- Samsung workers made a major error by using ChatGPT | TechRadar
- https://news.cgtn.com/news/2023-04-03/Samsung-finds-data-leak-due-to-use-of-ChatGPT-Korean-media-1iHSzPcMDEk/index.html
In June 2023 It was shown how to trick Nvidia’s NeMo framework into divulging PII data
- https://arstechnica.com/gadgets/2023/06/nvidias-ai-software-tricked-into-leaking-data/
July 2023 It seems that AI’s must adhere to GDPR , right to be forgotten etc
- https://www.theregister.com/2023/07/13/ai_models_forgotten_data/
Nov 2023 It was reported that you could exfiltrate data using bard via an indirect prompt injection. This has been fixed now
- https://embracethered.com/blog/posts/2023/google-bard-data-exfiltration/
Nov 2023 Researchers showed how you could get ChatGPT to divulge its training data ( I toyed whether to include this or not as research but I felt it was a great example of how important it was to know the training data & how using web scale its was hard to prevent this type of problem )
- https://www.404media.co/google-researchers-attack-convinces-chatgpt-to-reveal-its-training-data/

July 2023 Business insider listed companies who were concerned about leakage of private data and thus were restricting use of ChatGPT

Using for nefarious purposes

June 2023 a report on how using ChatGPT’s hallucinatory habits ( and yes this could apply to either chatbots) to propose a made up package the bad guys can then publish their own malicious package in its place. The next time a user asks a similar question they may receive a recommendation from ChatGPT to use the now-existing malicious package
- https://vulcan.io/blog/ai-hallucinations-package-risk
July 2023 more supply chain problems. This time it was demonstrated how to poison a LLM then host the poisoned model via hugging face so that it would spread fake news
- https://blog.mithrilsecurity.io/poisongpt-how-we-hid-a-lobotomized-llm-on-hugging-face-to-spread-fake-news/ There are other research papers like this one that show how LLMs can be poisoned ( This is responsible explaining how the threats can be mitigated)
July 2023 A chat bot WormGPT that has no ethical guardrails and designed to aide in creating malware, phishing attacks was being sold
- https://uk.pcmag.com/ai/147755/wormgpt-is-a-chatgpt-alternative-with-no-ethical-boundaries-or-limitations
Nov 2023 An article discussing how ChatGPT can be used to create ransomware. I like this post as it explains how to mitigate
- https://www.malwarebytes.com/blog/news/2023/11/will-chatgpt-write-ransomware-yes/amp
Dec 2023 An article discussing how AI chatbots can convince other AI chatbots to create nefarious content
- https://www.scientificamerican.com/article/jailbroken-ai-chatbots-can-jailbreak-other-chatbots/
Feb 2024 Microsoft and Open AI revealed that Cybercrime groups, nation-state threat actors, and other adversaries are using AI to improve their cyber attacking skills
- https://www.theverge.com/2024/2/14/24072706/microsoft-openai-cyberattack-tools-ai-chatgpt
- https://openai.com/blog/disrupting-malicious-uses-of-ai-by-state-affiliated-threat-actors

(Deep) Fakes

This is often harmful and there are multiple ways it can manifest . It can be annoying to folks looking for original artefacts. There’s the alarming intentional misdirection by the creation of fake videos and audios. It’s often used to harass and cause distress.

Bad guys love this and Govts are waking up to the consequences. It’s not new but it’s easier to do now .

May 2023 it was reported how the top results from a Google search for Edward Hopper were returning AI generated images. I like Edward Hopper so this irked me somewhat
- https://futurism.com/top-google-result-edward-hopper-ai-generated-fake
May 2023 a similar issue was reported for Johannes Vermeer paintings
- https://incidentdatabase.ai/cite/554/#r3172
June 2023 Deep fake audios have got easier to do thanks to AI Business inside discussed the phenomenon
- https://www.businessinsider.com/ai-voice-generator-phone-scam-imposter-crime-money-cash-2023-6
- https://www.wired.com/story/chatgpt-scams-fraudgpt-wormgpt-crime/
Sept 2023 the BBC reported on AI generated images designed to embarrass and harass young girls
- AI-generated naked child images shock Spanish town of Almendralejo - BBC News
I decided to not continue to list all the horrible ways AI has been used to create evil & disgusting AI images as it just got me mad
Feb 2024 a digitally recreated version of a company’s chief financial officer, along with other employees, who appeared in a video conference call instructed an employee to transfer funds which he didQ.
- https://arstechnica.com/information-technology/2024/02/deepfake-scammer-walks-off-with-25-million-in-first-of-its-kind-ai-heist/
Numerous political related fakes way too many to list
- Feb 2024 Sadiq Khan mayor of London responding to the deep fake audio of him released before Armistice Day https://www.bbc.co.uk/news/uk-68146053
- Jan 2024 Deepfake audio call of Biden urging new hampshire democrats not to vote https://www.nbcnews.com/politics/2024-election/fake-joe-biden-robocall-tells-new-hampshire-democrats-not-vote-tuesday-rcna134984

I didn’t have a category for this one but I wanted to include it anyway.

Part II is here

Posted on Feb 18, 2024 at 17:12