AI's Wild Ride in 2024

alt_text

Image courtesy of imagen3

2024 has been wild when it comes to AI. There have been a lot of good products such as NotebookLM and Cursor ( the latter of which just snuck in to make the list!) to name my two standouts of the year.

AI for good such as GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy - Google DeepMind and AlphaFold . The latter was given recognition with the Nobel prize for chemistry . Physics also got in on the act and the Nobel prize for physics was also given out for AI advancements.

Agents have evolved from essentially function calling to something a little more complicated ( I’ll dig into that another time ) with all the big AI players making some sort of agent related announcements. Introducing Gemini 2.0: our new AI model for the agentic era , AI agents — what they are, and how they’ll change the way we work - Source , Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku \ Anthropic and AWS What are AI Agents? - Artificial Intelligence just for starters . OpenAI hadn’t actually announced their’s at the time of writing, but guess it will be one of their 12 Days of OpenAI announcements.

So it’s not all doom and gloom but my GAI is going well collection continues to grow and grow as the adverse effects get ever wilder and in many cases just horrific.

Summarising the trends seen over the year in the in the wild collection of articles

As a result of this curated list “AI slop” is probably my new phrase of 2024!

The mitigations listed in mitigations and toolings imho are really struggling to stay ahead of the game . As soon as something to mitigate various threats and adverse outcomes of using LLM based applications is released such as Prompt Guard-86M | Model Cards and Prompt formats a way to circumvent the guardrail is soon found.

It’s a game of whack a mole .

Watermarking although laudable with solutions such as Watermarking AI-generated text and video with SynthID - Google DeepMind and Meta debuts a tool for watermarking AI-generated videos | TechCrunch which are both ways to show that AI images and videos were AI generated. However they work in different ways, rely on you using specific models and as discussed in The AI Act’s AI Watermarking Requirement Is a Misstep in the Quest for Transparency – Center for Data Innovation having an act that requires watermarking before there is any common standard is a misstep plus bad guys probably won’t be adhering to what lawmakers ask .

Efforts to regulate AI are ongoing and there is an obvious struggle between getting the balance right so innovation isn’t snuffed out or stifling startups who hopefully are working on things more impactful for good rather than yet another AI companion that feeds into their users’ insecurities. It would be a fair assessment to say that the rapid development and deployment of generative AI has outpaced regulatory frameworks, leading to a “wild west” scenario with insufficient safeguards against potential harms.

Summarising the themes collated in the regulating ai advisories collection of articles

The research in this area is broad and the articles collected in the Opinions, research and presentations list reflect this .The list paints a picture of the complexities and challenges of LLMs. While the potential benefits of LLMs are significant, there’s a growing awareness of their limitations and the need to address the associated risks. The main themes covered in this list are :

Growing Awareness of LLM Limitations:

  • Hallucinations: Many articles and research papers discuss the persistent issue of LLMs generating incorrect or nonsensical information, even in highly advanced models. This “hallucination” problem is a significant barrier to trust and reliability.
  • Bias and Safety: Concerns about bias in training data and the potential for LLMs to be used for malicious purposes are prominent. Research highlights how LLMs can reflect the biases of their creators and be vulnerable to jailbreaking attempts.
  • Diminishing Returns: There’s a growing recognition that simply scaling up LLM size and computing resources may not lead to continuous improvements in performance. Some sources suggest that the pace of innovation is slowing and that the industry may be hitting a wall.

Security and Privacy Risks:

  • Data Poisoning and Leakage: Several articles discuss the vulnerabilities of LLMs to data poisoning attacks and the potential for them to leak sensitive information, including training data and user prompts.
  • Cybersecurity Threats: The use of LLMs in cyberattacks is a growing concern, with research highlighting how they can be used to enhance phishing, social engineering, and other malicious activities.
  • AI-Generated Code Security: There are concerns about the security of code generated by LLMs, with research indicating they can introduce vulnerabilities and invent non-existent package names.

Societal and Ethical Implications:

  • Job Displacement: The potential for LLMs to displace human workers, particularly in creative and junior roles, is a recurring theme.
  • Misinformation and Deepfakes: The use of LLMs to generate convincing fake content, including news articles, social media posts, and even deepfakes, is raising alarm bells about the spread of misinformation and the erosion of trust.
  • AI Consciousness and Welfare: Philosophical questions about AI consciousness and the ethical treatment of AI agents are being explored, highlighting the potential for societal division and the need for responsible AI development.

Open Source vs. Closed Models:

  • Open Source Momentum: There’s a push for open-source AI models to foster transparency, collaboration, and innovation. However, there are also concerns about the legal and ethical implications of open-sourcing AI technology.
  • Closed Model Concerns: The dominance of closed models from companies like OpenAI and Google raises questions about control, access, and the potential for these companies to stifle competition.

Regulatory and Mitigation Efforts:

  • Regulation: The need for regulation to address the risks and challenges of LLMs is a common thread. Articles discuss the EU AI Act and other efforts to regulate AI development and deployment.
  • Mitigation Techniques: Research is exploring various techniques to mitigate the risks of LLMs, including jailbreak detection, data curation, and safety alignment

My hope for 2025 is that we continue to see LLMs being used as the foundation to continue to do amazing things like Alphafold , that the mitigation list is way longer and the mitigations more effective . Finally I’d hope humans would just be less horrible to their fellow humans and stop using AI to do terrible things.

A short footnote : Announcing the OWASP LLM and Gen AI Security Project Initiative for Securing Agentic Applications just in time so I could mention it here. AI Agents are the new hotness with new and novel threat vectors joining the over crowded party so it was great to see Owasp are going to tackle this head on.

🎄Seasons greetings and a new year resolution from me will be to continue curating the lists in 2025