Attacks against AI systems

Uploaded On: 22 Feb 2024 Author: Abhijit Limaye Like (83) Comment (0)

2023 - The Year of AI 

The year 2023 was a pivotal year for Artificial Intelligence or AI. The turning point came when OpenAI released ChatGPT, their Large Language Model (LLM), in November 2022. 

‘BigTech’ took notice and all major players rushed to release their own AI models.

Systems like ChatGPT, DALL-E2, Midjourney and Stable Diffusion achieved mainstream adoption. 

As with any technology released to users for public use, we see extremely fast user adoption. With AI, we saw not a wildfire adoption but integrations into mainstream commercial products like Bing search and Microsoft Office productivity Suite.

This widespread adoption and integration of AI technologies is not without concerns around workforce impact, ethical use and increasing attacks against AI systems. 

The blog explores the most important challenges and attacks related to AI. 

Impact on Employment Landscape

As AI systems become more prevalent, discussions about their potential impact on employment have gained traction. While certain jobs may be automated, others are created to support and maintain these advanced systems. The rise of jobs like prompt engineering exemplifies this shift, emphasizing the dynamic nature of the job market as it adapts to the AI revolution.

Impact on Education

Students and members of the teaching faculty are both going to see a very different impact of AI. Students can vastly benefit from the capabilities of models like ChatGPT which can summarize a complex topic or ‘teach as if I am 5 years old.  The challenges with faculty are very different. They must quickly adapt their teaching style by incorporating additional information, facts, data or explanations beyond the standard textbook.  If they do not quickly learn to work with AI, we will see some obsolescence of teachers who rely solely on textbook based teaching.

Inadequate Safeguards and Rule-Bending

The deployment of technology often precedes the establishment of robust safeguards. Users, in their quest to maximize the capabilities of AI, frequently find ways to push the boundaries and bend the rules. This pattern highlights the need for comprehensive ethical frameworks and security measures to mitigate potential risks.

Lack of Regulations and Testing

As we write this, there are no well-defined standards and regulations that regulate the development, deployment or use of AI systems. Traditional software systems work broadly in the  ‘Input-Process-Output' mode, where if the inputs are known, it is easy to validate the expected output.

AI systems like generative AI are based on deep learning and Artificial Neural Network.
(Ref.: https://en.wikipedia.org/wiki/Generative_artificial_intelligence)

By design, there is no well-defined relation between the input and the output of AI systems. It is highly difficult, if not impossible to predict the output of a GenAI system given a particular input.

This presents enormous difficulties in validating and or certifying the AI output to ensure that it does not generate wrong, offensive or unethical output.

With some of the challenges discussed above, let us dive into the possible attack scenarios against AI systems.

Indian AI Initiatives

Indian companies, Government organizations and premier Educational Institutions are not far behind in jumping onto the AI Bandwagon. There are already some notable initiatives and implementations trying to build native language  LLMs using Natural Language Processing or NLP. Bhashini, being developed at IIT Madras for Govt. Of India is a notable initiative aimed at solving the problem of language translation in real-time.  Krutrim, another AI LLM was recently announced by the Ola group is building the AI model and the necessary infrastructure to create new models. Some other notable initiatives are Project Indus by Tech Mahindra and CoRover.ai, a Conversational multi-format AI platform.  India is a very dynamic landscape where some opinion expressed online can have adverse effects on the larger populace.

As this is a very nascent field, Indian AI initiatives will face the same challenges that any other AI models will face. Enough safeguards, regulations and stringent criteria are required before the models are released for public consumption.

Attack Vectors Against AI Systems

Several attack vectors threaten the integrity of AI systems, jeopardizing their reliability and functionality.
Deepfakes pose a real danger where a fake image, video or audio is created that looks identical to the real person. Prominent examples of Deepfakes are the Barack Obama video and most recently Deepfakes of Indian celebrities.
An unassuming person will wrongly believe what is said in the audio / video. The potential dangers of Deepfakes have far reaching consequences like frauds, cyber-crimes or inciting violence.

Jailbreak and bypassing the ‘guard-rails’
A prominent attack vector is jailbreaking, where malicious actors attempt to make the AI systems perform something that is normally not available. An example is the unauthorized manipulation of model parameters to undermine the system's intended purpose, showcasing the vulnerability of AI models to external tampering.

The most recent (and (in)famous) example of an AI jailbreak is called the ‘Grandma locket jailbreak’ where the AI is prompted to solve a secret puzzle in a locket left behind by the dead grandma, while, in reality, the puzzle is a CAPTCHA. Another example of a jailbreak is bypassing the safeguards to reveal the ‘recipe’ of creating a napalm bomb. 

Algorithmic Bias

The persistence of algorithmic bias poses a significant threat. AI models require large amounts of data to be ‘trained’

As training datasets remain under developer control, the potential for biased models persists, leading to unfair and discriminatory outcomes. AI developers can influence the outcome by training the models using a biased dataset. As an example, if I am the AI developer and for some reason I do not like mangos, I can only build the model with data that infers that mangos are bad!

The need for transparent and diverse datasets is crucial to counteract this threat.

False Results Generation

 Attackers may exploit AI systems by inducing them to generate false or incorrect results. This can have far-reaching consequences, especially in applications where decision-making based on AI-generated outputs is critical. Ensuring the robustness of AI models against adversarial attacks becomes imperative in such scenarios.

Open-source Models: from ‘Black box’ to ‘white-box’

Systems like ChatGPT, DALL-E or Midjourney are ‘black boxes’ for users and developers. They don’t see the inner workings of the model to infer anything or to tamper.

However, with open-source models like Meta’s Llama 2, developers and users now get an inside view of how the system was built. 

One of the biggest risks with this approach is a malicious threat actor releasing a tampered open-source model as ‘free’. The model can prompt users to provide personal or financial information which unsuspecting users might just easily provide. 


Staying safe
So far, we have looked at various possible attack scenarios against the AI systems. I would still call it the tip of the iceberg as we do not fully understand the capabilities and use-cases of AI models. 

While it is extremely challenging to protect against different types of attacks, here’s what users of AI systems can do:

Stay Informed

Regularly update yourself on the latest developments in AI security to understand potential risks and vulnerabilities.

Implement Ethical AI Practices

Developers should prioritize ethical considerations throughout the AI development lifecycle, ensuring fairness, transparency, and accountability.

Collaborate for Security

Foster collaboration between AI developers, security experts, and regulatory bodies to create comprehensive guidelines and standards for AI security.

Report Suspicious Activity

Encourage a culture of reporting and addressing any suspicious or anomalous behavior in AI systems promptly, enabling swift responses to potential threats. 

By addressing these actionables, users can actively contribute to the responsible and secure development and deployment of AI technologies, ensuring the continued advancement of this transformative field.