Virtualization Technology News and Information
Beneath the Hood of ChatGPT: The Security Risk of Large Language Models

By Aaron Mulgrew

Generative artificial intelligence tools like ChatGPT and Bard are the talk of the town. But what makes them really tick? The answer lies in large language models (LLMs). These models, as part of predictive neural networks, possess remarkable language understanding and prediction capabilities. LLMs are trained by deep learning algorithms called transformers to comprehend text with billions of parameters, much like humans do. LLMs provide users with generative AI tools with an experience akin to interacting with a virtual assistant capable of creating fresh content, code, or calculations.

Let's assess how these models handle censorship as controls that prevent potential abuse. Closed models, such as ChatGPT, undergo a process called Instruction Tuning to ensure they avoid engaging in unethical activities. It's akin to an AI finishing school, teaching them to decline inappropriate requests through built-in policies and guardrails. Closed language models make it difficult for users to bypass these controls. However, there's an interesting twist. Workers in Kenya were paid minimal wages to label input and output content, fine-tuning models like GPT and ChatGPT. It's a bit of an ethical dilemma, but this classification work plays a crucial role in keeping these models on the right track.

Some argue that commercially released LLM chatbots merely echo the views of the software companies behind them, and they have a valid point. Studies have shown that ChatGPT, given the right prompt or configuration, can exhibit "pro-environmental, left-libertarian" tendencies or more toxic, racist behavior.

Complicating matters further, there is a growing demand for "unrefined" chatbots, like the one recently introduced by Meta AI. These bots operate outside the confines of rules followed by ChatGPT and other closed models. Their critics, including the closed-model software companies, argue that open-source bots can also generate misogynistic, racist, or even illegal content when prompted by certain user questions or requests. On the flip side, these uncensored models can function offline without an internet connection, showcasing impressive performance and flexibility. The fact they can function offline means that the ability to track usage of those who are using the uncensored models is nonexistent, for law enforcement. However, by manipulating the instruction dataset, the "trainer" can make the base model comply without questioning. The prospect of this has government officials, cybersecurity professionals, and business executives feeling uneasy.

We must exercise caution when dealing with LLMs, regardless of whether they are open-source or closed. The potential for creating unethical content makes them vulnerable to exploitation by those with malicious intentions. They can become the newest tools in the kit for well-funded cybercriminals and nation-states that want to steal intellectual property, spread false information, or cause disruption in critical infrastructure and economies. This is why organizations must remain vigilant about their data.

Adopting a data-first strategy for Zero Trust is crucial. Vigilance over your business data is paramount when employees are constantly shifting between corporate machines and personal devices and from websites and private networks to public clouds. The plan starts with implementing a comprehensive security program that can identify all data and gain a complete understanding of its storage, usage, and movement, as well as who can access it. Access controls must work together continuously with discovery, classification, prioritization, protection, and monitoring so that any time data is created or changed your security teams know about it and move quickly to stop threats. Continuous analytics and intelligence can unlock the efficacy and efficiency of your security investments. And it's true, AI and machine learning can enhance these Zero Trust security solutions, often making recommendations in natural language to simplify security and improve effectiveness.

As we all know, wherever there is technology, there are individuals seeking to misuse it. Uncensored open-source LLMs are no exception. Given the risk-reward dichotomy of AI tools, we must tread carefully in this unfamiliar territory that LLMs have opened up for us. With a balanced, data-first security approach, we can foster better information management practices while mitigating the risks of AI exploitation.




Aaron Mulgrew is a Solutions Architect with Forcepoint and works with central government departments in the UK and abroad to secure their systems. With a specialty in cryptocurrency protection, Aaron ensures that cryptocurrency service providers maximize their security potential. Aaron has held roles in research, technical sales and as a cloud solutions architect and engineer with over five years in-depth technical experience in threat removal, steganography and cloud infrastructure protection.

Published Wednesday, October 18, 2023 7:34 AM by David Marshall
Filed under: , ,
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<October 2023>