ChatGPT Vulnerable to Manipulation using Hex Code

Generative AI such as ChatGPT, Gemini, and Copilot can generate just about anything and although the majority uses these models in constructive ways, there are always bad actors looking to use them for malicious purposes. Therefore, companies that bring us these models must prevent bad actors from using their products for such purposes. Most companies do this by implementing certain measures that prevent malicious use of their model. For example, If you were to ask ChatGPT to make a computer virus for you, you might see something like this:

ChatGPT Refuses to Generate Malicious Code

This is the result of a built-in safeguard implemented by OpenAI to ensure that ChatGPT is not used for illegal purposes.

Prominent security researcher Marco Figueroa recently uncovered a significant vulnerability which could allow users to circumvent these safeguards placed on generative AI models, allowing their use to generate harmful code.

The root of this weakness is the fact that the language model is designed to follow instructions step-by-step but lacks deeper context awareness that could help evaluate the safety of each individual step in the broader context. In essence, a series of encoded instructions presented in the form of hex decoding tasks masks the harmful intent of the instructions and helps bypass the model’s guardrails.

Exploitation

Hex encoding converts plain-text data into hexadecimal notation, a format that is commonly used in computer science to represent binary data in a human-readable form.

The Malicious Prompt

A malicious instruction to conduct research on and generate an exploit for a certain CVE, which would ideally trigger ChatGPT’s guardrails, is first encoded into hex format giving us a long string of hexadecimal characters.

This string is then presented to the model with a clear set of instructions to decode it. Since the model processes this task step-by-step, it decodes the hex into readable instructions without triggering any alarms.

Once the hex is decoded, GPT interprets it as a valid task. Since the resultant instruction directs the model to conduct research on and generate a Python exploit for a CVE, the model proceeds to write the code which the attacker can use to exploit the CVE.

Implications

This vulnerability exposes real-world risks that could affect you in unexpected ways. If bad actors can successfully bypass the safeguards on AI models, they could generate malicious software or automated phishing tools much more easily, putting individuals’ data and online security at greater risk. You may have noticed how easy it has become for people with absolutely no software development training or experience to create complex software with the help of generative AI. Imagine if the same were true for malware and ransomware. We would see widespread negative impacts very quickly. It is therefore essential to strengthen the defences of AI models, focusing on strategies to better detect and prevent malicious activity.

Vulnerabilities in AI models are still revealing themselves to us. This is a great example, where such an obvious form of trickery is only now showing itself to us. Awareness about such vulnerabilities is therefore essential to not only the people building these models but also users of the same. Keeping in the know is the first step to security and we at Decrypting aim to keep our people in the know.

Decrypting

ChatGPT Vulnerable to Manipulation using Hex Code

Exploitation

Implications

Like this:

News

Romania’s Election System Suffered 85000 Attacks Approaching Elections!

Hackers Hacking Hackers: Turla Exploits Storm-0156 to Siphon Data

Windows Cyber Attack Warning: Zero-Click Russian Backdoors Confirmed!

HDFC Life Insurance Suffers Data Breach! How to Protect Yourself

Mailware: Swiss Hackers Using Snail Mail to Distrubute Malware

Update: CISA Adds Three Vulnerabilities to its Kown Exploited Vulnerabilities(KEV) Catalog

Indian Cybercrime Bust: Trio Arrested for Laundering Funds to China and Nepal via Cryptocurrency

How Hackers Avoid Detection using ZIP files

Winos 4.0: How Gaming Apps can be Used to Infect your Computer