Security researchers at cyber risk management company Vulcan.io published a proof of concept of how hackers can use ChatGPT 3.5 to spread malicious code from trusted repositories.
The research calls attention to security risks inherent in relying on ChatGPT suggestions for coding solutions.
The researchers collated frequently asked coding questions on Stack Overflow (a coding question and answer forum).
They chose 40 coding subjects (like parsing, math, scraping technologies, etc.) and used the first 100 questions for each of the 40 subjects.
The next step was to filter for “how to” questions that included programming packages in the query.
Questions asked were in the context of Node.js and Python.
“All of these questions were filtered with the programming language included with the question (node.js, python, go). After we collected many frequently asked questions, we narrowed down the list to only the “how to” questions.
Then, we asked ChatGPT through its API all the questions we had collected.
We used the API to replicate what an attacker’s approach would be to get as many non-existent package recommendations as possible in the shortest space of time.
In addition to each question, and following ChatGPT’s answer, we added a follow-up question where we asked it to provide more packages that also answered the query.
We saved all the conversations to a file and then analyzed their answers.”
They next scanned the answers to find recommendations of code packages that did not exist.
Up to 35% of ChatGPT Code Packages Were Hallucinated
Out of 201 Node.js questions ChatGPT recommended 40 packages that did not exist. That means that 20% of the ChatGPT answers contained hallucinated code packages.
For the Python questions, out of 227 questions, over a third of the answers consisted of hallucinated code packages, 80 packages that did not exist.
Actually, the total amounts of unpublished packages were even higher.
The researchers documented:
“In Node.js, we posed 201 questions and observed that more than 40 of these questions elicited a response that included at least one package that hasn’t been published.
In total, we received more than 50 unpublished npm packages.
In Python we asked 227 questions and, for more than 80 of those questions, we received at least one unpublished package, giving a total of over 100 unpublished pip packages.”
Proof of Concept (PoC)
What follows is the proof of concept. They took the name of one of the non-existent code packages that was supposed to exist on the NPM repository and created one with the same name in that repository.
The file they uploaded wasn’t malicious but it did phone home to communicate that it was installed by someone.
“The program will send to the threat actor’s server the device hostname, the package it came from and the absolute path of the directory containing the module file…”
What happened next is that a “victim” came along, asked the same question that the attacker did, ChatGPT recommended the package containing the “malicious” code and how to install it.
And sure enough, the package is installed and activated.
The researchers explained what happened next:
“The victim installs the malicious package following ChatGPT’s recommendation.
The attacker receives data from the victim based on our preinstall call to node index.js to the long hostname.”
A series of proof of concept images show the details of the installation by the unsuspecting user.
How to Protect Oneself From Bad ChatGPT Coding Solutions
The researchers recommend that before downloading and installing any package it’s a good practice to look for signals that may indicate that the package may be malicious.
Look for things like the creation date, how many downloads were made and for lack of positive comments and lack of any attached notes to the library.
Is ChatGPT Trustworthy?
ChatGPT was not trained to offer correct responses. It was trained to offer responses that sound correct.
This research shows the consequences of that training. This means that it is very important to verify that all facts and recommendations from ChatGPT are correct before using any of it.
Don’t just accept that the output is good, verify it.
Specific to coding, it may be useful to take extra care before installing any packages recommended by ChatGPT.
Read the original research documentation:
Featured image by Shutterstock/Roman Samborskyi