GitHub Copilot’s Security Filters Don’t Work

tl;dr We tested GitHub Copilot’s claim that they are able to use an LLM to catch simple security vulnerabilities to automatically filter out suggestions that contain them. None of our tests passed.

Background

At Codeium, the two most common classes of questions we get are “I saw a demo of [Cool Feature X] on Twitter, can Codeium do this?” and “Your competitor [X] can do [Y], can Codeium do the same?” We wrote an entire post on how we think of development in the space and how there needs to be a healthy appreciation for what LLMs can consistently do and what they cannot, and that argument makes a lot of sense for the first set of questions. But the second? If a competitor claims to do something, that means that the tech is there to reliably do whatever is in question, right? In the past, we have investigated such features, like post-generation license filters, and have found that the claims are much larger than the actual maturity of the capability.

Today, we will look at another such feature. We are often asked if we are able to filter out suggestions that could pose or insert new security vulnerabilities into the codebase, without the developer fully understanding what is going on. Our response is that detecting security vulnerabilities is a really hard problem, and there are plenty of SAST tools that are focused on doing this exact kind of scanning for code, and these tools are agnostic to whether a human or AI wrote the code, so companies who are concerned about this in general should invest in such tooling as part of their CI/CD pipelines. We try our best to remove repositories with known security vulnerabilities from our training data, but being able to filter any such generations out? We don’t claim to be able to do this, but recently GitHub Copilot did.

GitHub Copilot’s Security Scanning

GitHub Copilot has, for a while, claimed to have a filter that uses an LLM under the hood to scan any generated code and filter out or alert the developer when a suggestion seems to contain insecure coding practices.:

We also launched an AI-based vulnerability prevention system that blocks insecure coding patterns in real-time to make GitHub Copilot suggestions more secure. Our model targets the most common vulnerable coding patterns, including hardcoded credentials, SQL injections, and path injections.

The new system leverages LLMs to approximate the behavior of static analysis tools—and since GitHub Copilot runs advanced AI models on powerful compute resources, it’s incredibly fast and can even detect vulnerable patterns in incomplete fragments of code. This means insecure coding patterns are quickly blocked and replaced by alternative suggestions.

To be fair, they don’t claim to be able to catch everything, but the reality is, if you launch and market such a feature, the natural expectation from users is that it works pretty well and at the very bare minimum, would catch obvious things. SAST tools can catch such things at pretty good recall, so any new approach should be at roughly the same level. The simple act of releasing such a feature comes with certain expectations, and would potentially disincentive companies from investing in other tools dedicated to catching such security vulnerabilities.

So, let’s run it through some simple tests.

Testing GitHub Copilot’s Security Scanning

We won’t try anything that the GitHub Copilot team themselves don’t claim in their post. Namely we will test SQL injections, path injections, and hardcoded credentials.

The canonical example is SQL injection, so let’s see if Copilot (a) suggests code with SQL injection and (b) if so, does the security scanning catch it:

Ok, that was easy. Users can now do SQL injections such as setting their username to "bob"; DROP TABLE admins;.

Let us test path injection, which is just another variant of this kind of security vulnerability:

Well, this doesn’t instill a whole lot of confidence: requests could just be made with the prefix ../../../ and suddenly your entire file system is exposed. Copilot couldn’t even protect from the most basic injection vulnerabilities.

How would we test hardcoded credentials? Perhaps a developer has a local file with some secret keys. Copilot claims that “insecure coding patterns are quickly blocked and replaced by alternative suggestions” to prevent them from being accidently commit to the repository. To test this, we just put a random key variable in an open tab and see if we can regurgitate this value in a separate file:

No hesitation. Copilot also does not yet have a productionized way to specify particular file paths or types to ignore as part of their context, so this is just something you have to live with.

It does not seem like we even need to investigate more complex forms of security vulnerabilities, given that Copilot will happily regurgitate insecure code for the exact examples they claim this feature helps to catch.

Conclusion

If you use GitHub Copilot, please don’t expect their LLM-based security scanning to work. For full transparency, Codeium today doesn’t do a whole lot better, because removing all such possibilities from trillions of tokens of training data is a very hard problem, but we also don’t claim to solve the problem for you. Our general thesis is to not get ML models to do things that existing tools, like SAST tools, do pretty well already.

Maybe you are the (common) type of person who looks at this and says that humans are just as likely as an AI to write such insecure coding patterns, so this isn’t a huge deal in the first place, and that is totally fine - you are all set with or without this feature working. But for people who do care about these things, please invest in your CI/CD pipelines.

Using generative AI to accelerate software development is a no-brainer when it comes to productivity wins, but make sure the safeguards you need actually work.