What the recent npm and GitHub breach means for you
Last week GitHub and npm announced the details of an investigation into an attack that targeted the npm organisation on GitHub. In this article we explore the impact to the average organisation and discuss some routes you can take to minimise the effect of this attack and similar future attacks.
How did it happen?
You can read the full details in the post-mortem published by GitHub but in brief, an attacker managed to obtain OAuth tokens issued to authorised consumers of the GitHub API and subsequently use those tokens to selectively target a number of GitHub organisations. By cloning private repositories from the npm GitHub organisation the attacker was able to access sensitive AWS access keys and secrets that allowed access to npm's cloud infrastructure. From there the attacker was able to exfiltrate, or steal, data including a database backup dating from April 2021.
What was the impact?
The attackers were able to exfiltrate data from internal npm systems. Exactly what was stolen is detailed in the GitHub analysis so we won't cover it in depth here but instead focus on the one that's likely to have an impact on you or your organisation.
One of the pieces of exfiltrated data was a backup from April 2021 of "skimdb", a public mirror of the CouchDB database behind the npm package registry itself. The stolen backup includes the following points of particular concern from the GitHub announcement:
- An archive of user information from 2015. This contained npm usernames, password hashes, and email addresses for roughly 100k npm users.
- All private npm package manifests and package metadata as of April 7, 2021.
If your npm user account was one of those 100,000 you should have been notified last week by npm. Affected accounts have had their passwords forcibly changed and you'll have to go through the reset process to regain access to your account.
The second point regarding private package manifests is perhaps more concerning. The GitHub postmortem goes on to explain in more detail:
This exfiltrated data includes READMEs, package version histories, maintainer email addresses, and package install scripts, but does NOT include the actual package artifacts, i.e., the tarballs themselves.
It's particularly important to note that while the public registry mirror at
skimdb.npmjs.com
should not make data associated with private packages
available to the public, the data stolen in this breach does appear to
include such information.
Finally, it's worth mentioning that two specific unnamed organisations were further targeted and suffered a theft of actual package artefacts. These organisations have been notified by GitHub. Had the attackers chosen to do so they would have been in a position to steal this additional, and likely far more sensitive, data from any private package in the registry.
What can you do about it?
While it's fortunate that the actual package artefacts (the code itself) were not included in the exfiltrated data in the vast majority of cases it's quite likely that attackers could make use of sensitive information in the README files and package install scripts that were. We recommend the following steps at a minimum to reduce possible further impact to your organisation:
- Audit the codebases from which all of your private npm packages are published.
- Pay close attention to the any README files. Packages use a file in the root
directory (relative to the
package.json
file) namedREADME
(with any case in the file name and with any extension) and we can safely assume it is the contents of this file that is included in the stolen dataset. Search these files for hostnames, access keys, passwords or any other potentially sensitive data. - Check the
author
andcontributors
fields in thepackage.json
files. Any names and email addresses included in these fields were included in the stolen dataset. It may be sensible to proactively engage with any members of your organisation that have access to inboxes associated with those email addresses to reduce the risk of future targeted phishing attacks. - Check the
scripts
field in thepackage.json
files. We can fairly safely assume that the stolen dataset did not include any files within the package referenced by these scripts but any inline scripts withinpackage.json
itself will be included.
How can you minimise the impact of similar future attacks?
- Take care to avoid committing sensitive data such as access keys or passwords to source control repositories.
- If you use GitHub consider enabling secret scanning on your organisation's repositories. There are numerous third-party security scanning tools available if you use a different platform.
- If you discover or detect secrets committed in plain text to source control repositories ensure your teams understand the policy of rotating those keys immediately, rather than attempting to rewrite source control history to remove them from the codebase.
- Consider setting up email groups or mailing lists that can be used as contact details for authors and contributors of packages instead of those of individuals. This may reduce the chance of targeted "spear" phising attacks.