How do large companies protect their source code?

First off, I want to say that just because a company is big doesn't mean their security will be any better.

That said, I'll mention that having done security work in a large number of Fortune 500 companies, including lots of name-brands most people are familiar with, I'll say that currently 60-70% of them don't do as much as you'd think they should do. Some even give hundreds of third-party companies around the world full access to pull from their codebase, but not necessarily write to it.

A few use multiple private Github repositories for separate projects with two-factor authentication enabled and tight control over who they grant access too and have a process to quickly revoke access when anyone leaves.

A few others are very serious about protecting things, so they do everything in house and use what to many other companies would look like excessive levels of security control and employee monitoring. These companies use solutions like Data Loss Prevention (DLP) tools to watch for code exfiltration, internal VPN access to heavily hardened environments just for development with a ton of traditional security controls and monitoring, and, in some cases, full-packet capture of all traffic in the environment where the code is stored. But as of 2015 this situation is still very rare.

Something that may be of interest and which has always seemed unusual to me is that the financial industry, especially banks, have far worse security than one would think and that the pharmaceutical industry are much better than other industries, including many defense contractors. There are some industries that are absolutely horrible about security. I mention this because there are other dynamics at play: it's not just big companies versus small ones, a large part of it has to do with organizational culture.

To answer your question, I'm going to point out that it's the business as a whole making these decisions and not the security teams. If the security teams were in charge of everything, or even knew about all the projects going on, things probably wouldn't look anything like they do today.

That said, you should keep in mind that most large businesses are publicly traded and for a number of reasons tend to be much more concerned with short-term profits, meeting quarterly numbers, and competing for marketshare against their other large competitors than about security risks, even if the risks could effectively destroy their business. So keep that in mind when reading the following answers.

  • If source code were stolen:

    1. Most wouldn't care and it would have almost no impact on their brand or sales. Keep in mind that the code itself is in many cases not what stores the value of a company's offering. If someone else got a copy of Windows 10 source, they couldn't suddenly create a company selling a Windows 10 clone OS and be able to support it. The code itself is only part of the solution sold.

    2. Would the product be at greater risk because of this? Yes absolutely.

  • External Modification: Yes, but this is harder to do, and easier to catch. That said, since most companies are not seriously monitoring this it's a very real possibility that this has happened to many large companies, especially if back-door access to their software is of significant value to other nation-states. This probably happens a lot more often than people realize.

  • Internal Attacker: Depending on how smart the attacker was, this may never even be noticed or could be made to look like an inconspicuous programming mistake. Outside of background checks and behavior monitoring, there is not much that can prevent this, but hopefully some source-code analysis tools would catch this and force the team to correct it. This is a particularly tough attack to defend against and is the reason a few companies don't outsource work to other countries and do comprehensive background checks on their developers. Static source code analysis tools are getting better, but there will always be gap between what they can detect and what can be done.

In a nutshell, the holes will always come out before the fixes, so dealing with most security issues becomes something of a race against time. Security tools help give you time-tradeoffs but you'll never have "perfect" security and getting close to that can get very expensive in terms of time (slowing developers down or requiring a lot more man-hours somewhere else).

Again, just because a company is big doesn't mean they have good security. I've seen some small companies with much better security than their larger competitors, and I think this will increasingly be the case since smaller companies that want to take their security more seriously don't have to do massive organizational changes, where larger companies will be forced to stick with the way they've been doing things in the past due to the transition cost.

More importantly, I think it's easier for a new company (of any size, but especially smaller ones) to have security heavily integrated into it's core culture rather having to change their current/legacy cultures like older companies have to. There may even be opportunities now to take market share away from the a less secure product by creating a very secure version of it. Likewise, I think your question is important for a totally different reason: security is still in it's infancy, so we need better solutions in areas like code management where there is a lot of room for improvement.


Disclaimer: I work for a very big company that does a good job in this area, but my answer is my own personal opinion and is not indicative of my employer's position or policies.

First of all, how to protect code from being leaked:

  • Network Security: This is the obvious one -- if Chinese hackers get credentials into your internal systems, they'll go for your source code (if for no other reason than the fact that the source code will tell them where to go next). So basic computer security has to be a "given".
  • Access Control: Does your receptionist need access to your code repository? Probably not. Limit your exposure.
  • Be selective in hiring and maintain a healthy work environment: DLP measures like scanning outbound email is nifty in theory, but if your engineer is smart enough to be of any use to you at all, they're smart enough to figure out how to circumvent your DLP measures. Your employees shouldn't have a reason to leak your source code. If they do, you've done something horribly, horribly wrong.
  • Monitor your network: This is an extension of the "network security" answer but with a Digital Loss Prevention emphasis. If you see a sudden spike in DNS traffic, that may be your source code getting exfiltrated by an attacker. OK, now ask yourself if you would even know if there was a sudden spike in DNS traffic from your network. Probably not.
  • Treat mobile devices differently: Phones and laptops get lost really often. They also get stolen really often. You should never store sensitive information (including source code, customer data, and trade secrets) on mobile devices. Seriously. Never. That doesn't mean you can't use mobile devices to access and edit source code. But if a laptop goes missing, you should be able to remotely revoke any access that laptop has to sensitive data. Typically that means that code and documents are edited "in the cloud" (see c9.io, koding.com, Google Docs, etc) with proper authentication and all that. This can be done with or without trusting a third party, depending on how much work you want to put in to it. If your solution doesn't support 2-factor then pick another solution; you want to reduce your exposure with this measure, not increase it.

Second, how to prevent malicious code modification; there really is only one answer to this question: change control.

For every character of code in your repository, you must know exactly who added (or deleted) that code, and when. This is so easy to do with today's technology that it's almost more difficult to not have change tracking in place. If you use Git or Mercurial or any modestly usable source control system, you get change tracking and you rely on it heavily.

But to up the trustworthiness a bit, I would add that every change to your repository must be signed-off by at least one other person besides the author submitting the change. Tools like Gerrit can make this simple. Many certification regimes require code reviews anyway, so enforcing those reviews at checkin-time means that malicious actors can't act alone in getting bad code into your repo, helps prevent poorly-written code from being committed, and helps ensure that at least 2 people understand each change submitted.


There will be measures in place to prevent the accidental insertion of problematic code, aka bugs. Some of these will also help against the deliberate insertion of problematic code.

  • When a developer wants to commit code to the repository, another developer has to examine this merge request. Perhaps the second developer will be required to explain to the first developer what the new code does. That means going over every line.
  • If the code looks confusing, it might be rejected as bad style and not maintainable.
  • Code has automated unit and integration tests. When there is no test for a certain line of code, people wonder. So there would have to be a test that the backdoor works, or some sort of obfuscation.
  • When a new version of the software is built, developers and quality assurance check which commits are part of the build and why. Each commit has to have a documented purpose.
  • Software is built and deployed using automated scripts. That is not just for security but also to simplify the workload.

Of course such measures rely on the honesty and good will of all participants. Someone with admin access to the build server or repository could wreak a lot of havoc. On the other hand, ordinary programmers don't need this kind of access.