GDPR is big and scary and getting closer by the day. Part of me wants to believe that we’ll all be ready by the May 25th deadline, but more of me believes that the real push to comply will happen AFTER that date. If you’re not clear on what GDPR really is or if it matters to you, take a look at my last blog for a larger overview. In a nutshell, it means you have to let users know what data you’re collecting, how you’ll use it, do your best to protect it, be transparent when breaches occur, and remove any user data completely if they ask. And oh yeah, if you don’t comply, there are big, big fines.
In theory, GDPR applies only to EU citizens, but the global reach of most commerce these days requires diligence in complying with the regulation across the globe. This leaves a choice between treating all users in a secure, private manner vs. having a completely segmented data flow for EU and non-EU customers, for example (likely a more expensive proposition). In this blog, I’ll explain how you can leverage static code analysis to help improve data protection and privacy.
When you think about GDPR, data protection, and other associated data regulations like PCI-DSS (Payment Card Industry-Data Security Standard) or HIPAA (Health Insurance Portability and Accountability Act), the immediate thought is the need for increased testing, dynamic analysis, and penetration testing. While necessary and important, these testing technologies lessen the chance of shipping insecure software, without actually making software more secure or ensuring privacy in the first place. But security and privacy can’t be “tested into” into software any more than quality or performance. So GDPR requires concepts called “Security by Design” and “Privacy by Design” (PbD), which means building software better in the first place.
“The Privacy by Design approach is characterized by proactive rather than reactive measures. It anticipates and prevents privacy invasive events before they happen. PbD does not wait for privacy risks to materialize, nor does it offer remedies for resolving privacy infractions once they have occurred – it aims to prevent them from occurring. In short, Privacy by Design comes before-the-fact, not after”
A. Cavoukian. Privacy by Design – The 7 Foundational Principles, January 2011.
I bring these two concepts up because they are the next step after the normal application security activities take place (firewalls, penetration testing, red teams, DAST, etc.). The “by design” part can also be read as “build it in.” This is the idea that rather than poke at your application and fix where the holes are found, you build an application without the holes in the first place… by design, as it were. For example, SQL injection (SQLi) continues to be one of the most common exploits.
Many tools exist to try and either force an injection through the UI (penetration testing), or simulate the flow of data in a program without running it to see if tainted data can make it through to a database query (flow analysis). A “by design” approach means wrapping any input (from database, user, or anywhere) inside of a validation function at the moment the input is acquired. This reduces the possible paths where the data can bypass to zero. You still need to run the penetration tests to make sure you built your software right, but the difference is that if pentest succeeds you don’t simply fix the one weakness you found. Instead you look back and find out WHY pentest succeeded and build your software so it won’t succeed.
If pentest is finding lots of security flaws in your software, then you are not building secure software “by design.” Similar for Privacy by Design, we watch who/what/where we share, and we presume that all data is important unless told otherwise. Again, commonly programmers make assumptions that data ISN’T important unless specially flagged. You see this in things like decisions about whether the data is stored in plain form, or whether data is encrypted. Encrypting everything is a way of doing privacy by design. One of many granted, but that’s the basic idea. If you encrypt everything, you never have to worry that you didn’t encrypt something that you should have.
The role of static analysis isn’t to tell us that our software is vulnerable (that’s the job of testing). The role of static analysis is to help ensure that the software is strong in the first place… by design. While flow analysis has become popular in the last 10 years as a security testing technique, it’s still a way of testing the software rather than a way of hardening the software, or building security in, or doing it “by design.”
Static analysis can be uniquely positioned to act as a real preventative technique if it’s used properly. In addition to the flow analysis security rules, i.e. looking for tainted data, we also enable rules that ensure that the software is built in a secure manner. Considering the two cases above, when doing privacy by design, I can have static analysis rules that flag when data is stored without being encrypted first, or when an old improper encryption method that is hackable is used instead of strong encryption, or when users are trying to access inappropriate data for their expected permissions.
Here’s a brief description of a sample rule that enforces logging when sensitives methods are invoked. This static analysis rule won’t find bugs, but it will help you make software that logs what’s going on so that it’s more secure in production. This rule is a perfect fit for PCI-DSS as well as GDPR.
Ensure all sensitive method invocations are logged [SECURITY.BV.ENFL]
DESCRIPTION: This rule identifies code that does not log sensitive method invocations. An error is reported if some sensitive method invocations– for instance, ‘login’ and ‘logout’ from ‘javax.security.auth.login.LoginContext’– are notLogged when used.
Another example of privacy by design is this rule that helps prevent you from unintentionally leaking personal or important information when an error does occur in your software:
Do not pass exception messages into output in order to prevent the application from leaking sensitive information [SECURITY.ESD.PEO]
DESCRIPTION: This rule identifies code that passes exception messages into output.An error is reported when a catch clause calls an output method and the exception being caught in the catch clause appears in the list of parameters or is used as the calling object.
This rule covers OWASP Top 10, CWE, PCI-DSS, and GDPR – meaning it’s a really good idea no matter why you’re trying to do security.
Because GDPR isn’t a coding standard, there is no simple static analysis configuration that will cover it. Often the best starting point is to find static analysis rules that directly relate to the issues that you’re currently finding in testing, such as XSS, or SQLi issues. Such issues generally have some static analysis rules that act as bug-finders, and will provide early detection for these issues before they make it to testing. Even more important, there will also be associated rules, in this case around input validation, that help you ensure that SQLi simply cannot happen as I mentioned above.
Chasing data from user input through storage is hard. Programming so that validation always happens is easy. Programming so that encryption always happens is easy to do and easy to test for. Why do it the hard way?
Once you’ve found and turned on rules for issues that you’re finding during testing, you’ll want to go even further. I’d suggest borrowing ideas from other coding standards that already cover data privacy and protection. Some good choices are OWASP, HIPAA, and PCI-DSS. If you turned on any rules in your static analysis tool that relate to those standards, you’re going to be doing a good job for GDPR. In fact, if you’re already PCI-DSS compliant, you’ll find that at least this part of GDPR should be relatively easy to prepare for.
If you already have other security requirements like CWE or CERT, you can make sure that you’re following them as well, and expand your configuration to cover specific GDPR data protection as necessary by finding any items in those standards related to data privacy, data protection, and encryption.
Parasoft can help you get your code secure and private by design in a couple of ways. First, all of our static analysis engines have configurations for OWASP, CWE, CERT, PCI-DSS, HIPPA, etc. You can turn on the exact set of security rules that are a good fit for your organization, and then enforce them automatically.
Additionally, when you integrate Parasoft DTP with static analysis, you have full audit capability, automating the process of documenting what rules were run on what code and when. You can prove that you’re doing testing or even secure by design based on which rules you’ve selected.
Parasoft DTP also has some very special reports, and if you choose to base your security efforts on CWE, the Parasoft CWE dashboard gives you great SAST reports, such as issues by severity, location, type, history, etc. We’ve gone one step further and implemented the technical impact data in CWE. Technical Impact (TI) is research done at Mitre as part of the Common Weakness Risk Analysis Framework (CWRAF) and helps you classify SAST findings based on the problem they can cause. So instead of a message that says you have a buffer overflow, which some might not recognize as a security problem, TI tells you that buffer overflow could lead to denial of service.
Each CWE finding tells you what kinds of problems can happen, and there are special graphs that help you navigate your static analysis issues based on the problem areas most important to you, not just on severity levels. This groundbreaking technique helps you get a handle on what can often become an overwhelming number of vulnerabilities, especially if you’re working on a legacy code base. Focus first on the issues that scare you the most.
And of course, while I was focusing on static analysis today as a way of doing security-by-design, don’t forget that Parasoft also has penetration testing tools, API testing, and service virtualization, all of which are an important part of a comprehensive secure software development strategy.
GDPR looks scary and it certainly can be, but getting static analysis set up properly with the right tool and the right rules will help you secure your software, prove that you’re doing the right thing for auditors, and show that you’re following the principles of secure-by-design and privacy-by-design. This is something that penetration testing alone cannot do. The extra benefit is that you’ll find that approaching security from the “by-design” perspective is far more effective than trying to test your way to secure software between QA and release. To dive deeper, read the guide: Using Static Analysis for Secure-By-Design GDPR Data Security and Privacy.
Arthur has been involved in software security and test automation at Parasoft for over 25 years, helping research new methods and techniques (including 5 patents) while helping clients improve their software practices.