Test Smarter Not Harder: Shift Testing Left and Right with Test Impact Analysis
By Mark Lambert
August 28, 2019
6 min read
Test impact analysis means focusing testing specifically on changes made during each iteration, and testing exactly what needs to be tested, automatically. Teams leveraging this technology can optimize their in-development testing effort with instant feedback about what needs to be done.
Confirmed frequently by industry surveys and reports, software testing is still a bottleneck, even after the implementation of modern development processes like Agile, DevOps, and Continuous Integration/Deployment. In some cases, software teams aren’t testing nearly enough and have to deal with bugs and security vulnerabilities at the later stages of the development cycle, which creates a false assumption that these new processes can’t deliver on their promise. One solution to certain classes of issues is shift-right testing, which relies on monitoring the application in a production environment, but it requires a rock-solid infrastructure to roll back new changes if a critical defect arises.
As a result, organizations are still missing deadlines, and quality and security is suffering. But there’s a better way! To test smarter, organizations are using technology called test impact analysis to understand exactly what to test. This data-driven approach supports both shift-left testing and shift-right testing.
Agile and DevOps and the Testing Bottleneck
Testing in any iterative process is a compromise of how much testing can be done in a limited cycle time. In most projects, it’s impossible to do a full regression on each iteration. Instead, a limited set of testing is performed, and exactly what to test is based on best guesses. Testing is also back-loaded in the cycle since there isn’t usually enough completed new features to test. The resulting effort vs. time graph ends up like a saw tooth, as shown below in Figure 1. In each cycle, only a limited set of tests are executed until the final cycle where full regression testing is performed.
Figure 1: Agile processes result in a “saw tooth” of testing activity. Only the full regression cycle is able to do a “complete” test.
Unfortunately, no project reaches the final cycle with zero bugs and zero security vulnerabilities. Finding defects at this stage adds delays as bugs are fixed and retested. And even with those delays and all, many bugs still make their way into the deployed product, as illustrated below.
Figure 2: Integration and full regression testing is never error-free. Late stage defects cause schedule and cost overruns.
This situation has resulted in the adoption of what has been coined “shift-right testing,” in which organizations continue to test their application into the deployment phase. The intention of shift-right testing is to augment and extend testing efforts, with testing best-suited in the deployment phase such as API monitoring, toggling features in production, retrieving feedback from real life operation.
What is Shift-Right Testing?
The difficulties in reproducing realistic test environments and using real data and traffic in testing led teams to using production environments to monitor and test their applications. There are benefits to this, for example, being able to test applications with live production traffic supporting fault tolerance and performance improvements. A common use case is the so-called canary release, in which a new version of the software is released to a small subset of customers first, and then rolled out to an increasingly larger group as bugs are reported and fixed. Roku, for example, does this for updating their device firmware.
Shift-right testing relies on a development infrastructure that can roll back a release in the event of critical defects. For example, a severe security vulnerability in a canary release means rolling back the release until a new updated release is ready, as you can see in the illustration here:
Figure 3: Shift right testing relies on solid development operations infrastructure to roll back releases in the face of critical defects.
But there are risks to using production environments to monitor and test software, and of course, the intention of shift-right testing was never to replace unit, API and UI testing practices before deployment! Shift-right testing is a complementary practice, that extends the philosophy of continuous testing into production. Despite this, organizations can easily abuse the concept to justify doing even less unit and API testing during development. In order to prevent this, we need to make testing during development phases to be easier, more productive and produce better quality software.
Testing Smarter, Not Harder, by Focusing Your Testing
Most software isn’t fully tested, and the decision of what to test is essentially based on developers’ best guesses about what is critical functionality. During a SCRUM sprint, or an iteration in other processes, it’s difficult to determine what to test, because, of course, “test everything” isn’t an option. Since timelines are short, only parts of the software that were updated by the latest functionality can be tested, but exactly what code is impacted is usually unknown. Test automation helps, but without understanding exactly where and what to test, test coverage is inadequate.
Test Impact Analysis
These shortcomings can be overcome by using test impact analysis, which is a multivariate analysis of test coverage, code changes, and dependencies that pinpoints exactly what code needs to be tested. In addition, these exact tests can be scheduled an executed automatically.
Test impact analysis works at the developer level within the IDE, collecting information about which code is exercised by which tests, and applies that information within the developer’s IDE as the developer is changing code, enabling the developer to easily identify and execute the specific tests that need to be run to verify that the changed code doesn’t break any tests. Also, keeping track of which affected tests have been run, which have passed, and which have failed, makes it easy for the developer to determine which tests still need to be run, or which tests have failed and need to be addressed. Once all tests have been run and are passing, the developer knows that it’s safe to commit their code and move on.
Test impact analysis works within a CI/CD process by integrating seamlessly into a project’s build system such as Maven or Gradle, to get immediate feedback on changes. Test impact analysis identifies which code has changed since the baseline build (i.e. the last nightly build), determines which tests need to be run to exercise that code, and then runs just that subset of tests. This workflow enables teams to set up CI jobs that only run tests based on the most recent code changes, shrinking the amount of time it takes to run a CI job from hours to minutes.
Test impact analysis provides the following key benefits:
- Understand what each test covers: By automatically correlating test execution data with test coverage data, test impact analysis provides a mechanism to identify which tests need to be run, based on the code currently being developed. Users save time without having to run unnecessary tests, and teams benefit from immediate feedback during development and after code check-in.
- Understand what has changed: Developers often don’t know which tests to run to validate code changes, so they either (a) check their code in without running any tests (a very bad practice), (b) run only one or two tests that they know about (which easily misses some), or (c) run all of their tests (which wastes time). Test impact analysis solves this by immediately identifying which tests are related to which code changes, and takes it a step further by automatically executing them. Checked-in code becomes more stable since it has been thoroughly tested prior to check-in.
- Focus on tests that validate changes and impacted dependencies: Identifying and running just the set of tests needed to verify all of the code changes, and affected dependencies, that have been committed since the last baseline build (usually the nightly build), significantly decreases the amount of time it takes to run CI. This allows teams to benefit from a true CI process.
- Immediate and ongoing feedback: Identifying not just direct dependencies between tests and code, but indirect dependencies as well, test impact analysis helps teams understand as soon as possible after code is checked in whether the code broke any tests.
To greatly decrease the testing bottleneck in development, and improve the efficiency of the “saw tooth” effort that testers put into every iteration, development teams can benefit from test impact analysis technology. Test automation with test impact analysis means focusing testing specifically on changes made during each iteration, and testing exactly what needs to be tested, automatically. These teams optimize their in-development testing effort with instant feedback on what needs to be done, what code fails testing, and what other code is impacted by new changes.