An Intellyx BrainBlog for Parasoft by Jason English
Have you ever had a friend who was OCD about the whiteness of their teeth? They go to a cosmetic dentist for a laser whitening, then complain that it’s not white enough? Eventually they reach an asymptotic point where the surface of the tooth can’t reflect 99.9% of the light, so getting a treatment that delivers 99.999% white teeth would be too costly — and impossible.
Test engineers, performance testers, and SREs are obsessed with achieving a five nines (or sometimes, even higher) level of reliability at the last mile of software delivery, where it impacts customers the most.
If you, too, have OCD about application reliability, you already worked out the math, and realized 5 9’s equates to about 5.26 minutes of downtime per year, total. And you also realize that the test engineering cost and effort of pushing past 99.9% to guarantee 99.999% uptime is monumental.
Getting production from 3 9’s to 5 9’s will eat up way more IT budget than getting to 3 9’s in most cases!
It seems all the preproduction performance testing in the world will never possibly clear up all of the unknown conditions that might cause the 0.001% of downtime. No matter how much we spend. Out, damned spot!
Ah… but let’s not worry about that one thing you can’t improve, all the way over on the right side of software delivery. No, no, no… Issues cost a lot more to resolve here, anyway, when the application is fully baked. Can’t obsess over it.
Just a tiny flare up. Nobody’s perfect. Relax.
Why not leave production alone for a while and go hang out with the DevTest teams at the beginning of the software lifecycle instead?
Over here on the left side of software delivery, they’re drawing maps, writing code, and picking components. Not a care in the world. No fears about what unknown unknowns might happen in deployment.
They’re running test-first agile development, and they’re running lots of structural code checks with each build, running automated unit and regression test suites. They shifted left on those kinds of testing.
They’re also not terribly worried about the impact of decisions they’re making. When diverse services, software, and infrastructure components are integrated behind a working application, they can generate an infinite variety of non-functional issues when interacting under load.
Early design and development selections can provide bad news later, but non-functional testing (NFT) and performance testing traditionally happen closer to production.
Wait, what if you could bring NFT over to design and development? Could that provide enough early warning to prevent the costliest mistakes?
Non-functional problems are hard to detect early in the software because real world conditions are by nature hard to reproduce in the lab.
You may run Selenium functional test suites on an app interface or user authorization process, or run a set of API service test calls and datasets to verify responses, or run a bunch of JUnit or NUnit tests on code. But all these methods can only test for the problems you expect to find at this early stage.
To get closer to real world conditions, there are three options open to the team.
1. Insert functional tests for benchmarking. If you insert Selenium functional tests in the CI/CD software pipeline and automate them with a companion tool like Parasoft Selenic to monitor execution times with each build, you can capture a pretty good benchmark for any application, component, or service.
If there is some deviation in response time, for whatever reason, the build can notify the build platform or developer that some exception (likely non-functional) caused a particular component to slow down.
2. Reuse tests and repeat under load. Why wait until the end and use tools like LoadRunner which require near-finished UIs to invoke, specialized testing skills, and high per-seat costs?
Instead, what if you took the Selenium tests, and some security checks and service and integration tests from a tool like Parasoft SOAtest, then started cycling and re-running them, perhaps with some variability of data or timing, at higher frequencies? You would get an early-stage performance test from shift left testing.
From there, you can combine different combinations of web-style, UI-calls, non-UI calls to APIs or event queues for a hybrid performance test that can exercise multiple layers of the application’s target ecosystem, without waiting for a finished app.
3. Isolate against environmental dependencies. The last hurdle to shifting NFT to the left is the environment where applications actually live. A modern application functions in a world full of data and service dependencies.
Here’s where environment-based testing with a service virtualization solution makes sense to instrument all of the upstream and downstream dependencies around the live app. This allows you to simulate things like a banking partner’s system, or a national weather or air traffic system that would never be under your control or available for import into your own DevTest lab.
Dependencies can be listened to and captured as virtual services — components that can respond “better than the real thing” as far as software testing is concerned.
One Canadian bank used a combination of all three techniques. They automated functional tests to capture a benchmark for a loan application component, then re-ran the tests with some other tests for data queries and API calls to a virtual service “mock” of a third party credit service.
They were worried. If the virtual API service responded too slowly — what would happen to the component under test? It responded as expected, but then the testing team sped up the virtual service of that third party response to the shortest possible time. A “race condition” emerged that caused the component transaction to fail when getting a response that was too fast!
In software, there’s only one definition of perfection, but failure is infinite.
No matter how much we brush, we’ll never have 100% white teeth. No matter how much we test, we’ll never be 100% clear of defects making it to production, where errors are difficult to isolate and costly to remediate.
But with proper test-left hygiene that includes non-functional and performance testing, we can get an early warning system that will nip many of these nascent issues before they have a chance to emerge in the performance lab, or fail in front of customers.
©2020 Intellyx, LLC. Intellyx retains editorial control over the content of this document. At the time of writing, Parasoft is an Intellyx customer. None of the other vendors mentioned here are Intellyx clients. Image sources: Steve Snodgrass, Kristopher Volkman, John Queen, Gauthier DELECROIX – 郭天, flickr open source.
Jason English is Principal Analyst and CMO at Intellyx, where he advises leading technology solution providers and software startups as they navigate digital transformation. His background includes customer experience and interactive design, enterprise software dev/test lifecycle, virtualization, Cloud, and blockchain.