Is A.I. Disrupting Our QA World?

Ingenuity Festival Cleveland, Ohio, U.S.A. | Photography by Raphael Awoseyin

Ingenuity Festival Cleveland, Ohio, U.S.A. | Photography by Raphael Awoseyin

Automate Like Its 2009

2009 was a very important year for test automation. It signalled the dawn of a new era of seriousness on the subject. Eight years after Cem Kaner and James Bach published Lessons Learned in Software Testing, a landmark piece which helped defined the QA role as we know it, QA experts turned to automation and its proper integration into test strategy. Along with this awakening came several key publications that till today, are seen as major contributions to the test automation and QA world.  It would be an understatement to say the flow of new publications were largely influenced by the rise of test automation as standard QA practice adopted by major companies. 2009 marked the transition from the test pioneers who advocated to get software testing recognised as a legitimate skillset and trade, to the advocates of the proper use of testing tools in an ever-changing digital world.  And somehow, close to the same number of years later, today I feel we are at yet another turning point with artificial intelligence.

What is A.I.? According to Wikipedia, “artificial intelligence can be defined as intelligence demonstrated by machines”. Merriam-Webster defines it as “the capability of a machine to imitate intelligent human behaviour”. This is great, but we have not yet defined what intelligence means. Indeed, intelligence is a tricky word to define, even in a human context as we recognise different types of intelligence. Going back to Merriam-Webster regarding the definition of intelligence:

(1): the ability to learn or understand or to deal with new or trying situations 

(2): the ability to apply knowledge to manipulate one's environment or to think abstractly as measured by objective criteria (such as tests)

No, that wasn’t me. Yes, Merriam-Webster indeed even specifically mentions tests in its definition of intelligence!  I’d say in the test automation world, intelligence would involve appropriately modifying test scenarios in such a way as to adapt to changes in the software being tested without human intervention. But with the rise of A.I., is our world of testing principles and best practices about to be turned upside-down?

TEST CHECK 1, 2, 3

People are often hesitant about becoming a tester. This is because there appears to be a lot of misconceptions about what a tester actually does. Is it just about repeatedly verifying that software works as it should? That sounds really repetitive, doesn’t it? But then again, many testers will tell you there’s actually a whole science behind it. You have to analyse, investigate, and conceptualise. It’s a door into the software development world, completely immersing yourself into the finicky details of the products we know and love. And just when you may start to get bored, automation shows up. Another perspective on testing that reduces monotony while increasing efficiency.  We harness the computational powers that be, by putting the machine to work in our pro-robotics world. But this benefit does not come without a hefty bit of work on our part.

In 2009, world renowned testing experts Michael Bolton and James Bach, wrote Testing vs. Checking. In this article, Bolton distinguishes between checking, which is what machines do in automation, and testing, which is what human testers do:

Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modelling, observation, inference, etc.

Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.

This spurred conversations in certain circles about automation being in fact just checking that stuff works, and not a substitute for the nuances of testing. Automation code, much like robots, only tests what it has been programmed to test, and will be completely oblivious to everything else. In contrast, a human tester can pick up on subtle inconsistencies and issues beyond the brick and mortar scope of a certain test scenario. In some ways, what automation lacks is eyes, and an abstract brain to make sense of the product that is presented. This problem is not limited to test automation, but a global obstacle to humanity’s push for further robot autonomy.  So, in software testing, this gap between the abstract human brain and the algorithmic robotic computation is marked by this very concept of checking versus testing.

One of the promises of artificial intelligence is to reduce this very gap by adding that QA intelligence into otherwise robotic test scripts. If we could give our automation scripts eyes or the ability to process images, we could visually validate software in a similar way human testers would, by first having a visual reference point. This is one of the principles behind A.I. based testing tools such as Applitools. However, giving automation scripts that abstract brain would prove to be far more of a challenge. And this is why, even with all the advances in QA technology, we still need to test manually on some level.

THE GREAT PYRAMID

In the test automation world, one of the most contentious topics is determining the scope of automation, and more specifically, determining what to automate. Some companies, on discovering the efficiency by which automated test scripts can be run compared to manual testing, opt immediately to move towards a 100% automation model (it should be noted that 100% automation is not equivalent to automating all possible scenarios, but rather automating all the tests we would have otherwise executed manually).  This may work in a purely backend (APIs, web services) context, but immediately you get into human-computer interaction, the inability to find some major bugs with automation will become glaringly apparent. But even more apparent, is the enormous maintenance work that comes with increased test automation. By maintenance, we don’t just mean updating scripts to  adapt to the ever changing specifications of today’s agile software development world (more on this later), but also the general instability of many UI automated tests and the mitigation of false positives. 

Cohn’s Test Automation Pyramid

Most of us with test automation training have at some point heard of the pyramid of test automation. This is a concept that was presented to the world by Mike Cohn in his highly acclaimed 2009 book, Succeeding with Agile: Software Development Using Scrum. Cohn goes on to further elaborate on the subject in his blog post The Forgotten Layer of the Test Automation Pyramid . Cohn calls for a test automation strategy that specifically looks at three different layers structured in the form of a pyramid, with Unit tests at the base, making up the bulk of automated tests, followed by Webservces/API tests, and finally, UI tests with the least amount of automation.

The idea behind this model is to do relatively less automation at the UI level as it tends to be unstable and high maintenance. But why are UI tests so brittle? For starters, there tends to be many more variables; browsers, operating systems, screen resolutions, machine configurations, etc. The UI also tends to be much slower than the other layers as we often heavily rely on the computational powers of third-party machines and the further rendering of code through the product’s delivery channel, be it a browser or otherwise. And as we saw earlier, automation is generally not as capable as a human in finding bugs in the UI. But by far, the real reason most automation engineers may dread UI automation, is maintenance. In a fast-paced incremental development context, the UI is constantly changing, while services tend to be relatively stable. The hours QA engineers spend updating UI element locators and test scenario flows, could easily lead to doubts about the ROI on automation as a whole.

But what if we could drastically reduce our maintenance woes and make UI automation as stable or perhaps almost as stable as automation on the service layer? What would this mean for the test automation pyramid? Being able to achieve this would certainly cause us to revisit this pyramid model (many QA circles are already doing this, but not necessarily due to A.I.). The slow speed of UI test automation is already being tackled by the use of parallel testing i.e. making UI tests shorter and running them in parallel. By removing the grunt of the maintenance work, as well as stability issues, we would be free to pile on the UI tests and let the self-healing power of A.I. take out the garbage. But should we?

The Natural SELECTION OF AGILE CREATURES

I don’t know if there are still many companies using the traditional waterfall model, but they really should consider moving to another methodology, like Agile for example. Throwing “finished” software over the fence for hungry testers to feast upon is so savage. We all know bugs found earlier on are undoubtedly cheaper to fix than those found near the end of the development lifecycle. Agile won the war (if there ever really was one) of software development methodologies.

Software testing has since adapted to the Agile world by integrating QA into the development process as early as possible. This was the subject of the 2009 book, Agile Testing by Lisa Crispin, and Janet Gregory. This book has been seen by some as the modern day bible for software test processes in an agile context.

With Agile methodologies like Scrum and Kanban, came Continuous Integration (CI) and Continuous Delivery (CD). Both of which would be extremely risky without some sort of test automation structure. We need automated scripts that can run automatically, and that are reliable enough to give fast-feedback on the quality of our code. We can even take it one step further and say, we need test automation to know when risky business is happening on our product and then run the appropriate tests to shield us from disaster. But with this comes at a level of exigence that’s relatively new to our software testing world.

A.I. to the rescue? This is a formidable challenge that continues to plague software development efforts. Unlike some of the previous challenges, this may actually be one where the machine might have the upper hand. Monitoring code check-ins and deployment is already done by machines. Using tags in our automation code would link them to specific features. And putting it all together is just a matter of synchronisation. A.I could monitor trends and historical failures to figure out what kinds of tests to run on each change.

We’ve seen that the maintenance work on test automation scripts is often further intensified in the Agile world. We write short-lived scripts that quickly become obsolete within a few sprints. Why are we writing automation code for features that are constantly changing? Isn’t that a lot of rework? This is where Cohn might refer us back to the test automation pyramid and say this is why we should focus more on Unit and Service test automation, rather than those ever so brittle UI tests. So, what if we attack these very layers with A.I.? Unit tests are fairly straightforward and focus more on what the code is doing. And what the code is doing is right at the computational machine brain level. Ripe for A.I.

TDD or Test-Driven Development has also seen wide adoption with Agile. But should A.I. be used to generate unit tests? If we say the UI layer is the nemesis of automation, then Unit tests should be the soulmate of an automation playground. A tool capable of credibly analysing code and understanding what it’s doing. A tool capable of using machine learning to create a mental map based on historical data on the evolution of software at the code level, then using this data to dynamical generate powerful unit tests while at the same time killing off old ones. We already have tools capable of determining unit test coverage. There’s even Mutation Testing which analyses code, makes modifications, and then reruns relevant unit tests to ensure they indeed fail due to the corrupted code. Is the future a completely autonomous A.I.-driven testing bot? Would this mean developers would finally be free from the shackles of testing? Probably not. But it’s a thought.

The Magic Black Box

Despite redefining what it means to test, the possible impending implosion of our test automation pyramid, and the modification of Agile testing methodologies, there is still one big elephant in the room when it comes to A.I. in the testing world. In some ways, A.I. is a return to the black box of many years ago. The record and playback tools that promised to take care of all our automation coding woes. Only this time, we say we can not only do it better, but we can do more than you ever dreamed…just trust us and give up control. A.I. is the black box that offers to virtually eliminate maintenance worries and validate some things we would have thought only human testers could do.

But sometimes it may seem like offering help to someone who doesn’t want your help. Some proud programmers scorn at closed systems that propose to make things easier by hiding the code. If we really come to trust A.I. to help us with our testing, does less control eventually mean less programming? Are we relying more heavily on the programming prowess of the tool and less on our own? Are QA automation engineers, who are often pseudo programmers, ready to give up this power?

The answer to many of these questions seem to be a yes with much reserve. Making automation once again accessible to the non-programmer is certainly a good thing. However, some automation engineers still see themselves as distinct from manual testers. Taking away this distinction may cause an identity crises that, the industry would have to deal with. If A.I. brings automation to the manual testers’ front door, are we going to push our automation engineers further left into the code and down into the glass box, thereby establishing a neo-QA equilibrium?

CONCLUSION

The way I see it, A.I. is about getting testers back to testing and away from coding. Making automation code easier to write or maintain should not change the core guiding principles behind how we test software. Testing is an art that has somewhat paradoxically been polluted by programming. Framework building and maintenance take up a big chunk of our time. How much time do we spend actually testing and coming up with test scenarios? Why? Because we know testing can be repetitive? That’s why we invented automation. So why not leave the pyramids to the robots and focus on building civilisation?