Lessons Learned: Boosting Resilience with Automation

by Nick Shah
August 01, 2024
Boosting Resilience with Automation

The CrowdStrike Microsoft story is all over the news, so I won’t waste time with a restatement of facts.  

People all over the world were at best inconvenienced, and at worst suffered true hardships as computer systems were knocked out all over the world, halting entire processes. The very security we look to for protection, ironically, was the source of this failure (by accident of course), exposing a truly stunning level of concentration in our system dependence. 

It’s hard to look at events like this for positives, and yet as leaders we must see this as the wakeup call it is: presenting the need for far greater digital resilience.   

No single point of failure should be able to paralyze nearly 10 million systems globally—to this end we must have greater diversity in the systems at use, more redundancies and fallback capacities, and better crisis preparedness. 

Security, to be fair, is a balancing act in most every way, weighing constant threats and an urgent need for speed against the time required for proper testing, diligence, and documentation.  

To my mind, the flaw most exposed by the CrowdStrike incident is how such an error could slip through all checks and propagate throughout the word’s systems unseen, until it was too late. 

Today I want to consider automation in software testing in this light: focusing on its capacity to improve our resilience. Current advances make this not only possible, I believe, but essential.  

Automation Innovations in Software Testing  

Testing automation has long been a fundamental part of software development.  

So it should be no surprise that AI and ML adoption has also been swift by most measures, with some 3/4 of surveyed companies at least experimenting with AI automation in testing, according to numerous sources. (Examples include: LambdaTest reporting 78%, Test Guild 76%.)  

Given the natural fit, this number is certain to rise.  

But less at issue is how many are using AI in some capacity, and more how they are using it. And if it is helping to increase coverage and efficiency. 

Generative AI solutions are not yet reliable enough to replace humans as gatekeepers for quality, and must be implemented wisely, but they can already do numerous things which can help to this end, including taking on testing-adjacent tasks like:  

  • Automatic test case generation 
  • Smarter test data that’s more realistic (a classic bottleneck) 
  • Failure prediction, using historical models 
  • Prioritizing testing areas by risk, history, and activity 
  • Optimizing the testing workflow 
  • Analyzing root causes and patterns 

This kind of AI-powered QA testing assistance is possible right now. 

Gartner predicts that by 2025, AI innovations in testing will reduce the time required by as much as 70%, and this must mean both increased coverage in what gets tested (with more continuous testing, and earlier in-cycle), as well as a shift in the nature of the QA role overall. 

As AI testing continues to improve, it will move increasingly earlier in the process, with a capacity to run tirelessly, around the clock, and in parallel, bolstering the DevOps goal of true CI/CD.   

Looking into the future of QA testing with AI, it’s easy to see today’s testers able to focus more on strategy, becoming more involved throughout development, and as a result improving the ultimate testing quality as well as efficiency.  

Automated solutions should be addressing bottlenecks in human testing, and providing additional levels of verification, but they cannot replace human testers. In fact, human oversight in this area will persist for some time, as the stakes, evidenced in the CrowdStrike event, are simply too high.  

I’ve written about AIOps, which can translate your operations data into actionable insights with improved detection and response, and these advances can also provide greater AI-driven resilience in your testing, with more clarity on your entire workflow. 

What Happened to Our Digital Resilience? 

The ability to stand up to, and bounce back from, adverse events, be it fluctuations in traffic, cyber attacks, or hardware and software failures, is absolutely critical in 2024.  

We only continue to see increases in cybercrime alongside a growing dependence on our systems, making failures ever more dangerous. 

With airports, banks, hospitals, and more all paralyzed by a single flawed update, with many also unable to recover quickly, we see what digital resilience is not: tightly coupled and utterly dependent on a single software (and/or hardware) solution. 

So how did we get to this place?  

AI is tremendous tool which I believe will transform work as we know it, but it is also still costly and resource intensive, and with a handful of companies working as the prime arbiters of AI quality, we see a potentially dangerous concentration which can increase dependence and decrease our resilience.   

In addition to diversifying our solutions where possible to provide redundancies, we should also be testing our emergency responses to these events, before they happen.  

AI is used in cybersecurity testing in this way, for example, but these techniques can be employed more broadly, to help organizations prepare for randomized outages of all scales, so they are not left scrambling when such a failure occurs. 

[Check out our PTP Report on chaos engineering for an example of how Netflix is using AI in security testing to improve resilience.] 

Conclusion 

Automation has long been a part of testing, and the QA teams at many organizations are already experimenting with AI-powered testing innovations.   

But if there is one thing to take away from the CrowdStrike Windows calamity, it’s that increased automation must be part of our renewed emphasis on digital resilience, especially in software development. 

Mistakes happen, and that will not end with AI (in fact, AI coding, which is faster than ever, generates ever more code that must still be checked as carefully as ever).  

What this technology already provides, through AI-enhanced QA practices, is time savings, and these must increasingly be applied for greater coverage, improved sandboxing and testing conditions, and testing earlier in workflows. 

To this end, I recommend organizations: 

  • Invest in automated testing now, employing tools to improve testing, thereby enhancing product quality overall, with AI 
  • Use automation to make your testing ever more continuous, earlier in your DevOps pipeline  
  • Let AI help you improve, by showing bottlenecks, risks, and repeated problems 
  • Make your security and disaster preparedness more proactive by attacking your own vulnerabilities first 

These are not distant dreams, but attainable objectives, and they increase our digital resilience by giving a stronger foundation, to help businesses remain standing even in an ever-shifting landscape.  

References 

CrowdStrike update that caused global outage likely skipped checks, experts say, Reuters 

CrowdStrike—How Microsoft Will Protect 8.5 Million Windows Machines, Forbes 

12 Data and Analytics Trends to Keep on Your Radar, Gartner 

Read more on Digital Transformation   or related topics From our CEO   ,
26+ Years in IT Placements & Staffing Solutions

Illinois

1030 W Higgins Rd, Suite 230
Park Ridge, IL 60068

Texas

222 West Las Colinas Blvd.,
Suite 1650, Irving, Texas, 75039

Mexico

Av. de las Américas #1586 Country Club,
Guadalajara, Jalisco, Mexico, 44610

Brazil

8th floor, 90, Dolorez Alcaraz Caldas Ave.,
Belas Beach, Porto Alegre, Rio Grande do Sul
Brazil, 90110-180

Argentina

240 Ing. Buttystreet, 5th floor Buenos Aires,
Argentina, B1001AFB

Hyderabad

08th Floor, SLN Terminus, Survey No. 133, Beside Botanical Gardens,
Gachibowli, Hyderabad, Telangana, 500032, India

Gurgaon

16th Floor, Tower-9A, Cyber City, DLF City Phase II,
Gurgaon, Haryana, 122002, India

Work with us
Please enable JavaScript in your browser to complete this form.
*By submitting this form you agree to receiving marketing & services related communication via email, phone, text messages or WhatsApp. Please read our Privacy Policy and Terms & Conditions for more details.

Subscribe to the PTP Report

Be notified when new articles are published. Receive IT industry insights, recruitment trends, and leadership perspectives directly in your inbox.  

By submitting this form you agree to receiving Marketing & services related communication via email, phone, text messages or WhatsApp. Please read our Privacy Policy and Terms & Conditions for more details.

Unlock our expertise

If you're looking for a partner to help build talent management solutions, get in touch!

Please enable JavaScript in your browser to complete this form.
*By submitting this form you agree to receiving marketing & services related communication via email, phone, text messages or WhatsApp. Please read our Privacy Policy and Terms & Conditions for more details.
Global Popup