Design checklists to “Do, Sync & Act”

December 29, 2017

by STAG Software with No Comment Articles & Publications

This article is the second one on checklists, the first one being “The power of checklist“

Checklists seem to defend everyone, a kind cognitive net designed to catch flaws of memory and attention.

Atul Gawande in his book “ The Checklist Manifesto” states that there are three kinds of problems in the world:

SIMPLE: Individualistic in nature, solvable by application of simple techniques.e.g. Bake a cake.
COMPLICATED: Can be broken into a set of simple problems, requires multiple people, often multiple specialized teams. Timing, coordination becomes a serious concern. e.g. Launching a rocket.
COMPLEX: These are problems where the solution applied to two similar problems may not result in the same outcomes. e.g. Raising a child. Expertise is valuable, but most certainly not sufficient.

He continues on to say that checklists can provide protection against elementary errors in the case of simple problems that we are designed with. This can be accomplished by simple activity task checklist.

In the case of complex problems that require multiple specialists to coordinate and be in sync, a simple activity task checklist won’t do, what is needed is a checklist with communication tasks to ensure that experts discuss on the matter jointly and take appropriate action.

“Man is fallible, but maybe men are less so” Belief in the wisdom of the group, the wisdom that multiple parts of eyes are on the problem and letting watchers decide what to do.

So how can checklists help in solving simple/complex problems? Using simple activity task checklists to ensure simple steps are not missed or skipped and checklist with communication tasks to ensure that everyone talks though and resolves hard and unexpected problems.

Building tall buildings is a complex problem and the success rate of the construction industry’s checklist process has been astonishing. Building failure is less than 0.00002%, where building failure is defined as the partial or full collapse of a functioning structure. (from a sample size of a few million buildings!).

Now let us turn our attention to the complex problem. How does one deal with complex, non-routine problems that are fundamentally difficult, potentially dangerous and unanticipated? In these situations, the knowledge required exceeds that of any individual and unpredictability reigns. To solve these it is necessary to push the power of decision making from a central authority to the periphery, allowing people to make the decision and take responsibility.

So the checklist needs to allow for judgment to be used in the tasks rather than enforce compliance, so that actions may be taken responsibly. This needs to have set of checks to ensure stupid but critical stuff is not overlooked, a set of checks to ensure coordination and enable responsible actions to be taken without having to ask for authority. There must be room for judgment, but judgment aided and even enhanced by a procedure. Note that in COMPLEX situations, checklists not only help, they are *required* for success.

So, how does one make checklists that work? Aviation industry thrives on checklists in normal times and also to tackle emergencies. The advice from Boorman from the “The Checklist Factory” of Boeing for making checklists that work are:

Good checklists are precise, easy to use in the most difficult situations. They do not spell out everything, they only provide reminders of critical and important steps, the ones that even the highly skilled professionals would miss.
Bad checklists are too long, hard to use; they are impractical. They treat people as dumb and try to spell out every step. So they turn people’s brain off rather than turning them on.

“The power of checklists is limited, they help experts remember how to manage a complex process or machine, make priorities clearer and prompt people to function well as a team. By themselves, however, checklists cannot make anyone follow them.” (Boorman, Boeing)

So how should a good checklist look like?

Keep the length of checklist between five to nine items, the limit of human memory.
Decide on whether you need a DO-CONFIRM checklist or READ-DO checklist. With a DO-CONFIRM checklist, individuals perform jobs from memory and experience and pause to CONFIRM that everything that was supposed to be done was done. With the READ-DO checklist, individuals carry out the tasks as they tick them off, it is like a recipe. So choose the right type of checklist for the situation. DO-CONFIRM gives a people a greater flexibility in performing the tasks while nonetheless having to stop and confirm at key points.
Define clear pause points at which a checklist should be used
The look of checklist matters, they should be free of clutter and fit into a page

In summary

Ticking boxes is not the ultimate goal
Checklist is not a formula, it enables one to be smart as possible
It improves outcomes with no increase in skill
Checklists aid efficient working

As smart individuals, we don’t like checklists. It somehow feels beneath us to use a checklist, an embarrassment. The fear is that checklists are about mindless adherence to protocol.

Hey, the checklist should get the dumb stuff out the way so as to let you focus on the hard stuff. We are all plagued by failures – by missed subtleties, overlooked knowledge, and outright errors. Just working harder won’t cut. Accept the fallibilities. Recognise the simplicity and power of the checklist.

Try a checklist. It works.

If you find the article interesting, please ‘like’, ‘share’ or leave a comment below.

The power of checklist

December 29, 2017

by STAG Software with No Comment Articles & Publications

Recently I read the book “The Checklist Manifesto” by Atul Gawande.

“An essential primer on complexity in medicine” is what New York Times states about his book whilst The Hindu states this as “An unusual exploration of the power of to-do list”.

As an individual committed to perfection, in constant search of scientific and smart ways to test/prevent and as an architect of Hypothesis Based Testing, I was spellbound reading this brilliantly written book that made the lowly checklist the kingpin, to tackle complexity and establish a standard for higher baseline performance.

The problem of extreme complexity The field of medicine has become the art of managing extreme complexity. It is a test whether such complexity can be humanly mastered: 13000+ diseases, syndromes and types of injury (13000 ways a body can fail), 6000 drugs, 4000 medicines and surgical procedures each with different requirements, risks and considerations. Phew, a lot to get right.

So what has been done to handle this? Split up knowledge into various specializations, in fact, we have super specialization today. But it is not just the breadth and quantity of knowledge that has made medicine complicated, it is also the execution of these. In an ICU, an average patient required 178 individual interactions per day!

So to save a desperately sick patient it is necessary to: (1) Get the knowledge right (2) Do the 178 daily tasks right.

Let us look at some facts: 50M operations/year, 150K deaths following surgery/year (this is 3x #road-fatalities), at least half of these avoidable. Knowledge exists in supremely specialized doctors, but mistakes occur.

So what do you when specialists fail? Well, the answer to this comes from an unexpected source, nothing to do with medicine.

The answer is: THE CHECKLIST

On Oct 30, 1985, a massive plane that carries 5x more bombs roared and lifted off from the airport in Dayton, Ohio and then crashed. The reason cited was “Pilot error”. A newspaper reported, “this was too much airplane for one man to fly”. Boeing the maker of this plane nearly went bankrupt.

So, how did they fix this issue? By creating a pilot’s checklist, as flying a new plane was too complicated to be left to the memory of any one person, however expert. The result: 1.8 million miles without one accident!

In a complex environment, experts are against two main difficulties: (1) Fallibility of human memory, especially when it comes to mundane/routine matters which are easily overlooked when you are strained to look at other pressing matters of hand (2) Skipping steps even when you remember them, as we know that certain steps in a complex process don’t matter.

Checklists seem to provide against such failures and instill a kind of discipline of higher performance.

Peter Provonost in 2001 decided to give a doctor’s checklist a try to tackle central line infections in ICU. So what was the result after one year of usage? Checklist prevented 43 infections and 8 deaths and saved USD 2M! In another experiment, it was noticed that patients not receiving recommended care dipped from 70% to 4% and pneumonia fell by a quarter and 21 fewer parents died.

In a bigger implementation titled the “Keystone Initiative” (2006) involving more hospitals of 18-month duration, the results were stunning- USD 17M saved, 1500+ lives saved!

ALL BECAUSE OF A STUPID CHECKLIST!

So where am I heading? As a Test Practitioner, I am always amazed at how we behave like cowboys and miss simple issues causing great consternation to the customer and users. Here again, it is not about lack of knowledge, it is more often about carelessness. Some of the issues are so silly, that they can be prevented by the developer while coding, and certainly does not demand to test by a professional. This is where a checklist turns out to be very useful.

In an engagement with a product company, I noticed that one of the products has a product backlog of ~1000 issues found both internally and by the customer. Doing HyBIST level-wise analysis, we found that ~50% of the issues could have been caught/prevented by the developer preventing the vicious cycle of fix and re-test. A simple checklist used in a disciplined manner can fix this.

So how did the checklists help in the field of medicine or aviation? They helped in memory recall of clearly set out minimum necessary steps of the process. They established a standard for higher baseline performance.

Yes! HIGHER BASELINE PERFORMANCE. Yes, this is what a STUPID CHECKLIST CAN DO!

So how can test practitioners become smarter to deliver more with less? One way is to instill discipline and deliver baseline performance. I am sure we all use some checklist or other but still find results a little short.

So how can I make an effective checklist and see a higher performance ? Especially in this rapid Agile Software world?

This will be the focus of my second part of this article to follow. Checklists can be used in many areas of software testing, I will focus in my next article on ‘How to prevent simple issues that plague developers making the tester a sacrificial goat for customer ire by using a simple “shall we say unit testing checklist”.

Related article: Design checklists to “Do, Sync & Act”

If you find the article interesting, please ‘like’, ‘share’ or leave a comment below.

HyBIST enables agility in understanding

October 13, 2011

by STAG Software with No Comment Learnings & Experiences

A Fortune 100 healthcare company building applications for next generation of body scanners, uses many tools including OS, compilers, webservers, diagnostic tools, editors, SDKs, database, networking tools, Browsers, device drivers, project management tools and development libraries. Healthcare domain meant compliance to various government regulations including that of FDAs. One such compliance states that every tool used in the production, should be validated for ‘fitness of use’. This meant as many as 30 tools. How could one possibly test the entire range of applications, before it is used? Considering the diverse range of applications, how could they have one team do it?

STAG was the chosen partner not because we had expertise in healthcare applications, but because of HyBIST enables test teams to rapidly turnaround. For this job, STAG put together a team with sound knowledge of HyBIST.

The team relied on one of the most important stages of the SIX-staged HyBIST – “Understand Expectations” – A scientific approach to “the act of understanding the intentions or expectations” by identifying key elements in a requirement/specification and setting up a rapid personal process powered by scientific concepts to ensure that we quickly understand the intentions and identify the missing information. We look at each requirement and partition these into functional and non-functional aspects and probe into the key attributes to be satisfied for the requirement. We use a key core concept Landscaping that enables us to understand the Market place, end-users, Business flows, architecture and other attributes, and other information elements.

Once a tool is identified, the team gathers more information from the public domain. This ensured the demo from customer (of around 45 minutes) is easily absorbed. During the demo, the customer also shares the key features they intend to use. This information eventually morphs into requirements. The team then explores the application for around 2 days. During this period they come up with a list of good questions, clarify the missing elements and understand the intended behavior. Thus the effort spent to understand and learn the application is as less as 16 hours.

HyBIST implementation benefits more than just testing

November 27, 2010

by STAG Software with No Comment Learnings & Experiences

We have been talking and advocating on various platforms, how scientific approach of HyBIST and the method, STEM, delivers key business value propositions. This time we thought it would be prudent to share our experience of implementing this in projects and convey the results as well as interesting benefits that it entails while delivering clean software to our customers.

HyBIST was applied on various projects that were executed on diverse technologies in variety of domains across different phases of product life cycle. The people involved where of mix from no experience to 5 years of experience.

We observed that HyBIST can be plugged into any stage or situation of the project for a specific need and one can quickly see the results and get desired benefits as required at that stage.

Our experience and analysis showed that there were varied benefits like rapid reduction in ramp up time, creation of assets for learning, consistent way of delivering quality results even with less experienced team members, increased test coverage, scientific estimation of test effort, optimization of regression test effort/time/cost, selection of right and minimal test cases for automation and hence reduction in its development effort/time/cost.

Following are the key metrics and the results/benefits achieved in some of the projects were HyBIST was implemented –

Project 1:

Domain: SAAS / Bidding Software

Technology – Java, Apache (Web server), Jboss (App server), Oracle 9i with cluster server on Linux

Lines of Code – 308786

Project Duration – 4 months

D1, D2 and D4 were done almost in parallel due to time constraints for this complete application that was developed from scratch.

D1 – Business Value Understanding (Total effort of 180 person hours)
- 3 persons with 3+yrs experience were involved (had no prior experience in this particular domain)
- 4 main modules with 25 features listed.
- Landscaping, Viewpoints, Use cases, Interaction matrix (IM) were done.
- D1 evolved and developed by asking lot of questions to the design/dev team.

D2 – Defect Hypothesis (Total effort of 48 person hours)
- 3 persons with 3+yrs experience were involved.
- 255 potential defects were listed.

D4 – Test Design (Total effort of 1080 person hours)
- 3 persons with 3+yrs experience were involved.
- Applied Decision tables (DT) for designing test scenarios.
- Totally 10750 test cases were designed and documented.
- Out of which 7468 (69%) are positive test cases and 3282 (31%) are negative test cases.
- Requirement Traceability Matrix (RTM) and Fault Traceability Matrix (FTM) were prepared.

D8 – Test Execution (Total effort of 3240 person hours)
- 9 persons were involved in test execution and bug reporting/bug fixes verification (3 persons with 3+ yrs experience and 6 persons with 2+ yrs experience).
- 12 builds were tested in 3 iterations and 4 cycles.
- Totally 2500 bugs were logged, out of which 500 bugs were of high severity.

Key benefits:

No bugs were found in UAT.
All change requests raised by QA team was accepted by Customer & Dev Team.
Interaction matrix was very useful for selecting test cases for regression testing and also for selecting right and minimal test cases for automating sanity testing.
Regression testing was for shorter periods like 2 to 3 days, interaction matrix was quite useful to do optimal and effective regression testing.
The method, structure, templates (IM, DT, RTM, FTM, Test case, Reporting) used and developed in this project is being used as reference model for other projects at this customer place.

Project 2:

A web service with 5 features that had frequent enhancements and bug fixes (Maintenance)

Technology – Java, Apache Web Server

Project Duration – 4 weeks

D1 – Business Value Understanding (Effort of 6 hours)

Mind mapping of the product and also the impact of other services & usage on this service

D2 – Defect Hypothesis (Effort of 5 hours)

Listed 118 potential defects

Key Benefits:

Preparation of D1 document enabled ramp up time for new members (Developers/Testers) to understand the product, to come down from earlier 16 hours to 4 hours now.
Any member added to this team was productive from day one and could start testing for any regression testing cycles for enhancements and bug fixes.
Listing of potential defects enabled adding missing test cases from the existing test case set.

Project 3:

Domain – E-learning

Technology – ASP.Net, IIS, SQL Server, Windows

Validation of a new feature added to the product

Duration – 2 weeks

D1 – Business Value Understanding (Effort of 5 hours)

Understood the feature by asking questions and interacting with development team over emails/conf calls

D2 – Defect Hypothesis (Effort of 2 hours)

Listed 130 Potential defects by thinking from various perspectives

D4 – Test Design (Effort of 16 hours)

Designed and documented 129 test cases

D8 – Test Execution (Effort for Test Execution – 626 Person hours,

Effort for bugs reporting/bug fixes Verification – 144 Person hours)

Executed test cases by performing 2 cycles of testing and 2 regression cycles

8 new test cases were added while executing the test cases

31 bugs were found in test execution of which 23 bugs were of high severity. 29 of the bugs can be linked to potential defects visualized and listed before. 2 of the bugs found were not linked to any documented test cases.

Key Benefits:

Arrived at a consistent way of understanding the feature and designing test cases for new features irrespective of the experience of the team member involved

Project 4:

Domain – Video Streaming

Technology – C++, PHP, Apache, MySQL, Linux

An evolving new product in very initial cycles of development/testing

Duration – 4 weeks

People Involved – 2 Fresh test engineers (No previous work experience but trained in HyBIST/STEM)

D1 – Business Value Understanding (Effort of 32 hours)

The understanding of the product in the form of listing features/sub features, Landscaping, Critical quality attributes, Usage environment/Use cases, by questioning

D2 – Defect Hypothesis (Effort of 40 hours)

Listing of over 150 potential defects

Key Benefits:

2 fresh engineers could understand and comprehend the product features, the business flow and its usage in a scientific manner and also document it. They could also think and visualize possible defects to enable them to come up with needed test cases to identify and eliminate defects.
The process of doing D1 and D2 by the fresh engineers generated lot of useful questions that enabled better thinking, understanding and different perspectives of the product behavior to the senior engineers. This helped them to design more interesting test cases to capture the defects during test execution.
The assets created as D1 and D2 is helping the other members of the team to quickly ramp up on the product features and get the detailed understanding in 50 % lesser time.

Project 5:

Domain – Telecom protocol

3GPP TS 25.322 V9.1.0 Standards

Estimate effort for complete test design of RLC protocol by going through existing very high level test specifications and designing test cases for 2 sample functions

Duration – 3 weeks

None of the persons involved had any previous experience in testing protocol stack.

D1 – Business Value Understanding (Effort of 40 person hours)

Went through the generic RLC standards and in specific understood the 2 functions, Sequence number check and Single side re-establishment in AM mode

Prepared flow charts with data/message flows between different layers

Prepared box model illustrating various inputs, actions, outputs and external parameters

D2 – Defect Hypothesis (Effort of 16 person hours)

Listed 28 generic potential defects and the defect types

D4 – Test Design (Effort of 48 person hours)

Prepared 2 input tables and 2 decision tables

Designed 26 test scenarios (6 positive, 20 negative) and 44 test cases (6 positive, 38 negative)

Performed gap analysis of missing test cases in the customer’s test specification document for the 2 functions (Effort of 12 person hours)

Estimated time and effort for the complete RLC test case design, based on the above data (Effort of 4 person hours)

Key benefits:

Performed gap analysis in the existing high level test specs
20 times more test cases designed for 2 functions covered
86% of the test cases added were negative type
Test cases developed were detailed and covered various combinations of inputs, parameters and intended/unintended behaviors
Test cases developed were suitable in order to help easy conversion to test scripts using any tool
Performed estimation for RLC test design covering 22 functions
256 Test scenarios and 1056 test cases to be designed with effort of 446 person hours for RLC

Project 6:

Domain – Retail

Validate railway booking software on point of sale device

Technology – Java

Duration – 4 weeks

D1 – Business value understanding (Effort of 16 person hours)

Documented software overview, features list, use cases list, features interaction matrix, value prioritization and cleanliness criteria

D2 – Defect Hypothesis (Effort of 18 person hours)

Listed 20 potential defects by applying negative thinking, 54 potential defects by applying Error Fault Failure model. The potential defects were categorized into 46 defect types and were mapped to the features listed.

D3 – Test Strategy (Effort of 6 person hours)

Based on the listed potential defects types, arrived at test types, levels of quality and test design techniques needed as part of test strategy/planning. Quality level 1(Input validation and GUI validation), Quality level 2(Feature correctness), Quality level 3(Stated quality attributes) and Quality level 4(Use case correctness).

D4 – Test Design (Effort of 24 person hours)

Designed and documented 30 test scenarios (15 positive, 15 negative) and 268 test cases (197 positive, 71 negative) for quality level 1, 70 test scenarios (21 positive, 49 negative) and 123 test cases (55 positive, 68 negative) for quality level 2 and 8 test scenarios for quality level 4. Created box models and decision tables to arrive at test scenarios

Prepared requirement traceability and fault traceability matrices

D8 – Test Execution (Effort of 32 person hours)

Out of 293 test cases designed, 271 were executed and 22 were unable to be executed.

52 defects (27 high, 12 medium, 13 low) and 8 suggestions were logged.

Quality level 1 – 23 defects (2 high, 11 medium, 10 low) and 2 suggestions

Quality level 2 – 27 defects (25 high, 2 low) and 6 suggestions

Quality level 4 – 2 defects (2 medium)

Key Benefits:

Complete validation of the product was performed successfully by one senior engineer guiding 2 fresh test engineers without any previous work experience and none of them having experience in this particular domain
All the suggestions logged were accepted and valued

Project 7:

Domain – Mobile gaming

Technology – Java, Symbian

Duration – 3 weeks

D1 – Business Value Understanding (Effort of 16 person hours)

Achieved product understanding by documenting software overview, technology, environment of usage, features list, use cases list, mapping of use cases to features, features interaction matrix, value prioritization and cleanliness criteria.

D2 – Defect Hypothesis (Effort of 16 person hours)

Listed 96 potential defects by categorizing issues related to installation, download, invoking application, connectivity, input validation, Search, subscription, authorization, configuration, control, dependency, pause/resumption, performance, memory.

Mapped the features to the potential defects

D3 – Test Strategy (Effort of 4 hours)

Based on the listed potential defects types, arrived at 4 levels of quality (Game initialization & invoking correctness, Game subscription correctness, Game download correctness, Dependency correctness). The different test types were mapped to the 4 quality levels.

D4 – Test Design (Effort of 20 person hours)

Box models and decision tables were created

Designed and documented 37 test scenarios and 66 test cases

Key Benefits:

Complete validation of the product was performed successfully by one senior engineer guiding 2 fresh test engineers without any previous work experience and none of them having experience in this particular domain
The assets created in this became a useful reference for other projects in mobile gaming software for its understanding and validation

Quality injection – Scientific validation of requirements

August 9, 2010

by STAG Software with No Comment Learnings & Experiences

Validating early stage pre-code artifacts like requirement document is challenging. This is typically done by rigorous inspection and requires deep domain knowledge. One of our Japanese customer threw a challenge – “How can you use HyBIST/STEM to scientifically validate requirements without knowing the domain deeply?” .

The core aspect of HyBIST is to hypothesize potential defect types that prove that they do not exist. These are identified by keeping in mind the end users and the technology used to construct the system. So how do you apply this to validate a pre-code artifact?

We commenced by identifying the various stakeholders for requirement document and then identified key cleanliness attributes. These cleanliness attributes if met would imply that the requirements was indeed clean. We were excited by this. We then moved and identified potential defect types that would impede these cleanliness attributes/criteria.

Lo behold, the problem was cracked and we then identified the various types and the corresponding evaluation scenarios for validating the requirements/ architecture document. We came up with THIRTY+ defect types that required about 10+ types tests conducted over TEN quality levels with a total of SIXTY FIVE major requirement evaluation scenarios to validate a requirement.

What we came up is not yet-another-inspection-process that is dependent on domain knowledge, but a simple & scientific approach consisting a set of requirement evaluation scenarios that could be applied with low domain skill to ensure that the requirement/architecture can indeed be validated rapidly and effectively. These ensure that the requirement document is useful to the various stakeholders over the software life cycle and does indeed satisfy the intended application/product attributes.