Analyzing Tabletop Exercise Effectiveness

Analyzing Tabletop Exercise Effectiveness

To assess the current forms of evaluating the effectiveness of no-fault exercises and offer alternatives, several individuals with a great deal of experience participating in and planning tabletop exercises were interviewed.  This cadre includes former senior representatives with the Department of State, a Senior Policy Advisor to the White House and Department of Energy, a Deputy Under Secretary of the National Nuclear Security Administration, an Exercise Director with the Federal Bureau of Investigations, a former Navy SEAL, an Assistant Administrator for Protective Services and Security with the National Aeronautics and Space Administration, a hospital Emergency Manager, and a Senior Exercise Planner.  These interviews yielded a broad and detailed view on how tabletop exercises are currently evaluated and how they could improve.  Moreover, over 100 articles, governmental committee and subcommittee notes/reports, journals, after-action reports, and thesis were reviewed to gain a strong knowledge of how tabletop exercises are evaluated and whether they are effective.

Are qualitative assessments good enough?

Although no numeric rating may result from qualitative assessments and responses, one can certainly provide solid qualitative evidence that an exercise was effective through subjective post exercise evaluations/critiques provided by exercise participants.  Though it can be subjective, the individuals interviewed also provided examples of how one can determine that an exercise was effective solely through qualitative response.  Common qualitative examples discussed in the interviews as well as actual results from tabletop experience include:

  • Team-building and familiarity among response assets and leadership –The tabletop exercise provides an opportunity for first responders and follow-on response to meet one another, sometimes for the first time, and get to know/build trust among one another. “Almost impossible to measure, but a tabletop exercise is invaluable because it is the relationships built between responders in an emergency.  [A tabletop exercise] builds a trust between response.”
  • Knowledge gained of roles, responsibilities, and assets among responding parties – Deputy Under Secretary for the National Nuclear Security Administration Dr. Steve Aoki stated that taking part in tabletop exercises helped him in response guidance to Fukushima.  The exercises he participated in provided him with the knowledge on the various response assets available and what assets could be called upon during a crisis/disaster.  Additionally, the tabletop provided him with a venue to work through various scenarios.  Furthermore, the Assistant Administrator at NASA, Mr. Mahaley, stated that he took part in a tabletop exercise that involved him contacting the White House during a disaster.  This experience prepared him for an actual call that was needed while he was acting Director of Security for Energy during the blackout of 2003 that impacted much of the Northeast United States.  He stated that experience provided him the opportunity to work through how a phone call to the White House would take place and understand who needed to be included in the call.
  • Post-exercise lessons learned – things learned that were otherwise not know prior to the exercise.  “For the after action review to be effective, the opportunity to incorporate recommended changes to site response plans and procedures should be a goal.”A gap in a current security plan or procedure or a lack of understanding of a particular substance/organism often is identified through the course of tabletop exercise.  This often results in a change in a plan/procedure or a group of people being more comfortable with response.
  • Knowledge of a particular threat – i.e. group, source/material, attack.  As stated from the conclusion of Maryland’s pandemic influenza preparedness exercise – It [the tabletop exercise] served to engage the emergency response community and address the issues of incident command and how pandemic planning fits with the “all hazards” approach.  The exercise also educated key partners and stakeholders, through an experiential approach, about the potential severe consequences of pandemic influenza, and it provided a forum to “drill down” beyond the current state plan and identify additional critical local planning activities that are needed.  Instructive insights and lessons were gained from the exercise that should bolster further planning efforts in Maryland, not only for pandemic influenza, but also for bioterrorism and other public health disasters.
  • Exercising plans in place – a tabletop exercise provides a venue for response to actually practice the plans and procedures in place to ensure they fully understand said plan/procedure and/or response to a disaster. Mr. Mahaley stated “You do what you are trained to do.  In real life, you are going to react how you are trained.  In my 40 years of experience, tabletop exercises provide the most effective form of training.”

This qualitative information is extremely vital and shows a tabletop exercise is effective, but this information is not always easy to gather.  In almost every interview that was conducted, a common theme regarding the best way of obtaining great qualitative response was if the exercised remained no-fault or non-attributional, allowing an open, honest environment.  The reasons stated included that assets are more likely to admit faults, vulnerabilities, or lack of understanding or a shortfall in a plan/procedure if they are not worried about their job. This makes sense considering that participants may feel more comfortable speaking if they are not being graded.  So, if a no-fault tabletop exercise yields the best qualitative responses, can it also provide quantitative results to determine effectiveness?

Quantitative assessments

Raw, not well developed, quantitative assessments currently exist in the tabletop community but they are not well known and there is no standard.  It would be useful to have standardized quantitative assessments to assist public and private organizations determine if the money being spent by their organization is going to good use and the tabletop exercise is worth attending.  With a lack of available quantitative metrics, it is prudent to look at ways to quantify the results of a tabletop exercise to compliment the qualitative data.  Furthermore, for the purpose of this paper, it was stated that exercises may be more effective being non-attributional, so we will also mull over this as the type of tabletop exercise being considered.

Suggested Metrics

Three forms of quantitative assessments should be considered to assist government agencies and private organizations with determining effectiveness for a no-fault tabletop exercise.  These include a pre/post test combination to help identify the percentage of improvement, a numeric count of observations during and post exercise, and a rubric as an assessment tool.

Conducting a pre and post test among players and observers (observers typically consist of other invited responders not sitting at the player’s table) is a way to gauge a level of improvement in understanding, knowledge, and collaboration.  Participants would take a test to indicate their understanding of response to a disaster, level of knowledge on the particular threat, and how well they know who would be responding/in charge of a particular incident.  Then following the tabletop exercise, the participants would take the same test and the results would be compared between the two.  From that, a level of improvement could be gathered providing some quantitative gauge of exercise effectiveness.

Another potential way to gather quantitative data from a tabletop exercise could be in the form of counting the number of observations either during an exercise or post exercise.  During a no-fault exercise, an unbiased observer could be included not to grade or place fault, but to instead count the number of observations that a participant learned something, a vulnerability was identified, a gap in a plan or procedure was identified, or an agency stated they were unaware of a particular response or response asset.  The observations could then be sorted and tallied.  This could also be done post- exercise counting the number of changes made to plans, policies, or procedures, as well as anything else that may have resulted from the exercise experience.

The final suggestion that should be considered is an original contribution from the research for this paper.  This rubric was designed considering the great number of tasks and objectives that may be included in a tabletop exercise. Table 3 is shown in its full capacity below.

Tabletop Excercise Effectiveness

Table 3

Scoring Effectiveness

To utilize the rubric, exercise evaluators would read the tasks listed on the left and employ the descriptions listed beside the tasks to determine a score for the specific task or question listed.  Each task would receive a 1 (lowest quality), 2 (average quality) or a 3 (highest quality) based on how well the tabletop exercise fulfilled the task.  The culmination of the tasks/questions listed in the rubric should fulfill the purpose and goals of the exercise.  Hence, a score toward the higher end of the max scoring should indicate exercise effectiveness.  In the example in Appendix 1, the minimum score would be a 20 with a maximum value of 60.  Considering the range, a median score of at least 30 may indicate an effective exercise, but it would be up to the designer of the rubric to identify the threshold based on the number of tasks/questions listed and their respective values.   One other suggestion to the rubric might be assigning a greater weighted value range for tasks/questions that have great importance.

Who should complete the rubric?

In this assessment, it would make sense to utilize two groups to fill out this form following the exercise.  In the first group, the players should be considered the primary responder to the rubric.  They will be the main focus and it will be their response to the exercise scenario that will be gauged.  The rubric could remain anonymous since the exercise in no-fault and will not be a factor in gauging the results.  Additionally, the tabletop exercise remaining no-fault may foster more honest response from the players.  The second group that would complete the rubric consists of site agents that have expert knowledge and experience with tabletop exercises.  They may be individuals that assist in the setup and reality design of the exercise, but are not a part of the exercise planning/facilitation team and have no stake in how well the exercise performs.  Ideally this would be someone who can observe the exercise, but one who is not a player or providing response during the exercise.  Lastly, a combination of the two may be the best model.  By obtaining a score from both the players and the site, one could compare the averages between the two sets to see any deviation of appeared effectiveness.

Considering the potential alternatives to the current quantitative metrics available, the rubric may provide the most value to gain a quantitative insight from a no-fault exercise.

No comments