By Connor Walsh, CISSP
It is Monday morning and the first biomedical equipment support specialist (BESS) arrives onsite and starts checking email. Almost immediately, the phone rings with local office of information and technology (OIT) staff saying that the server room had an unexpected power outage over the weekend, and various clinics report that their systems are down. The BESS heads down to the server room to find the facility’s lab information system (LIS) and outpatient pharmacy automation servers are off, and all attempts fail to get them back on. The BESS starts to panic: which system should I get up first? How much time do I have? How much data will be lost if I can’t get the system back up?
A structured disaster recovery (DR) plan will help prepare and answer questions for any HTM department in the above situation. There are many free templates online that can help start the conversation. The most important first step in developing your DR plan is evaluating your systems on site; a good start is categorizing which systems have the greatest impact to patient care and hospital operations. In the example above, both systems being down would have a large negative impact, but evaluating the two, the LIS provides a greater impact to all facility operations and initial efforts for the BESS should be spent on this system. Other items to consider when assessing are the likelihood the system will go down to natural, technological and/or human-caused disasters, as well as recording other potential risks.
When developing a DR plan, there are two important terms to consider for your systems identified above. Recovery time objective (RTO) refers to how much time can go by for a system before it becomes “too much time” (i.e. patients need to be cancelled). For a large facility LIS like in the above, we will assume no downtime is acceptable – during procurement we would budget in the configuration of a “hot spare” so that if our production server corrupts, it fails with little to no downtime. Recovery point objective (RPO), which is how much data can be lost before it is considered “too much data loss” (i.e. regulatory requirements for data record storage), is another critical value that must be determined when identifying backups. Defining RPO often varies system to system and could range from real-time mirroring to weekly backups.
There are additional various “defense in depth” controls that you can apply to your critical assets as part of your DR plan. These can include preventative (such as redundant power), detective (such as network monitoring) and corrective (such as a robust backup system) controls. These controls can be picked and tailored to each medical system at your facility, after they have been evaluated, and each will help ensure there is minimal to accepted amounts of downtime in the event of emergency.
DR plans are often developed for each of your critical systems and include steps and instructions for restoring systems if disaster strikes. It is a great table-top exercise for healthcare technology management departments to practice a fake disaster so that everyone is prepared when the real thing happens. Consider setting up an annual or semi-annual department mock exercise to act and restore a critical medical system. Ultimately, a well-defined DR plan will allow anyone in your department to start and facilitate the recovery process and help provide you peace of mind knowing that your critical systems are protected.
Connor Walsh, CISSP, is a supervisory clinical engineer for the VA Boston Healthcare System.
The views expressed here are those of the author and do not necessarily represent or reflect the views of TechNation or MD Publishing.
