This month, we’ll explore the first two aspects of the AWWA’s guidelines for best practices: Governance & Risk Management and Business Continuity & Disaster Recovery.
The Big Picture for Security
In the world of cybersecurity for utilities, the area of Governance and Risk Management is the macro view. It identifies and defines those aspects of security that could impact the operations of the Agency. According to the AWWA: “This category is concerned with the management and executive control of the security systems of the organization; it is associated with defining organizational boundaries and establishing a framework of security policies, procedures, and systems to manage the confidentiality, integrity, and availability (CIA) of the organization.”
Maintaining an accurate inventory of Process Control Systems (PCS) is the primary focus of governance and risk management for water utilities. Such an inventory includes applications, data, servers, workstations, field devices, and communications/network equipment. Limiting the number of system components is a wise practice, since it makes management easier.
However, despite its lofty position at the top of the list, Governance isn’t given primary importance in terms of urgency. The AWWA recommends addressing immediate areas of significant risk and taking corrective action right away. Once the obvious gaping holes in cybersecurity have been plugged, it’s time to put Governance and Risk Management in place, making an ongoing commitment to security part of the organizational culture.
Getting Back to Business
Business Continuity focuses on ensuring availability even if there are faults in the system—avoiding the domino effect where one simple error or accident can shut down the entire process. Planning for this aspect of security starts with identifying what might go wrong and how it could impact the agency’s ability to deliver services. Having physical and virtual systems that are designed to fail over to a backup instead of crashing is important. On the cybersecurity side, using virtualization that redistributes computing loads to nearby nodes is an example of smart system design. In addition, the ability to manually override systems during an outage or cyberattack is essential.
Disaster Recovery deals with worst case scenarios that involve catastrophic failure due to natural or manmade causes. On-site and off-site database backups are vital to safeguard against prolonged shut-downs. An alternative method of communication should also be determined. Proper implementation includes putting together a crisis management team and testing the recovery system on a regular basis to ensure the plan will work in the real world.
Coming up next month, we’ll take a look at Server & Workstation Hardening and Access Control.