Quick Links
Advertise with Sarbanes Oxley Compliance Journal
Features


< Back

Sarbanes Oxley : Technology : Auditing

Three Simple Steps for Minimizing Data Protection Failure


While Ensuring Regulatory Compliance

By Jim McDonald
Jim McDonald
CTO
WysDM Software

One of the primary requirements for effective compliance management is an audit trail of all data protection activities that take place. These activities include traditional backups and restores from all of the backup servers within the environment. To ensure regulatory compliance, companies must retain information on these activities and make it available for inspection if required. Failure to produce data for a specific subset of protected systems over a given time period can lead to heavy fines, and unrecoverable data can have major consequences for the company, its shareholders, and its customers.

The risk of data protection failure is far higher than commonly suspected, because purely technical approaches to data protection management are insufficient for managing this risk. Companies often rely on information supplied by backup applications to provide a list of data protection failures; however, while these applications catch the most obvious problems, they miss a large number of failures that are known as policy failures. Not backing up a system at all, not backing up a system at the right time, and not retaining the backed up data for as long as required are all policy failures. This article outlines an approach that addresses these policy failures and ensures your data is protected in the manner required by business policies and regulations.

Over the past decade, the requirements for data protection have moved from purely technical to one that is far more business-oriented; however, data protection technologies have not kept up with this transition. Today there is a large gap between the capabilities of most data protection products and the requirements companies have for data protection. Because it approaches data protection from the business point of view, data protection management (DPM) is fast becoming the way to ensure that data is sufficiently protected.

This article outlines a number of areas of data protection risk and provides a simple three-step process that enables businesses to attain and maintain a minimal risk profile.

Risks in Data Protection
Successful data protection entails managing a number of risks. At a high level these risks fall into three categories:
• Unprotected data – data that is never protected
• Useless data – data that appears to be protected but is not
• Unrecoverable data – data that is protected but can’t be recovered

Each of these categories of risk must be addressed to ensure that data is truly protected. In addition, clear and concise information that details what was unprotected, what steps were taken to protect it, and gives assurance that it is now protected must be available to all relevant internal and external personnel.

Unprotected Data
It is very easy to leave data unprotected and the business at risk. The gulf between the technical operation of backups and the business impact of exposure is such that understanding what data is unprotected must be viewed and acted upon from a business point of view.

Business View of Data Protection
The first step in understanding the impact of unprotected data is to provide a business view of the data being protected. Knowing that a backup failure has occurred on server xyz123 is one thing, knowing that a backup failure has occurred on an email or HR server is very much another. An important point to note here is that information about such failures should come from somewhere other than the backup application, since configuration within the backup application is not authoritative from a business point of view. Details about business use of data and its criticality can provide a source of prioritization when resolving issues involving unprotected data.

Missed Backups
With the information about business assets at hand, it is important to highlight gaps in the data protection schedule. This requires knowledge of what exists (and hence should be protected); what should run given the existing schedules; and what has run. Once you have this knowledge, you can determine what has been missed, either through misconfiguration or mismanagement of backup schedules. This is especially true of backups that run on an ad hoc basis, such as backups that run as the result of a batch scheduler or other event outside of the backup application itself.
WysDM for Backups obtains detailed configuration and schedule information from leading backup applications and combines this with information about which data protection activities have actually taken place to ensure that missed backups are flagged accurately. It also obtains information from external systems such as asset management databases to ensure that all assets are protected.

Failure Auditing
Inevitably, some backups will fail. When they occur, it is important that details about the failed backups are recorded and retained so that you can determine the

Reconciliation
A failed backup is not the end of the story; there are a number of types of reconciliation that can take place after the backup fails. In any large enterprise, the acts that follow the failed backup are important to ensure that risk remains at an acceptable level. Such acts include contacting the user responsible for the asset that was not backed up, fixing the underlying issue, running a test to confirm that the next backup will work correctly, and ensuring that the relevant business user is aware of the level of risk that has been incurred. All of these actions should be logged at the time and made available for future reference if required.

Long-term Trending
The people involved with backup failures are often very much “in the trenches” of data protection and operate on the basis of the data protection cycle; longer-term trending of failures is not one of their priorities. But trend information is critical to pinpoint the highest causes of backup failure and focus resources on fixing whatever is causing the most data protection risk.
By tracking metrics such as the most common cause of backup failure and the assets that have the highest failure rate, businesses can take appropriate action to reduce risk in their data protection environment.

Useless Data
Data protection products are very good at providing information about the technical results of data protection; however, technical success is not an indicator of business protection.

Business Protection Policies
A technically successful backup does not provide all of the information required to ensure that business assets are adequately protected. For example, a backup of a trading application that takes place in the middle of the trading day may succeed but will not provide a suitable basis for data protection because regulatory requirements require end-of-day information. Similar requirements about the maximum time between full backups or the amount of time that backups need to be retained mean that backups that are technically successful are invalid as far as the business or regulatory agencies are concerned.
To ensure that backups meet business requirements rather than the technical requirements, a DPM solution should give you the ability to set up business policies that include information such as timing of the backup window, maximum time between full backups, retention periods, etc.

Breakdown of Protection Tiers
Data protection metrics are often presented in aggregate. This is understandable: if you have 10,000 backups running per night, a full list would be of very little use. However, an aggregate number such as 99% success is useless, since it does not identify which backups failed. What is needed is a breakdown of data protection success by criticality, or tier. This breakdown provides information about the success rate of critical and non-critical assets, and gives a much clearer picture as to the true success of the data protection activities. Detailed information about the critical assets that failed data protection is also required, to ensure that the backup administrator can address these failures before fixing less critical failures.

Unrecoverable Data
Even when data is fully protected according to technical and business requirements, the data may be unrecoverable. Although there is no easy way to eliminate this risk entirely, it can be alleviated by carrying out a number of checks and creating a recoverability policy.

Statistical Recoverability
The most common way of understanding the recoverability of data is to look at how many times the data has been backed up and how many times it has been successful restored, and use that information to calculate the likelihood that the data can be successfully recovered.

Recoverability Policy
Recoverability involves two factors: (1) is the data available to be recovered, and (2) how long would it take to recover the data. While the concept of a recovery time objective (RTO) is well established, the ability to understand if a backup or set of backups falls within a given RTO is not. Conversely, there are also situations in which a company does not want data to be recoverable, the two most common being data that has passed its retention period and data that the company no longer controls (for example, a lost tape).

Data Protection Risk Management Process
It is important to understand that none of the above risks can be negated without advanced reporting and powerful analysis of the data protection environment. However, in addition to ensuring that these risks are minimized, it is critical that the results of the actions taken to minimize risk are clearly visible. This visibility ensures that compliance audits are successful and day-to-day confidence in data protection risk management is kept high. However, it is equally important that this visibility does not come at the cost of requiring significant manual effort on a day-to-day basis.

The following three-step process can help you balance these requirements and ensure that effort is placed where needed but not expended unnecessarily.

Step 1: Definition
The first step is to define the issues that must be addressed. Does your company have specific retention requirements? Does it need to produce specific reports or generate alerts when certain conditions are met, to ensure compliance? Are there certain restore point and recovery time objectives for individual applications? Do applications have backup windows, and if so what are they? These issues may be driven by internal and external regulations or both, depending on the particular business areas within the company.
To streamline this step, some DPM products provide predefined best practices for common regulatory requirements such as Sarbanes Oxley, HIPPA, FDA and SEC. In addition, a number of generic best practices for identifying a data protection risk are available.

Step 2: Iterative Reporting and Process Improvement
The second step is to use the information from step 1 to improve the data protection process. Let’s say that a company states that its backup window is 9PM to 7AM each day. Generating a report that shows which backups ran outside of this window, where within the window the other backups ran, and how much time within the window was not used allows backup administrators to reschedule or reroute the backups that ran outside of the specified window to ensure that they subsequently run within the window. The next day or next week another report can be run to show the results of the changes and a generate a new list of backups that did not meet their window requirements. This process can continue until all backups are running within their required window.

Step 3: Automated Monitoring
The third and final step is to replace the iterative reports with alerts. As the situation has changed from one with problems (for example, backups running outside of the window) to one without problems (all backups are running inside their required window), the need to generating reports is reduced or eliminated. Instead, the business policy requirement s should be monitored in an automated fashion so that administrators are made aware of policy breaches as they happen. Monitoring can also flag potential policy breaches before they happen, such as a situation in which ongoing data growth or changes in throughput will cause a policy breach in the near future.

Conclusion
Data protection risk management should meet the needs of the business as well as satisfy external regulations and their audit requirements. Regardless of the size and complexity of the environment, it is possible to both attain and maintain a minimal risk profile by following the simple three-step process described above.



Jim McDonald
CTO
WysDM Software
Jim McDonald is co-founder and chief technology officer of WysDM Software.




About Us Editorial

© 2019 Simplex Knowledge Company. All Rights Reserved.   |   TERMS OF USE  |   PRIVACY POLICY