Let's talk about data retention, data protection, and the policies and strategies that can help your organization navigate them.
Why You Should Use Intelligent Retention for Data Retention Management
In this blog we’ll go through an overview of data retention, policies to implement a plan and why you should invest in ...
In this blog we’ll go through an overview of data retention, policies to implement a plan and why you should invest in a better alternative – intelligent retention.
What is Data Retention?
Data retention management is the process of storing information for a specified period. It is relevant to all businesses that produce and retain data to service their customers and comply with government or industry regulations and legal requirements.
Without a data retention strategy, companies might store too much information or keep it longer than needed, which leads to operational inefficiencies, increased costs, and legal and security risks. Regulatory compliance is a must to avoid fines and streamline workflows.
To formalize the data retention best practices , every organization needs a data retention policy to codify the requirements and give clear guidance across the organization: what data must be retained and how it should be archived.
Why You Need a Data Retention Policy
A data retention policy clarifies what data should be stored or archived, where that should happen and for how long. Essentially, it outlines data retention best practices your business should have in place.
It defines how the organization will store and manage data for compliance or regulatory reasons and the disposition once the data is no longer required.
Even a simple data retention policy should clarify how records and data should be formatted, how long they must be kept, and what storage system or devices are used to retain them.
Once a data set exceeds its retention time, the policy should specify whether the data can be deleted or moved to secondary (cold) data storage, depending on business requirements. These decisions will typically be based on local laws and regulations (GDPR) or rules of the industry’s regulatory body.
When you create a data retention policy, you will begin to explore data retention management software that can help achieve your goals.
Why Not Keep Everything Forever?
While it is important to retain historical data for business use, most data retention best practices really exist to fulfill regulatory requirements.
Data retention policies for electronic communications are often the crux of data management. While one could argue that there is already a long tradition of filing paper documentation as a form of record retention, information created as large streams of electronic communications cannot as easily be stored or cataloged in traditional filing systems and does not contain business records.
Nearly 50% of all business communications is “noise”: anything from airline newsletters, legitimate advertising, IT notifications or daily updates from social networks such as LinkedIn.
But telling which half of the information is irrelevant has historically been difficult.
In the past, many records management experts therefore translated the legal or tax-focused retention requirements into blanket statements, such as
“Keep all emails in a ten-year retention category”.
Even worse, this was often done without explicitly requiring the corresponding expiry mechanisms to automatically prune those emails once they are past their due date.
Today, they struggle to enforce this deletion process remains for various reasons:
- Ongoing cases / legal holds
- Data across different departments and geographies with conflicting retention requirements
- Potential long-term records such as contracts (life insurance) or IP records (patents)
- And above all: Finding a senior executive signing off the permanent deletion of data
On the other hand, most organizations do not have the financial ability to retain all data forever, nor is that even a desirable goal given the emergence of harsher data privacy requirements (such as GDPR) which might require data to have a much shorter retention period.
For instance, personnel records and sensitive financial or medical records may all have different (shorter) retention periods. The complications with data retention lead many businesses to turn to data retention management services for assistance.
Covering The Basics With an Industry-Specific Template
Due to the complex nature of the regulatory frameworks, organizations often use an industry-specific data retention management template that includes several different retention puzzle pieces:
- Publicly-traded United States companies must establish a Sarbanes-Oxley Act (SOX) data retention policy.
- Organizations that accept credit card payments must establish Payment Card Industry Data Security Standard (PCI DSS) data retention policies.
- Healthcare organizations must develop data retention policies that adhere to the Health Insurance and Portability and Accountability Act (HIPAA).
- Businesses that process or store personal information about EU citizens must comply with the General Data Protection Regulation (GDPR), even if they operate from a non-member state.
Using a simple template is just a starting point, which should be refined to your organization’s needs.
A Major Challenge: Retaining Personal Data
After many years of discussion, the European Union has unified the data privacy approaches of its member states in a single, ambitious compliance framework: the General Data Protection Regulation – GDPR.
It grants data subjects various rights, which in turn means that organizations should not treat data from European customers as their own but rather as data under their stewardship. One way to deal with personal data from a Data Retention Management perspective is to create a separate retention policy just for personal data.
When considering a personal data retention policy, you must carefully audit all data collected to be sure your policy considers all personal data your organization stores. Data stored in databases, documents, email, financial data, images, production data, system state information, and videos might all be important for your personal data retention policy.
Think about this from a customer, as well as from an employee perspective.
The Next Step: Intelligent Retention
The example above shows clearly that in a world of conflicting regulations and endless streams of irrelevant information, the classic “capture everything and store forever” approaches to data retention no longer work.
Fortunately, data retention management softwares are now available that will dramatically change how we look at Data Retention Management: Deep Learning, Natural Language processing and AI-based content recognition.
These transformative technologies pick up the age-old idea of filing the data on a “per record” basis but transfer the decision-making from the employee to an IT algorithm. Of course, this is not perfect, and a big chunk of the data will still be labeled as “miscellaneous”. Still, for many record types, the confidence levels of the AI are so trustworthy that it opens a glimpse into the future of Data Retention Management Services.
This is especially true as the data volumes have become unmanageable not only from a storage but, more importantly, from an eDiscovery or Data subject request perspective. It simply takes too long to sift manually through the irrelevant noise, especially for a highly paid legal expert.
In fact, it is now cheaper to use existing CPU capacity to classify 1,000 documents than having a manual reviewer read a single document to the end.
What Are The Foundations of an Intelligent Data Retention Approach
To outline the core principles of Intelligent Retention, we have created a list of 6 steps that will transform how organizations will retain, use and expire data in the future.
1: Classify Your Data
Not all created data is equal, which is why not all data has the same retention periods. Start by identifying what data your organization stores and then classify the data to determine which data needs to be archived and for how long.
If the data is not critical or a business record, it should be expired aggressively.
At the same time, there might be documents that the company wants to keep for a very long time: Contracts, design plans or anything that is patent related, which will likely include vast amounts of research data.
If any of that data also serves the current needs of the business, you might want to move it to the correct repository for your employees to find it, instead of keeping it hidden in an archived mailbox of someone who left the company years ago.
The right smart data retention management software can help your business effortlessly sift through data to determine if it’s critical or not.
2: Remove The Noise
Many organizations can reduce their communications archive by 25% by simply identifying and expiring newsletters and notifications older than two years.
Classifying your legacy data also allows for removing irrelevant and duplicated files. Aggressively expiring this data expedites searches, avoids confusion, and enhances the user experience.
One problem worth mentioning is “over-collecting” data during a legal investigation, which results in spiralingcosts for the review. How can that happen? Many newsletters and advertising emails contain disclaimers with several paragraphs of legal terminology. A search for “Luxemburg, contract, governing law, liability” produces thousands of Amazon newsletters.
Removing those “false positives” has an immediate positive effect on the legal team.
3: Understand The (Legal) Risk
As you classify data, ensure your data retention policies align with compliance and legal restrictions. You may also have to consider pre-existing contractual needs shaping data retention schedules.
Understand your compliance verticals, whether it’s HIPAA, the Sarbanes-Oxley Act or GDPR. Know the law to determine what data must be kept and for how long. Likewise, keep data that you might need if legal action should arise. Keeping track of your data retention requirements helps with compliance in the long run.
By tagging health records as “HIPAA” or financial statements as “SOX”, a modern, intelligent retention platform helps deal with different and conflicting regulations.
4: Identify Personal Data
Identifying personal data starts with the question:
What kinds of documents create the biggest risk in an archiving system?
Usually, those are documents that contain personal data by design, such as CVs, resumes or HR records. On top of that, health data is often protected and cannot be retained forever in most geographies. For a good reason: A sick note statingdepression or mental health problems after a divorce five years ago should play no role in promoting a top performer today.
The amount of personal data in question depends on the vertical industry and whether collecting information about the customer’s personal situation is part of the business process (insurance) or whether the health status needs to be documented regularly (healthcare).
You also need to consider the location of the data subject. In some cases, data located in different places may require unique data retention policies. A customer in California is now protected under the CCPA, while an employee in Europe has privacy rights that prevent the unsolicited recording or monitoring of certain communications.
Let us take a real-life example: There is hardly any document that contains more personal data than a CV or resume. GDPR has clarified that for rejected candidates, such a CV needs to be deleted no later than six months after the job application has been received.
How do you deal with informal job applications sent to the hiring manager via email?
Without an AI-based classification, your data retention policy will likely struggle to cater for these cases.
5: Transfer Relevant Data Into The Right Tools And Repositories
Identifying and capturing accounting records, customer correspondence, electronic communications financial data, sales data, and other mission-critical digital business data does more than ensure compliance. It also helps in re-introducing certain records into the actively used CRM or HR systems, such as Salesforce or Workday.
Emails that were sent during a patent application should be copied to the ECM system of choice, where the paper forms and laboratory records are stored, so that you do not accidentally lose this critical information after a standard 10-year retention period, given that patents tend to be granted for a minimum of 25 years.
An intelligent retention platform should allow you to re-insert data into other platforms if a corresponding tag has been confidently applied.
6: Delete Data Post The Retention Period
While this should be obvious, many businesses have second thoughts and hold on to data longer than required in hopes they might need it in the unforeseeable future.
Intelligent retention can make more storage available by saving space for new data and files while eliminating duplicates. The entire process saves time and money overall with lower storage costs and increased speed.
And you do not run into the danger of destroying important records or keeping personal data years longer than allowed.
Finally, it is much more likely to find an executive to sign off the deletion of content that has been confidently tagged as “non-relevant” than getting the buy-in to do a blanket deletion of data that might still contain important records.
How To Get Started With Intelligent Retention
The good news is that finally, the archiving and data retention community is waking up to this new challenge. While basic data classification has been available for a while, the trend is to provide ready-made Deep Learning capabilities within the Data Retention platform itself.
At Cloudficient, we have designed our Expireon solution as a true “Intelligent Retention” platform. It is designed with open interfaces so that different data classification plug-ins can be used, depending on the requirements of your organization.
The data retention management software can identify newsletters, system messages and many other content types. But the true power lies in how the learning algorithms can be trained to recognize industry-specific content and get more accurate over time. A document that might only score a confidence level of 50% will likely be significantly upgraded or downgraded when revisited after a few months.
While there is a trade-off between the required compute power versus the saved storage capacity (due to the cost of CPU calculations), Intelligent Retention saves a significant amount of storage cost and with further savings during the e-Discovery process.
Add the fact that Expireon uses industry-standard S3 storage and has a much lower subscription cost than competing products, and you end up with a cost-saving strategy that will fit even into tight budgets.
So How Does This Work In Reality?
We recommend making Intelligent Retention capabilities a cornerstone of your future investments in the Data Retention Management space. Without the intelligence of an AI-powered solution, you will likely hit a brick wall when dealing with conflicting regulations, such as SOX and GDPR. By identifying personal data and telling them from financial records, Intelligent Retention will give you the agility to define your Data Retention policy in a future-proof way.
Now the good news: Expireon has migration capabilities built right into the platform so that onboarding data from legacy systems is a breeze. And there is no better time to classify your information than during a technology refresh or cloud migration.
If you want to learn more about Expireon and Intelligent Retention, click here to visit our website.
With unmatched next generation migration technology, Cloudficient is revolutionizing the way businesses retire legacy systems and transform their organization into the cloud. Our business constantly remains focused on client needs and creating product offerings that match them. We provide affordable services that are scalable, fast and seamless, including for data retention management.