A challenge that some of the largest organization face is what to do with petabytes of archived email data.
Introduction to Intelligent Retention
In today's blog, I want to go through some thoughts regarding the legal challenges of managing unstructured data during ...
In today's blog, I want to go through some thoughts regarding the legal challenges of managing unstructured data during a cloud migration.
Several legal fields need to be considered when defining a data management strategy for unstructured and communications data, such as Mail, File and Chats.
From the top of my head, the most important ones would be:
- Vertical Regulations (Financial Services, Health & Pharma, Public Sector)
- Data Privacy and Data Protection
- Intellectual Property
- International Issues
- Risk Management and Liability Issues
- Security issues, including Security of Privacy
Let’s look at some of them in more detail.
For the US, the most important regulation for archiving and retention of electronic communication is the Dodd-Frank Act. It is a US federal law passed as a reaction to the 2007 financial crisis to enhance the financial stability of the US financial market by improving accountability and transparency. In simple terms, it puts an obligation onto FinServ companies to retain a record of all business activities of the location, including a complete audit trail for a minimum of 5 years. Most technical details, like the acceptable format, are defined by the responsible supervisory authorities such as FINRA and the SEC.
Following the US, the European Union has responded with a variety of directives and regulations over the last few years, directly affect recording and archiving of customer interactions in the financial sector.
The most prominent one is MIFID II, which implements the European Markets in Financial Instruments Directive in the law of the member states. It has not only tightened the regulations of MiFID I, but it also added new rules to strengthen investor protection and to increase the transparency of financial markets. Let’s look at the exact wording from the German regulator BaFin:
"Record-keeping obligation - MiFID II also requires investment services firms to record electronic communications with their clients and make them available to them and to national supervisory authorities on request. This applies not only to telephone calls, but to any type of electronic communication. Communications between the company and the customer that relate to the acceptance, transmission and execution of customer orders must be recorded. The recording obligation is intended on the one hand to make it easier for customers to prove misconduct on the part of their investment service provider; on the other hand, it makes it easier for national supervisory authorities to prosecute market abuse. Companies must inform their customers at the beginning of the business relationship that communications will be recorded. They must keep the records for at least five years."
But it is not just banking: For many countries and regions, similar regulation has been in place for pharmaceutical companies, health insurance and energy trading. In fact, most stock listed companies have obligations to keep communications records for specific departments and make them available to the relevant authorities if they so demand.
That can also be true for manufacturing companies, which are facing more scrutiny regarding export sanctions and product liability cases, such as the so-called “Dieselgate” scandal.
Data Privacy & Data Protection
To some people, this might sometimes seem like an old and worn-out topic, especially since the hype around GDPR a couple of years ago, but we are only getting started – this time the enforcement is real, and the fines are happening!
Have a look at: https://www.enforcementtracker.com/
(And in case you are not based in Europe and think “Does not affect me!” – similar privacy regulation is happening in the US (CCPA), Canada (PIPEDA), Australia (APA) and many other countries around the world. Also remember, GDPR does apply to any organization doing business with European customers (not just those incorporated in Europe).
It has already been broadly discussed that GDPR has wide-reaching consequences. It must be applied to all data that does not fall under a vertical regulation or tax-law related retention requirement.
An Alien Concept: Short-term Retention
GDPR is very clear about the retention of personal data: You may not store it longer than needed and for no other purpose than explicitly granted by the subject’s consent.
Therefore, retention periods are mandated … but how can that possibly work when you have a repository that contains both business records and personal identifiable information (PII)?
Legacy Data Management
But there is another obvious problem there: Legacy data and systems need to comply in the same way as data that is created today, even if the data predates GDPR by a significant amount of time.
This is especially difficult, as the original idea was to use long-term retention, often 7 or 10 years. However, according to GDPR, this is not allowed:
“Many of us never delete emails. There are plenty of good reasons: We may need to refer to them someday as a record of our activities or even for possible litigation. But the more data you keep, the greater your liability if there’s a data breach. Moreover, the erasure of unneeded personal data is now required under European law. Because of the GDPR, you should periodically review your organization’s email retention policy with the goal of reducing the amount of data your employees store in their mailboxes. The regulation requires you to be able to show that you have a policy in place that balances your legitimate business interests against your data protection obligations under the GDPR.”
Migration with Intelligent Retention as a Way Out
Migrations are rare events that require you to move large volumes of data and process the information on a newly designed target platform. The simple form of processing is to create a searchable index.
But for most projects, there is little desire to change the retention policies and to revisit the value of the old data. We believe this is unfortunate. And we believe that there is a genuine opportunity for many organizations to change that approach:
“Use your archive migration event for re-evaluating your retention policies.”
The latest advancements in machine learning and database technology have given us new options to look at the existing legacy data:
Tag and Keep the Business Records
For an energy company, it is most likely the commodities traders, and possibly anyone involved with CO2 certificates and emissions trading have the highest likelihood of being investigated,
For those departments, it makes sense to update the retention time to the current vertical regulation and keep a complete trail of their communications records. In these cases, the business interests outweigh the privacy interests of the employees.
For all the other non-regulated employees, it makes sense to create policies based on metadata, such as emails sent to the top 100 customers, suppliers and competitors. These are most likely to be relevant and concern specific projects or transactions. On top of that, modern classification technology can identify contracts, invoices and other commercial agreements with very high confidence.
Tagging this kind of information has, of course, very positive effects on any e-Discovery process, as you can suddenly target certain document classes instead of creating long lists of keywords.
Expire Any Personal Information
The email system is a bad place to collect personal information (PII) of your partners and customers, especially as the user mailbox is protected by privacy in some countries and makes it hard to comply with GDPR Articles 12,15,17 (Right to access, right to be forgotten). It should therefore not be treated as a valid business application like Salesforce or Workday.
Therefore, my recommendation is to encourage your workforce to “promote” any relevant email to the CRM, HR or other suitable business application, where the information is now centrally managed and attached to a distinguished customer or partner account.
Remove The Noise
Grey data such as newsletters, system notifications and social media updates is a bad way of “poisoning your well” when it comes to e-Discovery. Newsletters tend to contain lengthy disclaimers with paragraphs of legal terminology. Social Media updates might bring abusive language and inappropriate content into the work environment.
As those newsletters are typically short-term information unrelated to a business process or affecting the balance sheet, they can be safely removed based on the sender, metadata and a superficial analysis of the content.
“enable re-classification of certain records and document classes.”
Prepare For The Next Migration
You may discover that your new cloud solution does not work as effectively as expected initially. There may be data residency issues or security issues, cost or other challenges.
While it is our experience that detailed planning helps firms avoid these situations, even after a thorough review, you still may run into issues that require an exit strategy.
In this case, your best chance is that the data is stored in its original, open format and accessible through standard APIs. On top of that, there should be an agreement to get access to the metadata through a database export so that you can rebuild the catalogue in a different environment if needed.
And that includes Office 365 as well. It makes perfect sense to hedge for problems on the Microsoft platform – security, data sovereignty or reduced access to the platform – by creating a safety copy on an open system under your control.
This is precisely what Expireon provides on top of all the “intelligence” – a simple mechanism of future-proofing your environment.
If you'd like to find out more about bringing cloudficiency to your project, reach out to us.
Hopefully, I was able to point out the conflicting nature of data retention and data protection requirements and how the old approaches to storage and expiry have no way of dealing with this new future. Only the evaluation of the content and the classification by privacy risk and business value will be able to solve the puzzle of let’s say tax law vs privacy regulation.
Luckily the latest advancements in document classification are helping us to steer full steam into that direction and provide a risk optimized, privacy-aware platform for intelligent retention.
We call this platform “Expireon”, and you will be able to get much more information about this solution on the Cloudficient website.
It is open, scalable, intelligent, cloud-ready and comes in at a fraction of the Terabyte price that classic archiving solutions will charge. And it is designed to work hand in hand with different e-Discovery, Forensic and Cloud Archive solutions by providing a “Direct-delivery” mechanism into those upstream tools.
But before you go and check it out, have another look at the key takeaways.
- Vertical regulation and data retention requirements are getting stricter and finally being audited and enforced across all major regions.
- GDPR and CCPA are putting a spanner in the works for anyone considering “archiving forever” or even a 10-year retention
- The cost of not producing evidence can be dramatic – billions of dollars of fines or patent infringement claims
- Wading through terabytes of email is also costly, time-consuming and will bind legal resources.
But there is hope!
Cloudficient Expireon defines a new solution category, combining migration with open, intelligent data retention, while integrating with the market-leading e-discovery and forensic tools. It deals with conflicting regulation and retention requirements, helps with the tagging of important data while expiring the unwanted content. And on top of all that, it can create a safety copy outside your primary system so that your next migration will be cheaper, faster, and more feasible.
I would love to hear your thoughts on the legal issues I have discussed!
Talking about legal issues: Of course, there are more angles around the retention of communications data, such as the suitability of cloud storage from a legal perspective and the difficulty of cross-border data transfer.
Stay tuned for my next blog!