How To Validate Data: Ensuring Accuracy and Integrity
Knowing how to validate data is essential for accuracy and integrity. Find out how Cloudficient approaches data validation in preparation for cloud...
It was not that long ago, when CIOs were talking with their boardroom colleagues about moving to the cloud. They used ...
It was not that long ago, when CIOs were talking with their boardroom colleagues about moving to the cloud. They used the singular word, because back then it seemed like a binary choice: Run your own IT or let Amazon/Microsoft/Google run all infrastructure for you.
There was this compelling notion, that all you needed was a single provider, who would give you all you need, managed from a single pane of glass, and send you a single invoice afterwards.
Today, we know that this simplistic view of the future has not materialized for most customers – for most enterprises there is no such thing as a single public cloud deployment.
Many organizations run Microsoft’s cloud for productivity apps, AWS for mobile app and web infrastructure and Google or Digital Ocean for software development. And there is even more complexity. Different geographies might have adopted the cloud on their own terms. Mergers and acquisitions have created separate tenants and one should never forget about the shadow IT, where a department head might have created his own AWS kingdom on his corporate credit card.
A friend of mine said lately:
“We are now beyond multi-cloud and have firmly entered the age of the messy cloud”
Many organizations are now asking themselves how they get out of this situation. Here are a few strategic thoughts and tips to get back on track.
This seems like an obvious approach, but easier said than done. In the world of communications archiving and record retention, the migration and consolidation of existing platforms has always been an expensive nightmare. It is also true for applications and databases as well.
For on-premises systems, you were lucky if you could find a migration company at all. I have witnessed firsthand how people had to pay a “ransom” to have their data extracted by a retired ex-employee of the insolvent developer or had to run an old Notes/Domino-Server without a network connection, as there were neither export capabilities, nor required security patches available.
Most cloud archives are not much better.
The Export features are crippled by design, preventing the mass export of the data without involving the expensive professional service team of the vendor. This makes the data extraction extremely expensive and therefore creates a negative business case for cross-cloud consolidation.
All of that could have been easily prevented. Any major new application should be built on open protocols, use published standards, non-proprietary file-formats and have a set of public APIs that are designed to extract the data at scale. To me, a cloud archive provider who does not have an extraction API, is just a hijacker in a pin-striped suit. (Hang on, I might write another blog about this!)
And if your application does not fall into this category, then you should consider migrating it sooner rather than later, because the more data you have in it, the more expensive the consolidation project will be.
Read this again: Your already negative business case for migration will even get worse over time.
So, what are you waiting for!? You’re waiting for a real multi-cloud solution.
When I worked in business consulting a few years ago, I was regularly involved in product and architecture decisions, regarding multi-cloud adoption, mainly for data retention and e-discovery projects.
One of my favorite lines was always:
“What are the 3 most important questions when choosing an archive platform?”
The ‘answers’:
This was an effective way to highlight the strategic, long-term implications of a decision, where the life cycle of the data will likely outlive the underlying platform. Even the product I started with more than 20 years ago, Enterprise Vault, is on the tail end of its life cycle and people still find out the hard way, how expensive and cumbersome it is to extract data from a closed-source application with a proprietary file format.
But things have become much easier. Very capable foundational technologies like Kubernetes, S3 Storage and Elastic Search are Open Source are used across the globe in the most demanding workloads imaginable, all connected through micro-services and communication layers like Kafka.
And those core components are available across all cloud types and all providers.
At Cloudficient, we have been using these technologies to help our customers to deploy our solutions anywhere, use open source wherever feasible and promote open interfaces and file formats.
Our approach is:
Give us your data and we store it in the most open way possible, in whatever cloud you need it.
And that’s the level of abstraction that you should expect from any strategic provider these days.
The fastest growing supermarket brands in the Western world are Aldi (Trader Joe’s) and Lidl. One of the key factors behind their success is the limited choice they offer in return for lower prices derived from streamlined logistics and economies of scale.
Having to choose only 3 types of toilet paper might turn off some customers, but most shoppers will happily live with a limited choice of great value products.
To me, there are not enough IT leaders who think alike. In my previous international jobs, I had to deal with different local marketing platforms, some integrated with the corporate CRM, some kept separate. Then we had multiple Intranet, Wiki and Collaboration platforms: The salespeople kept using SharePoint, the product managers had an open-source Wiki, while the engineers were using Confluence. Even worse: Files were stored all over the place.
And here is the rub: While there were somewhat valid reasons for each group to have their own system – need feature ‘this’, need feature ‘that’ – the end-result was so much worse than having to live with a corporate-wide solution, that would only cover 90% of a certain group’s need. What brought the whole thing down, was “Too much choice”, combined with very few restrictions and little end-user guidance. You could simply not find what you need or only after having checked 3 or more separate systems.
So here is the strategic tip from our side: As an IT leader, you should be prepared to promote binary choices. If you want to promote the use of Slack, the best option is to disable Microsoft Teams.
If your legal team uses Nuix and Relativity to review all ESI, why would you invest in a full-featured communications archive?
By making sure that you right size the applications to what the business really needs and promote sustainability over excessive feature lists, you can streamline the end-user experience and prevent the trap of overlap and redundant functionality that frustrates everyone.
Be like Aldi and while you provide a limited number of options, enjoy the economies of scale and savings through a laser-sharp focus on a better user-experience in fewer tools.
If you'd like to find out more about bringing cloudficiency to your project, reach out to us.
Now the next thought might be a bit counter-intuitive. But encryption can make a bad application decision even worse. I remember reading the “new features” of a just released version of a well-known legacy archive: It said, “Now with Amazon S3 support” and “Optional encryption for stored messages.” My first thought was that they effectively triple the ransom and protect their market share by making it almost impossible to leave that platform, before the last bit of information has gracefully expired in a decade or two or three.
It is quite simple. If your data is in the cloud, but you need a Windows box in the data center to decrypt and extract the message, then suddenly your export performance goes down by as much as 80%. Your 6-month migration project will now be close to 3 years. The best advice I can give is to watch very carefully what gets written to your S3 bucket. Is it the original message (EML/MSG)? Then you can probably move this across clouds and technologies. If it is a proprietary format, then it might be better to store it on a fast local device. And only consider encryption when it is done outside the application and utilizes open algorithms.
With our S3Complete platform, a cloud-based storage service, we keep the encryption independent from the application that is writing the data. And that independence can save you months of migration time in a not-so distant future.
And that is not just true for this particular use case: Consider encryption all the time, across all your applications and cloud providers. Don’t give keys to the kingdom to someone who made a fortune by locking away your data in a virtual Hotel California, where you can check-out any time you like, but you can never leave!
You might expect this blog to talk more about security, management, and analytics. I hear you, but there are 3 main reasons why these topics are only subsumed in this section.
First, all the points made above directly impact the security and manageability of your environment. Fewer platforms, open technologies, less overlap and flexible encryption have a direct effect on how sustainable your multi-cloud platform really is.
Reducing the number of your legacy applications has many advantages: Legacy, closed source apps tend to have more security vulnerabilities, due to the required expertise in the old world the security staff are spread too thin and there is less time to focus on the target environments, leading to simple configuration errors remaining undetected.
Having Windows machines with 10 year old Microsoft Management Console (MMC) interfaces in a modern environment makes the management and monitoring so much harder, than running a “made-for-the-cloud” Kubernetes cluster.
And a modern Logstash based analytics engine is obviously much more insightful, than trying to extract something valuable from Windows event logs.
Secondly, there are many blogs and articles that talk about other aspects of multi-cloud environments. Be it the development side, the database world or all about containers and virtualization. Feel free to check out the tsunami of very technical and sometimes quite opinionated points of view. There is so much valuable information out there that will only complement the bigger picture and answer all the remaining questions you have. But you would probably not be here if your focus was on Cloud Access Security Brokers.
Which brings me to the third and final reason: Only talk about your area of expertise. Mine is in long-term information management and data retention. Some people live in the here and now of mobile app development, others have the luxury of thinking about the bright future of their v.Next cloud strategy completely disconnected from what is deployed today.
In my world, people have captured information before anyone knew what multi-cloud means or that Amazon would be used for anything other than buying books - and the FDA or another regulator still wants them to retain these records.
For those of us who don’t worry just about the place of data processing – be it private, hybrid or public cloud – but worry about the data itself, its integrity, accessibility, and retention, this has hopefully given some ideas and a level of reassurance, that we are making a lot of progress lately with our goal of open, flexible, and non-proprietary platforms, that ensure our data management strategies can outlive the applications we use today. I am optimistic that the worst times of bulky, slow and error prone systems are behind us and there is a bright future with the new generation of cloud applications just coming to market.
The future is bright. If you want to know more about how to get there, please get in touch and we can show you a way: From migration to retention to preservation and beyond!
With unmatched next generation migration technology, Cloudficient is revolutionizing the way businesses retire legacy systems and transform their organization into the cloud. Our business constantly remains focused on client needs and creating product offerings that match them. We provide affordable services that are scalable, fast, and seamless.
If you would like to learn more about how to bring Cloudficiency to your migration project, visit our website, or contact us.
Knowing how to validate data is essential for accuracy and integrity. Find out how Cloudficient approaches data validation in preparation for cloud...
Does your business subscribe to a patchwork of cloud apps from many different vendors? Cloudficient explores why consolidating your apps may make...
If you want to utilize cloud-based solutions, you need to know how to assess your cloud migration ROI. Accurate figures reflect the many long-term...