HEMA, a multinational variety retailer chain headquartered in Amsterdam, has partnered with Xomnia to replace its legacy, on-premise data processing and service system with a modern, reliable, and efficient system that can store, process, and export business data in the AWS Cloud.
The new platform, known as the Cloud Data Analytics Platform (CDAP), is a scalable, efficient, and more adaptable platform that can be used to meet almost all the business needs at multiple departments within the retailer chain. It is almost fully built in Kubernetes using open source and AWS PaaS (Platform as a Service) tools. It also incorporates aspects that are built, installed, and configured from scratch by its data engineering team.
To guarantee a swift transition from the legacy system, Xomnia helped the dedicated team at HEMA migrate the necessary reports, code, and data from the company’s legacy system into its new system.
"HEMA wants to improve the daily lives of its customers. To support decision making by people and processes, our team democratizes data and insights available in our cloud data and analytics platform. Xomnia plays a crucial role in this migration." - Bas Karsemeijer, Head of Data & Analytics at HEMA
As data and AI became increasingly central to HEMA’s operations, it’s been relying on a system to collect and process data to carry out several tasks. It is an on-premise cloud infrastructure where data is collected and reports are processed to be used by the sales, stock, and managerial teams, among others.
This legacy data system, which performs tasks such as tracing revenue from online and offline sales and performing stock and promotion forecasting, is no longer able to scale up to HEMA’s needs and different data sources. For this reason, the retailer chain created a dedicated team to build a new platform that is more adequate, cost efficient, and able to accommodate data coming from multiple sources.
Xomnia’s data engineer Daniel Galea is part of the team working to build, install, and configure the new system. Besides building the new system, Daniel’s role is to also migrate the existing data, code, and technology from the legacy system to the new system, and to be an intermediary between the development team and business team to clearly communicate the interests of the business, such as specific reports and the datasets that they require.
The solution, called the Cloud Data Analytics Platform (CDAP), can almost meet all the business needs at HEMA, such as ingesting, processing, aggregating/ combining, and exposing/serving different datasets. As migrating the data and data sources from the several departments that have for years used the legacy system is a time-consuming process, the solution is expected to be fully built by the end of 2021.
Going over the technical details, the CDAP is built in the AWS cloud, using open-source system Kubernetes. In Kubernetes, the data engineers are installing tools like Spark for processing, Airflow for orchestration, and Prometheus/Grafana for monitoring. The data engineering team is working on building, installing, and configuring this.
The new system has been partially put in use already. When fully operational, it will enable HEMA to run analytical dashboards and machine learning features, and perform tasks that require running round the clock. The system will also efficiently accommodate new sources of data, and accept API’s for multiple purposes.
By developing this data processing and service system, running almost entirely on AWS managed Kubernetes, HEMA aims to achieve four impacts:
- Enhance their customer experience on day-to-day basis, for instance by enabling their staff to make informed decisions in real time, personalizing omnichannel marketing communications via data science models, and optimizing the assortments of their local stores based on holistic analytical insights and full operational processes, like forecasting and replenishment,
- saving costs, since it is less expensive to run the current system than using all the PaaS Tools from AWS,
- being able to easily and quickly migrate to any other cloud platform in the future, should the company wish to do so, as a result of installing open source tools on Kubernetes, and making all the data processing happen on Spark, without coupling the data ingestion process to PaaS tools from AWS, and
- achieving a more scalable system, i.e. capable of keeping up with bigger or more datasets without taking a longer time to run.