Randstad matches vacancies with best candidates using a robust and scalable data pipeline

Randstad, a Dutch multinational human resource consulting firm, has teamed up with Xomnia to optimize and scale their lead generation proof of concept, and use it to build a robust and scalable data pipeline that can handle Randstad’s customer base.

Using this data pipeline, Randstand will be able to create suitable matches between vacancies and candidates based on a wide range of criteria, with more reliability and at scale.

"We had a very successful experiment on lead generating solution, but the still experimental technique held us down from further scaling the solution. Together with the help of Xomnia we were able to refactor the solution in a high scalable modular solution that is scaled all over the Dutch branches." - Anne Reuver, Principal ICT Manager

Challenge

Randstad works on the two sides of the recruitment funnel, combining data from vacancies that various client organizations publish with data from its pool of available candidates. A dedicated  team, operating within the marketing intelligence department in Randstad, worked on a solution that creates the most suitable matches between vacancies and candidates, based on a variety of criteria such as work experience, home to work distance, educational requirements and many more.

Having completed a very successful Proof of Concept (POC) that focused on automatically emailing a small range of clients with the ‘hottest’ matches, the team were ready for the next level: transforming the POC into a robust, scalable data pipeline that can handle Randstad’s customer base in an easy-to-extend modular approach. For this reason, Randstad has teamed up with Xomnia to make use of our data engineering expertise.

Solution

The data pipeline processes vacancy and candidate data to create the most suitable matches between vacancies and candidates based on a variety of criteria. These criteria include work experience, home-to-work distance, educational requirements and many more.

Technically, the pipeline consists of several components that are split into modules (data ingestion, preprocessing, matching, ML model to choose the client contact to email, and so on). Each module runs in its own docker container that is deployed on AWS Batch, a service that allows managed computation clusters to be spinned up based on the requested capacity, sizing down to zero when no resources are needed.

Airflow is used as the orchestration system, ingesting Randstad or external data sources and triggering the various AWS Batch components/jobs. Finally, a lot of work was done on CI/CD and environment isolation (development/acceptance/production) for the various databases and AWS components in use.

Impact

With the aid of software engineering practices for development, deployment and monitoring provided by Xomnia, in addition to the proper utilization of AWS services, Randstad was able to turn the POC into an optimized data pipeline. The recruitment company is ready to scale the pipeline across all its Dutch branches and create more matches between employers and job seeking candidates, while being sure that everything runs smoothly.

Moreover, the modular approach, the CI/CD pipelines, and the proper environment isolation make it possible for the data scientists and engineers to easily add and test new features, forming a development flow with increased speed and confidence. In addition, the containerized approach together with AWS Batch gives the development team the chance to combine flexibility and scalability with minimum maintenance, while keeping the costs as low as possible.

This way, Randstad is ready to scale to its full customer base and be able to extend the product with new components with more confidence in the future. The pipeline has been in production for some time now, giving the team the chance to shift focus to new cool projects, stay tuned!