BACK_TO_ARCHIVE
Manufacturing & Retail/Premium Food Manufacturer

Helping A Food Manufacturer And Retailer Deliver A Real Time Data Warehouse Using ClickHouse

Migrating 20 years of sales data from a legacy Oracle data warehouse to ClickHouse, resolving performance and memory issues while accelerating business intelligence.

The Challenge

Our client is a premium food manufacturer who sell their products through tens of thousands of retail outlets across the United States.

They are migrating from a legacy Oracle data warehouse and replacing it with a real-time data warehouse based on ClickHouse to support use cases across their business, including sales analysis, logistics and manufacturing.

They have over 20 years worth of sales data in their legacy warehouse and are looking to migrate and analyze this in more effective ways for insights and business intelligence.

Their key challenge at the outset related to the data volumes and the complexity of the schema. This was leading to slow performance and out of memory exceptions in their analytics work. After resolving this, the emphasis was on accelerating the migration and expanding use cases in the new warehouse.

Customer Pain Points

The customer were experiencing the following challenges prior to the engagement:

  • Queries were taking a long time to run and experiencing out of memory exceptions due to the amount of data involved and the complexity of joins and aggregations.
  • Datasets needed to be better structured to allow downstream data analysts to access them in a more convenient format.
  • There was a desire to accelerate and scale the use of the ClickHouse data warehouse to allow them to decommission legacy technology. However, they were struggling with a lack of resource and expertise.

Our Technical Approach

We took the following approach to this project:

  • Identified the root causes of slow queries and out of memory exceptions.
  • Implemented a range of fixes including schema optimisation, query settings, and query optimisation.
  • Developed a range of datasets to accelerate the business intelligence and analytics programmes. These were built with ClickHouse best practices in mind for performance and memory usage.
  • Worked closely with client employees to upskill them on ClickHouse internals and approaches to query efficiency.
  • Implemented a solution based on refreshable materialised views to better encode dependencies between views to increase the reliability of ETL processes.

Outcomes

Key outcomes of the project included:

  • Performance and out of memory issues resolved.
  • More predictable and reliable ETL jobs over 300+ million row datasets.
  • Accelerated the business intelligence programme into new areas, helping to accelerate the date when the legacy warehouse can be fully decommissioned.
  • Upskilled customer data analysts on working with ClickHouse.
CASE_ID: manufacturing-retailRETURN_TO_INDEX