BACK_TO_ARCHIVE
Capital Markets/Financial Services Software Provider

Supporting A Trade Compliance Vendor In Their Migration From Postgres To ClickHouse

Helping a trade compliance and surveillance software vendor migrate from PostgreSQL to ClickHouse to handle multi-billion row datasets and petabyte scale workloads.

The Challenge

Our client is a specialist financial-services software provider focused on compliance reporting, best-execution, and trade surveillance.

The company builds and delivers a comprehensive suite of products that enable broker-dealers, exchanges, hedge funds, and money managers to meet complex regulatory obligations and gain insight into trading effectiveness.

In order to cope with growing customer demand and increasingly stringent regulatory requirements, they are embarking on a migration from their current PostgreSQL-centric data stack towards using ClickHouse for their most demanding workloads.

They will eventually be using ClickHouse to process billions of rows per day in a petabyte scale environment, and will increasingly have the need for intraday rather than batch analytics.

Customer Pain Points

The customer were experiencing the following challenges prior to the engagement:

  • Scalability concerns with regards to PostgreSQL being able to support growing data volumes and increasingly complex query use cases.
  • Emerging but limited skills and bandwidth regarding operating and scaling an open source ClickHouse customer.
  • An outstanding need to validate ClickHouse performance and scalability with multi-billion row datasets.
  • The need to tune and optimise the cluster for performance when ingesting and querying large datasets hosted on AWS S3.
  • The need to review their cluster against best practices for performance scalability and resilience.

Our Technical Approach

We took the following approach to this project:

  • Reviewed the customers setup for best practices spanning security, scalability, performance and operability. Made over 40 recommendations for tuning their environment as a result.
  • Carried out performance and load testing against the clients ClickHouse cluster. This involved generating representative billion row datasets for key tables, then load testing using representative queries.
  • Having identified slow or memory intensive queries, we moved towards optimising the solution including rewriting queries for performance, introducing projections and targeted secondary indexes.
  • Supported the customer in implementing our recommendations and benchmarking the results.
  • Upskilling client DBAs on usage and operation of ClickHouse.

Outcomes

Key outcomes of the project included:

  • Made over 40 recommendations for tuning and optimising their ClickHouse environment.
  • Confidence built around ClickHouse ability to scale to multi-billion row and Petabyte scale workloads.
  • Enhanced ClickHouse schema including the use of projections, materialised views and secondary indexes.
  • Optimised a number of queries for performance at both ingestion and query time.
  • Upskilled customer engineers on ClickHouse operations, schema design and usage.
CASE_ID: capital-markets-complianceRETURN_TO_INDEX