cd ..

How To Evaluate The TCO of ClickHouse

Benjamin Wootton
2026-06-23
5 min read
Featured image for How To Evaluate The TCO of ClickHouse

As more enterprises are looking to implement Clickhouse, there are an increasing number of questions about total cost of ownership. People want to understand how much the platform will cost today to run, but also how it will grow over time as we retain data and use it for more use cases in the business.

Estimating the total cost of a ClickHouse cluster is however a fairly complex affair involving lots of variables. If you run it as a open source service, then there are a number of components to model, you have to think about me compute and storage and manpower.

Open Source vs Cloud vs BYOC

One of the strengths of ClickHouse is that it's an open source product. This means that it Can be downloaded and ran for free in your own data centre or hosting environment.

However, all open source is free "as in puppies" in that it requires work to host, to manage, to upgrade. This does require... real engineering time. from people with specialist skills.

Compute

The 1st thing we are going to need is some servers to run our clickhouse environment. This includes the click house server nodes, but also some of the supporting cast, such as the click housekeeper nodes, which should ideally be running on different hardware than the servers, and potentially a load balancer refront to root queries.

It's important to model all of your environments, so maybe you have a dev, a test, and a production environment, and maybe also developers would like their own environment to work on.

The big delta between open source and clickhouse cloud or BYOC is that open source is not as dynamic in its ability to scale up and down. You typically design your cluster and it's a relatively fixed size. If I want to remove shards or add new shards, that is a manual activity where I will create the instance, move a subset of the data over and things like that. In the cloud, we can autoscale much more effectively. So if click out isn't in use, it can scale down to 0 or if we need more capacity, we can easily scale horizontally or automatically scale up vertically.

A challenge from a TCO modeling perspective is that we don't quite know the utilization figure in McLeod. So, if the system is used 40% of the time, then probably a one cost, whereas if it's used 80% of the time, then the cost could be significantly different, and that can change over time during a week. So we could take a naive average, but what you might find is we have peaks at certain points of a day, and then potentially a system could be turned off overnight or over a weekend. So we have to model that usage profile accurately. Click house open source on the other hand because it's static. We don't really have viscated off or scale down capability at all, and we're literally paying for a service 24-7.

Storage

Next up we need to estimate our storage volumes.

Firstly, ClickHouse compresses data really well. I frequently see mroe than 70% compression ratios of stored data. The post-compressed data sizes are the ones which should be plugged into the TCO calculation.

Secondly, we may store data in different places. In ClickHouse Cloud, most of our data will be stored on object stored and charged at a flat rate. In open source, we could make use of tiered storage and maybe include our hot data on SSDs, cold data, on hard disks, and our long tail of data could be pushed out to S3 object store.

Retention

Data retention is an important lever. Many companies with a lot of data, you know, very well, try to minimize the amount of data that is retained. What you'll often find is of a more recent data is much more valuable, but very old data is barely used or barely consorted. So we have to decide when do we cut off, you know, do we only keep 6 months of data, 12 months, 24. This can have a big impact on storage costs.

Supporting Tooling

Clickhouse doesn't exist in a vacuum and... There are likely to be other tools which need to interact with it. If using Clickhouse Open source, Maybe, for instance, we want some monitoring and blogging to ensure that the environment remains healthy. If we are using click half cloud when some of that is included, so there's embedded dashboards and query insights and blogs are within the instance. They may also be tooling for ETL and some of that might have a cost associated. So if we're loading data from disk or APIs or from Kafra, then we need that additional tooling in Clickhouse Cloud, some of that is included through click pipes.

Portrait of Benjamin Wootton

Written by

Benjamin Wootton

Independent Consultant - ClickHouse

Benjamin Wootton is an independent ClickHouse consultant. I help businesses deploy ClickHouse open source and ClickHouse Cloud, build solutions on top of ClickHouse for real-time analytics, observability and AI, and resolve performance and reliability issues with their existing deployments.

Connect on LinkedIn
END OF FILE Share