ClickHouse is designed to use system resources quite aggressively. When it does a large select over a lot of data or a large batch insert, it will allocate the memory it needs relatively aggressively. If it consumes too much memory either at the query level or at the system level then the query will be cancelled.
Two of the most important settings for putting the brakes on memory usage are max_threads and max_insert_threads which can be used on inserts and selects respectively. They can be applied to a query using SETTINGS like this:
INSERT INTO SELECT
max_threads=1, max_insert_threads=1
max_threads will limit the number of concurrent threads on a select query, whilst max_insert_threads will do the same on an insert query.
max_threads
Firstly, lets consider the impact of max_threads. When ClickHouse is carrying out a query it can potentially do a lot of work in parallel such as reading from disk or filtering the data. Though this will make your query return results faster, the cost of this is more memory as more blocks are loaded and persisted in memory in parallel. There is therefore a tradeoff between speed and memory usage.
If we run the following aggregation query with max_threads=1, we observe it took 1.6 seconds with a peak memory usage of 389mb.
SELECT
postcode1,
postcode2,
type,
avg(price) AS avgPrice
FROM uk.uk_price_paid
GROUP BY
postcode1,
postcode2,
type
ORDER BY avgPrice DESC
LIMIT 5
SETTINGS max_threads = 1, use_query_condition_cache = 0, min_bytes_to_use_direct_io = 1
Query id: 1102297e-66ac-41bd-8c79-6d9ef3f28df6
┌─postcode1─┬─postcode2─┬─type──┬──avgPrice─┐
1. │ TN23 │ 7HE │ other │ 900000000 │
2. │ BB11 │ 2EG │ other │ 558170052 │
3. │ SE1 │ 9AN │ other │ 448500000 │
4. │ E1 │ 8EP │ other │ 421364142 │
5. │ WS1 │ 1RY │ other │ 415000000 │
└───────────┴───────────┴───────┴───────────┘
5 rows in set. Elapsed: 1.604 sec. Processed 30.37 million rows, 269.92 MB (18.94 million rows/s., 168.32 MB/s.)
Peak memory usage: 389.83 MiB.
If we run the same query but increase to 2 threads the query takes approximately half the time but uses more memory:
SELECT
postcode1,
postcode2,
type,
avg(price) AS avgPrice
FROM uk.uk_price_paid
GROUP BY
postcode1,
postcode2,
type
ORDER BY avgPrice DESC
LIMIT 5
SETTINGS max_threads = 2, use_query_condition_cache = 0, min_bytes_to_use_direct_io = 1
Query id: 4c8d3c72-c61b-43c4-9a56-ee0ee50dbdc4
┌─postcode1─┬─postcode2─┬─type──┬──avgPrice─┐
1. │ TN23 │ 7HE │ other │ 900000000 │
2. │ BB11 │ 2EG │ other │ 558170052 │
3. │ SE1 │ 9AN │ other │ 448500000 │
4. │ E1 │ 8EP │ other │ 421364142 │
5. │ WS1 │ 1RY │ other │ 415000000 │
└───────────┴───────────┴───────┴───────────┘
5 rows in set. Elapsed: 0.848 sec. Processed 30.37 million rows, 269.92 MB (35.82 million rows/s., 318.44 MB/s.)
Peak memory usage: 454.63 MiB.
If we increase to 4 threads we are now down to less than half a second but 710mb in memory usage:
SELECT
postcode1,
postcode2,
type,
avg(price) AS avgPrice
FROM uk.uk_price_paid
GROUP BY
postcode1,
postcode2,
type
ORDER BY avgPrice DESC
LIMIT 5
SETTINGS max_threads = 4, use_query_condition_cache = 0, min_bytes_to_use_direct_io = 1
Query id: 77bb7f51-94b6-4fd4-83a0-570d9a4b6d87
┌─postcode1─┬─postcode2─┬─type──┬──avgPrice─┐
1. │ TN23 │ 7HE │ other │ 900000000 │
2. │ BB11 │ 2EG │ other │ 558170052 │
3. │ SE1 │ 9AN │ other │ 448500000 │
4. │ E1 │ 8EP │ other │ 421364142 │
5. │ WS1 │ 1RY │ other │ 415000000 │
└───────────┴───────────┴───────┴───────────┘
5 rows in set. Elapsed: 0.494 sec. Processed 30.37 million rows, 269.92 MB (61.42 million rows/s., 545.91 MB/s.)
Peak memory usage: 710.47 MiB.
By 12 threads we the query is taking more than 1gb to execute but the benefits of paralellism in terms of query performance have tailed off due to coordination overhead and as we hit physical limits. This shows that there is a “sweet spot” for max_threads where we can get the most performance for the least amount of memory.
SELECT
postcode1,
postcode2,
type,
avg(price) AS avgPrice
FROM uk.uk_price_paid
GROUP BY
postcode1,
postcode2,
type
ORDER BY avgPrice DESC
LIMIT 5
SETTINGS max_threads = 12, use_query_condition_cache = 0, min_bytes_to_use_direct_io = 1
Query id: d5c3ab65-ec7e-493d-80ef-f9143f0fb1cd
┌─postcode1─┬─postcode2─┬─type──┬──avgPrice─┐
1. │ TN23 │ 7HE │ other │ 900000000 │
2. │ BB11 │ 2EG │ other │ 558170052 │
3. │ SE1 │ 9AN │ other │ 448500000 │
4. │ E1 │ 8EP │ other │ 421364142 │
5. │ WS1 │ 1RY │ other │ 415000000 │
└───────────┴───────────┴───────┴───────────┘
5 rows in set. Elapsed: 0.436 sec. Processed 30.37 million rows, 269.95 MB (69.58 million rows/s., 618.53 MB/s.)
Peak memory usage: 1005.82 MiB.
If you think that most people are using ClickHouse without specifying this setting, they could be using far more memory than needed with little performance benefit.
max_insert_threads
max_insert_threads has similiar behaviour but on the insert side. When we are performing an insert, we can choose how much work to carry out in parrallel. A higher max_insert_threads means we can build more blocks in memory and write them out concurrently.
Here we will start with a single thread writing 20 million rows:
As we write out this.
