{"id":3185,"date":"2021-02-21T22:23:44","date_gmt":"2021-02-21T21:23:44","guid":{"rendered":"https:\/\/kudzia.eu\/b\/?p=3185"},"modified":"2026-03-26T09:10:44","modified_gmt":"2026-03-26T08:10:44","slug":"clickhouse","status":"publish","type":"post","link":"https:\/\/kudzia.eu\/b\/2021\/02\/clickhouse\/","title":{"rendered":"clickhouse"},"content":{"rendered":"<p><a href=\"https:\/\/clickhouse.tech\/\">clickhouse<\/a> is a column oriented OLAP database. i&#8217;ve started using  it about half a year ago. i&#8217;m impressed. earlier i&#8217;ve read about it on <a href=\"https:\/\/www.percona.com\/blog\/\">percona&#8217;s blog<\/a> but did not fully grasp how performant it was. i&#8217;ve tried it when i wrestling with MariaDB&#8217;s query planner on table with ~100M rows got me tired and each query took minutes to finish. clickhosue still feels like magic &#8211; the data size went down from tens of GB to few GB, query execution time &#8211; from minutes to low seconds.<\/p>\n<p>some useful queries:<\/p>\n<p>get size of each of tables&#8217; columns:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\nSELECT name,type,formatReadableSize(data_compressed_bytes) AS size_compressed, formatReadableSize(data_uncompressed_bytes) AS size_raw, floor(data_uncompressed_bytes\/data_compressed_bytes,2) as ratio, compression_codec from system.columns where table =&#039;t&#039;;\r\n<\/pre>\n<p>get on-disk size of all tables [<a href=\"https:\/\/zaiste.net\/databases\/clickhouse\/howtos\/howto-get-table-size-clickhouse\/\">src<\/a>]:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\nselect concat(database, &#039;.&#039;, table)                         as table,\r\n       formatReadableSize(sum(bytes))                       as size,\r\n       sum(rows)                                            as rows,\r\n       max(modification_time)                               as latest_modification,\r\n       sum(bytes)                                           as bytes_size,\r\n       any(engine)                                          as engine,\r\n       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size\r\nfrom system.parts\r\nwhere active\r\ngroup by database, table\r\norder by bytes_size desc;\r\n<\/pre>\n<p>variation of above, including information about TTL, where set:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\nwith details as ( select concat(database, &#039;.&#039;, table)                         as table,\r\n       formatReadableSize(sum(bytes))                       as size,\r\n       sum(rows)                                            as rows,\r\n       max(modification_time)                               as latest_modification,\r\n       sum(bytes)                                           as bytes_size,\r\n       any(engine)                                          as engine,\r\n       formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size, formatReadableSize(sum(bytes_on_disk)) as bytes_on_disk, formatReadableSize(sum(data_uncompressed_bytes)) as data_uncompressed_bytes\r\nfrom system.parts\r\nwhere active and database!=&#039;system&#039;\r\ngroup by database, table\r\norder by bytes_size desc) select details.*,extract(tables.engine_full,&#039;.*(TTL.*)SETTINGS.*&#039; ) as ttl_cleanup from details join system.tables on concat(tables.database,&#039;.&#039;,tables.name)=details.table;\r\n<\/pre>\n<p>another take [ <a href=\"https:\/\/gist.github.com\/sanchezzzhak\/511fd140e8809857f8f1d84ddb937015#gistcomment-3489966\">src<\/a> ]<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\nselect parts.*,\r\n       columns.compressed_size,\r\n       columns.uncompressed_size,\r\n       columns.ratio\r\nfrom (\r\n         select table,\r\n                formatReadableSize(sum(data_uncompressed_bytes))          AS uncompressed_size,\r\n                formatReadableSize(sum(data_compressed_bytes))            AS compressed_size,\r\n                sum(data_compressed_bytes) \/ sum(data_uncompressed_bytes) AS ratio\r\n         from system.columns\r\n         where database = currentDatabase()\r\n         group by table\r\n         ) columns\r\n         right join (\r\n    select table,\r\n           sum(rows)                                            as rows,\r\n           max(modification_time)                               as latest_modification,\r\n           formatReadableSize(sum(bytes))                       as disk_size,\r\n           formatReadableSize(sum(primary_key_bytes_in_memory)) as primary_keys_size,\r\n           any(engine)                                          as engine,\r\n           sum(bytes)                                           as bytes_size\r\n    from system.parts\r\n    where active and database = currentDatabase()\r\n    group by database, table\r\n    ) parts on columns.table = parts.table\r\norder by parts.bytes_size desc;\r\n<\/pre>\n<p>things to adjust after the installation, based on https:\/\/kb.altinity.com\/altinity-kb-setup-and-maintenance\/altinity-kb-settings-to-adjust\/ , https:\/\/kb.altinity.com\/altinity-kb-setup-and-maintenance\/altinity-kb-system-tables-eat-my-disk\/<\/p>\n<p>prevent never-ending growth of few stats\/logs tables:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\nALTER TABLE system.asynchronous_metric_log MODIFY TTL  event_date + INTERVAL 2 week;\r\nALTER TABLE system.query_thread_log MODIFY TTL event_date + INTERVAL 2 week;\r\nALTER TABLE system.trace_log MODIFY TTL event_date + INTERVAL 1 week;\r\nALTER TABLE system.metric_log MODIFY TTL event_date + INTERVAL 1 week;\r\nALTER TABLE system.query_log MODIFY TTL event_date + INTERVAL 1 week;\r\n<\/pre>\n<p>looks like above has to be done after each clickhouse-server upgrade. also note &#8211; disk space is not released immediately after cleanup, you have to wait ~ 8 minutes for that to happen &#8211; see <a href=\"https:\/\/github.com\/ClickHouse\/ClickHouse\/issues\/5486\">here<\/a>.<\/p>\n<p>dumping and restoring a table:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n-- at the source\r\nSELECT * FROM tree_path INTO OUTFILE &#039;\/var\/lib\/clickhouse\/user_files\/tree_path.native&#039; FORMAT Native;\r\n\r\n-- at the destination\r\nCREATE TABLE tree_path .... ;\r\n\r\nINSERT INTO tree_path FROM INFILE &#039;\/var\/lib\/clickhouse\/user_files\/tree_path.native&#039; FORMAT Native;\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>clickhouse is a column oriented OLAP database. i&#8217;ve started using it about half a year ago. i&#8217;m impressed. earlier i&#8217;ve read about it on percona&#8217;s blog but did not fully grasp how performant it was. i&#8217;ve tried it when i wrestling with MariaDB&#8217;s query planner on table with ~100M rows got me tired and each [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1,51],"tags":[112],"class_list":["post-3185","post","type-post","status-publish","format-standard","hentry","category-uncategorized","category-unimportant","tag-clickhouse"],"_links":{"self":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3185","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/comments?post=3185"}],"version-history":[{"count":12,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3185\/revisions"}],"predecessor-version":[{"id":3890,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3185\/revisions\/3890"}],"wp:attachment":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/media?parent=3185"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/categories?post=3185"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/tags?post=3185"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}