ClickHouse 删除数据或分区后为什么还能查询到?

在删除清理clickhouse系统日志数据时,执行完清理操作后,数据并没有立即删除。

清理操作如下:

1
2
3
4
5
alter table system.query_log_0 drop partition '202302';

alter table system.query_log_0 delete where event_date < '2023-03-01';

alter table system.trace_log_0 modify ttl event_date + toIntervalDay(30);

操作完成后,查询日志大小和分区,日志大小并没有减少,如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
SELECT sum(rows) AS `总行数`,
               formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`,
               formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`,
               round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率`,
               `table` AS `表名`
               FROM system.parts
               where database = 'system'
               group by `table`
               order by table asc;

┌────总行数─┬─原始大小───┬─压缩大小───┬─压缩率─┬─表名──────────────────────┐
  61410609  879.14 MiB  48.44 MiB        6  asynchronous_metric_log   
 484067810  6.76 GiB    307.19 MiB       4  asynchronous_metric_log_0 
     65816  253.45 MiB  18.01 MiB        7  metric_log                
  10888256  40.95 GiB   1.61 GiB         4  metric_log_0              
    454367  108.79 MiB  19.95 MiB       18  part_log                  
  52910875  12.43 GiB   2.29 GiB        18  part_log_0                
    283345  351.34 MiB  37.40 MiB       11  query_log                 
 347623369  406.07 GiB  39.68 GiB       10  query_log_0               
  46233632  59.73 GiB   1.87 GiB         3  query_views_log           
   1179869  342.91 MiB  32.41 MiB        9  trace_log                 
  86408095  25.96 GiB   2.37 GiB         9  trace_log_0               
└───────────┴────────────┴────────────┴────────┴───────────────────────────┘

查询分区发现被删除的分区依然可以查询到,如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
SELECT
    partition,
    name
FROM system.parts
WHERE (database = 'system') AND (table = 'query_log_0')


┌─partition─┬─name───────────────────────────┐
 202302     202302_1_78215_256             
 202302     202302_1_78215_258_837509      
 202302     202302_78216_144657_34         
 202302     202302_78216_144657_36_837509  
 202302     202302_144658_145552_14_837509 
 202302     202302_145553_145560_3_837509  
 202303     202303_145561_304204_62        
 202303     202303_145561_304204_63_837509 
 202303     202303_304205_304322_12_837509 
 202304     202304_304323_362055_28        
 202304     202304_304323_362055_29_837509 
 202304     202304_362056_408477_31        
 202304     202304_362056_408477_32_837509 
 202304     202304_408478_408816_13_837509 
 202304     202304_408817_408817_0_837509  
 202305     202305_408818_488988_34        
 202305     202305_408818_488988_36_837509 
 202305     202305_488989_601130_61        
 202305     202305_601131_647947_40        
 202305     202305_601131_647947_42_837509 
 202305     202305_647948_676146_39        
 202305     202305_647948_676146_41_837509 
 202305     202305_676147_679991_38_837509 
 202305     202305_679992_683622_30_837509 
 202305     202305_683623_683714_21_837509 
 202305     202305_683715_683730_5_837509  
 202306     202306_683731_771288_33        
 202306     202306_683731_771288_33_837509 
 202306     202306_771289_778574_31_837509 
 202306     202306_778575_793185_27_837509 
 202306     202306_793186_820273_33        
 202306     202306_793186_820273_33_837509 
 202306     202306_820274_837106_33        
 202306     202306_820274_837106_33_837509 
 202306     202306_837107_837303_20_837509 
 202306     202306_837304_837456_18_837509 
 202306     202306_837457_837506_10_837509 
 202306     202306_837507_837507_0_837509  
 202306     202306_837508_837508_0_837509  
└───────────┴────────────────────────────────┘

这是因为clickhouse ALTER 操作,对于不可复制的表,所有 ALTER 操作都是同步执行的。对于可复制的表,ALTER操作会将指令添加到ZooKeeper中,然后会尽快的执行它们。

异步操作被称为Mutations操作,Mutations是一类允许对表的行记录进行删除或更新的ALTER操作。相较于标准的 UPDATE 和 DELETE 用于少量行操作而言,

Mutations用来对表的很多行进行重量级的操作。

mutation操作在提交后(对于可复制表,添加到Zookeeper,对于不可复制表,添加到文件系统)立即返回。mutation操作本身是根据系统的配置参数异步执行的。

要跟踪mutation的进度,可以使用系统表 system.mutations。已经成功提交的mutation操作在服务重启后仍会继续执行。

一旦mutation完成提交,就不能回退了,但是如果因为某种原因操作被卡住了,可以通过 KILL MUTATION操作来取消它的执行。

查询system.mutations表:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
SELECT
    database,
    table,
    mutation_id,
    command,
    is_done
FROM system.mutations
ORDER BY create_time DESC

┌─database─┬─table─────────────────────┬─mutation_id──────────┬─command────────────────────────────────┬─is_done─┐
 system    query_log_0                mutation_837512.txt   DELETE WHERE event_date < '2023-05-01'        1 
 system    query_log_0                mutation_837511.txt   DELETE WHERE event_date < '2023-04-01'        1 
 system    query_log_0                mutation_837510.txt   DELETE WHERE event_date < '2023-03-01'        1 
 system    part_log_0                 mutation_583121.txt   MATERIALIZE TTL                               1 
 system    asynchronous_metric_log_0  mutation_1554547.txt  MATERIALIZE TTL                               1 
 system    trace_log_0                mutation_1335667.txt  MATERIALIZE TTL                               1 
└──────────┴───────────────────────────┴──────────────────────┴────────────────────────────────────────┴─────────┘

等待mutation操作都完成后,再查询就可以看到数据已经被清理。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
SELECT
    sum(rows) AS `总行数`,
    formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`,
    formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`,
    round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率`,
    table AS `表名`
FROM system.parts
WHERE database = 'system'
GROUP BY table
ORDER BY table ASC

┌────总行数─┬─原始大小───┬─压缩大小───┬─压缩率─┬─表名──────────────────────┐
  91808506  1.28 GiB    69.41 MiB        5  asynchronous_metric_log   
 484067810  6.76 GiB    307.19 MiB       4  asynchronous_metric_log_0 
     76882  296.07 MiB  19.33 MiB        7  metric_log                
  10888256  40.95 GiB   1.61 GiB         4  metric_log_0              
    216286  52.18 MiB   9.64 MiB        18  part_log                  
  52910875  12.43 GiB   2.29 GiB        18  part_log_0                
    398906  485.02 MiB  48.71 MiB       10  query_log                 
  41266563  48.78 GiB   4.80 GiB        10  query_log_0               
  46233632  59.73 GiB   1.87 GiB         3  query_views_log           
    975688  302.25 MiB  25.59 MiB        8  trace_log                 
  86408095  25.96 GiB   2.37 GiB         9  trace_log_0               
└───────────┴────────────┴────────────┴────────┴───────────────────────────┘

查询分区,被删除的分区已经无法查询。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
SELECT
    partition,
    name
FROM system.parts
WHERE (database = 'system') AND (table = 'query_log_0')

┌─partition─┬─name───────────────────────────┐
 202306     202306_683731_771288_33_837512 
 202306     202306_771289_778574_31_837512 
 202306     202306_778575_793185_27_837512 
 202306     202306_793186_820273_33_837512 
 202306     202306_820274_837106_33_837512 
 202306     202306_837107_837303_20_837512 
 202306     202306_837304_837456_18_837512 
 202306     202306_837457_837506_10_837512 
 202306     202306_837507_837507_0_837512  
 202306     202306_837508_837508_0_837512  
└───────────┴────────────────────────────────┘

参考