探索ClickHouse——使用MaterializedView存储kafka传递的数据

《探索ClickHouse------连接Kafka和Clickhouse》中,我们讲解了如何使用kafka engin连接kafka,并读取topic中的数据。但是遇到了一个问题,就是数据只能读取一次,即使后面还有新数据发送到该topic,该表也读不出来。

为了解决这个问题,我们引入MaterializedView。

创建表

该表结构直接借用了《探索ClickHouse------使用Projection加速查询》中的表结构。

bash 复制代码
CREATE TABLE materialized_uk_price_paid_from_kafka ( price UInt32, date Date, postcode1 LowCardinality(String), postcode2 LowCardinality(String), type Enum8('terraced' = 1, 'semi-detached' = 2, 'detached' = 3, 'flat' = 4, 'other' = 0), is_new UInt8, duration Enum8('freehold' = 1, 'leasehold' = 2, 'unknown' = 0), addr1 String, addr2 String, street LowCardinality(String), locality LowCardinality(String), town LowCardinality(String), district LowCardinality(String), county LowCardinality(String) ) ENGINE = MergeTree ORDER BY (postcode1, postcode2, addr1, addr2);

CREATE TABLE materialized_uk_price_paid_from_kafka

(
price UInt32,
date Date,
postcode1 LowCardinality(String),
postcode2 LowCardinality(String),
type Enum8('terraced' = 1, 'semi-detached' = 2, 'detached' = 3, 'flat' = 4, 'other' = 0),
is_new UInt8,
duration Enum8('freehold' = 1, 'leasehold' = 2, 'unknown' = 0),
addr1 String,
addr2 String,
street LowCardinality(String),
locality LowCardinality(String),
town LowCardinality(String),
district LowCardinality(String),
county LowCardinality(String)

)

ENGINE = MergeTree

ORDER BY (postcode1, postcode2, addr1, addr2)

Query id: 55b16049-a865-4d54-9333-d661c6280a09

Ok.

0 rows in set. Elapsed: 0.005 sec.

创建MaterializedView

bash 复制代码
CREATE MATERIALIZED VIEW uk_price_paid_from_kafka_consumer_view TO materialized_uk_price_paid_from_kafka AS SELECT splitByChar(' ', postcode) AS p, toUInt32(price_string) AS price, parseDateTimeBestEffortUS(time) AS date, p[1] AS postcode1, p[2] AS postcode2, transform(a, ['T', 'S', 'D', 'F', 'O'], ['terraced', 'semi-detached', 'detached', 'flat', 'other']) AS type, b = 'Y' AS is_new, transform(c, ['F', 'L', 'U'], ['freehold', 'leasehold', 'unknown']) AS duration, addr1, addr2, street, locality, town, district, county FROM uk_price_paid_from_kafka;

这样kafka topic中的数据被清洗到materialized_uk_price_paid_from_kafka表中。

查询

bash 复制代码
select * from materialized_uk_price_paid_from_kafka;

我们在给topic发送下面的内容

"{5FA8692E-537B-4278-8C67-5A060540506D}","19500","1995-01-27 00:00","SK10 2QW","T","N","L","38","","GARDEN STREET","MACCLESFIELD","MACCLESFIELD","MACCLESFIELD","CHESHIRE","A","A"

再查询表

bash 复制代码
select * from materialized_uk_price_paid_from_kafka;
相关推荐
码河漫步1 小时前
win11安装mysql社区版数据库
数据库·mysql
Wang's Blog1 小时前
MySQL: 存储引擎深度解析:Memory与Federated的特性与应用场景
数据库·mysql
学习中的程序媛~1 小时前
Spring 事务(@Transactional)与异步(@Async / CompletableFuture)结合的陷阱与最佳实践
java·数据库·sql
❀͜͡傀儡师2 小时前
docker搭建Elasticsearch+Kafka+Logstash+Filebeat日志分析系统
elasticsearch·docker·kafka
老葱头蒸鸡2 小时前
(4)Kafka消费者分区策略、Rebalance、Offset存储机制
sql·kafka·linq
员大头硬花生2 小时前
九、InnoDB引擎-MVCC
数据库·sql·mysql
一条闲鱼_mytube2 小时前
mvcc 简介
数据库
稻香味秋天2 小时前
单元测试指南
数据库·sqlserver·单元测试
JosieBook2 小时前
【数据库】Apache IoTDB数据库在大数据场景下的时序数据模型与建模方案
数据库·apache·iotdb
全栈胖叔叔-瓜州2 小时前
关于微软最新数据库引擎sqlserver2025 关于向量距离函数调用的问题
数据库·microsoft