clickhouse数据字典

前言:

clickhouse的安装见

https://blog.csdn.net/weixin_41463944/article/details/142534145?spm=1001.2014.3001.5501

1、需要在路径/etc/clickhouse-serve下定义*_dictionary.xml结尾的配置文件

2、需要准备organization.csv和sales.csv本地测试文件(注意不是windows下的xlsx表格,是linux下的文件)

2.1 organization.csv文件

1,"a0001","研发部"
2,"a0002","产品部"
3,"a0003","数据部"
4,"a0004","测试部"
5,"a0005","运维部"
6,"a0006","规划部",
7,"a0007","市场部"

2.2 sales.csv文件

1,2016-01-01,2017-01-10,100

2,2016-05-01,2017-07-01,200

3,2014-03-05,2018-01-20,300

4,2018-08-01,2019-10-01,400

5,2017-03-01,2017-06-01,500

6,2017-04-09,2018-05-30,600

7,2018-06-01,2019-01-25,700

8,2019-08-01,2019-12-12,800

3、关于每个xml字典文件的格式

<?xml version="1.0"?>
<dictionaries>
  <dictionary>
    <name>字典名称(全局唯一)</name>

    <source>
      数据源配置
    </source>

    <layout>
      字典数据在内存中的格式
    </layout>

    <structure>
      字典的数据结构

    </structure>

    <lifetime>
    字典的自动更新频率
      <min>300</min>
      <max>360</max>
    </lifetime>

  </dictionary>
</dictionaries>

4、关于扩展字典的几种类型配置

一共7种类型,其中flat、hashed、sparse_hashed、range_hashed、cache的主键为数值型UInt64类型的,其他类型如complex_key_hashed、complex_key_cache为复合类型的key

4.1)flat类型的字典,在/etc/clickhouse-serve目录下新建test_flat_dictionary.xml

<?xml version="1.0"?>
<dictionaries>
  <dictionary>
    <name>test_flat_dict</name>

    <source>
      <file>
        <path>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</path>
        <format>CSV</format>
      </file>
    </source>

    <layout>
      <flat/>
    </layout>

    <structure>
      <id>
        <name>id</name>
      </id>

      <attribute>
        <name>code</name>
        <type>String</type>
        <null_value></null_value>
      </attribute>

      <attribute>
        <name>name</name>
        <type>String</type>
        <null_value></null_value>
      </attribute>

    </structure>

    <lifetime>
      <min>300</min>
      <max>360</max>
    </lifetime>

  </dictionary>
</dictionaries>

4.2)hashed类型的字典,在/etc/clickhouse-serve目录下新建test_hashed_dictionary.xml

<?xml version="1.0"?>
<dictionaries>
  <dictionary>
    <name>test_hashed_dict</name>

    <source>
      <file>
        <path>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</path>
        <format>CSV</format>
      </file>
    </source>

    <layout>
      <hashed/>
    </layout>

    <structure>
      <id>
        <name>id</name>
      </id>

      <attribute>
        <name>code</name>
        <type>String</type>
        <null_value></null_value>
      </attribute>

      <attribute>
        <name>name</name>
        <type>String</type>
        <null_value></null_value>
      </attribute>

    </structure>

    <lifetime>
      <min>300</min>
      <max>360</max>
    </lifetime>

  </dictionary>
</dictionaries>

4.3)complex_key_hashed类型的字典,在/etc/clickhouse-serve目录下新建complex_key_hashed_dictionary.xml

<?xml version="1.0"?>
<dictionaries>
  <dictionary>
    <name>test_complex_key_hashed_dict</name>

    <source>
      <file>
        <path>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</path>
        <format>CSV</format>
      </file>
    </source>

    <layout>
      <complex_key_hashed/>
    </layout>

    <structure>
      <key>
        <attribute>
          <name>id</name>
          <type>UInt64</type>
        </attribute>
        <attribute>
          <name>code</name>
          <type>String</type>
        </attribute>
      </key>

      <attribute>
        <name>name</name>
        <type>String</type>
        <null_value></null_value>
      </attribute>

    </structure>

    <lifetime>
      <min>300</min>
      <max>360</max>
    </lifetime>

  </dictionary>
</dictionaries>

4.4)range_hashed类型的字典,在/etc/clickhouse-serve目录下新建test_range_hashed_dictionary.xml

<?xml version="1.0"?>
<dictionaries>
  <dictionary>
    <name>test_range_hashed_dict</name>

    <source>
      <file>
        <path>/home/clickhouse-server/etc/clickhouse-server/dictionary/sales.csv</path>
        <format>CSV</format>
      </file>
    </source>

    <layout>
      <range_hashed/>
    </layout>

    <structure>
      <id>
        <name>id</name>
      </id>

      <range_min>
        <name>start</name>
      </range_min>

      <range_max>
        <name>end</name>
      </range_max>

      <attribute>
        <name>price</name>
        <type>Float32</type>
        <null_value></null_value>
      </attribute>

    </structure>

    <lifetime>
      <min>300</min>
      <max>360</max>
    </lifetime>

  </dictionary>
</dictionaries>

5、关于字典的数据源

5.1 文件类型,格式通常为CSV或者TabSeparated

1)本地文件

<source>
      <file>
        <path>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</path>
        <format>CSV</format>
      </file>
    </source>

2)可执行文件

<source>
      <executable>
        <command>/home/clickhouse-server/etc/clickhouse-server/dictionary/organization.csv</command>
        <format>CSV</format>
      </executable>
    </source>

3)远程文件

<source>
      <http>
        <url>xxxxx.csv</url>
        <format>CSV</format>
      </http>
    </source>

5.2 数据库类型的数据源
1)mysql

<source>
      <mysql>
        <port>3306</port>
        <user>xxxx</user>
        <password>xxxx</password>
        <replica>
          <host>localhost</host>
          <priority>1</priority>
        </replica>
        <db>db_mybatis</db>
        <table>userList</table>
      </mysql>
    </source>

其中<mysql> 标签的下一级可添加 <where>is_deleted=0</where>做为筛选条件

2)clickhouse

 <source>
      <clickhouse>
        <host>localhost</host>
        <port>9000</port>
        <user>xxxx</user>
        <password>xxxxx</password>
        <db>my_database</db>
        <table>test_table</table>
      </clickhouse>
    </source>

6、关于DDL语句

dictGet('字典名称','字典的具体字段','key键值'),如果字典为复合key的话,则在字典函数中的key用元组tuple('a','b','c')表示

7、关于手动刷新数据字典

7.1 )全局字典刷新

system reload dictionaries

7.2)刷新单个字典(字典名称不加单引号)

system reload dictionary test_mysql_dict
相关推荐
武子康5 天前
大数据-154 Apache Druid 架构与原理详解 基础架构、架构演进
java·大数据·clickhouse·hdfs·架构·flink·apache
武子康6 天前
大数据-152 Apache Druid 集群模式 配置启动【下篇】 超详细!
java·大数据·clickhouse·flink·apache
AAEllisonPang6 天前
ClickHouse 的 MergeTree 引擎有哪些性能优势?
大数据·数据库·clickhouse
SelectDB技术团队6 天前
快手:从 Clickhouse 到 Apache Doris,实现湖仓分离向湖仓一体架构升级
数据仓库·clickhouse·doris·快手·lakehouse
武子康6 天前
大数据-149 Apache Druid 基本介绍 技术特点 应用场景
大数据·hadoop·clickhouse·hdfs·架构·apache
武子康6 天前
大数据-155 Apache Druid 架构与原理详解 数据存储 索引服务 压缩机制
java·大数据·clickhouse·架构·flink·系统架构·apache
AAEllisonPang6 天前
ClickHouse 引擎的选择
大数据·数据库·clickhouse
云观秋毫6 天前
APO v0.5.0 发布:可视化配置告警规则;优化时间筛选器;支持自建的ClickHouse和VictoriaMetrics
运维·clickhouse
Biturd6 天前
docker-compose 快速部署clickhouse集群
clickhouse·docker·容器