Doris SQL 特技

group_concat

description

Syntax

VARCHAR GROUP_CONCAT([DISTINCT] VARCHAR str[, VARCHAR sep] [ORDER BY { col_name | expr} [ASC | DESC])
该函数是类似于 sum() 的聚合函数,group_concat 将结果集中的多行结果连接成一个字符串。第二个参数 sep 为字符串之间的连接符号,该参数可以省略。该函数通常需要和 group by 语句一起使用。

sql 复制代码
mysql> select value from test;
+-------+
| value |
+-------+
| a     |
| b     |
| c     |
| c     |
+-------+

mysql> select GROUP_CONCAT(value) from test;
+-----------------------+
| GROUP_CONCAT(`value`) |
+-----------------------+
| a, b, c, c               |
+-----------------------+

mysql> select GROUP_CONCAT(DISTINCT value) from test;
+-----------------------+
| GROUP_CONCAT(`value`) |
+-----------------------+
| a, b, c               |
+-----------------------+

mysql> select GROUP_CONCAT(value, " ") from test;
+----------------------------+
| GROUP_CONCAT(`value`, ' ') |
+----------------------------+
| a b c c                    |
+----------------------------+

mysql> select GROUP_CONCAT(value, NULL) from test;
+----------------------------+
| GROUP_CONCAT(`value`, NULL)|
+----------------------------+
| NULL                       |
+----------------------------+

SELECT abs(k3), group_concat(distinct cast(abs(k2) as varchar) order by abs(k1), k5) FROM bigtable group by abs(k3) order by abs(k3);     +------------+-------------------------------------------------------------------------------+
| abs(`k3`)  | group_concat(DISTINCT CAST(abs(`k2`) AS CHARACTER), ORDER BY abs(`k1`), `k5`) |
+------------+-------------------------------------------------------------------------------+
|        103 | 255                                                                           |
|       1001 | 1989, 1986                                                                    |
|       1002 | 1989, 32767                                                                   |
|       3021 | 1991, 32767, 1992                                                             |
|       5014 | 1985, 1991                                                                    |
|      25699 | 1989                                                                          |
| 2147483647 | 255, 1991, 32767, 32767                                                       |
+------------+-------------------------------------------------------------------------------+

row_number

description

为每个 Partition 的每一行返回一个从1开始连续递增的整数。与 RANK() 和 DENSE_RANK() 不同的是,ROW_NUMBER() 返回的值不会重复也不会出现空缺,是连续递增的。

sql 复制代码
ROW_NUMBER() OVER(partition_by_clause order_by_clause)
sql 复制代码
select x, y, row_number() over(partition by x order by y) as rank from int_t;

| x | y    | rank     |
|---|------|----------|
| 1 | 1    | 1        |
| 1 | 2    | 2        |
| 1 | 2    | 3        |
| 2 | 1    | 1        |
| 2 | 2    | 2        |
| 2 | 3    | 3        |
| 3 | 1    | 1        |
| 3 | 1    | 2        |
| 3 | 2    | 3        |

timestampdiff

description

Syntax

INT TIMESTAMPDIFF(unit, DATETIME datetime_expr1, DATETIME datetime_expr2)

返回datetime_expr2−datetime_expr1,其中datetime_expr1和datetime_expr2是日期或日期时间表达式。

结果(整数)的单位由unit参数给出。interval的单位由unit参数给出,它应该是下列值之一:

SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, or YEAR。

sql 复制代码
MySQL> SELECT TIMESTAMPDIFF(MONTH,'2003-02-01','2003-05-01');
+--------------------------------------------------------------------+
| timestampdiff(MONTH, '2003-02-01 00:00:00', '2003-05-01 00:00:00') |
+--------------------------------------------------------------------+
|                                                                  3 |
+--------------------------------------------------------------------+

MySQL> SELECT TIMESTAMPDIFF(YEAR,'2002-05-01','2001-01-01');
+-------------------------------------------------------------------+
| timestampdiff(YEAR, '2002-05-01 00:00:00', '2001-01-01 00:00:00') |
+-------------------------------------------------------------------+
|                                                                -1 |
+-------------------------------------------------------------------+


MySQL> SELECT TIMESTAMPDIFF(MINUTE,'2003-02-01','2003-05-01 12:05:55');
+---------------------------------------------------------------------+
| timestampdiff(MINUTE, '2003-02-01 00:00:00', '2003-05-01 12:05:55') |
+---------------------------------------------------------------------+
|                                                              128885 |
+---------------------------------------------------------------------+

array_distinct

description

Syntax

ARRAY<T> array_distinct(ARRAY<T> arr)

返回去除了重复元素的数组,如果输入数组为NULL,则返回NULL。

notice

仅支持向量化引擎中使用

sql 复制代码
mysql> select k1, k2, array_distinct(k2) from array_test;
+------+-----------------------------+---------------------------+
| k1   | k2                          | array_distinct(k2)        |
+------+-----------------------------+---------------------------+
| 1    | [1, 2, 3, 4, 5]             | [1, 2, 3, 4, 5]           |
| 2    | [6, 7, 8]                   | [6, 7, 8]                 |
| 3    | []                          | []                        |
| 4    | NULL                        | NULL                      |
| 5    | [1, 2, 3, 4, 5, 4, 3, 2, 1] | [1, 2, 3, 4, 5]           |
| 6    | [1, 2, 3, NULL]             | [1, 2, 3, NULL]           |
| 7    | [1, 2, 3, NULL, NULL]       | [1, 2, 3, NULL]     |
+------+-----------------------------+---------------------------+

mysql> select k1, k2, array_distinct(k2) from array_test01;
+------+------------------------------------------+---------------------------+
| k1   | k2                                       | array_distinct(`k2`)      |
+------+------------------------------------------+---------------------------+
| 1    | ['a', 'b', 'c', 'd', 'e']                | ['a', 'b', 'c', 'd', 'e'] |
| 2    | ['f', 'g', 'h']                          | ['f', 'g', 'h']           |
| 3    | ['']                                     | ['']                      |
| 3    | [NULL]                                   | [NULL]                    |
| 5    | ['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c'] | ['a', 'b', 'c', 'd', 'e'] |
| 6    | NULL                                     | NULL                      |
| 7    | ['a', 'b', NULL]                         | ['a', 'b', NULL]          |
| 8    | ['a', 'b', NULL, NULL]                   | ['a', 'b', NULL]    |
+------+------------------------------------------+---------------------------+

split_by_string

description

Syntax

ARRAY<STRING> split_by_string(STRING s, STRING separator) 将字符串拆分为由字符串分隔的子字符串。它使用多个字符的常量字符串分隔符作为分隔符。如果字符串分隔符为空,它将字符串拆分为单个字符数组。

Arguments

separator --- 分隔符是一个字符串,是用来分割的标志字符. 类型: String

s --- 需要分割的字符串. 类型: String

Returned value(s)

返回一个包含子字符串的数组. 以下情况会返回空的子字符串:

需要分割的字符串的首尾是分隔符;

多个分隔符连续出现;

需要分割的字符串为空,而分隔符不为空.

Type: Array(String)

notice

仅支持向量化引擎中使用

sql 复制代码
select split_by_string('a1b1c1d','1');
+---------------------------------+
| split_by_string('a1b1c1d', '1') |
+---------------------------------+
| ['a', 'b', 'c', 'd']            |
+---------------------------------+

select split_by_string(',,a,b,c,',',');
+----------------------------------+
| split_by_string(',,a,b,c,', ',') |
+----------------------------------+
| ['', '', 'a', 'b', 'c', '']      |
+----------------------------------+

SELECT split_by_string(NULL,',');
+----------------------------+
| split_by_string(NULL, ',') |
+----------------------------+
| NULL                       |
+----------------------------+

select split_by_string('a,b,c,abcde',',');
+-------------------------------------+
| split_by_string('a,b,c,abcde', ',') |
+-------------------------------------+
| ['a', 'b', 'c', 'abcde']            |
+-------------------------------------+

select split_by_string('1,,2,3,,4,5,,abcde', ',,');
+---------------------------------------------+
| split_by_string('1,,2,3,,4,5,,abcde', ',,') |
+---------------------------------------------+
| ['1', '2,3', '4,5', 'abcde']                |
+---------------------------------------------+

select split_by_string(',,,,',',,');
+-------------------------------+
| split_by_string(',,,,', ',,') |
+-------------------------------+
| ['', '', '']                  |
+-------------------------------+

select split_by_string(',,a,,b,,c,,',',,');
+--------------------------------------+
| split_by_string(',,a,,b,,c,,', ',,') |
+--------------------------------------+
| ['', 'a', 'b', 'c', '']              |
+--------------------------------------+
相关推荐
村口蹲点的阿三1 分钟前
Spark SQL 中对 Map 类型的操作函数
javascript·数据库·hive·sql·spark
暮湫1 小时前
MySQL(1)概述
数据库·mysql
唯余木叶下弦声1 小时前
PySpark之金融数据分析(Spark RDD、SQL练习题)
大数据·python·sql·数据分析·spark·pyspark
fajianchen1 小时前
记一次线上SQL死锁事故:如何避免死锁?
数据库·sql
chengpei1471 小时前
实现一个自己的spring-boot-starter,基于SQL生成HTTP接口
java·数据库·spring boot·sql·http
中东大鹅3 小时前
MongoDB的索引与聚合
数据库·hadoop·分布式·mongodb
天天向上杰4 小时前
简识Redis 持久化相关的 “Everysec“ 策略
数据库·redis·缓存
Leaf吧4 小时前
springboot 配置多数据源以及动态切换数据源
java·数据库·spring boot·后端
狮歌~资深攻城狮5 小时前
TiDB出现后,大数据技术的未来方向
数据库·数据仓库·分布式·数据分析·tidb
狮歌~资深攻城狮5 小时前
TiDB 和信创:如何推动国产化数据库的发展?
数据库·数据仓库·分布式·数据分析·tidb