| 序号 | 类型 | 地址 |
|---|---|---|
| 1 | Spark 函数 | 1、Spark函数_符号 |
| 2 | Spark 函数 | 2、Spark 函数_a/b/c |
| 3 | Spark 函数 | 3、Spark 函数_d/e/f/j/h/i/j/k/l |
| 4 | Spark 函数 | 4、Spark 函数_m/n/o/p/q/r |
| 5 | Spark 函数 | 5、Spark函数_s/t |
| 6 | Spark 函数 | 6、Spark 函数_u/v/w/x/y/z |
文章目录
-
- 1、A
-
- abs
- acos
- acosh
- add_months
- aes_decrypt
- aes_encrypt
- aggregate
- and
- any
- any_value
- approx_count_distinct
- approx_percentile
- array_agg
- array_compact
- array_contains
- array_distinct
- array_except
- array_insert
- array_intersect
- array_join
- array_max
- array_min
- array_position
- array_prepend
- array_remove
- array_repeat
- array_size
- array_sort
- array_union
- arrays_overlap
- arrays_zip
- ascii
- asin
- asinh
- assert_true
- atan
- atan2
- atanh
- avg
- 2、B
- 3、C
-
- cardinality
- case
- cast
- cbrt
- ceil
- ceiling
- char
- char_length
- character_length
- chr
- coalesce
- collate
- collation
- collations
- collect_list
- collect_set
- concat
- concat_ws
- contains
- conv
- convert_timezone
- corr
- cos
- cosh
- cot
- count
- count_if
- count_min_sketch
- covar_pop
- covar_samp
- crc32
- csc
- cume_dist
- curdate
- current_catalog
- current_database
- current_date
- current_schema
- current_timestamp
- current_timezone
- current_user
1、A
abs
abs(expr) - Returns the absolute value of the numeric or interval value.
Examples:
sql
> SELECT abs(-1);
1
> SELECT abs(INTERVAL -'1-1' YEAR TO MONTH);
1-1
Since: 1.2.0
acos
acos(expr) - Returns the inverse cosine (a.k.a. arc cosine) of expr, as if computed by java.lang.Math.acos.
Examples:
sql
> SELECT acos(1);
0.0
> SELECT acos(2);
NaN
Since: 1.4.0
acosh
acosh(expr) - Returns inverse hyperbolic cosine of expr.
Examples:
sql
> SELECT acosh(1);
0.0
> SELECT acosh(0);
NaN
Since: 3.0.0
add_months
add_months(start_date, num_months) - Returns the date that is num_months after start_date.
Examples:
sql
> SELECT add_months('2016-08-31', 1);
2016-09-30
Since: 1.5.0
aes_decrypt
aes_decrypt(expr, key[, mode[, padding[, aad]]]) - Returns a decrypted value of expr using AES in mode with padding. Key lengths of 16, 24 and 32 bits are supported. Supported combinations of (mode, padding) are ('ECB', 'PKCS'), ('GCM', 'NONE') and ('CBC', 'PKCS'). Optional additional authenticated data (AAD) is only supported for GCM. If provided for encryption, the identical AAD value must be provided for decryption. The default mode is GCM.
Arguments:
- expr - The binary value to decrypt.
- key - The passphrase to use to decrypt the data.
- mode - Specifies which block cipher mode should be used to decrypt messages. Valid modes: ECB, GCM, CBC.
- padding - Specifies how to pad messages whose length is not a multiple of the block size. Valid values: PKCS, NONE, DEFAULT. The DEFAULT padding means PKCS for ECB, NONE for GCM and PKCS for CBC.
- aad - Optional additional authenticated data. Only supported for GCM mode. This can be any free-form input and must be provided for both encryption and decryption.
Examples:
sql
> SELECT aes_decrypt(unhex('83F16B2AA704794132802D248E6BFD4E380078182D1544813898AC97E709B28A94'), '0000111122223333');
Spark
> SELECT aes_decrypt(unhex('6E7CA17BBB468D3084B5744BCA729FB7B2B7BCB8E4472847D02670489D95FA97DBBA7D3210'), '0000111122223333', 'GCM');
Spark SQL
> SELECT aes_decrypt(unbase64('3lmwu+Mw0H3fi5NDvcu9lg=='), '1234567890abcdef', 'ECB', 'PKCS');
Spark SQL
> SELECT aes_decrypt(unbase64('2NYmDCjgXTbbxGA3/SnJEfFC/JQ7olk2VQWReIAAFKo='), '1234567890abcdef', 'CBC');
Apache Spark
> SELECT aes_decrypt(unbase64('AAAAAAAAAAAAAAAAAAAAAPSd4mWyMZ5mhvjiAPQJnfg='), 'abcdefghijklmnop12345678ABCDEFGH', 'CBC', 'DEFAULT');
Spark
> SELECT aes_decrypt(unbase64('AAAAAAAAAAAAAAAAQiYi+sTLm7KD9UcZ2nlRdYDe/PX4'), 'abcdefghijklmnop12345678ABCDEFGH', 'GCM', 'DEFAULT', 'This is an AAD mixed into the input');
Spark
Since: 3.3.0
aes_encrypt
aes_encrypt(expr, key[, mode[, padding[, iv[, aad]]]]) - Returns an encrypted value of expr using AES in given mode with the specified padding. Key lengths of 16, 24 and 32 bits are supported. Supported combinations of (mode, padding) are ('ECB', 'PKCS'), ('GCM', 'NONE') and ('CBC', 'PKCS'). Optional initialization vectors (IVs) are only supported for CBC and GCM modes. These must be 16 bytes for CBC and 12 bytes for GCM. If not provided, a random vector will be generated and prepended to the output. Optional additional authenticated data (AAD) is only supported for GCM. If provided for encryption, the identical AAD value must be provided for decryption. The default mode is GCM.
Arguments:
- expr - The binary value to encrypt.
- key - The passphrase to use to encrypt the data.
- mode - Specifies which block cipher mode should be used to encrypt messages. Valid modes: ECB, GCM, CBC.
- padding - Specifies how to pad messages whose length is not a multiple of the block size. Valid values: PKCS, NONE, DEFAULT. The DEFAULT padding means PKCS for ECB, NONE for GCM and PKCS for CBC.
- iv - Optional initialization vector. Only supported for CBC and GCM modes. Valid values: None or ''. 16-byte array for CBC mode. 12-byte array for GCM mode.
- aad - Optional additional authenticated data. Only supported for GCM mode. This can be any free-form input and must be provided for both encryption and decryption.
Examples:
sql
> SELECT hex(aes_encrypt('Spark', '0000111122223333'));
83F16B2AA704794132802D248E6BFD4E380078182D1544813898AC97E709B28A94
> SELECT hex(aes_encrypt('Spark SQL', '0000111122223333', 'GCM'));
6E7CA17BBB468D3084B5744BCA729FB7B2B7BCB8E4472847D02670489D95FA97DBBA7D3210
> SELECT base64(aes_encrypt('Spark SQL', '1234567890abcdef', 'ECB', 'PKCS'));
3lmwu+Mw0H3fi5NDvcu9lg==
> SELECT base64(aes_encrypt('Apache Spark', '1234567890abcdef', 'CBC', 'DEFAULT'));
2NYmDCjgXTbbxGA3/SnJEfFC/JQ7olk2VQWReIAAFKo=
> SELECT base64(aes_encrypt('Spark', 'abcdefghijklmnop12345678ABCDEFGH', 'CBC', 'DEFAULT', unhex('00000000000000000000000000000000')));
AAAAAAAAAAAAAAAAAAAAAPSd4mWyMZ5mhvjiAPQJnfg=
> SELECT base64(aes_encrypt('Spark', 'abcdefghijklmnop12345678ABCDEFGH', 'GCM', 'DEFAULT', unhex('000000000000000000000000'), 'This is an AAD mixed into the input'));
AAAAAAAAAAAAAAAAQiYi+sTLm7KD9UcZ2nlRdYDe/PX4
Since: 3.3.0
aggregate
aggregate(expr, start, merge, finish) - Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function.
Examples:
sql
> SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x);
6
> SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10);
60
Since: 2.4.0
and
expr1 and expr2 - Logical AND.
Examples:
sql
> SELECT true and true;
true
> SELECT true and false;
false
> SELECT true and NULL;
NULL
> SELECT false and NULL;
false
Since: 1.0.0
any
any(expr) - Returns true if at least one value of expr is true.
Examples:
sql
> SELECT any(col) FROM VALUES (true), (false), (false) AS tab(col);
true
> SELECT any(col) FROM VALUES (NULL), (true), (false) AS tab(col);
true
> SELECT any(col) FROM VALUES (false), (false), (NULL) AS tab(col);
false
Since: 3.0.0
any_value
any_value(expr[, isIgnoreNull]) - Returns some value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.
Examples:
sql
> SELECT any_value(col) FROM VALUES (10), (5), (20) AS tab(col);
10
> SELECT any_value(col) FROM VALUES (NULL), (5), (20) AS tab(col);
NULL
> SELECT any_value(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
5
Note:
The function is non-deterministic.
Since: 3.4.0
approx_count_distinct
approx_count_distinct(expr[, relativeSD]) - Returns the estimated cardinality by HyperLogLog++. relativeSD defines the maximum relative standard deviation allowed.
Examples:
sql
> SELECT approx_count_distinct(col1) FROM VALUES (1), (1), (2), (2), (3) tab(col1);
3
Since: 1.6.0
approx_percentile
approx_percentile(col, percentage [, accuracy]) - Returns the approximate percentile of the numeric or ansi interval column col which is the smallest value in the ordered col values (sorted from least to greatest) such that no more than percentage of col values is less than the value or equal to that value. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the given percentage array.
Examples:\n\n```sql
SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col);
1,1,0
SELECT approx_percentile(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col);
7
SELECT approx_percentile(col, 0.5, 100) FROM VALUES (INTERVAL '0' MONTH), (INTERVAL '1' MONTH), (INTERVAL '2' MONTH), (INTERVAL '10' MONTH) AS tab(col);
0-1
SELECT approx_percentile(col, array(0.5, 0.7), 100) FROM VALUES (INTERVAL '0' SECOND), (INTERVAL '1' SECOND), (INTERVAL '2' SECOND), (INTERVAL '10' SECOND) AS tab(col);
0 00:00:01.000000000,0 00:00:02.000000000
**Since:** 2.1.0
---
### [array](https://spark.apache.org/docs/latest/api/sql/#array)
array(expr, ...) - Returns an array with the given elements.
**Examples:**
```sql
> SELECT array(1, 2, 3);
[1,2,3]
Since: 1.1.0
array_agg
array_agg(expr) - Collects and returns a list of non-unique elements.
Examples:\n\n```sql
SELECT array_agg(col) FROM VALUES (1), (2), (1) AS tab(col);
1,2,1
**Note:**
The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
**Since:** 3.3.0
---
### [array_append](https://spark.apache.org/docs/latest/api/sql/#array_append)
array_append(array, element) - Add the element at the end of the array passed as first argument. Type of element should be similar to type of the elements of the array. Null element is also appended into the array. But if the array passed, is NULL output is NULL
**Examples:**
```sql
> SELECT array_append(array('b', 'd', 'c', 'a'), 'd');
["b","d","c","a","d"]
> SELECT array_append(array(1, 2, 3, null), null);
[1,2,3,null,null]
> SELECT array_append(CAST(null as Array<Int>), 2);
NULL
Since: 3.4.0
array_compact
array_compact(array) - Removes null values from the array.
Examples:
sql
> SELECT array_compact(array(1, 2, 3, null));
[1,2,3]
> SELECT array_compact(array("a", "b", "c"));
["a","b","c"]
Since: 3.4.0
array_contains
array_contains(array, value) - Returns true if the array contains the value.
Examples:
sql
> SELECT array_contains(array(1, 2, 3), 2);
true
Since: 1.5.0
array_distinct
array_distinct(array) - Removes duplicate values from the array.
Examples:
sql
> SELECT array_distinct(array(1, 2, 3, null, 3));
[1,2,3,null]
Since: 2.4.0
array_except
array_except(array1, array2) - Returns an array of the elements in array1 but not in array2, without duplicates.
Examples:
sql
> SELECT array_except(array(1, 2, 3), array(1, 3, 5));
[2]
Since: 2.4.0
array_insert
array_insert(x, pos, val) - Places val into index pos of array x. Array indices start at 1. The maximum negative index is -1 for which the function inserts new element after the current last element. Index above array size appends the array, or prepends the array if index is negative, with 'null' elements.
Examples:
sqlsql
> SELECT array_insert(array(1, 2, 3, 4), 5, 5);
[1,2,3,4,5]
> SELECT array_insert(array(5, 4, 3, 2), -1, 1);
[5,4,3,2,1]
> SELECT array_insert(array(5, 3, 2, 1), -4, 4);
[5,4,3,2,1]
Since: 3.4.0
array_intersect
array_intersect(array1, array2) - Returns an array of the elements in the intersection of array1 and array2, without duplicates.
Examples:
sql
> SELECT array_intersect(array(1, 2, 3), array(1, 3, 5));
[1,3]
Since: 2.4.0
array_join
array_join(array, delimiter[, nullReplacement]) - Concatenates the elements of the given array using the delimiter and an optional string to replace nulls. If no value is set for nullReplacement, any null value is filtered.
Examples:
sql
> SELECT array_join(array('hello', 'world'), ' ');
hello world
> SELECT array_join(array('hello', null ,'world'), ' ');
hello world
> SELECT array_join(array('hello', null ,'world'), ' ', ',');
hello , world
Since: 2.4.0
array_max
array_max(array) - Returns the maximum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped.
Examples:
sql
> SELECT array_max(array(1, 20, null, 3));
20
Since: 2.4.0
array_min
array_min(array) - Returns the minimum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped.
Examples:
sql
> SELECT array_min(array(1, 20, null, 3));
1
Since: 2.4.0
array_position
array_position(array, element) - Returns the (1-based) index of the first matching element of the array as long, or 0 if no match is found.
Examples:
sql
> SELECT array_position(array(312, 773, 708, 708), 708);
3
> SELECT array_position(array(312, 773, 708, 708), 414);
0
Since: 2.4.0
array_prepend
array_prepend(array, element) - Add the element at the beginning of the array passed as first argument. Type of element should be the same as the type of the elements of the array. Null element is also prepended to the array. But if the array passed is NULL output is NULL
Examples:
sql
> SELECT array_prepend(array('b', 'd', 'c', 'a'), 'd');
["d","b","d","c","a"]
> SELECT array_prepend(array(1, 2, 3, null), null);
[null,1,2,3,null]
> SELECT array_prepend(CAST(null as Array<Int>), 2);
NULL
Since: 3.5.0
array_remove
array_remove(array, element) - Remove all elements that equal to element from array.
Examples:
sql
> SELECT array_remove(array(1, 2, 3, null, 3), 3);
[1,2,null]
Since: 2.4.0
array_repeat
array_repeat(element, count) - Returns the array containing element count times.
Examples:
sql
> SELECT array_repeat('123', 2);
["123","123"]
Since: 2.4.0
array_size
array_size(expr) - Returns the size of an array. The function returns null for null input.
Examples:
sql
> SELECT array_size(array('b', 'd', 'c', 'a'));
4
Since: 3.3.0
array_sort
array_sort(expr, func) - Sorts the input array. If func is omitted, sort in ascending order. The elements of the input array must be orderable. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the end of the returned array. Since 3.0.0 this function also sorts and returns the array based on the given comparator function. The comparator will take two arguments representing two elements of the array. It returns a negative integer, 0, or a positive integer as the first element is less than, equal to, or greater than the second element. If the comparator function returns null, the function will fail and raise an error.
Examples:
sql
> SELECT array_sort(array(5, 6, 1), (left, right) -> case when left < right then -1 when left > right then 1 else 0 end);
[1,5,6]
> SELECT array_sort(array('bc', 'ab', 'dc'), (left, right) -> case when left is null and right is null then 0 when left is null then -1 when right is null then 1 when left < right then 1 when left > right then -1 else 0 end);
["dc","bc","ab"]
> SELECT array_sort(array('b', 'd', null, 'c', 'a'));
["a","b","c","d",null]
Since: 2.4.0
array_union
array_union(array1, array2) - Returns an array of the elements in the union of array1 and array2, without duplicates.
Examples:
sql
> SELECT array_union(array(1, 2, 3), array(1, 3, 5));
[1,2,3,5]
Since: 2.4.0
arrays_overlap
arrays_overlap(a1, a2) - Returns true if a1 contains at least a non-null element present also in a2. If the arrays have no common element and they are both non-empty and either of them contains a null element null is returned, false otherwise.
Examples:
sql
> SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5));
true
Since: 2.4.0
arrays_zip
arrays_zip(a1, a2, ...) - Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
Examples:
sql
> SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4));
[{"0":1,"1":2},{"0":2,"1":3},{"0":3,"1":4}]
> SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4));
[{"0":1,"1":2,"2":3},{"0":2,"1":3,"2":4}]
Since: 2.4.0
ascii
ascii(str) - Returns the numeric value of the first character of str.
Examples:
sql
> SELECT ascii('222');
50
> SELECT ascii(2);
50
Since: 1.5.0
asin
asin(expr) - Returns the inverse sine (a.k.a. arc sine) the arc sin of expr, as if computed by java.lang.Math.asin.
Examples:
sql
> SELECT asin(0);
0.0
> SELECT asin(2);
NaN
Since: 1.4.0
asinh
asinh(expr) - Returns inverse hyperbolic sine of expr.
Examples:
sql
> SELECT asinh(0);
0.0
Since: 3.0.0
assert_true
assert_true(expr [, message]) - Throws an exception if expr is not true.
Examples:
sql
> SELECT assert_true(0 < 1);
NULL
Since: 2.0.0
atan
atan(expr) - Returns the inverse tangent (a.k.a. arc tangent) of expr, as if computed by java.lang.Math.atan
Examples:
sql
> SELECT atan(0);
0.0
Since: 1.4.0
atan2
atan2(exprY, exprX) - Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates (exprX, exprY), as if computed by java.lang.Math.atan2.
Arguments:
- exprY - coordinate on y-axis
- exprX - coordinate on x-axis
Examples:
sql
> SELECT atan2(0, 0);
0.0
Since: 1.4.0
atanh
atanh(expr) - Returns inverse hyperbolic tangent of expr.
Examples:
sql
> SELECT atanh(0);
0.0
> SELECT atanh(2);
NaN
Since: 3.0.0
avg
avg(expr) - Returns the mean calculated from values of a group.
Examples:
sql
> SELECT avg(col) FROM VALUES (1), (2), (3) AS tab(col);
2.0
> SELECT avg(col) FROM VALUES (1), (2), (NULL) AS tab(col);
1.5
Since: 1.0.0
2、B
base64
base64(bin) - Converts the argument from a binary bin to a base 64 string.
Examples:
sql
> SELECT base64('Spark SQL');
U3BhcmsgU1FM
> SELECT base64(x'537061726b2053514c');
U3BhcmsgU1FM
Since: 1.5.0
between
input [NOT] between lower AND upper - evaluate if input is [not] in between lower and upper
Arguments:
- input - An expression that is being compared with lower and upper bound.
- lower - Lower bound of the between check.
- upper - Upper bound of the between check.
Examples:
sql
> SELECT 0.5 between 0.1 AND 1.0;
true
Since: 1.0.0
bigint
bigint(expr) - Casts the value expr to the target data type bigint.
Since: 2.0.1
bin
bin(expr) - Returns the string representation of the long value expr represented in binary.
Examples:
sql
> SELECT bin(13);
1101
> SELECT bin(-13);
1111111111111111111111111111111111111111111111111111111111110011
> SELECT bin(13.3);
1101
Since: 1.5.0
binary
binary(expr) - Casts the value expr to the target data type binary.
Since: 2.0.1
bit_and
bit_and(expr) - Returns the bitwise AND of all non-null input values, or null if none.
Examples:
sql
> SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col);
1
Since: 3.0.0
bit_count
bit_count(expr) - Returns the number of bits that are set in the argument expr as an unsigned 64-bit integer, or NULL if the argument is NULL.
Examples:
sql
> SELECT bit_count(0);
0
Since: 3.0.0
bit_get
bit_get(expr, pos) - Returns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative.
Examples:
sql
> SELECT bit_get(11, 0);
1
> SELECT bit_get(11, 2);
0
Since: 3.2.0
bit_length
bit_length(expr) - Returns the bit length of string data or number of bits of binary data.
Examples:
sql
> SELECT bit_length('Spark SQL');
72
> SELECT bit_length(x'537061726b2053514c');
72
Since: 2.3.0
bit_or
bit_or(expr) - Returns the bitwise OR of all non-null input values, or null if none.
Examples:
sql
> SELECT bit_or(col) FROM VALUES (3), (5) AS tab(col);
7
Since: 3.0.0
bit_xor
bit_xor(expr) - Returns the bitwise XOR of all non-null input values, or null if none.
Examples:
sql
> SELECT bit_xor(col) FROM VALUES (3), (5) AS tab(col);
6
Since: 3.0.0
bitmap_bit_position
bitmap_bit_position(child) - Returns the bit position for the given input child expression.
Examples:
sql
> SELECT bitmap_bit_position(1);
0
> SELECT bitmap_bit_position(123);
122
Since: 3.5.0
bitmap_bucket_number
bitmap_bucket_number(child) - Returns the bucket number for the given input child expression.
Examples:
sql
> SELECT bitmap_bucket_number(123);
1
> SELECT bitmap_bucket_number(0);
0
Since: 3.5.0
bitmap_construct_agg
bitmap_construct_agg(child) - Returns a bitmap with the positions of the bits set from all the values from the child expression. The child expression will most likely be bitmap_bit_position().
Examples:
sql
> SELECT substring(hex(bitmap_construct_agg(bitmap_bit_position(col))), 0, 6) FROM VALUES (1), (2), (3) AS tab(col);
070000
> SELECT substring(hex(bitmap_construct_agg(bitmap_bit_position(col))), 0, 6) FROM VALUES (1), (1), (1) AS tab(col);
010000
Since: 3.5.0
bitmap_count
bitmap_count(child) - Returns the number of set bits in the child bitmap.
Examples:
sql
> SELECT bitmap_count(X '1010');
2
> SELECT bitmap_count(X 'FFFF');
16
> SELECT bitmap_count(X '0');
0
Since: 3.5.0
bitmap_or_agg
bitmap_or_agg(child) - Returns a bitmap that is the bitwise OR of all of the bitmaps from the child expression. The input should be bitmaps created from bitmap_construct_agg().
Examples:
sql
> SELECT substring(hex(bitmap_or_agg(col)), 0, 6) FROM VALUES (X '10'), (X '20'), (X '40') AS tab(col);
700000
> SELECT substring(hex(bitmap_or_agg(col)), 0, 6) FROM VALUES (X '10'), (X '10'), (X '10') AS tab(col);
100000
Since: 3.5.0
bool_and
bool_and(expr) - Returns true if all values of expr are true.
Examples:
sql
> SELECT bool_and(col) FROM VALUES (true), (true), (true) AS tab(col);
true
> SELECT bool_and(col) FROM VALUES (NULL), (true), (true) AS tab(col);
true
> SELECT bool_and(col) FROM VALUES (true), (false), (true) AS tab(col);
false
Since: 3.0.0
bool_or
bool_or(expr) - Returns true if at least one value of expr is true.
Examples:
sql
> SELECT bool_or(col) FROM VALUES (true), (false), (false) AS tab(col);
true
> SELECT bool_or(col) FROM VALUES (NULL), (true), (false) AS tab(col);
true
> SELECT bool_or(col) FROM VALUES (false), (false), (NULL) AS tab(col);
false
Since: 3.0.0
boolean
boolean(expr) - Casts the value expr to the target data type boolean.
Since: 2.0.1
bround
bround(expr, d) - Returns expr rounded to d decimal places using HALF_EVEN rounding mode.
Examples:
sql
> SELECT bround(2.5, 0);
2
> SELECT bround(25, -1);
20
Since: 2.0.0
btrim
btrim(str) - Removes the leading and trailing space characters from str.
btrim(str, trimStr) - Remove the leading and trailing trimStr characters from str.
Arguments:
- str - a string expression
- trimStr - the trim string characters to trim, the default value is a single space
Examples:
sql
> SELECT btrim(' SparkSQL ');
SparkSQL
> SELECT btrim(encode(' SparkSQL ', 'utf-8'));
SparkSQL
> SELECT btrim('SSparkSQLS', 'SL');
parkSQ
> SELECT btrim(encode('SSparkSQLS', 'utf-8'), encode('SL', 'utf-8'));
parkSQ
Since: 3.2.0
3、C
cardinality
cardinality(expr) - Returns the size of an array or a map. This function returns -1 for null input only if spark.sql.ansi.enabled is false and spark.sql.legacy.sizeOfNull is true. Otherwise, it returns null for null input. With the default settings, the function returns null for null input.
Examples:
sql
> SELECT cardinality(array('b', 'd', 'c', 'a'));
4
> SELECT cardinality(map('a', 1, 'b', 2));
2
Since: 2.4.0
case
CASE expr1 WHEN expr2 THEN expr3 [WHEN expr4 THEN expr5]* [ELSE expr6] END - When expr1 = expr2, returns expr3; when expr1 = expr4, return expr5; else return expr6.
Arguments:
- expr1 - the expression which is one operand of comparison.
- expr2, expr4 - the expressions each of which is the other operand of comparison.
- expr3, expr5, expr6 - the branch value expressions and else value expression should all be same type or coercible to a common type.
Examples:
sql
> SELECT CASE col1 WHEN 1 THEN 'one' WHEN 2 THEN 'two' ELSE '?' END FROM VALUES 1, 2, 3;
one
two
?
> SELECT CASE col1 WHEN 1 THEN 'one' WHEN 2 THEN 'two' END FROM VALUES 1, 2, 3;
one
two
NULL
Since: 1.0.1
cast
cast(expr AS type) - Casts the value expr to the target data type type. expr :: type alternative casting syntax is also supported.
Examples:
sql
> SELECT cast('10' as int);
10
> SELECT '10' :: int;
10
Since: 1.0.0
cbrt
cbrt(expr) - Returns the cube root of expr.
Examples:
sql
> SELECT cbrt(27.0);
3.0
Since: 1.4.0
ceil
ceil(expr[, scale]) - Returns the smallest number after rounding up that is not smaller than expr. An optional scale parameter can be specified to control the rounding behavior.
Examples:
sql
> SELECT ceil(-0.1);
0
> SELECT ceil(5);
5
> SELECT ceil(3.1411, 3);
3.142
> SELECT ceil(3.1411, -3);
1000
Since: 3.3.0
ceiling
ceiling(expr[, scale]) - Returns the smallest number after rounding up that is not smaller than expr. An optional scale parameter can be specified to control the rounding behavior.
Examples:
sql
> SELECT ceiling(-0.1);
0
> SELECT ceiling(5);
5
> SELECT ceiling(3.1411, 3);
3.142
> SELECT ceiling(3.1411, -3);
1000
Since: 3.3.0
char
char(expr) - Returns the ASCII character having the binary equivalent to expr. If n is larger than 256 the result is equivalent to chr(n % 256)
Examples:
sql
> SELECT char(65);
A
Since: 2.3.0
char_length
char_length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.
Examples:
sql
> SELECT char_length('Spark SQL ');
10
> SELECT char_length(x'537061726b2053514c');
9
> SELECT CHAR_LENGTH('Spark SQL ');
10
> SELECT CHARACTER_LENGTH('Spark SQL ');
10
Since: 2.3.0
character_length
character_length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.
Examples:
sql
> SELECT character_length('Spark SQL ');
10
> SELECT character_length(x'537061726b2053514c');
9
> SELECT CHAR_LENGTH('Spark SQL ');
10
> SELECT CHARACTER_LENGTH('Spark SQL ');
10
Since: 2.3.0
chr
chr(expr) - Returns the ASCII character having the binary equivalent to expr. If n is larger than 256 the result is equivalent to chr(n % 256)
Examples:
sql
> SELECT chr(65);
A
Since: 2.3.0
coalesce
coalesce(expr1, expr2, ...) - Returns the first non-null argument if exists. Otherwise, null.
Examples:
sql
> SELECT coalesce(NULL, 1, NULL);
1
Since: 1.0.0
collate
collate(expr, collationName) - Marks a given expression with the specified collation.
Arguments:
- expr - String expression to perform collation on.
- collationName - Foldable string expression that specifies the collation name.
Examples:
sql
> SELECT COLLATION('Spark SQL' collate UTF8_LCASE);
SYSTEM.BUILTIN.UTF8_LCASE
Since: 4.0.0
collation
collation(expr) - Returns the collation name of a given expression.
Arguments:
- expr - String expression to perform collation on.
Examples:
sql
> SELECT collation('Spark SQL');
SYSTEM.BUILTIN.UTF8_BINARY
Since: 4.0.0
collations
collations() - Get all of the Spark SQL string collations
Examples:
sql
> SELECT * FROM collations() WHERE NAME = 'UTF8_BINARY';
SYSTEM BUILTIN UTF8_BINARY NULL NULL ACCENT_SENSITIVE CASE_SENSITIVE NO_PAD NULL
Since: 4.0.0
collect_list
collect_list(expr) - Collects and returns a list of non-unique elements.
Examples:
sql
> SELECT collect_list(col) FROM VALUES (1), (2), (1) AS tab(col);
[1,2,1]
Note:
The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
Since: 2.0.0
collect_set
collect_set(expr) - Collects and returns a set of unique elements.
Examples:
sql
> SELECT collect_set(col) FROM VALUES (1), (2), (1) AS tab(col);
[1,2]
Note:
The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
Since: 2.0.0
concat
concat(col1, col2, ..., colN) - Returns the concatenation of col1, col2, ..., colN.
Examples:
sql
> SELECT concat('Spark', 'SQL');
SparkSQL
> SELECT concat(array(1, 2, 3), array(4, 5), array(6));
[1,2,3,4,5,6]
Note:
Concat logic for arrays is available since 2.4.0.
Since: 1.5.0
concat_ws
concat_ws(sep[, str | array(str)]+) - Returns the concatenation of the strings separated by sep, skipping null values.
Examples:
sql
> SELECT concat_ws(' ', 'Spark', 'SQL');
Spark SQL
> SELECT concat_ws('s');
> SELECT concat_ws('/', 'foo', null, 'bar');
foo/bar
> SELECT concat_ws(null, 'Spark', 'SQL');
NULL
Since: 1.5.0
contains
contains(left, right) - Returns a boolean. The value is True if right is found inside left. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type.
Examples:
sql
> SELECT contains('Spark SQL', 'Spark');
true
> SELECT contains('Spark SQL', 'SPARK');
false
> SELECT contains('Spark SQL', null);
NULL
> SELECT contains(x'537061726b2053514c', x'537061726b');
true
Since: 3.3.0
conv
conv(num, from_base, to_base) - Convert num from from_base to to_base.
Examples:
sql
> SELECT conv('100', 2, 10);
4
> SELECT conv(-10, 16, -10);
-16
Since: 1.5.0
convert_timezone
convert_timezone([sourceTz, ]targetTz, sourceTs) - Converts the timestamp without time zone sourceTs from the sourceTz time zone to targetTz.
Arguments:
- sourceTz - the time zone for the input timestamp. If it is missed, the current session time zone is used as the source time zone.
- targetTz - the time zone to which the input timestamp should be converted
- sourceTs - a timestamp without time zone
Examples:
sql
> SELECT convert_timezone('Europe/Brussels', 'America/Los_Angeles', timestamp_ntz'2021-12-06 00:00:00');
2021-12-05 15:00:00
> SELECT convert_timezone('Europe/Brussels', timestamp_ntz'2021-12-05 15:00:00');
2021-12-06 00:00:00
Since: 3.4.0
corr
corr(expr1, expr2) - Returns Pearson coefficient of correlation between a set of number pairs.
Examples:
sql
> SELECT corr(c1, c2) FROM VALUES (3, 2), (3, 3), (6, 4) as tab(c1, c2);
0.8660254037844387
Since: 1.6.0
cos
cos(expr) - Returns the cosine of expr, as if computed by java.lang.Math.cos.
Arguments:
- expr - angle in radians
Examples:
sql
> SELECT cos(0);
1.0
Since: 1.4.0
cosh
cosh(expr) - Returns the hyperbolic cosine of expr, as if computed by java.lang.Math.cosh.
Arguments:
- expr - hyperbolic angle
Examples:
sql
> SELECT cosh(0);
1.0
Since: 1.4.0
cot
cot(expr) - Returns the cotangent of expr, as if computed by 1/java.lang.Math.tan.
Arguments:
- expr - angle in radians
Examples:
sql
> SELECT cot(1);
0.6420926159343306
Since: 2.3.0
count
count(*) - Returns the total number of retrieved rows, including rows containing null.
count(expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are all non-null.
count(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-null.
Examples:
sql
> SELECT count(*) FROM VALUES (NULL), (5), (5), (20) AS tab(col);
4
> SELECT count(col) FROM VALUES (NULL), (5), (5), (20) AS tab(col);
3
> SELECT count(DISTINCT col) FROM VALUES (NULL), (5), (5), (10) AS tab(col);
2
Since: 1.0.0
count_if
count_if(expr) - Returns the number of TRUE values for the expression.
Examples:
sql
> SELECT count_if(col % 2 = 0) FROM VALUES (NULL), (0), (1), (2), (3) AS tab(col);
2
> SELECT count_if(col IS NULL) FROM VALUES (NULL), (0), (1), (2), (3) AS tab(col);
1
Since: 3.0.0
count_min_sketch
count_min_sketch(col, eps, confidence, seed) - Returns a count-min sketch of a column with the given esp, confidence and seed. The result is an array of bytes, which can be deserialized to a CountMinSketch before usage. Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space.
Examples:
sql
> SELECT hex(count_min_sketch(col, 0.5d, 0.5d, 1)) FROM VALUES (1), (2), (1) AS tab(col);
0000000100000000000000030000000100000004000000005D8D6AB90000000000000000000000000000000200000000000000010000000000000000
Since: 2.2.0
covar_pop
covar_pop(expr1, expr2) - Returns the population covariance of a set of number pairs.
Examples:
sql
> SELECT covar_pop(c1, c2) FROM VALUES (1,1), (2,2), (3,3) AS tab(c1, c2);
0.6666666666666666
Since: 2.0.0
covar_samp
covar_samp(expr1, expr2) - Returns the sample covariance of a set of number pairs.
Examples:
sql
> SELECT covar_samp(c1, c2) FROM VALUES (1,1), (2,2), (3,3) AS tab(c1, c2);
1.0
Since: 2.0.0
crc32
crc32(expr) - Returns a cyclic redundancy check value of the expr as a bigint.
Examples:
sql
> SELECT crc32('Spark');
1557323817
Since: 1.5.0
csc
csc(expr) - Returns the cosecant of expr, as if computed by 1/java.lang.Math.sin.
Arguments:
- expr - angle in radians
Examples:
sql
> SELECT csc(1);
1.1883951057781212
Since: 3.3.0
cume_dist
cume_dist() - Computes the position of a value relative to all values in the partition.
Examples:
sql
> SELECT a, b, cume_dist() OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
A1 1 0.6666666666666666
A1 1 0.6666666666666666
A1 2 1.0
A2 3 1.0
Since: 2.0.0
curdate
curdate() - Returns the current date at the start of query evaluation. All calls of curdate within the same query return the same value.
Examples:
sql
> SELECT curdate();
2022-09-06
Since: 3.4.0
current_catalog
current_catalog() - Returns the current catalog.
Examples:
sql
> SELECT current_catalog();
spark_catalog
Since: 3.1.0
current_database
current_database() - Returns the current database.
Examples:
sql
> SELECT current_database();
default
Since: 1.6.0
current_date
current_date() - Returns the current date at the start of query evaluation. All calls of current_date within the same query return the same value.
current_date - Returns the current date at the start of query evaluation.
Examples:
sql
> SELECT current_date();
2020-04-25
> SELECT current_date;
2020-04-25
Note:
The syntax without braces has been supported since 2.0.1.
Since: 1.5.0
current_schema
current_schema() - Returns the current database.
Examples:
sql
> SELECT current_schema();
default
Since: 3.4.0
current_timestamp
current_timestamp() - Returns the current timestamp at the start of query evaluation. All calls of current_timestamp within the same query return the same value.
current_timestamp - Returns the current timestamp at the start of query evaluation.
Examples:
sql
> SELECT current_timestamp();
2020-04-25 15:49:11.914
> SELECT current_timestamp;
2020-04-25 15:49:11.914
Note:
The syntax without braces has been supported since 2.0.1.
Since: 1.5.0
current_timezone
current_timezone() - Returns the current session local timezone.
Examples:
sql
> SELECT current_timezone();
Asia/Shanghai
Since: 3.1.0
current_user
current_user() - user name of current execution context.
Examples:
sql
> SELECT current_user();
mockingjay
Since: 3.2.0