3、Spark 函数_d/e/f/j/h/i/j/k/l


序号 类型 地址
1 Spark 函数 1、Spark函数_符号
2 Spark 函数 2、Spark 函数_a/b/c
3 Spark 函数 3、Spark 函数_d/e/f/j/h/i/j/k/l
4 Spark 函数 4、Spark 函数_m/n/o/p/q/r
5 Spark 函数 5、Spark函数_s/t
6 Spark 函数 6、Spark 函数_u/v/w/x/y/z

文章目录


4、D


date

date(expr) - Casts the value expr to the target data type date.

Since: 2.0.1


date_add

date_add(start_date, num_days) - Returns the date that is num_days after start_date.

Examples:

sql 复制代码
> SELECT date_add('2016-07-30', 1);
 2016-07-31

Since: 1.5.0


date_diff

date_diff(endDate, startDate) - Returns the number of days from startDate to endDate.

Examples:

sql 复制代码
> SELECT date_diff('2009-07-31', '2009-07-30');
 1

> SELECT date_diff('2009-07-30', '2009-07-31');
 -1

Since: 3.4.0


date_format

date_format(timestamp, fmt) - Converts timestamp to a value of string in the format specified by the date format fmt.

Arguments:

  • timestamp - A date/timestamp or string to be converted to the given format.
  • fmt - Date/time format pattern to follow. See Datetime Patterns for valid date and time format patterns.

Examples:

sql 复制代码
> SELECT date_format('2016-04-08', 'y');
 2016

Since: 1.5.0


date_from_unix_date

date_from_unix_date(days) - Create date from the number of days since 1970-01-01.

Examples:

sql 复制代码
> SELECT date_from_unix_date(1);
 1970-01-02

Since: 3.1.0


date_part

date_part(field, source) - Extracts a part of the date/timestamp or interval source.

Arguments:

  • field - selects which part of the source should be extracted, and supported string values are as same as the fields of the equivalent function EXTRACT.
  • source - a date/timestamp or interval column from where field should be extracted

Examples:

sql 复制代码
> SELECT date_part('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
 2019
> SELECT date_part('week', timestamp'2019-08-12 01:00:00.123456');
 33
> SELECT date_part('doy', DATE'2019-08-12');
 224
> SELECT date_part('SECONDS', timestamp'2019-10-01 00:00:01.000001');
 1.000001
> SELECT date_part('days', interval 5 days 3 hours 7 minutes);
 5
> SELECT date_part('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
 30.001001
> SELECT date_part('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
 11
> SELECT date_part('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
 55

Note:

The date_part function is equivalent to the SQL-standard function EXTRACT(field FROM source)

Since: 3.0.0


date_sub

date_sub(start_date, num_days) - Returns the date that is num_days before start_date.

Examples:

sql 复制代码
> SELECT date_sub('2016-07-30', 1);
 2016-07-29

Since: 1.5.0


date_trunc

date_trunc(fmt, ts) - Returns timestamp ts truncated to the unit specified by the format model fmt.

Arguments:

  • fmt - the format representing the unit to be truncated to
    • "YEAR", "YYYY", "YY" - truncate to the first date of the year that the ts falls in, the time part will be zero out
    • "QUARTER" - truncate to the first date of the quarter that the ts falls in, the time part will be zero out
    • "MONTH", "MM", "MON" - truncate to the first date of the month that the ts falls in, the time part will be zero out
    • "WEEK" - truncate to the Monday of the week that the ts falls in, the time part will be zero out
    • "DAY", "DD" - zero out the time part
    • "HOUR" - zero out the minute and second with fraction part
    • "MINUTE"- zero out the second with fraction part
    • "SECOND" - zero out the second fraction part
    • "MILLISECOND" - zero out the microseconds
    • "MICROSECOND" - everything remains
  • ts - datetime value or valid timestamp string

Examples:

sql 复制代码
> SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359');
 2015-01-01 00:00:00
> SELECT date_trunc('MM', '2015-03-05T09:32:05.359');
 2015-03-01 00:00:00
> SELECT date_trunc('DD', '2015-03-05T09:32:05.359');
 2015-03-05 00:00:00
> SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359');
 2015-03-05 09:00:00
> SELECT date_trunc('MILLISECOND', '2015-03-05T09:32:05.123456');
 2015-03-05 09:32:05.123

Since: 2.3.0


dateadd

dateadd(start_date, num_days) - Returns the date that is num_days after start_date.

Examples:

sql 复制代码
> SELECT dateadd('2016-07-30', 1);
 2016-07-31

Since: 3.4.0


datediff

datediff(endDate, startDate) - Returns the number of days from startDate to endDate.

Examples:

sql 复制代码
> SELECT datediff('2009-07-31', '2009-07-30');
 1

> SELECT datediff('2009-07-30', '2009-07-31');
 -1

Since: 1.5.0


datepart

datepart(field, source) - Extracts a part of the date/timestamp or interval source.

Arguments:

  • field - selects which part of the source should be extracted, and supported string values are as same as the fields of the equivalent function EXTRACT.
  • source - a date/timestamp or interval column from where field should be extracted

Examples:

sql 复制代码
> SELECT datepart('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
 2019
> SELECT datepart('week', timestamp'2019-08-12 01:00:00.123456');
 33
> SELECT datepart('doy', DATE'2019-08-12');
 224
> SELECT datepart('SECONDS', timestamp'2019-10-01 00:00:01.000001');
 1.000001
> SELECT datepart('days', interval 5 days 3 hours 7 minutes);
 5
> SELECT datepart('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
 30.001001
> SELECT datepart('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
 11
> SELECT datepart('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
 55

Note:

The datepart function is equivalent to the SQL-standard function EXTRACT(field FROM source)

Since: 3.4.0


day

day(date) - Returns the day of month of the date/timestamp.

Examples:

sql 复制代码
> SELECT day('2009-07-30');
 30

Since: 1.5.0


dayname

dayname(date) - Returns the three-letter abbreviated day name from the given date.

Examples:

sql 复制代码
> SELECT dayname(DATE('2008-02-20'));
 Wed

Since: 4.0.0


dayofmonth

dayofmonth(date) - Returns the day of month of the date/timestamp.

Examples:

sql 复制代码
> SELECT dayofmonth('2009-07-30');
 30

Since: 1.5.0


dayofweek

dayofweek(date) - Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday).

Examples:

sql 复制代码
> SELECT dayofweek('2009-07-30');
 5

Since: 2.3.0


dayofyear

dayofyear(date) - Returns the day of year of the date/timestamp.

Examples:

sql 复制代码
> SELECT dayofyear('2016-04-09');
 100

Since: 1.5.0


decimal

decimal(expr) - Casts the value expr to the target data type decimal.

Since: 2.0.1


decode

decode(bin, charset) - Decodes the first argument using the second argument character set. If either argument is null, the result will also be null.

decode(expr, search, result [, search, result ] ... [, default]) - Compares expr to each search value in order. If expr is equal to a search value, decode returns the corresponding result. If no match is found, then it returns default. If default is omitted, it returns null.

Arguments:

  • bin - a binary expression to decode
  • charset - one of the charsets 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16', 'UTF-32' to decode bin into a STRING. It is case insensitive.

Examples:

sql 复制代码
> SELECT decode(encode('abc', 'utf-8'), 'utf-8');
 abc
> SELECT decode(2, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
 San Francisco
> SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
 Non domestic
> SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle');
 NULL
> SELECT decode(null, 6, 'Spark', NULL, 'SQL', 4, 'rocks');
 SQL

Note:

decode(expr, search, result [, search, result ] ... [, default]) is supported since 3.2.0

Since: 1.5.0


degrees

degrees(expr) - Converts radians to degrees.

Arguments:

  • expr - angle in radians

Examples:

sql 复制代码
> SELECT degrees(3.141592653589793);
 180.0

Since: 1.4.0


dense_rank

dense_rank() - Computes the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps in the ranking sequence.

Arguments:

  • children - this is to base the rank on; a change in the value of one the children will trigger a change in rank. This is an internal parameter and will be assigned by the Analyser.

Examples:

sql 复制代码
> SELECT a, b, dense_rank(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
 A1 1   1
 A1 1   1
 A1 2   2
 A2 3   1

Since: 2.0.0


div

expr1 div expr2 - Divide expr1 by expr2. It returns NULL if an operand is NULL or expr2 is 0. The result is casted to long.

Examples:

sql 复制代码
> SELECT 3 div 2;
 1
> SELECT INTERVAL '1-1' YEAR TO MONTH div INTERVAL '-1' MONTH;
 -13

Since: 3.0.0


double

double(expr) - Casts the value expr to the target data type double.

Since: 2.0.1

5、E


e

e() - Returns Euler's number, e.

Examples:

sql 复制代码
> SELECT e();
 2.718281828459045

Since: 1.5.0


element_at

element_at(array, index) - Returns element of array at given (1-based) index. If Index is 0, Spark will throw an error. If index < 0, accesses elements from the last to the first. The function returns NULL if the index exceeds the length of the array and spark.sql.ansi.enabled is set to false. If spark.sql.ansi.enabled is set to true, it throws ArrayIndexOutOfBoundsException for invalid indices.

element_at(map, key) - Returns value for given key. The function returns NULL if the key is not contained in the map.

Examples:

sql 复制代码
> SELECT element_at(array(1, 2, 3), 2);
 2
> SELECT element_at(map(1, 'a', 2, 'b'), 2);
 b

Since: 2.4.0


elt

elt(n, input1, input2, ...) - Returns the n-th input, e.g., returns input2 when n is 2. The function returns NULL if the index exceeds the length of the array and spark.sql.ansi.enabled is set to false. If spark.sql.ansi.enabled is set to true, it throws ArrayIndexOutOfBoundsException for invalid indices.

Examples:

sql 复制代码
> SELECT elt(1, 'scala', 'java');
 scala
> SELECT elt(2, 'a', 1);
 1

Since: 2.0.0


encode

encode(str, charset) - Encodes the first argument using the second argument character set. If either argument is null, the result will also be null.

Arguments:

  • str - a string expression
  • charset - one of the charsets 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16', 'UTF-32' to encode str into a BINARY. It is case insensitive.

Examples:

sql 复制代码
> SELECT encode('abc', 'utf-8');
 abc

Since: 1.5.0


endswith

endswith(left, right) - Returns a boolean. The value is True if left ends with right. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type.

Examples:

sql 复制代码
> SELECT endswith('Spark SQL', 'SQL');
 true
> SELECT endswith('Spark SQL', 'Spark');
 false
> SELECT endswith('Spark SQL', null);
 NULL
> SELECT endswith(x'537061726b2053514c', x'537061726b');
 false
> SELECT endswith(x'537061726b2053514c', x'53514c');
 true

Since: 3.3.0


equal_null

equal_null(expr1, expr2) - Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null.

Arguments:

  • expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array/struct, the data types of fields must be orderable.

Examples:

sql 复制代码
> SELECT equal_null(3, 3);
 true
> SELECT equal_null(1, '11');
 false
> SELECT equal_null(true, NULL);
 false
> SELECT equal_null(NULL, 'abc');
 false
> SELECT equal_null(NULL, NULL);
 true

Since: 3.4.0


every

every(expr) - Returns true if all values of expr are true.

Examples:

sql 复制代码
> SELECT every(col) FROM VALUES (true), (true), (true) AS tab(col);
 true
> SELECT every(col) FROM VALUES (NULL), (true), (true) AS tab(col);
 true
> SELECT every(col) FROM VALUES (true), (false), (true) AS tab(col);
 false

Since: 3.0.0


exists

exists(expr, pred) - Tests whether a predicate holds for one or more elements in the array.

Examples:

sql 复制代码
> SELECT exists(array(1, 2, 3), x -> x % 2 == 0);
 true
> SELECT exists(array(1, 2, 3), x -> x % 2 == 10);
 false
> SELECT exists(array(1, null, 3), x -> x % 2 == 0);
 NULL
> SELECT exists(array(0, null, 2, 3, null), x -> x IS NULL);
 true
> SELECT exists(array(1, 2, 3), x -> x IS NULL);
 false

Since: 2.4.0


exp

exp(expr) - Returns e to the power of expr.

Examples:

sql 复制代码
> SELECT exp(0);
 1.0

Since: 1.4.0


explode

explode(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns. Unless specified otherwise, uses the default column name col for elements of the array or key and value for the elements of the map.

Examples:

sql 复制代码
> SELECT explode(array(10, 20));
 10
 20
> SELECT explode(collection => array(10, 20));
 10
 20

Since: 1.0.0


explode_outer

explode_outer(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns. Unless specified otherwise, uses the default column name col for elements of the array or key and value for the elements of the map.

Examples:

sql 复制代码
> SELECT explode_outer(array(10, 20));
 10
 20
> SELECT explode_outer(collection => array(10, 20));
 10
 20

Since: 1.0.0


expm1

expm1(expr) - Returns exp(expr) - 1.

Examples:

sql 复制代码
> SELECT expm1(0);
 0.0

Since: 1.4.0


extract

extract(field FROM source) - Extracts a part of the date/timestamp or interval source.

Arguments:

  • field - selects which part of the source should be extracted
    • Supported string values of field for dates and timestamps are(case insensitive):
      • "YEAR", ("Y", "YEARS", "YR", "YRS") - the year field
      • "YEAROFWEEK" - the ISO 8601 week-numbering year that the datetime falls in. For example, 2005-01-02 is part of the 53rd week of year 2004, so the result is 2004
      • "QUARTER", ("QTR") - the quarter (1 - 4) of the year that the datetime falls in
      • "MONTH", ("MON", "MONS", "MONTHS") - the month field (1 - 12)
      • "WEEK", ("W", "WEEKS") - the number of the ISO 8601 week-of-week-based-year. A week is considered to start on a Monday and week 1 is the first week with >3 days. In the ISO week-numbering system, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for late-December dates to be part of the first week of the next year. For example, 2005-01-02 is part of the 53rd week of year 2004, while 2012-12-31 is part of the first week of 2013
      • "DAY", ("D", "DAYS") - the day of the month field (1 - 31)
      • "DAYOFWEEK",("DOW") - the day of the week for datetime as Sunday(1) to Saturday(7)
      • "DAYOFWEEK_ISO",("DOW_ISO") - ISO 8601 based day of the week for datetime as Monday(1) to Sunday(7)
      • "DOY" - the day of the year (1 - 365/366)
      • "HOUR", ("H", "HOURS", "HR", "HRS") - The hour field (0 - 23)
      • "MINUTE", ("M", "MIN", "MINS", "MINUTES") - the minutes field (0 - 59)
      • "SECOND", ("S", "SEC", "SECONDS", "SECS") - the seconds field, including fractional parts
    • Supported string values of field for interval(which consists of months, days, microseconds) are(case insensitive):
      • "YEAR", ("Y", "YEARS", "YR", "YRS") - the total months / 12
      • "MONTH", ("MON", "MONS", "MONTHS") - the total months % 12
      • "DAY", ("D", "DAYS") - the days part of interval
      • "HOUR", ("H", "HOURS", "HR", "HRS") - how many hours the microseconds contains
      • "MINUTE", ("M", "MIN", "MINS", "MINUTES") - how many minutes left after taking hours from microseconds
      • "SECOND", ("S", "SEC", "SECONDS", "SECS") - how many second with fractions left after taking hours and minutes from microseconds
  • source - a date/timestamp or interval column from where field should be extracted

Examples:

sql 复制代码
> SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456');
 2019
> SELECT extract(week FROM timestamp'2019-08-12 01:00:00.123456');
 33
> SELECT extract(doy FROM DATE'2019-08-12');
 224
> SELECT extract(SECONDS FROM timestamp'2019-10-01 00:00:01.000001');
 1.000001
> SELECT extract(days FROM interval 5 days 3 hours 7 minutes);
 5
> SELECT extract(seconds FROM interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
 30.001001
> SELECT extract(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH);
 11
> SELECT extract(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND);
 55

Note:

The extract function is equivalent to date_part(field, source).

Since: 3.0.0

6、F


factorial

factorial(expr) - Returns the factorial of expr. expr is [0...20]. Otherwise, null.

Examples:

sql 复制代码
> SELECT factorial(5);
 120

Since: 1.5.0


filter

filter(expr, func) - Filters the input array using the given predicate.

Examples:

sql 复制代码
> SELECT filter(array(1, 2, 3), x -> x % 2 == 1);
 [1,3]
> SELECT filter(array(0, 2, 3), (x, i) -> x > i);
 [2,3]
> SELECT filter(array(0, null, 2, 3, null), x -> x IS NOT NULL);
 [0,2,3]

Note:

The inner function may use the index argument since 3.0.0.

Since: 2.4.0


find_in_set

find_in_set(str, str_array) - Returns the index (1-based) of the given string (str) in the comma-delimited list (str_array). Returns 0, if the string was not found or if the given string (str) contains a comma.

Examples:

sql 复制代码
> SELECT find_in_set('ab','abc,b,ab,c,def');
 3

Since: 1.5.0


first

first(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.

Examples:

sql 复制代码
> SELECT first(col) FROM VALUES (10), (5), (20) AS tab(col);
 10
> SELECT first(col) FROM VALUES (NULL), (5), (20) AS tab(col);
 NULL
> SELECT first(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
 5

Note:

The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.

Since: 2.0.0


first_value

first_value(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.

Examples:

sql 复制代码
> SELECT first_value(col) FROM VALUES (10), (5), (20) AS tab(col);
 10
> SELECT first_value(col) FROM VALUES (NULL), (5), (20) AS tab(col);
 NULL
> SELECT first_value(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
 5

Note:

The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.

Since: 2.0.0


flatten

flatten(arrayOfArrays) - Transforms an array of arrays into a single array.

Examples:

sql 复制代码
> SELECT flatten(array(array(1, 2), array(3, 4)));
 [1,2,3,4]

Since: 2.4.0


float

float(expr) - Casts the value expr to the target data type float.

Since: 2.0.1


floor

floor(expr[, scale]) - Returns the largest number after rounding down that is not greater than expr. An optional scale parameter can be specified to control the rounding behavior.

Examples:

sql 复制代码
> SELECT floor(-0.1);
 -1
> SELECT floor(5);
 5
> SELECT floor(3.1411, 3);
 3.141
> SELECT floor(3.1411, -3);
 0

Since: 3.3.0


forall

forall(expr, pred) - Tests whether a predicate holds for all elements in the array.

Examples:

sql 复制代码
> SELECT forall(array(1, 2, 3), x -> x % 2 == 0);
 false
> SELECT forall(array(2, 4, 8), x -> x % 2 == 0);
 true
> SELECT forall(array(1, null, 3), x -> x % 2 == 0);
 false
> SELECT forall(array(2, null, 8), x -> x % 2 == 0);
 NULL

Since: 3.0.0


format_number

format_number(expr1, expr2) - Formats the number expr1 like '#,###,###.##', rounded to expr2 decimal places. If expr2 is 0, the result has no decimal point or fractional part. expr2 also accept a user specified format. This is supposed to function like MySQL's FORMAT.

Examples:

sql 复制代码
> SELECT format_number(12332.123456, 4);
 12,332.1235
> SELECT format_number(12332.123456, '##################.###');
 12332.123

Since: 1.5.0


format_string

format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.

Examples:

sql 复制代码
> SELECT format_string("Hello World %d %s", 100, "days");
 Hello World 100 days

Since: 1.5.0


from_avro

from_avro(child, jsonFormatSchema, options) - Converts a binary Avro value into a Catalyst value.

Examples:

sql 复制代码
> SELECT from_avro(s, '{"type": "record", "name": "struct", "fields": [{ "name": "u", "type": ["int","string"] }]}', map()) IS NULL AS result FROM (SELECT NAMED_STRUCT('u', NAMED_STRUCT('member0', member0, 'member1', member1)) AS s FROM VALUES (1, NULL), (NULL,  'a') tab(member0, member1));
 [false]

Note:

The specified schema must match actual schema of the read data, otherwise the behavior is undefined: it may fail or return arbitrary result. To deserialize the data with a compatible and evolved schema, the expected Avro schema can be set via the corresponding option.

Since: 4.0.0


from_csv

from_csv(csvStr, schema[, options]) - Returns a struct value with the given csvStr and schema.

Examples:

sql 复制代码
> SELECT from_csv('1, 0.8', 'a INT, b DOUBLE');
 {"a":1,"b":0.8}
> SELECT from_csv('26/08/2015', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
 {"time":2015-08-26 00:00:00}

Since: 3.0.0


from_json

from_json(jsonStr, schema[, options]) - Returns a struct value with the given jsonStr and schema.

Examples:

sql 复制代码
> SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
 {"a":1,"b":0.8}
> SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
 {"time":2015-08-26 00:00:00}
> SELECT from_json('{"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]}', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
 {"teacher":"Alice","student":[{"name":"Bob","rank":1},{"name":"Charlie","rank":2}]}

Since: 2.2.0


from_protobuf

from_protobuf(data, messageName, descFilePath, options) - Converts a binary Protobuf value into a Catalyst value.

Examples:

sql 复制代码
> SELECT from_protobuf(s, 'Person', '/path/to/descriptor.desc', map()) IS NULL AS result FROM (SELECT NAMED_STRUCT('name', name, 'id', id) AS s FROM VALUES ('John Doe', 1), (NULL,  2) tab(name, id));
 [false]

Note:

The specified Protobuf schema must match actual schema of the read data, otherwise the behavior is undefined: it may fail or return arbitrary result. To deserialize the data with a compatible and evolved schema, the expected Protobuf schema can be set via the corresponding option.

Since: 4.0.0


from_unixtime

from_unixtime(unix_time[, fmt]) - Returns unix_time in the specified fmt.

Arguments:

  • unix_time - UNIX Timestamp to be converted to the provided format.
  • fmt - Date/time format pattern to follow. See Datetime Patterns for valid date and time format patterns. The 'yyyy-MM-dd HH:mm:ss' pattern is used if omitted.

Examples:

sql 复制代码
> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
 1969-12-31 16:00:00

> SELECT from_unixtime(0);
 1969-12-31 16:00:00

Since: 1.5.0


from_utc_timestamp

from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

Examples:

sql 复制代码
> SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');
 2016-08-31 09:00:00

Since: 1.5.0


from_xml

from_xml(xmlStr, schema[, options]) - Returns a struct value with the given xmlStr and schema.

Examples:

sql 复制代码
> SELECT from_xml('<p><a>1</a><b>0.8</b></p>', 'a INT, b DOUBLE');
 {"a":1,"b":0.8}
> SELECT from_xml('<p><time>26/08/2015</time></p>', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
 {"time":2015-08-26 00:00:00}
> SELECT from_xml('<p><teacher>Alice</teacher><student><name>Bob</name><rank>1</rank></student><student><name>Charlie</name><rank>2</rank></student></p>', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
 {"teacher":"Alice","student":[{"name":"Bob","rank":1},{"name":"Charlie","rank":2}]}

Since: 4.0.0

7、G


get

get(array, index) - Returns element of array at given (0-based) index. If the index points outside of the array boundaries, then this function returns NULL.

Examples:

sql 复制代码
> SELECT get(array(1, 2, 3), 0);
 1
> SELECT get(array(1, 2, 3), 3);
 NULL
> SELECT get(array(1, 2, 3), -1);
 NULL

Since: 3.4.0


get_json_object

get_json_object(json_txt, path) - Extracts a json object from path.

Examples:

sql 复制代码
> SELECT get_json_object('{"a":"b"}', '$.a');
 b

Since: 1.5.0


getbit

getbit(expr, pos) - Returns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative.

Examples:

sql 复制代码
> SELECT getbit(11, 0);
 1
> SELECT getbit(11, 2);
 0

Since: 3.2.0


greatest

greatest(expr, ...) - Returns the greatest value of all parameters, skipping null values.

Examples:

sql 复制代码
> SELECT greatest(10, 9, 2, 4, 3);
 10

Since: 1.5.0


grouping

grouping(col) - indicates whether a specified column in a GROUP BY is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.",

Examples:

sql 复制代码
> SELECT name, grouping(name), sum(age) FROM VALUES (2, 'Alice'), (5, 'Bob') people(age, name) GROUP BY cube(name);
  Alice 0   2
  Bob   0   5
  NULL  1   7

Since: 2.0.0


grouping_id

grouping_id([col1[, col2 ...]]) - returns the level of grouping, equals to (grouping(c1) << (n-1)) + (grouping(c2) << (n-2)) + ... + grouping(cn)

Examples:

sql 复制代码
> SELECT name, grouping_id(), sum(age), avg(height) FROM VALUES (2, 'Alice', 165), (5, 'Bob', 180) people(age, name, height) GROUP BY cube(name, height);
  Alice 0   2   165.0
  Alice 1   2   165.0
  NULL  3   7   172.5
  Bob   0   5   180.0
  Bob   1   5   180.0
  NULL  2   2   165.0
  NULL  2   5   180.0

Note:

Input columns should match with grouping columns exactly, or empty (means all the grouping columns).

Since: 2.0.0

8、H


hash

hash(expr1, expr2, ...) - Returns a hash value of the arguments.

Examples:

sql 复制代码
> SELECT hash('Spark', array(123), 2);
 -1321691492

Since: 2.0.0


hex

hex(expr) - Converts expr to hexadecimal.

Examples:

sql 复制代码
> SELECT hex(17);
 11
> SELECT hex('Spark SQL');
 537061726B2053514C

Since: 1.5.0


histogram_numeric

histogram_numeric(expr, nb) - Computes a histogram on numeric 'expr' using nb bins. The return value is an array of (x,y) pairs representing the centers of the histogram's bins. As the value of 'nb' is increased, the histogram approximation gets finer-grained, but may yield artifacts around outliers. In practice, 20-40 histogram bins appear to work well, with more bins being required for skewed or smaller datasets. Note that this function creates a histogram with non-uniform bin widths. It offers no guarantees in terms of the mean-squared-error of the histogram, but in practice is comparable to the histograms produced by the R/S-Plus statistical computing packages. Note: the output type of the 'x' field in the return value is propagated from the input value consumed in the aggregate function.

Examples:

sql 复制代码
> SELECT histogram_numeric(col, 5) FROM VALUES (0), (1), (2), (10) AS tab(col);
 [{"x":0,"y":1.0},{"x":1,"y":1.0},{"x":2,"y":1.0},{"x":10,"y":1.0}]

Since: 3.3.0


hll_sketch_agg

hll_sketch_agg(expr, lgConfigK) - Returns the HllSketch's updatable binary representation. lgConfigK (optional) the log-base-2 of K, with K is the number of buckets or slots for the HllSketch.

Examples:

sql 复制代码
> SELECT hll_sketch_estimate(hll_sketch_agg(col, 12)) FROM VALUES (1), (1), (2), (2), (3) tab(col);
 3

Since: 3.5.0


hll_sketch_estimate

hll_sketch_estimate(expr) - Returns the estimated number of unique values given the binary representation of a Datasketches HllSketch.

Examples:

sql 复制代码
> SELECT hll_sketch_estimate(hll_sketch_agg(col)) FROM VALUES (1), (1), (2), (2), (3) tab(col);
 3

Since: 3.5.0


hll_union

hll_union(first, second, allowDifferentLgConfigK) - Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Set allowDifferentLgConfigK to true to allow unions of sketches with different lgConfigK values (defaults to false).

Examples:

sql 复制代码
> SELECT hll_sketch_estimate(hll_union(hll_sketch_agg(col1), hll_sketch_agg(col2))) FROM VALUES (1, 4), (1, 4), (2, 5), (2, 5), (3, 6) tab(col1, col2);
 6

Since: 3.5.0


hll_union_agg

hll_union_agg(expr, allowDifferentLgConfigK) - Returns the estimated number of unique values. allowDifferentLgConfigK (optional) Allow sketches with different lgConfigK values to be unioned (defaults to false).

Examples:

sql 复制代码
> SELECT hll_sketch_estimate(hll_union_agg(sketch, true)) FROM (SELECT hll_sketch_agg(col) as sketch FROM VALUES (1) tab(col) UNION ALL SELECT hll_sketch_agg(col, 20) as sketch FROM VALUES (1) tab(col));
 1

Since: 3.5.0


hour

hour(timestamp) - Returns the hour component of the string/timestamp.

Examples:

sql 复制代码
> SELECT hour('2009-07-30 12:58:59');
 12

Since: 1.5.0


hypot

hypot(expr1, expr2) - Returns sqrt(expr1² + expr2²).

Examples:

sql 复制代码
> SELECT hypot(3, 4);
 5.0

Since: 1.4.0

9、I


if

if(expr1, expr2, expr3) - If expr1 evaluates to true, then returns expr2; otherwise returns expr3.

Examples:

sql 复制代码
> SELECT if(1 < 2, 'a', 'b');
 a

Since: 1.0.0


ifnull

ifnull(expr1, expr2) - Returns expr2 if expr1 is null, or expr1 otherwise.

Examples:

sql 复制代码
> SELECT ifnull(NULL, array('2'));
 ["2"]

Since: 2.0.0


ilike

str ilike pattern[ ESCAPE escape] - Returns true if str matches pattern with escape case-insensitively, null if any arguments are null, false otherwise.

Arguments:

  • str - a string expression

  • pattern - a string expression. The pattern is a string which is matched literally and case-insensitively, with exception to the following special symbols:

    _ matches any one character in the input (similar to . in posix regular expressions)

    % matches zero or more characters in the input (similar to .* in posix regular expressions)

    Since Spark 2.0, string literals are unescaped in our SQL parser, see the unescaping rules at String Literal. For example, in order to match "\abc", the pattern should be "\abc".

    When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it falls back to Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the pattern to match "\abc" should be "\abc".

    It's recommended to use a raw string literal (with the r prefix) to avoid escaping special characters in the pattern string if exists.

  • escape - an character added since Spark 3.0. The default escape character is the ''. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character.

Examples:

sql 复制代码
> SELECT ilike('Spark', '_Park');
true
> SELECT '\\abc' AS S, S ilike r'\\abc', S ilike '\\\\abc';
\abc    true    true
> SET spark.sql.parser.escapedStringLiterals=true;
spark.sql.parser.escapedStringLiterals  true
> SELECT '%SystemDrive%\Users\John' ilike '\%SystemDrive\%\\users%';
true
> SET spark.sql.parser.escapedStringLiterals=false;
spark.sql.parser.escapedStringLiterals  false
> SELECT '%SystemDrive%\\USERS\\John' ilike r'%SystemDrive%\\Users%';
true
> SELECT '%SystemDrive%/Users/John' ilike '/%SYSTEMDrive/%//Users%' ESCAPE '/';
true

Note:

Use RLIKE to match with standard regular expressions.

Since: 3.3.0


in

expr1 in(expr2, expr3, ...) - Returns true if expr equals to any valN.

Arguments:

  • expr1, expr2, expr3, ... - the arguments must be same type.

Examples:

sql 复制代码
> SELECT 1 in(1, 2, 3);
 true
> SELECT 1 in(2, 3, 4);
 false
> SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 1), named_struct('a', 1, 'b', 3));
 false
> SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 2), named_struct('a', 1, 'b', 3));
 true

Since: 1.0.0


initcap

initcap(str) - Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space.

Examples:

sql 复制代码
> SELECT initcap('sPark sql');
 Spark Sql

Since: 1.5.0


inline

inline(expr) - Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise.

Examples:

sql 复制代码
> SELECT * FROM inline(array(struct(1, 'a'), struct(2, 'b')));
 1  a
 2  b
> SELECT * FROM inline(input => array(struct(1, 'a'), struct(2, 'b')));
 1  a
 2  b

Since: 3.4.0


inline_outer

inline_outer(expr) - Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise.

Examples:

sql 复制代码
> SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b')));
 1  a
 2  b
> SELECT inline_outer(input => array(struct(1, 'a'), struct(2, 'b')));
 1  a
 2  b

Since: 2.0.0


input_file_block_length

input_file_block_length() - Returns the length of the block being read, or -1 if not available.

Examples:

sql 复制代码
> SELECT input_file_block_length();
 -1

Since: 2.2.0


input_file_block_start

input_file_block_start() - Returns the start offset of the block being read, or -1 if not available.

Examples:

sql 复制代码
> SELECT input_file_block_start();
 -1

Since: 2.2.0


input_file_name

input_file_name() - Returns the name of the file being read, or empty string if not available.

Examples:

sql 复制代码
> SELECT input_file_name();

Since: 1.5.0


instr

instr(str, substr) - Returns the (1-based) index of the first occurrence of substr in str.

Examples:

sql 复制代码
> SELECT instr('SparkSQL', 'SQL');
 6

Since: 1.5.0


int

int(expr) - Casts the value expr to the target data type int.

Since: 2.0.1


is_valid_utf8

is_valid_utf8(str) - Returns true if str is a valid UTF-8 string, otherwise returns false.

Arguments:

  • str - a string expression

Examples:

sql 复制代码
> SELECT is_valid_utf8('Spark');
 true
> SELECT is_valid_utf8(x'61');
 true
> SELECT is_valid_utf8(x'80');
 false
> SELECT is_valid_utf8(x'61C262');
 false

Since: 4.0.0


is_variant_null

is_variant_null(expr) - Check if a variant value is a variant null. Returns true if and only if the input is a variant null and false otherwise (including in the case of SQL NULL).

Examples:

sql 复制代码
> SELECT is_variant_null(parse_json('null'));
 true
> SELECT is_variant_null(parse_json('"null"'));
 false
> SELECT is_variant_null(parse_json('13'));
 false
> SELECT is_variant_null(parse_json(null));
 false
> SELECT is_variant_null(variant_get(parse_json('{"a":null, "b":"spark"}'), "$.c"));
 false
> SELECT is_variant_null(variant_get(parse_json('{"a":null, "b":"spark"}'), "$.a"));
 true

Since: 4.0.0


isnan

isnan(expr) - Returns true if expr is NaN, or false otherwise.

Examples:

sql 复制代码
> SELECT isnan(cast('NaN' as double));
 true

Since: 1.5.0


isnotnull

isnotnull(expr) - Returns true if expr is not null, or false otherwise.

Examples:

sql 复制代码
> SELECT isnotnull(1);
 true

Since: 1.0.0


isnull

isnull(expr) - Returns true if expr is null, or false otherwise.

Examples:

sql 复制代码
> SELECT isnull(1);
 false

Since: 1.0.0

10、J


java_method

java_method(class, method[, arg1[, arg2 ...]]) - Calls a method with reflection.

Examples:

sql 复制代码
> SELECT java_method('java.util.UUID', 'randomUUID');
 c33fb387-8500-4bfa-81d2-6e0e3e930df2
> SELECT java_method('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2');
 a5cf6c42-0c85-418f-af6c-3e4e5b1328f2

Since: 2.0.0


json_array_length

json_array_length(jsonArray) - Returns the number of elements in the outermost JSON array.

Arguments:

  • jsonArray - A JSON array. NULL is returned in case of any other valid JSON string, NULL or an invalid JSON.

Examples:

sql 复制代码
> SELECT json_array_length('[1,2,3,4]');
  4
> SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
  5
> SELECT json_array_length('[1,2');
  NULL

Since: 3.1.0


json_object_keys

json_object_keys(json_object) - Returns all the keys of the outermost JSON object as an array.

Arguments:

  • json_object - A JSON object. If a valid JSON object is given, all the keys of the outermost object will be returned as an array. If it is any other valid JSON string, an invalid JSON string or an empty string, the function returns null.

Examples:

sql 复制代码
> SELECT json_object_keys('{}');
  []
> SELECT json_object_keys('{"key": "value"}');
  ["key"]
> SELECT json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}');
  ["f1","f2"]

Since: 3.1.0


json_tuple

json_tuple(jsonStr, p1, p2, ..., pn) - Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.

Examples:

sql 复制代码
> SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
 1  2

Since: 1.6.0

11、K


kurtosis

kurtosis(expr) - Returns the kurtosis value calculated from values of a group.

Examples:

sql 复制代码
> SELECT kurtosis(col) FROM VALUES (-10), (-20), (100), (1000) AS tab(col);
 -0.7014368047529627
> SELECT kurtosis(col) FROM VALUES (1), (10), (100), (10), (1) as tab(col);
 0.19432323191699075

Since: 1.6.0

12、L


lag

lag(input[, offset[, default]]) - Returns the value of input at the offsetth row before the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offsetth row is null, null is returned. If there is no such offset row (e.g., when the offset is 1, the first row of the window does not have any previous row), default is returned.

Arguments:

  • input - a string expression to evaluate offset rows before the current row.
  • offset - an int expression which is rows to jump back in the partition.
  • default - a string expression which is to use when the offset row does not exist.

Examples:

sql 复制代码
> SELECT a, b, lag(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
 A1 1   NULL
 A1 1   1
 A1 2   1
 A2 3   NULL

Since: 2.0.0


last

last(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values

Examples:

sql 复制代码
> SELECT last(col) FROM VALUES (10), (5), (20) AS tab(col);
 20
> SELECT last(col) FROM VALUES (10), (5), (NULL) AS tab(col);
 NULL
> SELECT last(col, true) FROM VALUES (10), (5), (NULL) AS tab(col);
 5

Note:

The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.

Since: 2.0.0


last_day

last_day(date) - Returns the last day of the month which the date belongs to.

Examples:

sql 复制代码
> SELECT last_day('2009-01-12');
 2009-01-31

Since: 1.5.0


last_value

last_value(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values

Examples:

sql 复制代码
> SELECT last_value(col) FROM VALUES (10), (5), (20) AS tab(col);
 20
> SELECT last_value(col) FROM VALUES (10), (5), (NULL) AS tab(col);
 NULL
> SELECT last_value(col, true) FROM VALUES (10), (5), (NULL) AS tab(col);
 5

Note:

The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.

Since: 2.0.0


lcase

lcase(str) - Returns str with all characters changed to lowercase.

Examples:

sql 复制代码
> SELECT lcase('SparkSql');
 sparksql

Since: 1.0.1


lead

lead(input[, offset[, default]]) - Returns the value of input at the offsetth row after the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offsetth row is null, null is returned. If there is no such an offset row (e.g., when the offset is 1, the last row of the window does not have any subsequent row), default is returned.

Arguments:

  • input - a string expression to evaluate offset rows after the current row.
  • offset - an int expression which is rows to jump ahead in the partition.
  • default - a string expression which is to use when the offset is larger than the window. The default value is null.

Examples:

sql 复制代码
> SELECT a, b, lead(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
 A1 1   1
 A1 1   2
 A1 2   NULL
 A2 3   NULL

Since: 2.0.0


least

least(expr, ...) - Returns the least value of all parameters, skipping null values.

Examples:

sql 复制代码
> SELECT least(10, 9, 2, 4, 3);
 2

Since: 1.5.0


left

left(str, len) - Returns the leftmost len(len can be string type) characters from the string str,if len is less or equal than 0 the result is an empty string.

Examples:

sql 复制代码
> SELECT left('Spark SQL', 3);
 Spa
> SELECT left(encode('Spark SQL', 'utf-8'), 3);
 Spa

Since: 2.3.0


len

len(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

Examples:

sql 复制代码
> SELECT len('Spark SQL ');
 10
> SELECT len(x'537061726b2053514c');
 9
> SELECT CHAR_LENGTH('Spark SQL ');
 10
> SELECT CHARACTER_LENGTH('Spark SQL ');
 10

Since: 3.4.0


length

length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

Examples:

sql 复制代码
> SELECT length('Spark SQL ');
 10
> SELECT length(x'537061726b2053514c');
 9
> SELECT CHAR_LENGTH('Spark SQL ');
 10
> SELECT CHARACTER_LENGTH('Spark SQL ');
 10

Since: 1.5.0


levenshtein

levenshtein(str1, str2[, threshold]) - Returns the Levenshtein distance between the two given strings. If threshold is set and distance more than it, return -1.

Examples:

sql 复制代码
> SELECT levenshtein('kitten', 'sitting');
 3
> SELECT levenshtein('kitten', 'sitting', 2);
 -1

Since: 1.5.0


like

str like pattern[ ESCAPE escape] - Returns true if str matches pattern with escape, null if any arguments are null, false otherwise.

Arguments:

  • str - a string expression

  • pattern - a string expression. The pattern is a string which is matched literally, with exception to the following special symbols:

    _ matches any one character in the input (similar to . in posix regular expressions)\ % matches zero or more characters in the input (similar to .* in posix regular expressions)

    Since Spark 2.0, string literals are unescaped in our SQL parser, see the unescaping rules at String Literal. For example, in order to match "\abc", the pattern should be "\abc".

    When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it falls back to Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the pattern to match "\abc" should be "\abc".

    It's recommended to use a raw string literal (with the r prefix) to avoid escaping special characters in the pattern string if exists.

  • escape - an character added since Spark 3.0. The default escape character is the ''. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character.

Examples:

sql 复制代码
> SELECT like('Spark', '_park');
true
> SELECT '\\abc' AS S, S like r'\\abc', S like '\\\\abc';
\abc    true    true
> SET spark.sql.parser.escapedStringLiterals=true;
spark.sql.parser.escapedStringLiterals  true
> SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\\Users%';
true
> SET spark.sql.parser.escapedStringLiterals=false;
spark.sql.parser.escapedStringLiterals  false
> SELECT '%SystemDrive%\\Users\\John' like r'%SystemDrive%\\Users%';
true
> SELECT '%SystemDrive%/Users/John' like '/%SystemDrive/%//Users%' ESCAPE '/';
true

Note:

Use RLIKE to match with standard regular expressions.

Since: 1.0.0


listagg

listagg(expr[, delimiter])[ WITHIN GROUP (ORDER BY key [ASC | DESC] [,...])] - Returns the concatenation of non-NULL input values, separated by the delimiter ordered by key. If all values are NULL, NULL is returned.

Arguments:

  • expr - a string or binary expression to be concatenated.
  • delimiter - an optional string or binary foldable expression used to separate the input values. If NULL, the concatenation will be performed without a delimiter. Default is NULL.
  • key - an optional expression for ordering the input values. Multiple keys can be specified. If none are specified, the order of the rows in the result is non-deterministic.

Examples:

sql 复制代码
> SELECT listagg(col) FROM VALUES ('a'), ('b'), ('c') AS tab(col);
 abc
> SELECT listagg(col) WITHIN GROUP (ORDER BY col DESC) FROM VALUES ('a'), ('b'), ('c') AS tab(col);
 cba
> SELECT listagg(col) FROM VALUES ('a'), (NULL), ('b') AS tab(col);
 ab
> SELECT listagg(col) FROM VALUES ('a'), ('a') AS tab(col);
 aa
> SELECT listagg(DISTINCT col) FROM VALUES ('a'), ('a'), ('b') AS tab(col);
 ab
> SELECT listagg(col, ', ') FROM VALUES ('a'), ('b'), ('c') AS tab(col);
 a, b, c
> SELECT listagg(col) FROM VALUES (NULL), (NULL) AS tab(col);
 NULL

Note:

  • If the order is not specified, the function is non-deterministic because the order of the rows may be non-deterministic after a shuffle.
  • If DISTINCT is specified, then expr and key must be the same expression.

Since: 4.0.0


ln

ln(expr) - Returns the natural logarithm (base e) of expr.

Examples:

sql 复制代码
> SELECT ln(1);
 0.0

Since: 1.4.0


localtimestamp

localtimestamp() - Returns the current timestamp without time zone at the start of query evaluation. All calls of localtimestamp within the same query return the same value.

localtimestamp - Returns the current local date-time at the session time zone at the start of query evaluation.

Examples:

sql 复制代码
> SELECT localtimestamp();
 2020-04-25 15:49:11.914

Since: 3.4.0


locate

locate(substr, str[, pos]) - Returns the position of the first occurrence of substr in str after position pos. The given pos and return value are 1-based.

Examples:

sql 复制代码
> SELECT locate('bar', 'foobarbar');
 4
> SELECT locate('bar', 'foobarbar', 5);
 7
> SELECT POSITION('bar' IN 'foobarbar');
 4

Since: 1.5.0


log

log(base, expr) - Returns the logarithm of expr with base.

Examples:

sql 复制代码
> SELECT log(10, 100);
 2.0

Since: 1.5.0


log10

log10(expr) - Returns the logarithm of expr with base 10.

Examples:

sql 复制代码
> SELECT log10(10);
 1.0

Since: 1.4.0


log1p

log1p(expr) - Returns log(1 + expr).

Examples:

sql 复制代码
> SELECT log1p(0);
 0.0

Since: 1.4.0


log2

log2(expr) - Returns the logarithm of expr with base 2.

Examples:

sql 复制代码
> SELECT log2(2);
 1.0

Since: 1.4.0


lower

lower(str) - Returns str with all characters changed to lowercase.

Examples:

sql 复制代码
> SELECT lower('SparkSql');
 sparksql

Since: 1.0.1


lpad

lpad(str, len[, pad]) - Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters or bytes. If pad is not specified, str will be padded to the left with space characters if it is a character string, and with zeros if it is a byte sequence.

Examples:

sql 复制代码
> SELECT lpad('hi', 5, '??');
 ???hi
> SELECT lpad('hi', 1, '??');
 h
> SELECT lpad('hi', 5);
    hi
> SELECT hex(lpad(unhex('aabb'), 5));
 000000AABB
> SELECT hex(lpad(unhex('aabb'), 5, unhex('1122')));
 112211AABB

Since: 1.5.0


ltrim

ltrim(str) - Removes the leading space characters from str.

Arguments:

  • str - a string expression
  • trimStr - the trim string characters to trim, the default value is a single space

Examples:

sql 复制代码
> SELECT ltrim('    SparkSQL   ');
 SparkSQL

Since: 1.5.0


luhn_check

luhn_check(str ) - Checks that a string of digits is valid according to the Luhn algorithm. This checksum function is widely applied on credit card numbers and government identification numbers to distinguish valid numbers from mistyped, incorrect numbers.

Examples:

sql 复制代码
> SELECT luhn_check('8112189876');
 true
> SELECT luhn_check('79927398713');
 true
> SELECT luhn_check('79927398714');
 false

Since: 3.5.0

相关推荐
张彦峰ZYF2 小时前
优化分布式系统性能:热key识别与实战解决方案
redis·分布式·性能优化
张彦峰ZYF2 小时前
高并发场景下的大 Key 问题及应对策略
redis·分布式·缓存
张彦峰ZYF2 小时前
高并发场景下的缓存击穿问题探析与应对策略
redis·分布式·缓存
ha_lydms4 小时前
AnalyticDB导入MaxCompute数据的几种方式
大数据·数据仓库·阿里云·dataworks·maxcompute·odps·analyticdb
拓端研究室4 小时前
专题:2025电商行业洞察报告:数字化、订阅电商、内容营销、B2B|附200+份报告PDF、数据、可视化模板汇总下载
大数据·人工智能
毕设源码-钟学长4 小时前
【开题答辩全过程】以 基于大数据的化妆品推荐系统为例,包含答辩的问题和答案
大数据
sheji34164 小时前
【开题答辩全过程】以 基于大数据的健康评估管理系统的设计与实现为例,包含答辩的问题和答案
大数据
豌豆学姐4 小时前
123 口播数字人 API 接入实战:附完整前后端开源项目
大数据·php·uniapp·开源软件
不爱吃糖的程序媛4 小时前
cJSON 适配 OpenHarmony PC 完整指南
大数据·elasticsearch·搜索引擎