| 序号 | 类型 | 地址 |
|---|---|---|
| 1 | Spark 函数 | 1、Spark函数_符号 |
| 2 | Spark 函数 | 2、Spark 函数_a/b/c |
| 3 | Spark 函数 | 3、Spark 函数_d/e/f/j/h/i/j/k/l |
| 4 | Spark 函数 | 4、Spark 函数_m/n/o/p/q/r |
| 5 | Spark 函数 | 5、Spark函数_s/t |
| 6 | Spark 函数 | 6、Spark 函数_u/v/w/x/y/z |
文章目录
4、D
date
date(expr) - Casts the value expr to the target data type date.
Since: 2.0.1
date_add
date_add(start_date, num_days) - Returns the date that is num_days after start_date.
Examples:
sql
> SELECT date_add('2016-07-30', 1);
2016-07-31
Since: 1.5.0
date_diff
date_diff(endDate, startDate) - Returns the number of days from startDate to endDate.
Examples:
sql
> SELECT date_diff('2009-07-31', '2009-07-30');
1
> SELECT date_diff('2009-07-30', '2009-07-31');
-1
Since: 3.4.0
date_format
date_format(timestamp, fmt) - Converts timestamp to a value of string in the format specified by the date format fmt.
Arguments:
- timestamp - A date/timestamp or string to be converted to the given format.
- fmt - Date/time format pattern to follow. See Datetime Patterns for valid date and time format patterns.
Examples:
sql
> SELECT date_format('2016-04-08', 'y');
2016
Since: 1.5.0
date_from_unix_date
date_from_unix_date(days) - Create date from the number of days since 1970-01-01.
Examples:
sql
> SELECT date_from_unix_date(1);
1970-01-02
Since: 3.1.0
date_part
date_part(field, source) - Extracts a part of the date/timestamp or interval source.
Arguments:
- field - selects which part of the source should be extracted, and supported string values are as same as the fields of the equivalent function
EXTRACT. - source - a date/timestamp or interval column from where
fieldshould be extracted
Examples:
sql
> SELECT date_part('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
2019
> SELECT date_part('week', timestamp'2019-08-12 01:00:00.123456');
33
> SELECT date_part('doy', DATE'2019-08-12');
224
> SELECT date_part('SECONDS', timestamp'2019-10-01 00:00:01.000001');
1.000001
> SELECT date_part('days', interval 5 days 3 hours 7 minutes);
5
> SELECT date_part('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
30.001001
> SELECT date_part('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
11
> SELECT date_part('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
55
Note:
The date_part function is equivalent to the SQL-standard function EXTRACT(field FROM source)
Since: 3.0.0
date_sub
date_sub(start_date, num_days) - Returns the date that is num_days before start_date.
Examples:
sql
> SELECT date_sub('2016-07-30', 1);
2016-07-29
Since: 1.5.0
date_trunc
date_trunc(fmt, ts) - Returns timestamp ts truncated to the unit specified by the format model fmt.
Arguments:
- fmt - the format representing the unit to be truncated to
- "YEAR", "YYYY", "YY" - truncate to the first date of the year that the
tsfalls in, the time part will be zero out - "QUARTER" - truncate to the first date of the quarter that the
tsfalls in, the time part will be zero out - "MONTH", "MM", "MON" - truncate to the first date of the month that the
tsfalls in, the time part will be zero out - "WEEK" - truncate to the Monday of the week that the
tsfalls in, the time part will be zero out - "DAY", "DD" - zero out the time part
- "HOUR" - zero out the minute and second with fraction part
- "MINUTE"- zero out the second with fraction part
- "SECOND" - zero out the second fraction part
- "MILLISECOND" - zero out the microseconds
- "MICROSECOND" - everything remains
- "YEAR", "YYYY", "YY" - truncate to the first date of the year that the
- ts - datetime value or valid timestamp string
Examples:
sql
> SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359');
2015-01-01 00:00:00
> SELECT date_trunc('MM', '2015-03-05T09:32:05.359');
2015-03-01 00:00:00
> SELECT date_trunc('DD', '2015-03-05T09:32:05.359');
2015-03-05 00:00:00
> SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359');
2015-03-05 09:00:00
> SELECT date_trunc('MILLISECOND', '2015-03-05T09:32:05.123456');
2015-03-05 09:32:05.123
Since: 2.3.0
dateadd
dateadd(start_date, num_days) - Returns the date that is num_days after start_date.
Examples:
sql
> SELECT dateadd('2016-07-30', 1);
2016-07-31
Since: 3.4.0
datediff
datediff(endDate, startDate) - Returns the number of days from startDate to endDate.
Examples:
sql
> SELECT datediff('2009-07-31', '2009-07-30');
1
> SELECT datediff('2009-07-30', '2009-07-31');
-1
Since: 1.5.0
datepart
datepart(field, source) - Extracts a part of the date/timestamp or interval source.
Arguments:
- field - selects which part of the source should be extracted, and supported string values are as same as the fields of the equivalent function
EXTRACT. - source - a date/timestamp or interval column from where
fieldshould be extracted
Examples:
sql
> SELECT datepart('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456');
2019
> SELECT datepart('week', timestamp'2019-08-12 01:00:00.123456');
33
> SELECT datepart('doy', DATE'2019-08-12');
224
> SELECT datepart('SECONDS', timestamp'2019-10-01 00:00:01.000001');
1.000001
> SELECT datepart('days', interval 5 days 3 hours 7 minutes);
5
> SELECT datepart('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
30.001001
> SELECT datepart('MONTH', INTERVAL '2021-11' YEAR TO MONTH);
11
> SELECT datepart('MINUTE', INTERVAL '123 23:55:59.002001' DAY TO SECOND);
55
Note:
The datepart function is equivalent to the SQL-standard function EXTRACT(field FROM source)
Since: 3.4.0
day
day(date) - Returns the day of month of the date/timestamp.
Examples:
sql
> SELECT day('2009-07-30');
30
Since: 1.5.0
dayname
dayname(date) - Returns the three-letter abbreviated day name from the given date.
Examples:
sql
> SELECT dayname(DATE('2008-02-20'));
Wed
Since: 4.0.0
dayofmonth
dayofmonth(date) - Returns the day of month of the date/timestamp.
Examples:
sql
> SELECT dayofmonth('2009-07-30');
30
Since: 1.5.0
dayofweek
dayofweek(date) - Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday).
Examples:
sql
> SELECT dayofweek('2009-07-30');
5
Since: 2.3.0
dayofyear
dayofyear(date) - Returns the day of year of the date/timestamp.
Examples:
sql
> SELECT dayofyear('2016-04-09');
100
Since: 1.5.0
decimal
decimal(expr) - Casts the value expr to the target data type decimal.
Since: 2.0.1
decode
decode(bin, charset) - Decodes the first argument using the second argument character set. If either argument is null, the result will also be null.
decode(expr, search, result [, search, result ] ... [, default]) - Compares expr to each search value in order. If expr is equal to a search value, decode returns the corresponding result. If no match is found, then it returns default. If default is omitted, it returns null.
Arguments:
- bin - a binary expression to decode
- charset - one of the charsets 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16', 'UTF-32' to decode
bininto a STRING. It is case insensitive.
Examples:
sql
> SELECT decode(encode('abc', 'utf-8'), 'utf-8');
abc
> SELECT decode(2, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
San Francisco
> SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle', 'Non domestic');
Non domestic
> SELECT decode(6, 1, 'Southlake', 2, 'San Francisco', 3, 'New Jersey', 4, 'Seattle');
NULL
> SELECT decode(null, 6, 'Spark', NULL, 'SQL', 4, 'rocks');
SQL
Note:
decode(expr, search, result [, search, result ] ... [, default]) is supported since 3.2.0
Since: 1.5.0
degrees
degrees(expr) - Converts radians to degrees.
Arguments:
- expr - angle in radians
Examples:
sql
> SELECT degrees(3.141592653589793);
180.0
Since: 1.4.0
dense_rank
dense_rank() - Computes the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps in the ranking sequence.
Arguments:
- children - this is to base the rank on; a change in the value of one the children will trigger a change in rank. This is an internal parameter and will be assigned by the Analyser.
Examples:
sql
> SELECT a, b, dense_rank(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
A1 1 1
A1 1 1
A1 2 2
A2 3 1
Since: 2.0.0
div
expr1 div expr2 - Divide expr1 by expr2. It returns NULL if an operand is NULL or expr2 is 0. The result is casted to long.
Examples:
sql
> SELECT 3 div 2;
1
> SELECT INTERVAL '1-1' YEAR TO MONTH div INTERVAL '-1' MONTH;
-13
Since: 3.0.0
double
double(expr) - Casts the value expr to the target data type double.
Since: 2.0.1
5、E
e
e() - Returns Euler's number, e.
Examples:
sql
> SELECT e();
2.718281828459045
Since: 1.5.0
element_at
element_at(array, index) - Returns element of array at given (1-based) index. If Index is 0, Spark will throw an error. If index < 0, accesses elements from the last to the first. The function returns NULL if the index exceeds the length of the array and spark.sql.ansi.enabled is set to false. If spark.sql.ansi.enabled is set to true, it throws ArrayIndexOutOfBoundsException for invalid indices.
element_at(map, key) - Returns value for given key. The function returns NULL if the key is not contained in the map.
Examples:
sql
> SELECT element_at(array(1, 2, 3), 2);
2
> SELECT element_at(map(1, 'a', 2, 'b'), 2);
b
Since: 2.4.0
elt
elt(n, input1, input2, ...) - Returns the n-th input, e.g., returns input2 when n is 2. The function returns NULL if the index exceeds the length of the array and spark.sql.ansi.enabled is set to false. If spark.sql.ansi.enabled is set to true, it throws ArrayIndexOutOfBoundsException for invalid indices.
Examples:
sql
> SELECT elt(1, 'scala', 'java');
scala
> SELECT elt(2, 'a', 1);
1
Since: 2.0.0
encode
encode(str, charset) - Encodes the first argument using the second argument character set. If either argument is null, the result will also be null.
Arguments:
- str - a string expression
- charset - one of the charsets 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16', 'UTF-32' to encode
strinto a BINARY. It is case insensitive.
Examples:
sql
> SELECT encode('abc', 'utf-8');
abc
Since: 1.5.0
endswith
endswith(left, right) - Returns a boolean. The value is True if left ends with right. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type.
Examples:
sql
> SELECT endswith('Spark SQL', 'SQL');
true
> SELECT endswith('Spark SQL', 'Spark');
false
> SELECT endswith('Spark SQL', null);
NULL
> SELECT endswith(x'537061726b2053514c', x'537061726b');
false
> SELECT endswith(x'537061726b2053514c', x'53514c');
true
Since: 3.3.0
equal_null
equal_null(expr1, expr2) - Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null.
Arguments:
- expr1, expr2 - the two expressions must be same type or can be casted to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array/struct, the data types of fields must be orderable.
Examples:
sql
> SELECT equal_null(3, 3);
true
> SELECT equal_null(1, '11');
false
> SELECT equal_null(true, NULL);
false
> SELECT equal_null(NULL, 'abc');
false
> SELECT equal_null(NULL, NULL);
true
Since: 3.4.0
every
every(expr) - Returns true if all values of expr are true.
Examples:
sql
> SELECT every(col) FROM VALUES (true), (true), (true) AS tab(col);
true
> SELECT every(col) FROM VALUES (NULL), (true), (true) AS tab(col);
true
> SELECT every(col) FROM VALUES (true), (false), (true) AS tab(col);
false
Since: 3.0.0
exists
exists(expr, pred) - Tests whether a predicate holds for one or more elements in the array.
Examples:
sql
> SELECT exists(array(1, 2, 3), x -> x % 2 == 0);
true
> SELECT exists(array(1, 2, 3), x -> x % 2 == 10);
false
> SELECT exists(array(1, null, 3), x -> x % 2 == 0);
NULL
> SELECT exists(array(0, null, 2, 3, null), x -> x IS NULL);
true
> SELECT exists(array(1, 2, 3), x -> x IS NULL);
false
Since: 2.4.0
exp
exp(expr) - Returns e to the power of expr.
Examples:
sql
> SELECT exp(0);
1.0
Since: 1.4.0
explode
explode(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns. Unless specified otherwise, uses the default column name col for elements of the array or key and value for the elements of the map.
Examples:
sql
> SELECT explode(array(10, 20));
10
20
> SELECT explode(collection => array(10, 20));
10
20
Since: 1.0.0
explode_outer
explode_outer(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns. Unless specified otherwise, uses the default column name col for elements of the array or key and value for the elements of the map.
Examples:
sql
> SELECT explode_outer(array(10, 20));
10
20
> SELECT explode_outer(collection => array(10, 20));
10
20
Since: 1.0.0
expm1
expm1(expr) - Returns exp(expr) - 1.
Examples:
sql
> SELECT expm1(0);
0.0
Since: 1.4.0
extract
extract(field FROM source) - Extracts a part of the date/timestamp or interval source.
Arguments:
- field - selects which part of the source should be extracted
- Supported string values of
fieldfor dates and timestamps are(case insensitive):- "YEAR", ("Y", "YEARS", "YR", "YRS") - the year field
- "YEAROFWEEK" - the ISO 8601 week-numbering year that the datetime falls in. For example, 2005-01-02 is part of the 53rd week of year 2004, so the result is 2004
- "QUARTER", ("QTR") - the quarter (1 - 4) of the year that the datetime falls in
- "MONTH", ("MON", "MONS", "MONTHS") - the month field (1 - 12)
- "WEEK", ("W", "WEEKS") - the number of the ISO 8601 week-of-week-based-year. A week is considered to start on a Monday and week 1 is the first week with >3 days. In the ISO week-numbering system, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for late-December dates to be part of the first week of the next year. For example, 2005-01-02 is part of the 53rd week of year 2004, while 2012-12-31 is part of the first week of 2013
- "DAY", ("D", "DAYS") - the day of the month field (1 - 31)
- "DAYOFWEEK",("DOW") - the day of the week for datetime as Sunday(1) to Saturday(7)
- "DAYOFWEEK_ISO",("DOW_ISO") - ISO 8601 based day of the week for datetime as Monday(1) to Sunday(7)
- "DOY" - the day of the year (1 - 365/366)
- "HOUR", ("H", "HOURS", "HR", "HRS") - The hour field (0 - 23)
- "MINUTE", ("M", "MIN", "MINS", "MINUTES") - the minutes field (0 - 59)
- "SECOND", ("S", "SEC", "SECONDS", "SECS") - the seconds field, including fractional parts
- Supported string values of
fieldfor interval(which consists ofmonths,days,microseconds) are(case insensitive):- "YEAR", ("Y", "YEARS", "YR", "YRS") - the total
months/ 12 - "MONTH", ("MON", "MONS", "MONTHS") - the total
months% 12 - "DAY", ("D", "DAYS") - the
dayspart of interval - "HOUR", ("H", "HOURS", "HR", "HRS") - how many hours the
microsecondscontains - "MINUTE", ("M", "MIN", "MINS", "MINUTES") - how many minutes left after taking hours from
microseconds - "SECOND", ("S", "SEC", "SECONDS", "SECS") - how many second with fractions left after taking hours and minutes from
microseconds
- "YEAR", ("Y", "YEARS", "YR", "YRS") - the total
- Supported string values of
- source - a date/timestamp or interval column from where
fieldshould be extracted
Examples:
sql
> SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456');
2019
> SELECT extract(week FROM timestamp'2019-08-12 01:00:00.123456');
33
> SELECT extract(doy FROM DATE'2019-08-12');
224
> SELECT extract(SECONDS FROM timestamp'2019-10-01 00:00:01.000001');
1.000001
> SELECT extract(days FROM interval 5 days 3 hours 7 minutes);
5
> SELECT extract(seconds FROM interval 5 hours 30 seconds 1 milliseconds 1 microseconds);
30.001001
> SELECT extract(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH);
11
> SELECT extract(MINUTE FROM INTERVAL '123 23:55:59.002001' DAY TO SECOND);
55
Note:
The extract function is equivalent to date_part(field, source).
Since: 3.0.0
6、F
factorial
factorial(expr) - Returns the factorial of expr. expr is [0...20]. Otherwise, null.
Examples:
sql
> SELECT factorial(5);
120
Since: 1.5.0
filter
filter(expr, func) - Filters the input array using the given predicate.
Examples:
sql
> SELECT filter(array(1, 2, 3), x -> x % 2 == 1);
[1,3]
> SELECT filter(array(0, 2, 3), (x, i) -> x > i);
[2,3]
> SELECT filter(array(0, null, 2, 3, null), x -> x IS NOT NULL);
[0,2,3]
Note:
The inner function may use the index argument since 3.0.0.
Since: 2.4.0
find_in_set
find_in_set(str, str_array) - Returns the index (1-based) of the given string (str) in the comma-delimited list (str_array). Returns 0, if the string was not found or if the given string (str) contains a comma.
Examples:
sql
> SELECT find_in_set('ab','abc,b,ab,c,def');
3
Since: 1.5.0
first
first(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.
Examples:
sql
> SELECT first(col) FROM VALUES (10), (5), (20) AS tab(col);
10
> SELECT first(col) FROM VALUES (NULL), (5), (20) AS tab(col);
NULL
> SELECT first(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
5
Note:
The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
Since: 2.0.0
first_value
first_value(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values.
Examples:
sql
> SELECT first_value(col) FROM VALUES (10), (5), (20) AS tab(col);
10
> SELECT first_value(col) FROM VALUES (NULL), (5), (20) AS tab(col);
NULL
> SELECT first_value(col, true) FROM VALUES (NULL), (5), (20) AS tab(col);
5
Note:
The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
Since: 2.0.0
flatten
flatten(arrayOfArrays) - Transforms an array of arrays into a single array.
Examples:
sql
> SELECT flatten(array(array(1, 2), array(3, 4)));
[1,2,3,4]
Since: 2.4.0
float
float(expr) - Casts the value expr to the target data type float.
Since: 2.0.1
floor
floor(expr[, scale]) - Returns the largest number after rounding down that is not greater than expr. An optional scale parameter can be specified to control the rounding behavior.
Examples:
sql
> SELECT floor(-0.1);
-1
> SELECT floor(5);
5
> SELECT floor(3.1411, 3);
3.141
> SELECT floor(3.1411, -3);
0
Since: 3.3.0
forall
forall(expr, pred) - Tests whether a predicate holds for all elements in the array.
Examples:
sql
> SELECT forall(array(1, 2, 3), x -> x % 2 == 0);
false
> SELECT forall(array(2, 4, 8), x -> x % 2 == 0);
true
> SELECT forall(array(1, null, 3), x -> x % 2 == 0);
false
> SELECT forall(array(2, null, 8), x -> x % 2 == 0);
NULL
Since: 3.0.0
format_number
format_number(expr1, expr2) - Formats the number expr1 like '#,###,###.##', rounded to expr2 decimal places. If expr2 is 0, the result has no decimal point or fractional part. expr2 also accept a user specified format. This is supposed to function like MySQL's FORMAT.
Examples:
sql
> SELECT format_number(12332.123456, 4);
12,332.1235
> SELECT format_number(12332.123456, '##################.###');
12332.123
Since: 1.5.0
format_string
format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.
Examples:
sql
> SELECT format_string("Hello World %d %s", 100, "days");
Hello World 100 days
Since: 1.5.0
from_avro
from_avro(child, jsonFormatSchema, options) - Converts a binary Avro value into a Catalyst value.
Examples:
sql
> SELECT from_avro(s, '{"type": "record", "name": "struct", "fields": [{ "name": "u", "type": ["int","string"] }]}', map()) IS NULL AS result FROM (SELECT NAMED_STRUCT('u', NAMED_STRUCT('member0', member0, 'member1', member1)) AS s FROM VALUES (1, NULL), (NULL, 'a') tab(member0, member1));
[false]
Note:
The specified schema must match actual schema of the read data, otherwise the behavior is undefined: it may fail or return arbitrary result. To deserialize the data with a compatible and evolved schema, the expected Avro schema can be set via the corresponding option.
Since: 4.0.0
from_csv
from_csv(csvStr, schema[, options]) - Returns a struct value with the given csvStr and schema.
Examples:
sql
> SELECT from_csv('1, 0.8', 'a INT, b DOUBLE');
{"a":1,"b":0.8}
> SELECT from_csv('26/08/2015', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
{"time":2015-08-26 00:00:00}
Since: 3.0.0
from_json
from_json(jsonStr, schema[, options]) - Returns a struct value with the given jsonStr and schema.
Examples:
sql
> SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
{"a":1,"b":0.8}
> SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
{"time":2015-08-26 00:00:00}
> SELECT from_json('{"teacher": "Alice", "student": [{"name": "Bob", "rank": 1}, {"name": "Charlie", "rank": 2}]}', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
{"teacher":"Alice","student":[{"name":"Bob","rank":1},{"name":"Charlie","rank":2}]}
Since: 2.2.0
from_protobuf
from_protobuf(data, messageName, descFilePath, options) - Converts a binary Protobuf value into a Catalyst value.
Examples:
sql
> SELECT from_protobuf(s, 'Person', '/path/to/descriptor.desc', map()) IS NULL AS result FROM (SELECT NAMED_STRUCT('name', name, 'id', id) AS s FROM VALUES ('John Doe', 1), (NULL, 2) tab(name, id));
[false]
Note:
The specified Protobuf schema must match actual schema of the read data, otherwise the behavior is undefined: it may fail or return arbitrary result. To deserialize the data with a compatible and evolved schema, the expected Protobuf schema can be set via the corresponding option.
Since: 4.0.0
from_unixtime
from_unixtime(unix_time[, fmt]) - Returns unix_time in the specified fmt.
Arguments:
- unix_time - UNIX Timestamp to be converted to the provided format.
- fmt - Date/time format pattern to follow. See Datetime Patterns for valid date and time format patterns. The 'yyyy-MM-dd HH:mm:ss' pattern is used if omitted.
Examples:
sql
> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
1969-12-31 16:00:00
> SELECT from_unixtime(0);
1969-12-31 16:00:00
Since: 1.5.0
from_utc_timestamp
from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.
Examples:
sql
> SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');
2016-08-31 09:00:00
Since: 1.5.0
from_xml
from_xml(xmlStr, schema[, options]) - Returns a struct value with the given xmlStr and schema.
Examples:
sql
> SELECT from_xml('<p><a>1</a><b>0.8</b></p>', 'a INT, b DOUBLE');
{"a":1,"b":0.8}
> SELECT from_xml('<p><time>26/08/2015</time></p>', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
{"time":2015-08-26 00:00:00}
> SELECT from_xml('<p><teacher>Alice</teacher><student><name>Bob</name><rank>1</rank></student><student><name>Charlie</name><rank>2</rank></student></p>', 'STRUCT<teacher: STRING, student: ARRAY<STRUCT<name: STRING, rank: INT>>>');
{"teacher":"Alice","student":[{"name":"Bob","rank":1},{"name":"Charlie","rank":2}]}
Since: 4.0.0
7、G
get
get(array, index) - Returns element of array at given (0-based) index. If the index points outside of the array boundaries, then this function returns NULL.
Examples:
sql
> SELECT get(array(1, 2, 3), 0);
1
> SELECT get(array(1, 2, 3), 3);
NULL
> SELECT get(array(1, 2, 3), -1);
NULL
Since: 3.4.0
get_json_object
get_json_object(json_txt, path) - Extracts a json object from path.
Examples:
sql
> SELECT get_json_object('{"a":"b"}', '$.a');
b
Since: 1.5.0
getbit
getbit(expr, pos) - Returns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative.
Examples:
sql
> SELECT getbit(11, 0);
1
> SELECT getbit(11, 2);
0
Since: 3.2.0
greatest
greatest(expr, ...) - Returns the greatest value of all parameters, skipping null values.
Examples:
sql
> SELECT greatest(10, 9, 2, 4, 3);
10
Since: 1.5.0
grouping
grouping(col) - indicates whether a specified column in a GROUP BY is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.",
Examples:
sql
> SELECT name, grouping(name), sum(age) FROM VALUES (2, 'Alice'), (5, 'Bob') people(age, name) GROUP BY cube(name);
Alice 0 2
Bob 0 5
NULL 1 7
Since: 2.0.0
grouping_id
grouping_id([col1[, col2 ...]]) - returns the level of grouping, equals to (grouping(c1) << (n-1)) + (grouping(c2) << (n-2)) + ... + grouping(cn)
Examples:
sql
> SELECT name, grouping_id(), sum(age), avg(height) FROM VALUES (2, 'Alice', 165), (5, 'Bob', 180) people(age, name, height) GROUP BY cube(name, height);
Alice 0 2 165.0
Alice 1 2 165.0
NULL 3 7 172.5
Bob 0 5 180.0
Bob 1 5 180.0
NULL 2 2 165.0
NULL 2 5 180.0
Note:
Input columns should match with grouping columns exactly, or empty (means all the grouping columns).
Since: 2.0.0
8、H
hash
hash(expr1, expr2, ...) - Returns a hash value of the arguments.
Examples:
sql
> SELECT hash('Spark', array(123), 2);
-1321691492
Since: 2.0.0
hex
hex(expr) - Converts expr to hexadecimal.
Examples:
sql
> SELECT hex(17);
11
> SELECT hex('Spark SQL');
537061726B2053514C
Since: 1.5.0
histogram_numeric
histogram_numeric(expr, nb) - Computes a histogram on numeric 'expr' using nb bins. The return value is an array of (x,y) pairs representing the centers of the histogram's bins. As the value of 'nb' is increased, the histogram approximation gets finer-grained, but may yield artifacts around outliers. In practice, 20-40 histogram bins appear to work well, with more bins being required for skewed or smaller datasets. Note that this function creates a histogram with non-uniform bin widths. It offers no guarantees in terms of the mean-squared-error of the histogram, but in practice is comparable to the histograms produced by the R/S-Plus statistical computing packages. Note: the output type of the 'x' field in the return value is propagated from the input value consumed in the aggregate function.
Examples:
sql
> SELECT histogram_numeric(col, 5) FROM VALUES (0), (1), (2), (10) AS tab(col);
[{"x":0,"y":1.0},{"x":1,"y":1.0},{"x":2,"y":1.0},{"x":10,"y":1.0}]
Since: 3.3.0
hll_sketch_agg
hll_sketch_agg(expr, lgConfigK) - Returns the HllSketch's updatable binary representation. lgConfigK (optional) the log-base-2 of K, with K is the number of buckets or slots for the HllSketch.
Examples:
sql
> SELECT hll_sketch_estimate(hll_sketch_agg(col, 12)) FROM VALUES (1), (1), (2), (2), (3) tab(col);
3
Since: 3.5.0
hll_sketch_estimate
hll_sketch_estimate(expr) - Returns the estimated number of unique values given the binary representation of a Datasketches HllSketch.
Examples:
sql
> SELECT hll_sketch_estimate(hll_sketch_agg(col)) FROM VALUES (1), (1), (2), (2), (3) tab(col);
3
Since: 3.5.0
hll_union
hll_union(first, second, allowDifferentLgConfigK) - Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Set allowDifferentLgConfigK to true to allow unions of sketches with different lgConfigK values (defaults to false).
Examples:
sql
> SELECT hll_sketch_estimate(hll_union(hll_sketch_agg(col1), hll_sketch_agg(col2))) FROM VALUES (1, 4), (1, 4), (2, 5), (2, 5), (3, 6) tab(col1, col2);
6
Since: 3.5.0
hll_union_agg
hll_union_agg(expr, allowDifferentLgConfigK) - Returns the estimated number of unique values. allowDifferentLgConfigK (optional) Allow sketches with different lgConfigK values to be unioned (defaults to false).
Examples:
sql
> SELECT hll_sketch_estimate(hll_union_agg(sketch, true)) FROM (SELECT hll_sketch_agg(col) as sketch FROM VALUES (1) tab(col) UNION ALL SELECT hll_sketch_agg(col, 20) as sketch FROM VALUES (1) tab(col));
1
Since: 3.5.0
hour
hour(timestamp) - Returns the hour component of the string/timestamp.
Examples:
sql
> SELECT hour('2009-07-30 12:58:59');
12
Since: 1.5.0
hypot
hypot(expr1, expr2) - Returns sqrt(expr1² + expr2²).
Examples:
sql
> SELECT hypot(3, 4);
5.0
Since: 1.4.0
9、I
if
if(expr1, expr2, expr3) - If expr1 evaluates to true, then returns expr2; otherwise returns expr3.
Examples:
sql
> SELECT if(1 < 2, 'a', 'b');
a
Since: 1.0.0
ifnull
ifnull(expr1, expr2) - Returns expr2 if expr1 is null, or expr1 otherwise.
Examples:
sql
> SELECT ifnull(NULL, array('2'));
["2"]
Since: 2.0.0
ilike
str ilike pattern[ ESCAPE escape] - Returns true if str matches pattern with escape case-insensitively, null if any arguments are null, false otherwise.
Arguments:
-
str - a string expression
-
pattern - a string expression. The pattern is a string which is matched literally and case-insensitively, with exception to the following special symbols:
_ matches any one character in the input (similar to . in posix regular expressions)
% matches zero or more characters in the input (similar to .* in posix regular expressions)
Since Spark 2.0, string literals are unescaped in our SQL parser, see the unescaping rules at String Literal. For example, in order to match "\abc", the pattern should be "\abc".
When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it falls back to Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the pattern to match "\abc" should be "\abc".
It's recommended to use a raw string literal (with the
rprefix) to avoid escaping special characters in the pattern string if exists. -
escape - an character added since Spark 3.0. The default escape character is the ''. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character.
Examples:
sql
> SELECT ilike('Spark', '_Park');
true
> SELECT '\\abc' AS S, S ilike r'\\abc', S ilike '\\\\abc';
\abc true true
> SET spark.sql.parser.escapedStringLiterals=true;
spark.sql.parser.escapedStringLiterals true
> SELECT '%SystemDrive%\Users\John' ilike '\%SystemDrive\%\\users%';
true
> SET spark.sql.parser.escapedStringLiterals=false;
spark.sql.parser.escapedStringLiterals false
> SELECT '%SystemDrive%\\USERS\\John' ilike r'%SystemDrive%\\Users%';
true
> SELECT '%SystemDrive%/Users/John' ilike '/%SYSTEMDrive/%//Users%' ESCAPE '/';
true
Note:
Use RLIKE to match with standard regular expressions.
Since: 3.3.0
in
expr1 in(expr2, expr3, ...) - Returns true if expr equals to any valN.
Arguments:
- expr1, expr2, expr3, ... - the arguments must be same type.
Examples:
sql
> SELECT 1 in(1, 2, 3);
true
> SELECT 1 in(2, 3, 4);
false
> SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 1), named_struct('a', 1, 'b', 3));
false
> SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 2), named_struct('a', 1, 'b', 3));
true
Since: 1.0.0
initcap
initcap(str) - Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space.
Examples:
sql
> SELECT initcap('sPark sql');
Spark Sql
Since: 1.5.0
inline
inline(expr) - Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise.
Examples:
sql
> SELECT * FROM inline(array(struct(1, 'a'), struct(2, 'b')));
1 a
2 b
> SELECT * FROM inline(input => array(struct(1, 'a'), struct(2, 'b')));
1 a
2 b
Since: 3.4.0
inline_outer
inline_outer(expr) - Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise.
Examples:
sql
> SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b')));
1 a
2 b
> SELECT inline_outer(input => array(struct(1, 'a'), struct(2, 'b')));
1 a
2 b
Since: 2.0.0
input_file_block_length
input_file_block_length() - Returns the length of the block being read, or -1 if not available.
Examples:
sql
> SELECT input_file_block_length();
-1
Since: 2.2.0
input_file_block_start
input_file_block_start() - Returns the start offset of the block being read, or -1 if not available.
Examples:
sql
> SELECT input_file_block_start();
-1
Since: 2.2.0
input_file_name
input_file_name() - Returns the name of the file being read, or empty string if not available.
Examples:
sql
> SELECT input_file_name();
Since: 1.5.0
instr
instr(str, substr) - Returns the (1-based) index of the first occurrence of substr in str.
Examples:
sql
> SELECT instr('SparkSQL', 'SQL');
6
Since: 1.5.0
int
int(expr) - Casts the value expr to the target data type int.
Since: 2.0.1
is_valid_utf8
is_valid_utf8(str) - Returns true if str is a valid UTF-8 string, otherwise returns false.
Arguments:
- str - a string expression
Examples:
sql
> SELECT is_valid_utf8('Spark');
true
> SELECT is_valid_utf8(x'61');
true
> SELECT is_valid_utf8(x'80');
false
> SELECT is_valid_utf8(x'61C262');
false
Since: 4.0.0
is_variant_null
is_variant_null(expr) - Check if a variant value is a variant null. Returns true if and only if the input is a variant null and false otherwise (including in the case of SQL NULL).
Examples:
sql
> SELECT is_variant_null(parse_json('null'));
true
> SELECT is_variant_null(parse_json('"null"'));
false
> SELECT is_variant_null(parse_json('13'));
false
> SELECT is_variant_null(parse_json(null));
false
> SELECT is_variant_null(variant_get(parse_json('{"a":null, "b":"spark"}'), "$.c"));
false
> SELECT is_variant_null(variant_get(parse_json('{"a":null, "b":"spark"}'), "$.a"));
true
Since: 4.0.0
isnan
isnan(expr) - Returns true if expr is NaN, or false otherwise.
Examples:
sql
> SELECT isnan(cast('NaN' as double));
true
Since: 1.5.0
isnotnull
isnotnull(expr) - Returns true if expr is not null, or false otherwise.
Examples:
sql
> SELECT isnotnull(1);
true
Since: 1.0.0
isnull
isnull(expr) - Returns true if expr is null, or false otherwise.
Examples:
sql
> SELECT isnull(1);
false
Since: 1.0.0
10、J
java_method
java_method(class, method[, arg1[, arg2 ...]]) - Calls a method with reflection.
Examples:
sql
> SELECT java_method('java.util.UUID', 'randomUUID');
c33fb387-8500-4bfa-81d2-6e0e3e930df2
> SELECT java_method('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2');
a5cf6c42-0c85-418f-af6c-3e4e5b1328f2
Since: 2.0.0
json_array_length
json_array_length(jsonArray) - Returns the number of elements in the outermost JSON array.
Arguments:
- jsonArray - A JSON array.
NULLis returned in case of any other valid JSON string,NULLor an invalid JSON.
Examples:
sql
> SELECT json_array_length('[1,2,3,4]');
4
> SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
5
> SELECT json_array_length('[1,2');
NULL
Since: 3.1.0
json_object_keys
json_object_keys(json_object) - Returns all the keys of the outermost JSON object as an array.
Arguments:
- json_object - A JSON object. If a valid JSON object is given, all the keys of the outermost object will be returned as an array. If it is any other valid JSON string, an invalid JSON string or an empty string, the function returns null.
Examples:
sql
> SELECT json_object_keys('{}');
[]
> SELECT json_object_keys('{"key": "value"}');
["key"]
> SELECT json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}');
["f1","f2"]
Since: 3.1.0
json_tuple
json_tuple(jsonStr, p1, p2, ..., pn) - Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.
Examples:
sql
> SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
1 2
Since: 1.6.0
11、K
kurtosis
kurtosis(expr) - Returns the kurtosis value calculated from values of a group.
Examples:
sql
> SELECT kurtosis(col) FROM VALUES (-10), (-20), (100), (1000) AS tab(col);
-0.7014368047529627
> SELECT kurtosis(col) FROM VALUES (1), (10), (100), (10), (1) as tab(col);
0.19432323191699075
Since: 1.6.0
12、L
lag
lag(input[, offset[, default]]) - Returns the value of input at the offsetth row before the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offsetth row is null, null is returned. If there is no such offset row (e.g., when the offset is 1, the first row of the window does not have any previous row), default is returned.
Arguments:
- input - a string expression to evaluate
offsetrows before the current row. - offset - an int expression which is rows to jump back in the partition.
- default - a string expression which is to use when the offset row does not exist.
Examples:
sql
> SELECT a, b, lag(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
A1 1 NULL
A1 1 1
A1 2 1
A2 3 NULL
Since: 2.0.0
last
last(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values
Examples:
sql
> SELECT last(col) FROM VALUES (10), (5), (20) AS tab(col);
20
> SELECT last(col) FROM VALUES (10), (5), (NULL) AS tab(col);
NULL
> SELECT last(col, true) FROM VALUES (10), (5), (NULL) AS tab(col);
5
Note:
The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
Since: 2.0.0
last_day
last_day(date) - Returns the last day of the month which the date belongs to.
Examples:
sql
> SELECT last_day('2009-01-12');
2009-01-31
Since: 1.5.0
last_value
last_value(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only non-null values
Examples:
sql
> SELECT last_value(col) FROM VALUES (10), (5), (20) AS tab(col);
20
> SELECT last_value(col) FROM VALUES (10), (5), (NULL) AS tab(col);
NULL
> SELECT last_value(col, true) FROM VALUES (10), (5), (NULL) AS tab(col);
5
Note:
The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
Since: 2.0.0
lcase
lcase(str) - Returns str with all characters changed to lowercase.
Examples:
sql
> SELECT lcase('SparkSql');
sparksql
Since: 1.0.1
lead
lead(input[, offset[, default]]) - Returns the value of input at the offsetth row after the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offsetth row is null, null is returned. If there is no such an offset row (e.g., when the offset is 1, the last row of the window does not have any subsequent row), default is returned.
Arguments:
- input - a string expression to evaluate
offsetrows after the current row. - offset - an int expression which is rows to jump ahead in the partition.
- default - a string expression which is to use when the offset is larger than the window. The default value is null.
Examples:
sql
> SELECT a, b, lead(b) OVER (PARTITION BY a ORDER BY b) FROM VALUES ('A1', 2), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);
A1 1 1
A1 1 2
A1 2 NULL
A2 3 NULL
Since: 2.0.0
least
least(expr, ...) - Returns the least value of all parameters, skipping null values.
Examples:
sql
> SELECT least(10, 9, 2, 4, 3);
2
Since: 1.5.0
left
left(str, len) - Returns the leftmost len(len can be string type) characters from the string str,if len is less or equal than 0 the result is an empty string.
Examples:
sql
> SELECT left('Spark SQL', 3);
Spa
> SELECT left(encode('Spark SQL', 'utf-8'), 3);
Spa
Since: 2.3.0
len
len(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.
Examples:
sql
> SELECT len('Spark SQL ');
10
> SELECT len(x'537061726b2053514c');
9
> SELECT CHAR_LENGTH('Spark SQL ');
10
> SELECT CHARACTER_LENGTH('Spark SQL ');
10
Since: 3.4.0
length
length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.
Examples:
sql
> SELECT length('Spark SQL ');
10
> SELECT length(x'537061726b2053514c');
9
> SELECT CHAR_LENGTH('Spark SQL ');
10
> SELECT CHARACTER_LENGTH('Spark SQL ');
10
Since: 1.5.0
levenshtein
levenshtein(str1, str2[, threshold]) - Returns the Levenshtein distance between the two given strings. If threshold is set and distance more than it, return -1.
Examples:
sql
> SELECT levenshtein('kitten', 'sitting');
3
> SELECT levenshtein('kitten', 'sitting', 2);
-1
Since: 1.5.0
like
str like pattern[ ESCAPE escape] - Returns true if str matches pattern with escape, null if any arguments are null, false otherwise.
Arguments:
-
str - a string expression
-
pattern - a string expression. The pattern is a string which is matched literally, with exception to the following special symbols:
_ matches any one character in the input (similar to . in posix regular expressions)\ % matches zero or more characters in the input (similar to .* in posix regular expressions)
Since Spark 2.0, string literals are unescaped in our SQL parser, see the unescaping rules at String Literal. For example, in order to match "\abc", the pattern should be "\abc".
When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it falls back to Spark 1.6 behavior regarding string literal parsing. For example, if the config is enabled, the pattern to match "\abc" should be "\abc".
It's recommended to use a raw string literal (with the
rprefix) to avoid escaping special characters in the pattern string if exists. -
escape - an character added since Spark 3.0. The default escape character is the ''. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character.
Examples:
sql
> SELECT like('Spark', '_park');
true
> SELECT '\\abc' AS S, S like r'\\abc', S like '\\\\abc';
\abc true true
> SET spark.sql.parser.escapedStringLiterals=true;
spark.sql.parser.escapedStringLiterals true
> SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\\Users%';
true
> SET spark.sql.parser.escapedStringLiterals=false;
spark.sql.parser.escapedStringLiterals false
> SELECT '%SystemDrive%\\Users\\John' like r'%SystemDrive%\\Users%';
true
> SELECT '%SystemDrive%/Users/John' like '/%SystemDrive/%//Users%' ESCAPE '/';
true
Note:
Use RLIKE to match with standard regular expressions.
Since: 1.0.0
listagg
listagg(expr[, delimiter])[ WITHIN GROUP (ORDER BY key [ASC | DESC] [,...])] - Returns the concatenation of non-NULL input values, separated by the delimiter ordered by key. If all values are NULL, NULL is returned.
Arguments:
- expr - a string or binary expression to be concatenated.
- delimiter - an optional string or binary foldable expression used to separate the input values. If NULL, the concatenation will be performed without a delimiter. Default is NULL.
- key - an optional expression for ordering the input values. Multiple keys can be specified. If none are specified, the order of the rows in the result is non-deterministic.
Examples:
sql
> SELECT listagg(col) FROM VALUES ('a'), ('b'), ('c') AS tab(col);
abc
> SELECT listagg(col) WITHIN GROUP (ORDER BY col DESC) FROM VALUES ('a'), ('b'), ('c') AS tab(col);
cba
> SELECT listagg(col) FROM VALUES ('a'), (NULL), ('b') AS tab(col);
ab
> SELECT listagg(col) FROM VALUES ('a'), ('a') AS tab(col);
aa
> SELECT listagg(DISTINCT col) FROM VALUES ('a'), ('a'), ('b') AS tab(col);
ab
> SELECT listagg(col, ', ') FROM VALUES ('a'), ('b'), ('c') AS tab(col);
a, b, c
> SELECT listagg(col) FROM VALUES (NULL), (NULL) AS tab(col);
NULL
Note:
- If the order is not specified, the function is non-deterministic because the order of the rows may be non-deterministic after a shuffle.
- If DISTINCT is specified, then expr and key must be the same expression.
Since: 4.0.0
ln
ln(expr) - Returns the natural logarithm (base e) of expr.
Examples:
sql
> SELECT ln(1);
0.0
Since: 1.4.0
localtimestamp
localtimestamp() - Returns the current timestamp without time zone at the start of query evaluation. All calls of localtimestamp within the same query return the same value.
localtimestamp - Returns the current local date-time at the session time zone at the start of query evaluation.
Examples:
sql
> SELECT localtimestamp();
2020-04-25 15:49:11.914
Since: 3.4.0
locate
locate(substr, str[, pos]) - Returns the position of the first occurrence of substr in str after position pos. The given pos and return value are 1-based.
Examples:
sql
> SELECT locate('bar', 'foobarbar');
4
> SELECT locate('bar', 'foobarbar', 5);
7
> SELECT POSITION('bar' IN 'foobarbar');
4
Since: 1.5.0
log
log(base, expr) - Returns the logarithm of expr with base.
Examples:
sql
> SELECT log(10, 100);
2.0
Since: 1.5.0
log10
log10(expr) - Returns the logarithm of expr with base 10.
Examples:
sql
> SELECT log10(10);
1.0
Since: 1.4.0
log1p
log1p(expr) - Returns log(1 + expr).
Examples:
sql
> SELECT log1p(0);
0.0
Since: 1.4.0
log2
log2(expr) - Returns the logarithm of expr with base 2.
Examples:
sql
> SELECT log2(2);
1.0
Since: 1.4.0
lower
lower(str) - Returns str with all characters changed to lowercase.
Examples:
sql
> SELECT lower('SparkSql');
sparksql
Since: 1.0.1
lpad
lpad(str, len[, pad]) - Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters or bytes. If pad is not specified, str will be padded to the left with space characters if it is a character string, and with zeros if it is a byte sequence.
Examples:
sql
> SELECT lpad('hi', 5, '??');
???hi
> SELECT lpad('hi', 1, '??');
h
> SELECT lpad('hi', 5);
hi
> SELECT hex(lpad(unhex('aabb'), 5));
000000AABB
> SELECT hex(lpad(unhex('aabb'), 5, unhex('1122')));
112211AABB
Since: 1.5.0
ltrim
ltrim(str) - Removes the leading space characters from str.
Arguments:
- str - a string expression
- trimStr - the trim string characters to trim, the default value is a single space
Examples:
sql
> SELECT ltrim(' SparkSQL ');
SparkSQL
Since: 1.5.0
luhn_check
luhn_check(str ) - Checks that a string of digits is valid according to the Luhn algorithm. This checksum function is widely applied on credit card numbers and government identification numbers to distinguish valid numbers from mistyped, incorrect numbers.
Examples:
sql
> SELECT luhn_check('8112189876');
true
> SELECT luhn_check('79927398713');
true
> SELECT luhn_check('79927398714');
false
Since: 3.5.0