一行命令搞定json数据导出到csv

临近年终,经常遇到把接口数据导出到csv,再进一步做成图表放入PPT中的诉求,毕竟PPT才是最好的语言!

每次导出数据都要写一堆代码,未免太浪费时间,送你一串神奇的命令行指令,让你快速导出json到csv中,事半功倍!

处理json,肯定绕不过jq这个命令。之前的文章:《教你在命令行操作JSON(jq命令入门)》介绍了jq基础的用法。本篇文章就借着导出数据这个实际的需求,再介绍下jq的高级用法

热身

先来个简单版本的,接口响应内容如下,我们只想导出其中的codename字段到scv

json 复制代码
[
    {"code": "NSW", "name": "New South Wales", "level":"state", "country": "AU"},
    {"code": "AB", "name": "Alberta", "level":"province", "country": "CA"},
    {"code": "ABD", "name": "Aberdeenshire", "level":"council area", "country": "GB"},
    {"code": "AK", "name": "Alaska", "level":"state", "country": "US"}
]
shell 复制代码
$ cat j.json | jq -r '. | ["name", "code"], map([.name, .code])[] | @csv'
"name","code"
"New South Wales","NSW"
"Alberta","AB"
"Aberdeenshire","ABD"
"Alaska","AK"

可以先尝试自行理解上面jq的使用,下面我们加大难度,自动提取数据的全部字段,并添加表头

进阶

先看下进阶版本的全貌,为了换行更加清晰的展示,这里把jqfilter单独放入了一个文件,在执行jq的时候只需要指定-f file即可。效果和在命令行中一样。

shell 复制代码
# filters 文件内容
(map(keys) | add | unique) as $header
    | map(. as $row | $header | map($row[.])) as $rows
    | $header, $rows[]
    | @csv
shell 复制代码
# j.json文件内容
# [
#     {"code": "NSW", "name": "New South Wales", "level":"state", "country": "AU"},
#     {"code": "AB", "name": "Alberta", "level":"province", "country": "CA"},
#     {"code": "ABD", "name": "Aberdeenshire", "level":"council area", "country": "GB"},
#     {"code": "AK", "name": "Alaska", "level":"state", "country": "US"}
# ]
$ cat j.json | jq -r -f filters | tee j.csv
"code","country","level","name"
"NSW","AU","state","New South Wales"
"AB","CA","province","Alberta"
"ABD","GB","council area","Aberdeenshire"
"AK","US","state","Alaska"

提取csv的表头

(map(keys) | add | unique) as $header 用来提取csv第一行需要的表头。逐个命令看下

keys 对象所有key组成的数组

shell 复制代码
$ echo '{"code": "NSW", "name": "New South Wales", "level":"state", "country": "AU"}' | jq 'keys'
[
  "code",
  "country",
  "level",
  "name"
]

map(f) 可以对数组的每一项进行f操作,然后合并结果

shell 复制代码
$ echo '[{"name": "foo"},{"name": "bar"},{"name": "foobar"}]' | jq 'map(.name)'
[
  "foo",
  "bar",
  "foobar"
]

f可以是更复杂的函数,例如length可以获取字符串或数组的长度,把length放到map中,得到数组每一个元素的长度

shell 复制代码
$ echo '["foo", "bar", "foobar"]' | jq 'map(length)'
[
  3,
  3,
  6
]

所以map(keys)对于下面这段json来说。对数组中每一个元素执行keys,即对象所有key组成的数组

json 复制代码
# j.json
[
    {"code": "NSW", "name": "New South Wales", "level":"state", "country": "AU"},
    {"code": "AB", "name": "Alberta", "level":"province", "country": "CA"},
    {"code": "ABD", "name": "Aberdeenshire", "level":"council area", "country": "GB"},
    {"code": "AK", "name": "Alaska", "level":"state", "country": "US"}
]
json 复制代码
$ cat j.json | jq 'map(keys)'
[
  [
    "code",
    "country",
    "level",
    "name"
  ],
  [
    "code",
    "country",
    "level",
    "name"
  ],
  [
    "code",
    "country",
    "level",
    "name"
  ],
  [
    "code",
    "country",
    "level",
    "name"
  ]
]

add | unique 顾名思义,首先将数组合并,然后再去重

json 复制代码
$ cat j.json | jq 'map(keys) | add | unique'
[
  "code",
  "country",
  "level",
  "name"
]

(map(keys) | add | unique) as $header 总结就是遍历要转换成csv的每一条数据,取每一条数据的所有key,合并去重。相比于取数据的第一条作为表头,这种方式获取了所有数据的字段,避免第一条后面数据的字段多于第一条的情况

生成表格数据

map(. as $row | $header | map($row[.])) as $rows就是生成表格内容的主要命令

最外层的map遍历处理每一行数据,我们看看如何对每一行进行处理

🌲 . as $row相当于给当前行命名成$row

🌲$header | map($row[.]) 此时上下文已经变成了$header

🌲🌲 map($row[.])遍历表头的每一个字段,从$row中获取对应的值。类似$row["code"]$row["country"]$row["level"]这样

对每一行处理完后,就得到了多行的表格的内容区域

shell 复制代码
$ cat j.json | jq -r '(map(keys) | add | unique) as $header | map(. as $row | $header | map($row[.])) as $rows | $header, $rows[] '
[
  "code",
  "country",
  "level",
  "name"
]
[
  "NSW",
  "AU",
  "state",
  "New South Wales"
]
[
  "AB",
  "CA",
  "province",
  "Alberta"
]
[
  "ABD",
  "GB",
  "council area",
  "Aberdeenshire"
]
[
  "AK",
  "US",
  "state",
  "Alaska"
]

输出成csv

@csv指令能很好的完成把数组转换成csv的工作。

最终完成的效果如下,说简单也简单,说复杂也复杂。命令有点长,往后滑👉

shell 复制代码
$ cat j.json | jq -r '(map(keys) | add | unique) as $header | map(. as $row | $header | map($row[.])) as $rows | $header, $rows[] | @csv'
"code","country","level","name"
"NSW","AU","state","New South Wales"
"AB","CA","province","Alberta"
"ABD","GB","council area","Aberdeenshire"
"AK","US","state","Alaska"

考考你

有时候接口返回的数据可能会是如下结构,思考下如何利用jq完成csv的转换吧

json 复制代码
{
    "headers": [
        "code",
        "name",
        "level",
        "country"
    ],
    "data": [
        ["NSW", "New South Wales", "state", "AU"],
        ["AB", "Alberta", "province", "CA"],
        ["ABD", "Aberdeenshire", "council area", "GB"],
        ["AK", "Alaska", "state", "US"]
    ]
}

✨ 微信公众号【凉凉的知识库】同步更新,欢迎关注获取最新最有用的后端知识 ✨

相关推荐
布列瑟农的星空几秒前
WebAssembly入门(一)——Emscripten
前端·后端
小突突突1 小时前
Spring框架中的单例bean是线程安全的吗?
java·后端·spring
iso少年1 小时前
Go 语言并发编程核心与用法
开发语言·后端·golang
掘金码甲哥2 小时前
云原生算力平台的架构解读
后端
码事漫谈2 小时前
智谱AI从清华实验室到“全球大模型第一股”的六年征程
后端
码事漫谈2 小时前
现代软件开发中常用架构的系统梳理与实践指南
后端
Mr.Entropy2 小时前
JdbcTemplate 性能好,但 Hibernate 生产力高。 如何选择?
java·后端·hibernate
麦聪聊数据2 小时前
MySQL 性能调优:从EXPLAIN到JSON索引优化
数据库·sql·mysql·安全·json
YDS8292 小时前
SpringCloud —— MQ的可靠性保障和延迟消息
后端·spring·spring cloud·rabbitmq
无限大63 小时前
为什么"区块链"不只是比特币?——从加密货币到分布式应用
后端