dgraph example数据导入

数据准备

下载地址: github.com/hypermodein...

下载1million.rdf.gz, 下载1million.schema

数据清除

学习目的,清除所有之前旧的数据

css 复制代码
curl --location 'localhost:8080/alter' \
--header 'Content-Type: application/json' \
--data '{"drop_all": true}'

数据导入

下载dgraph 源代码, build

实时导入

css 复制代码
dgraph live -f 1million.rdf.gz --schema 1million.schema --alpha localhost:9080 --zero localhost:5080

离线导入

执行:

cmd 复制代码
.\live.exe bulk -f .\1million.rdf.gz --schema 1million.schema --zero localhost:508

结果

result 复制代码
I0704 11:11:44.465439    8408 init.go:68] 

Dgraph version   : dev
Dgraph codename  :
Dgraph SHA-256   : 3634731c41fb274ea640f9477985f0e79a1ddfd5ca28a7707b6694c1e7000c7c
Commit SHA-1     :
Commit timestamp :
Branch           :
Go version       : go1.24.4
jemalloc enabled : false

For Dgraph official documentation, visit https://dgraph.io/docs.
For discussions about Dgraph     , visit https://discuss.dgraph.io.
For fully-managed Dgraph Cloud   , visit https://dgraph.io/cloud.

Licensed under the Apache Public License 2.0.
© Hypermode Inc.


Encrypted input: false; Encrypted output: false
{
        "DataFiles": ".\\1million.rdf.gz",
        "DataFormat": "",
        "SchemaFile": "1million.schema",
        "GqlSchemaFile": "",
        "OutDir": "./out",
        "ReplaceOutDir": false,
        "TmpDir": "tmp",
        "NumGoroutines": 4,
        "MapBufSize": 2147483648,
        "PartitionBufSize": 4194304,
        "SkipMapPhase": false,
        "CleanupTmp": true,
        "NumReducers": 1,
        "Version": false,
        "StoreXids": false,
        "ZeroAddr": "localhost:5080",
        "ConnStr": "",
        "HttpAddr": "localhost:8080",
        "IgnoreErrors": false,
        "CustomTokenizers": "",
        "NewUids": false,
        "ClientDir": "",
        "Encrypted": false,
        "EncryptedOut": false,
        "MapShards": 1,
        "ReduceShards": 1,
        "Namespace": 18446744073709551615,
        "EncryptionKey": null,
        "Badger": {
                "Dir": "",
                "ValueDir": "",
                "SyncWrites": false,
                "NumVersionsToKeep": 1,
                "ReadOnly": false,
                "Logger": {},
                "Compression": 1,
                "InMemory": false,
                "MetricsEnabled": true,
                "NumGoroutines": 8,
                "MemTableSize": 67108864,
                "BaseTableSize": 2097152,
                "BaseLevelSize": 10485760,
                "LevelSizeMultiplier": 10,
                "TableSizeMultiplier": 2,
                "MaxLevels": 7,
                "VLogPercentile": 0,
                "ValueThreshold": 1048576,
                "NumMemtables": 5,
                "BlockSize": 4096,
                "BloomFalsePositive": 0.01,
                "BlockCacheSize": 20132659,
                "IndexCacheSize": 46976204,
                "NumLevelZeroTables": 5,
                "NumLevelZeroTablesStall": 15,
                "ValueLogFileSize": 1073741823,
                "ValueLogMaxEntries": 1000000,
                "NumCompactors": 4,
                "CompactL0OnClose": false,
                "LmaxCompaction": false,
                "ZSTDCompressionLevel": 0,
                "VerifyValueChecksum": false,
                "EncryptionKey": "",
                "EncryptionKeyRotationDuration": 864000000000000,
                "BypassLockGuard": false,
                "ChecksumVerificationMode": 0,
                "DetectConflicts": true,
                "NamespaceOffset": -1,
                "ExternalMagicVersion": 0
        }
}

The bulk loader needs to open many files at once. This number depends on the size of the data set loaded, the map file output size, and the level of indexing. 100,000 is adequate for most data set sizes. See `man ulimit` for details of how to change the limit.
Nonfatal error: max open file limit could not be detected: Cannot detect max open files on this platform

Connecting to zero at localhost:5080
Using Go memory
Processing file (1 out of 1): .\1million.rdf.gz
[11:11:45+0800] MAP 01s nquad_count:651.3k err_count:0.000 nquad_speed:507.9k/sec edge_count:3.380M edge_speed:2.635M/sec jemalloc: 0 B 
[11:11:46+0800] MAP 02s nquad_count:1.042M err_count:0.000 nquad_speed:456.4k/sec edge_count:4.719M edge_speed:2.068M/sec jemalloc: 0 B 
Shard tmp\map_output\000 -> Reduce tmp\shards\shard_0\000
badger 2025/07/04 11:11:46 INFO: All 0 tables opened in 0s
badger 2025/07/04 11:11:46 INFO: Discard stats nextEmptySlot: 0
badger 2025/07/04 11:11:46 INFO: Set nextTxnTs to 0
badger 2025/07/04 11:11:46 INFO: All 0 tables opened in 0s
badger 2025/07/04 11:11:46 INFO: Discard stats nextEmptySlot: 0
badger 2025/07/04 11:11:46 INFO: Set nextTxnTs to 0
badger 2025/07/04 11:11:46 INFO: DropAll called. Blocking writes...
badger 2025/07/04 11:11:46 INFO: Writes flushed. Stopping compactions now...
badger 2025/07/04 11:11:46 INFO: Deleted 0 SSTables. Now deleting value logs...
badger 2025/07/04 11:11:46 INFO: Value logs deleted. Creating value log file: 1
badger 2025/07/04 11:11:46 INFO: Deleted 1 value log files. DropAll done.
Num Encoders: 4
Final Histogram of buffer sizes: 
 -- Histogram:
Min value: 241739340
Max value: 241739340
Count: 1
50p: 65536.00
75p: 65536.00
90p: 65536.00
[134217728, 268435456) 1 100.00% 100.00%
 --

[11:11:47+0800] REDUCE 03s 0.00% edge_count:0.000 edge_speed:0.000/sec plist_count:0.000 plist_speed:0.000/sec. Num Encoding MBs: 230. jemalloc: 0 B 
[11:11:48+0800] REDUCE 04s 55.45% edge_count:2.617M edge_speed:2.617M/sec plist_count:354.1k plist_speed:354.1k/sec. Num Encoding MBs: 230. jemalloc: 0 B 
[11:11:49+0800] REDUCE 05s 100.00% edge_count:4.719M edge_speed:2.360M/sec plist_count:1.179M plist_speed:589.6k/sec. Num Encoding MBs: 230. jemalloc: 0 B 
Finishing stream id: 1
Finishing stream id: 2
Finishing stream id: 3
badger 2025/07/04 11:11:50 INFO: Table created: 2 at level: 6 for stream: 2. Size: 604 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 3 at level: 6 for stream: 3. Size: 473 KiB
Finishing stream id: 4
badger 2025/07/04 11:11:50 INFO: Table created: 4 at level: 6 for stream: 4. Size: 2.0 MiB
badger 2025/07/04 11:11:50 INFO: Table created: 1 at level: 6 for stream: 1. Size: 40 MiB
Finishing stream id: 5
Finishing stream id: 6
badger 2025/07/04 11:11:50 INFO: Table created: 6 at level: 6 for stream: 6. Size: 355 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 5 at level: 6 for stream: 5. Size: 3.5 MiB
Finishing stream id: 7
badger 2025/07/04 11:11:50 INFO: Table created: 7 at level: 6 for stream: 7. Size: 2.5 MiB
Finishing stream id: 8
Finishing stream id: 9
badger 2025/07/04 11:11:50 INFO: Table created: 9 at level: 6 for stream: 9. Size: 258 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 8 at level: 6 for stream: 8. Size: 3.1 MiB
Writing count index for "0-starring" rev=false
Writing count index for "0-genre" rev=false
Writing count index for "0-director.film" rev=false
Writing count index for "0-actor.film" rev=false
Writing split lists back to the main DB now
badger 2025/07/04 11:11:50 INFO: Number of ranges found: 2
badger 2025/07/04 11:11:50 INFO: Sent range 0 for iteration: [, 040000000000000000000b6467726170682e747970650202506572736f6e0000000000000001ffffffffffffd875) of size: 0 B
badger 2025/07/04 11:11:50 INFO: copying split keys to main DB Streaming about 0 B of uncompressed data (0 B on disk)
badger 2025/07/04 11:11:50 INFO: Sent range 1 for iteration: [040000000000000000000b6467726170682e747970650202506572736f6e0000000000000001ffffffffffffd875, ) of size: 0 B
badger 2025/07/04 11:11:50 INFO: copying split keys to main DB Sent data of size 1.5 MiB
badger 2025/07/04 11:11:50 INFO: Table created: 15 at level: 6 for stream: 12. Size: 84 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 14 at level: 6 for stream: 17. Size: 199 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 11 at level: 6 for stream: 11. Size: 83 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 10 at level: 6 for stream: 13. Size: 218 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 16 at level: 6 for stream: 14. Size: 97 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 13 at level: 6 for stream: 16. Size: 598 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 12 at level: 6 for stream: 10. Size: 2.3 MiB
badger 2025/07/04 11:11:50 INFO: Resuming writes
badger 2025/07/04 11:11:50 INFO: Lifetime L0 stalled for: 0s
badger 2025/07/04 11:11:50 INFO:
Level 0 [ ]: NumTables: 01. Size: 1000 B of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [B]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 6 [ ]: NumTables: 16. Size: 56 MiB of 56 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 4.0 MiB
Level Done
badger 2025/07/04 11:11:50 INFO: Lifetime L0 stalled for: 0s
badger 2025/07/04 11:11:50 INFO:
Level 0 [ ]: NumTables: 01. Size: 1.5 MiB of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 6 [B]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level Done
[11:11:50+0800] REDUCE 06s 100.00% edge_count:4.719M edge_speed:1.735M/sec plist_count:1.179M plist_speed:433.4k/sec. Num Encoding MBs: 0. jemalloc: 0 B
Total: 06s

说明已经成功导入了。

数据加载

因为dgraph bulk 并不会写入正在运行的alpha 节点, 二十离线构建的数据目录。 这些构建好的数据并不在默认的data目录中,而是在buil 时指定的--out目录,默认是(./out)。目前我放在了 {$workspace}/build 目录下, 所以这个导入命令就会把数据写入到 ./build/out/0/p 目录下

所以配置alpha 启动参数时候,也需要重新调整arguments

bash 复制代码
alpha --trace "jaeger=http://localhost:14268; ratio=0.99;" --security "whitelist=0.0.0.0/0;" --postings ./build/out/0/p --wal ./build/out/0/w

数据查询

e.g.

css 复制代码
curl --location 'http://localhost:8080/query' \
--header 'Content-Type: application/dql' \
--data '
{
  actors(func: has(starring), first: 10) {
    uid
    name
    starring {
      name
    }
  }
}
'
相关推荐
qq_413847401 小时前
HTML怎么限制输入字符数_HTML input maxlength属性用法【详解】
jvm·数据库·python
liuyouzhang3 小时前
将基于Archery的web数据库审计查询平台封装为jdbc接口的可行性研究(基于AI)
前端·数据库
Meepo_haha6 小时前
配置 Redis
数据库·redis·缓存
u0109147608 小时前
CSS组件库如何快速扩展_通过Sass @extend继承基础布局
jvm·数据库·python
baidu_340998828 小时前
Golang怎么用go-noescape优化性能_Golang如何使用编译器指令控制逃逸分析行为【进阶】
jvm·数据库·python
m0_678485458 小时前
如何利用虚拟 DOM 实现无痕刷新?基于 VNode 对比的状态保持技巧
jvm·数据库·python
qq_342295828 小时前
CSS如何实现透明背景效果_通过RGBA色彩模式控制透明度
jvm·数据库·python
panzer_maus8 小时前
MySQL 索引介绍与索引优化的简单介绍
数据库·mysql
Greyson18 小时前
CSS如何处理超长文本换行问题_结合word-wrap属性
jvm·数据库·python
码事漫谈8 小时前
大模型输出的“隐性结构塌缩”问题及对策
前端·后端