dgraph example数据导入

数据准备

下载地址: github.com/hypermodein...

下载1million.rdf.gz, 下载1million.schema

数据清除

学习目的,清除所有之前旧的数据

css 复制代码
curl --location 'localhost:8080/alter' \
--header 'Content-Type: application/json' \
--data '{"drop_all": true}'

数据导入

下载dgraph 源代码, build

实时导入

css 复制代码
dgraph live -f 1million.rdf.gz --schema 1million.schema --alpha localhost:9080 --zero localhost:5080

离线导入

执行:

cmd 复制代码
.\live.exe bulk -f .\1million.rdf.gz --schema 1million.schema --zero localhost:508

结果

result 复制代码
I0704 11:11:44.465439    8408 init.go:68] 

Dgraph version   : dev
Dgraph codename  :
Dgraph SHA-256   : 3634731c41fb274ea640f9477985f0e79a1ddfd5ca28a7707b6694c1e7000c7c
Commit SHA-1     :
Commit timestamp :
Branch           :
Go version       : go1.24.4
jemalloc enabled : false

For Dgraph official documentation, visit https://dgraph.io/docs.
For discussions about Dgraph     , visit https://discuss.dgraph.io.
For fully-managed Dgraph Cloud   , visit https://dgraph.io/cloud.

Licensed under the Apache Public License 2.0.
© Hypermode Inc.


Encrypted input: false; Encrypted output: false
{
        "DataFiles": ".\\1million.rdf.gz",
        "DataFormat": "",
        "SchemaFile": "1million.schema",
        "GqlSchemaFile": "",
        "OutDir": "./out",
        "ReplaceOutDir": false,
        "TmpDir": "tmp",
        "NumGoroutines": 4,
        "MapBufSize": 2147483648,
        "PartitionBufSize": 4194304,
        "SkipMapPhase": false,
        "CleanupTmp": true,
        "NumReducers": 1,
        "Version": false,
        "StoreXids": false,
        "ZeroAddr": "localhost:5080",
        "ConnStr": "",
        "HttpAddr": "localhost:8080",
        "IgnoreErrors": false,
        "CustomTokenizers": "",
        "NewUids": false,
        "ClientDir": "",
        "Encrypted": false,
        "EncryptedOut": false,
        "MapShards": 1,
        "ReduceShards": 1,
        "Namespace": 18446744073709551615,
        "EncryptionKey": null,
        "Badger": {
                "Dir": "",
                "ValueDir": "",
                "SyncWrites": false,
                "NumVersionsToKeep": 1,
                "ReadOnly": false,
                "Logger": {},
                "Compression": 1,
                "InMemory": false,
                "MetricsEnabled": true,
                "NumGoroutines": 8,
                "MemTableSize": 67108864,
                "BaseTableSize": 2097152,
                "BaseLevelSize": 10485760,
                "LevelSizeMultiplier": 10,
                "TableSizeMultiplier": 2,
                "MaxLevels": 7,
                "VLogPercentile": 0,
                "ValueThreshold": 1048576,
                "NumMemtables": 5,
                "BlockSize": 4096,
                "BloomFalsePositive": 0.01,
                "BlockCacheSize": 20132659,
                "IndexCacheSize": 46976204,
                "NumLevelZeroTables": 5,
                "NumLevelZeroTablesStall": 15,
                "ValueLogFileSize": 1073741823,
                "ValueLogMaxEntries": 1000000,
                "NumCompactors": 4,
                "CompactL0OnClose": false,
                "LmaxCompaction": false,
                "ZSTDCompressionLevel": 0,
                "VerifyValueChecksum": false,
                "EncryptionKey": "",
                "EncryptionKeyRotationDuration": 864000000000000,
                "BypassLockGuard": false,
                "ChecksumVerificationMode": 0,
                "DetectConflicts": true,
                "NamespaceOffset": -1,
                "ExternalMagicVersion": 0
        }
}

The bulk loader needs to open many files at once. This number depends on the size of the data set loaded, the map file output size, and the level of indexing. 100,000 is adequate for most data set sizes. See `man ulimit` for details of how to change the limit.
Nonfatal error: max open file limit could not be detected: Cannot detect max open files on this platform

Connecting to zero at localhost:5080
Using Go memory
Processing file (1 out of 1): .\1million.rdf.gz
[11:11:45+0800] MAP 01s nquad_count:651.3k err_count:0.000 nquad_speed:507.9k/sec edge_count:3.380M edge_speed:2.635M/sec jemalloc: 0 B 
[11:11:46+0800] MAP 02s nquad_count:1.042M err_count:0.000 nquad_speed:456.4k/sec edge_count:4.719M edge_speed:2.068M/sec jemalloc: 0 B 
Shard tmp\map_output\000 -> Reduce tmp\shards\shard_0\000
badger 2025/07/04 11:11:46 INFO: All 0 tables opened in 0s
badger 2025/07/04 11:11:46 INFO: Discard stats nextEmptySlot: 0
badger 2025/07/04 11:11:46 INFO: Set nextTxnTs to 0
badger 2025/07/04 11:11:46 INFO: All 0 tables opened in 0s
badger 2025/07/04 11:11:46 INFO: Discard stats nextEmptySlot: 0
badger 2025/07/04 11:11:46 INFO: Set nextTxnTs to 0
badger 2025/07/04 11:11:46 INFO: DropAll called. Blocking writes...
badger 2025/07/04 11:11:46 INFO: Writes flushed. Stopping compactions now...
badger 2025/07/04 11:11:46 INFO: Deleted 0 SSTables. Now deleting value logs...
badger 2025/07/04 11:11:46 INFO: Value logs deleted. Creating value log file: 1
badger 2025/07/04 11:11:46 INFO: Deleted 1 value log files. DropAll done.
Num Encoders: 4
Final Histogram of buffer sizes: 
 -- Histogram:
Min value: 241739340
Max value: 241739340
Count: 1
50p: 65536.00
75p: 65536.00
90p: 65536.00
[134217728, 268435456) 1 100.00% 100.00%
 --

[11:11:47+0800] REDUCE 03s 0.00% edge_count:0.000 edge_speed:0.000/sec plist_count:0.000 plist_speed:0.000/sec. Num Encoding MBs: 230. jemalloc: 0 B 
[11:11:48+0800] REDUCE 04s 55.45% edge_count:2.617M edge_speed:2.617M/sec plist_count:354.1k plist_speed:354.1k/sec. Num Encoding MBs: 230. jemalloc: 0 B 
[11:11:49+0800] REDUCE 05s 100.00% edge_count:4.719M edge_speed:2.360M/sec plist_count:1.179M plist_speed:589.6k/sec. Num Encoding MBs: 230. jemalloc: 0 B 
Finishing stream id: 1
Finishing stream id: 2
Finishing stream id: 3
badger 2025/07/04 11:11:50 INFO: Table created: 2 at level: 6 for stream: 2. Size: 604 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 3 at level: 6 for stream: 3. Size: 473 KiB
Finishing stream id: 4
badger 2025/07/04 11:11:50 INFO: Table created: 4 at level: 6 for stream: 4. Size: 2.0 MiB
badger 2025/07/04 11:11:50 INFO: Table created: 1 at level: 6 for stream: 1. Size: 40 MiB
Finishing stream id: 5
Finishing stream id: 6
badger 2025/07/04 11:11:50 INFO: Table created: 6 at level: 6 for stream: 6. Size: 355 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 5 at level: 6 for stream: 5. Size: 3.5 MiB
Finishing stream id: 7
badger 2025/07/04 11:11:50 INFO: Table created: 7 at level: 6 for stream: 7. Size: 2.5 MiB
Finishing stream id: 8
Finishing stream id: 9
badger 2025/07/04 11:11:50 INFO: Table created: 9 at level: 6 for stream: 9. Size: 258 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 8 at level: 6 for stream: 8. Size: 3.1 MiB
Writing count index for "0-starring" rev=false
Writing count index for "0-genre" rev=false
Writing count index for "0-director.film" rev=false
Writing count index for "0-actor.film" rev=false
Writing split lists back to the main DB now
badger 2025/07/04 11:11:50 INFO: Number of ranges found: 2
badger 2025/07/04 11:11:50 INFO: Sent range 0 for iteration: [, 040000000000000000000b6467726170682e747970650202506572736f6e0000000000000001ffffffffffffd875) of size: 0 B
badger 2025/07/04 11:11:50 INFO: copying split keys to main DB Streaming about 0 B of uncompressed data (0 B on disk)
badger 2025/07/04 11:11:50 INFO: Sent range 1 for iteration: [040000000000000000000b6467726170682e747970650202506572736f6e0000000000000001ffffffffffffd875, ) of size: 0 B
badger 2025/07/04 11:11:50 INFO: copying split keys to main DB Sent data of size 1.5 MiB
badger 2025/07/04 11:11:50 INFO: Table created: 15 at level: 6 for stream: 12. Size: 84 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 14 at level: 6 for stream: 17. Size: 199 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 11 at level: 6 for stream: 11. Size: 83 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 10 at level: 6 for stream: 13. Size: 218 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 16 at level: 6 for stream: 14. Size: 97 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 13 at level: 6 for stream: 16. Size: 598 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 12 at level: 6 for stream: 10. Size: 2.3 MiB
badger 2025/07/04 11:11:50 INFO: Resuming writes
badger 2025/07/04 11:11:50 INFO: Lifetime L0 stalled for: 0s
badger 2025/07/04 11:11:50 INFO:
Level 0 [ ]: NumTables: 01. Size: 1000 B of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [B]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 6 [ ]: NumTables: 16. Size: 56 MiB of 56 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 4.0 MiB
Level Done
badger 2025/07/04 11:11:50 INFO: Lifetime L0 stalled for: 0s
badger 2025/07/04 11:11:50 INFO:
Level 0 [ ]: NumTables: 01. Size: 1.5 MiB of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 6 [B]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level Done
[11:11:50+0800] REDUCE 06s 100.00% edge_count:4.719M edge_speed:1.735M/sec plist_count:1.179M plist_speed:433.4k/sec. Num Encoding MBs: 0. jemalloc: 0 B
Total: 06s

说明已经成功导入了。

数据加载

因为dgraph bulk 并不会写入正在运行的alpha 节点, 二十离线构建的数据目录。 这些构建好的数据并不在默认的data目录中,而是在buil 时指定的--out目录,默认是(./out)。目前我放在了 {$workspace}/build 目录下, 所以这个导入命令就会把数据写入到 ./build/out/0/p 目录下

所以配置alpha 启动参数时候,也需要重新调整arguments

bash 复制代码
alpha --trace "jaeger=http://localhost:14268; ratio=0.99;" --security "whitelist=0.0.0.0/0;" --postings ./build/out/0/p --wal ./build/out/0/w

数据查询

e.g.

css 复制代码
curl --location 'http://localhost:8080/query' \
--header 'Content-Type: application/dql' \
--data '
{
  actors(func: has(starring), first: 10) {
    uid
    name
    starring {
      name
    }
  }
}
'
相关推荐
Aurora_NeAr5 分钟前
大数据之路:阿里巴巴大数据实践——大数据领域建模综述
大数据·后端
Olrookie6 分钟前
若依前后端分离版学习笔记(一)——本地部署
笔记·后端·开源
魏振东9 分钟前
COLA
后端
凉冰不加冰12 分钟前
Spring Boot自动配置原理深度解析
java·spring boot·后端
非优秀程序员18 分钟前
8 个提升开发者效率的小众 AI 项目
前端·人工智能·后端
一枚小小程序员哈20 分钟前
springboot基于Java与MySQL库的健身俱乐部管理系统设计与实现
数据库·spring boot·mysql·spring·java-ee·intellij-idea
Antonio91524 分钟前
【Redis】 Redis 基础命令和原理
数据库·redis·缓存
非优秀程序员25 分钟前
未来的编程将会是什么样子?从面向对象转为面向业务数据!!
数据库·架构
rzl0237 分钟前
SpringBoot总结
spring boot·后端·firefox
老华带你飞1 小时前
口腔助手|口腔挂号预约小程序|基于微信小程序的口腔门诊预约系统的设计与实现(源码+数据库+文档)
java·数据库·微信小程序·小程序·论文·毕设·口腔小程序