dgraph example数据导入

数据准备

下载地址: github.com/hypermodein...

下载1million.rdf.gz, 下载1million.schema

数据清除

学习目的,清除所有之前旧的数据

css 复制代码
curl --location 'localhost:8080/alter' \
--header 'Content-Type: application/json' \
--data '{"drop_all": true}'

数据导入

下载dgraph 源代码, build

实时导入

css 复制代码
dgraph live -f 1million.rdf.gz --schema 1million.schema --alpha localhost:9080 --zero localhost:5080

离线导入

执行:

cmd 复制代码
.\live.exe bulk -f .\1million.rdf.gz --schema 1million.schema --zero localhost:508

结果

result 复制代码
I0704 11:11:44.465439    8408 init.go:68] 

Dgraph version   : dev
Dgraph codename  :
Dgraph SHA-256   : 3634731c41fb274ea640f9477985f0e79a1ddfd5ca28a7707b6694c1e7000c7c
Commit SHA-1     :
Commit timestamp :
Branch           :
Go version       : go1.24.4
jemalloc enabled : false

For Dgraph official documentation, visit https://dgraph.io/docs.
For discussions about Dgraph     , visit https://discuss.dgraph.io.
For fully-managed Dgraph Cloud   , visit https://dgraph.io/cloud.

Licensed under the Apache Public License 2.0.
© Hypermode Inc.


Encrypted input: false; Encrypted output: false
{
        "DataFiles": ".\\1million.rdf.gz",
        "DataFormat": "",
        "SchemaFile": "1million.schema",
        "GqlSchemaFile": "",
        "OutDir": "./out",
        "ReplaceOutDir": false,
        "TmpDir": "tmp",
        "NumGoroutines": 4,
        "MapBufSize": 2147483648,
        "PartitionBufSize": 4194304,
        "SkipMapPhase": false,
        "CleanupTmp": true,
        "NumReducers": 1,
        "Version": false,
        "StoreXids": false,
        "ZeroAddr": "localhost:5080",
        "ConnStr": "",
        "HttpAddr": "localhost:8080",
        "IgnoreErrors": false,
        "CustomTokenizers": "",
        "NewUids": false,
        "ClientDir": "",
        "Encrypted": false,
        "EncryptedOut": false,
        "MapShards": 1,
        "ReduceShards": 1,
        "Namespace": 18446744073709551615,
        "EncryptionKey": null,
        "Badger": {
                "Dir": "",
                "ValueDir": "",
                "SyncWrites": false,
                "NumVersionsToKeep": 1,
                "ReadOnly": false,
                "Logger": {},
                "Compression": 1,
                "InMemory": false,
                "MetricsEnabled": true,
                "NumGoroutines": 8,
                "MemTableSize": 67108864,
                "BaseTableSize": 2097152,
                "BaseLevelSize": 10485760,
                "LevelSizeMultiplier": 10,
                "TableSizeMultiplier": 2,
                "MaxLevels": 7,
                "VLogPercentile": 0,
                "ValueThreshold": 1048576,
                "NumMemtables": 5,
                "BlockSize": 4096,
                "BloomFalsePositive": 0.01,
                "BlockCacheSize": 20132659,
                "IndexCacheSize": 46976204,
                "NumLevelZeroTables": 5,
                "NumLevelZeroTablesStall": 15,
                "ValueLogFileSize": 1073741823,
                "ValueLogMaxEntries": 1000000,
                "NumCompactors": 4,
                "CompactL0OnClose": false,
                "LmaxCompaction": false,
                "ZSTDCompressionLevel": 0,
                "VerifyValueChecksum": false,
                "EncryptionKey": "",
                "EncryptionKeyRotationDuration": 864000000000000,
                "BypassLockGuard": false,
                "ChecksumVerificationMode": 0,
                "DetectConflicts": true,
                "NamespaceOffset": -1,
                "ExternalMagicVersion": 0
        }
}

The bulk loader needs to open many files at once. This number depends on the size of the data set loaded, the map file output size, and the level of indexing. 100,000 is adequate for most data set sizes. See `man ulimit` for details of how to change the limit.
Nonfatal error: max open file limit could not be detected: Cannot detect max open files on this platform

Connecting to zero at localhost:5080
Using Go memory
Processing file (1 out of 1): .\1million.rdf.gz
[11:11:45+0800] MAP 01s nquad_count:651.3k err_count:0.000 nquad_speed:507.9k/sec edge_count:3.380M edge_speed:2.635M/sec jemalloc: 0 B 
[11:11:46+0800] MAP 02s nquad_count:1.042M err_count:0.000 nquad_speed:456.4k/sec edge_count:4.719M edge_speed:2.068M/sec jemalloc: 0 B 
Shard tmp\map_output\000 -> Reduce tmp\shards\shard_0\000
badger 2025/07/04 11:11:46 INFO: All 0 tables opened in 0s
badger 2025/07/04 11:11:46 INFO: Discard stats nextEmptySlot: 0
badger 2025/07/04 11:11:46 INFO: Set nextTxnTs to 0
badger 2025/07/04 11:11:46 INFO: All 0 tables opened in 0s
badger 2025/07/04 11:11:46 INFO: Discard stats nextEmptySlot: 0
badger 2025/07/04 11:11:46 INFO: Set nextTxnTs to 0
badger 2025/07/04 11:11:46 INFO: DropAll called. Blocking writes...
badger 2025/07/04 11:11:46 INFO: Writes flushed. Stopping compactions now...
badger 2025/07/04 11:11:46 INFO: Deleted 0 SSTables. Now deleting value logs...
badger 2025/07/04 11:11:46 INFO: Value logs deleted. Creating value log file: 1
badger 2025/07/04 11:11:46 INFO: Deleted 1 value log files. DropAll done.
Num Encoders: 4
Final Histogram of buffer sizes: 
 -- Histogram:
Min value: 241739340
Max value: 241739340
Count: 1
50p: 65536.00
75p: 65536.00
90p: 65536.00
[134217728, 268435456) 1 100.00% 100.00%
 --

[11:11:47+0800] REDUCE 03s 0.00% edge_count:0.000 edge_speed:0.000/sec plist_count:0.000 plist_speed:0.000/sec. Num Encoding MBs: 230. jemalloc: 0 B 
[11:11:48+0800] REDUCE 04s 55.45% edge_count:2.617M edge_speed:2.617M/sec plist_count:354.1k plist_speed:354.1k/sec. Num Encoding MBs: 230. jemalloc: 0 B 
[11:11:49+0800] REDUCE 05s 100.00% edge_count:4.719M edge_speed:2.360M/sec plist_count:1.179M plist_speed:589.6k/sec. Num Encoding MBs: 230. jemalloc: 0 B 
Finishing stream id: 1
Finishing stream id: 2
Finishing stream id: 3
badger 2025/07/04 11:11:50 INFO: Table created: 2 at level: 6 for stream: 2. Size: 604 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 3 at level: 6 for stream: 3. Size: 473 KiB
Finishing stream id: 4
badger 2025/07/04 11:11:50 INFO: Table created: 4 at level: 6 for stream: 4. Size: 2.0 MiB
badger 2025/07/04 11:11:50 INFO: Table created: 1 at level: 6 for stream: 1. Size: 40 MiB
Finishing stream id: 5
Finishing stream id: 6
badger 2025/07/04 11:11:50 INFO: Table created: 6 at level: 6 for stream: 6. Size: 355 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 5 at level: 6 for stream: 5. Size: 3.5 MiB
Finishing stream id: 7
badger 2025/07/04 11:11:50 INFO: Table created: 7 at level: 6 for stream: 7. Size: 2.5 MiB
Finishing stream id: 8
Finishing stream id: 9
badger 2025/07/04 11:11:50 INFO: Table created: 9 at level: 6 for stream: 9. Size: 258 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 8 at level: 6 for stream: 8. Size: 3.1 MiB
Writing count index for "0-starring" rev=false
Writing count index for "0-genre" rev=false
Writing count index for "0-director.film" rev=false
Writing count index for "0-actor.film" rev=false
Writing split lists back to the main DB now
badger 2025/07/04 11:11:50 INFO: Number of ranges found: 2
badger 2025/07/04 11:11:50 INFO: Sent range 0 for iteration: [, 040000000000000000000b6467726170682e747970650202506572736f6e0000000000000001ffffffffffffd875) of size: 0 B
badger 2025/07/04 11:11:50 INFO: copying split keys to main DB Streaming about 0 B of uncompressed data (0 B on disk)
badger 2025/07/04 11:11:50 INFO: Sent range 1 for iteration: [040000000000000000000b6467726170682e747970650202506572736f6e0000000000000001ffffffffffffd875, ) of size: 0 B
badger 2025/07/04 11:11:50 INFO: copying split keys to main DB Sent data of size 1.5 MiB
badger 2025/07/04 11:11:50 INFO: Table created: 15 at level: 6 for stream: 12. Size: 84 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 14 at level: 6 for stream: 17. Size: 199 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 11 at level: 6 for stream: 11. Size: 83 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 10 at level: 6 for stream: 13. Size: 218 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 16 at level: 6 for stream: 14. Size: 97 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 13 at level: 6 for stream: 16. Size: 598 KiB
badger 2025/07/04 11:11:50 INFO: Table created: 12 at level: 6 for stream: 10. Size: 2.3 MiB
badger 2025/07/04 11:11:50 INFO: Resuming writes
badger 2025/07/04 11:11:50 INFO: Lifetime L0 stalled for: 0s
badger 2025/07/04 11:11:50 INFO:
Level 0 [ ]: NumTables: 01. Size: 1000 B of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [B]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 6 [ ]: NumTables: 16. Size: 56 MiB of 56 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 4.0 MiB
Level Done
badger 2025/07/04 11:11:50 INFO: Lifetime L0 stalled for: 0s
badger 2025/07/04 11:11:50 INFO:
Level 0 [ ]: NumTables: 01. Size: 1.5 MiB of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 6 [B]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level Done
[11:11:50+0800] REDUCE 06s 100.00% edge_count:4.719M edge_speed:1.735M/sec plist_count:1.179M plist_speed:433.4k/sec. Num Encoding MBs: 0. jemalloc: 0 B
Total: 06s

说明已经成功导入了。

数据加载

因为dgraph bulk 并不会写入正在运行的alpha 节点, 二十离线构建的数据目录。 这些构建好的数据并不在默认的data目录中,而是在buil 时指定的--out目录,默认是(./out)。目前我放在了 {$workspace}/build 目录下, 所以这个导入命令就会把数据写入到 ./build/out/0/p 目录下

所以配置alpha 启动参数时候,也需要重新调整arguments

bash 复制代码
alpha --trace "jaeger=http://localhost:14268; ratio=0.99;" --security "whitelist=0.0.0.0/0;" --postings ./build/out/0/p --wal ./build/out/0/w

数据查询

e.g.

css 复制代码
curl --location 'http://localhost:8080/query' \
--header 'Content-Type: application/dql' \
--data '
{
  actors(func: has(starring), first: 10) {
    uid
    name
    starring {
      name
    }
  }
}
'
相关推荐
倔强的石头_8 小时前
关系数据库替换用金仓:数据迁移过程中的完整性与一致性风险
数据库
Elastic 中国社区官方博客8 小时前
使用 Groq 与 Elasticsearch 进行智能查询
大数据·数据库·人工智能·elasticsearch·搜索引擎·ai·全文检索
qq_297574678 小时前
【实战】POI 实现 Excel 多级表头导出(含合并单元格完整方案)
java·spring boot·后端·excel
穿过锁扣的风8 小时前
一文搞懂 SQL 五大分类:DQL/DML/DDL/DCL/TCL
数据库·microsoft·oracle
l1t8 小时前
DeepSeek总结的SNKV — 无查询处理器的 SQLite 键值存储
数据库·sqlite·kvstore
洛豳枭薰8 小时前
MySQL 梳理
数据库·mysql
郝学胜-神的一滴8 小时前
超越Spring的Summer(一): PackageScanner 类实现原理详解
java·服务器·开发语言·后端·spring·软件构建
Tony Bai8 小时前
“Go 2,请不要发生!”:如果 Go 变成了“缝合怪”,你还会爱它吗?
开发语言·后端·golang
九.九8 小时前
CANN 算子生态的底层安全与驱动依赖:固件校验与算子安全边界的强化
大数据·数据库·安全
蓝帆傲亦9 小时前
代码革命!我用Claude Code 3个月完成1年工作量,这些实战经验全给你
jvm·数据库·oracle