序言
调研一个需求,客户想实现的是可以根据不同数据的DDL语句提取出相关信息,比如表名称、表注释、字段名称、字段类型、是否主键、字段注释等,要求支持的数据库有 MySQL、PG、Oracle、SQL Server、databricks。这里选择参考一下Chardb是如何实现导入ddl为模型信息的。Chartdb地址
概览
- 输入: 各数据库方言的 DDL SQL(PostgreSQL、MySQL/MariaDB、SQL Server、SQLite)
- 流程: 方言检测 → 方言解析器 → 统一中间结构 → 类型映射与关系绑定 → 生成 Diagram
- 输出: ChartDB 内部模型 Diagram(表、字段、索引、关系、自定义枚举)
入口和流程
- UI 触发:导入数据库对话框(DDL 模式)
- 入口函数:sqlImportToDiagram(自动检测数据库类型,分派到对应解析器,统一转换)
js
// 位于 chartdb-main\src\dialogs\import-database-dialog\import-database-dialog.tsx
if (importMethod === 'ddl') {
diagram = await sqlImportToDiagram({
sqlContent: scriptResult,
sourceDatabaseType: databaseType,
targetDatabaseType: databaseType,
});
}
导入入口解析器分派:
js
// 位于 chartdb-main\src\lib\data\sql-import\index.ts
export async function sqlImportToDiagram({
sqlContent,
sourceDatabaseType,
targetDatabaseType = DatabaseType.GENERIC,
}: {
sqlContent: string;
sourceDatabaseType: DatabaseType;
targetDatabaseType: DatabaseType;
}): Promise<Diagram> {
if (sourceDatabaseType === DatabaseType.GENERIC) {
const detectedType = detectDatabaseType(sqlContent);
sourceDatabaseType = detectedType ?? DatabaseType.POSTGRESQL;
}
let parserResult: SQLParserResult;
switch (sourceDatabaseType) {
case DatabaseType.POSTGRESQL:
if (isPgDumpFormat(sqlContent)) {
parserResult = await fromPostgresDump(sqlContent);
} else {
parserResult = await fromPostgres(sqlContent);
}
break;
case DatabaseType.MYSQL:
case DatabaseType.MARIADB:
parserResult = await fromMySQL(sqlContent);
break;
case DatabaseType.SQL_SERVER:
parserResult = await fromSQLServer(sqlContent);
break;
case DatabaseType.SQLITE:
parserResult = await fromSQLite(sqlContent);
break;
default:
throw new Error(`Unsupported database type: ${sourceDatabaseType}`);
}
// Convert to Diagram...
}
方言特定解析器
所有解析器输出统一中间结构 SQLParserResult:
- tables: SQLTable[](表、列、索引)
- relationships: SQLForeignKey[](外键)
- enums?: SQLEnumType[](方言枚举)
- warnings?: string[](非致命警告)
PostgreSQL:fromPostgres
- 使用 node-sql-parser 解析 AST
- 失败回退:正则提取列/外键信息
- 识别 CREATE TYPE ... AS ENUM 并输出枚举
- 支持 ALTER TABLE 外键与 schema 匹配
js
// 位于 chartdb-main\src\lib\data\sql-import\dialect-importers\postgresql\postgresql.ts
export async function fromPostgres(
sqlContent: string
): Promise<SQLParserResult & { warnings?: string[] }> {
const tables: SQLTable[] = [];
const relationships: SQLForeignKey[] = [];
const tableMap: Record<string, string> = {};
const processedStatements: string[] = [];
const enumTypes: SQLEnumType[] = [];
const { statements, warnings } = preprocessSQL(sqlContent);
const { Parser } = await import('node-sql-parser');
const parser = new Parser();
MySQL / MariaDB:fromMySQL
- 处理反引号、AUTO_INCREMENT、ENGINE/CHARSET
- 外键两轮处理:建表内约束 + ALTER TABLE,并有 pending FK 机制
- 失败回退:正则解析
js
// 位于 chartdb-main\src\lib\data\sql-import\dialect-importers\mysql\mysql.ts
export async function fromMySQL(sqlContent: string): Promise<SQLParserResult> {
const { found, line } = detectInlineReferences(sqlContent);
if (found) {
throw new Error(
`MySQL does not support inline REFERENCES in column definitions (line ${line}). Please use FOREIGN KEY constraints instead:\n\nCREATE TABLE ...`
);
}
const tables: SQLTable[] = [];
const relationships: SQLForeignKey[] = [];
const tableMap: Record<string, string> = {};
SQL Server:fromSQLServer
- 预处理脚本:移除 USE/GO/[]/WITH(...) 等影响解析的语法
- 手工解析 CREATE TABLE、ALTER TABLE ... ADD CONSTRAINT(外键)
- 再用 node-sql-parser 补充索引/约束
- 类型标准化(如 uniqueidentifier、datetime2、nvarchar(max) 等)
js
// 位于 chartdb-main\src\lib\data\sql-import\dialect-importers\sqlserver\sqlserver.ts
export async function fromSQLServer(
sqlContent: string
): Promise<SQLParserResult> {
const tables: SQLTable[] = [];
const relationships: SQLForeignKey[] = [];
const tableMap: Record<string, string> = {};
const statements = sqlContent
.split(/(?:GO\s*$|;\s*$)/im)
.filter((stmt) => stmt.trim().length > 0);
SQLite:fromSQLite
- 适配动态类型系
- 处理 INTEGER PRIMARY KEY 自增语义
js
// 位于 chartdb-main\src\lib\data\sql-import\dialect-importers\sqlite\sqlite.ts
export async function fromSQLite(sqlContent: string): Promise<SQLParserResult> {
const tables: SQLTable[] = [];
const relationships: SQLForeignKey[] = [];
const tableMap: Record<string, string> = {};
统一转换为 Diagram
解析器输出的 SQLParserResult 会被统一转换为 ChartDB 内部模型 Diagram:
- 字段类型映射为通用 DataType
- 生成 DBTable、DBField、DBIndex
- 外键→DBRelationship(基数来自唯一性/主键或解析器提供)
- 方言枚举→DBCustomType(kind=enum)
js
// 转换函数 位于 chartdb-main\src\lib\data\sql-import\common.ts
export function convertToChartDBDiagram(
parserResult: SQLParserResult,
sourceDatabaseType: DatabaseType,
targetDatabaseType: DatabaseType
): Diagram {
const tableIdMapping = new Map<string, string>();
const tables: DBTable[] = parserResult.tables.map((table, index) => {
const row = Math.floor(index / 4);
const col = index % 4;
const tableSpacing = 300;
const newId = generateId();
tableIdMapping.set(table.id, newId);
const fields: DBField[] = table.columns.map((column) => {
js
// 关系生产 位于 chartdb-main\src\lib\data\sql-import\common.ts
const relationships: DBRelationship[] = [];
parserResult.relationships.forEach((rel) => {
let sourceTable = tables.find(
(t) => t.name === rel.sourceTable && rel.sourceSchema === t.schema
);
if (!sourceTable) {
sourceTable = tables.find((t) => t.name === rel.sourceTable);
}
let targetTable = tables.find(
(t) => t.name === rel.targetTable && rel.targetSchema === t.schema
);
if (!targetTable) {
targetTable = tables.find((t) => t.name === rel.targetTable);
}
if (!sourceTable || !targetTable) return;
const sourceTableId = tableIdMapping.get(rel.sourceTableId);
const targetTableId = tableIdMapping.get(rel.targetTableId);
if (!sourceTableId || !targetTableId) return;
const sourceField = sourceTable.fields.find((f) => f.name === rel.sourceColumn);
const targetField = targetTable.fields.find((f) => f.name === rel.targetColumn);
if (!sourceField || !targetField) return;
const sourceCardinality =
rel.sourceCardinality || (sourceField.unique || sourceField.primaryKey ? 'one' : 'many');
const targetCardinality =
rel.targetCardinality || (targetField.unique || targetField.primaryKey ? 'one' : 'many');
relationships.push({
id: generateId(),
name: rel.name,
sourceSchema: sourceTable.schema,
targetSchema: targetTable.schema,
sourceTableId: sourceTableId,
targetTableId: targetTableId,
sourceFieldId: sourceField.id,
targetFieldId: targetField.id,
sourceCardinality,
targetCardinality,
createdAt: Date.now(),
});
});
类型映射策略
核心函数:mapSQLTypeToGenericType(sqlType, databaseType?
- PostgreSQL:serial 系列映射到整数基型;遇到自定义枚举则保真为 customTypes
- MySQL/MariaDB:细分 tinyint/smallint/int/mediumint/bigint;识别 auto_increment
- SQL Server:当目标同为 SQL Server 时尽量保留 nvarchar/nchar/ntext/uniqueidentifier/datetime2/datetimeoffset/money/smallmoney/bit/xml/hierarchyid/geography/geometry
- SQLite:按 integer/real/blob/text 亲和映射
js
// 位于 chartdb-main\src\lib\data\sql-import\common.ts
export function mapSQLTypeToGenericType(
sqlType: string,
databaseType?: DatabaseType
): DataType {
if (!sqlType) {
return genericDataTypes.find((t) => t.id === 'text')!;
}
const normalizedSqlType = sqlType.toLowerCase();
js
// 类型参数的传递
if (column.typeArgs) {
if (typeof column.typeArgs === 'string') {
if ((field.type.id === 'varchar' || field.type.id === 'nvarchar') && column.typeArgs === 'max') {
field.characterMaximumLength = 'max';
}
} else if (Array.isArray(column.typeArgs) && column.typeArgs.length > 0) {
if (field.type.id === 'varchar' || field.type.id === 'nvarchar' || field.type.id === 'char' || field.type.id === 'nchar') {
field.characterMaximumLength = column.typeArgs[0].toString();
} else if ((field.type.id === 'numeric' || field.type.id === 'decimal') && column.typeArgs.length >= 2) {
field.precision = column.typeArgs[0];
field.scale = column.typeArgs[1];
}
}
}
总结
ChartDB采用"AST 优先 + 手工回退"的混合策略,不同方言的力度不同:
-
PostgreSQL:主要用 AST 解析,失败时回退正则/字符串解析
-
文件:chartdb-main/src/lib/data/sql-import/dialect-importers/postgresql/postgresql.ts
-
说明:使用 node-sql-parser 解析表/索引/ALTER 外键;对复杂列/外键用正则兜底,并提取 ENUM。
-
-
MySQL/MariaDB:主要用 AST 解析,配合大量回退与两轮外键处理
- 文件:chartdb-main/src/lib/data/sql-import/dialect-importers/mysql/mysql.ts
- 说明:先 AST 解析建表与约束;对解析不稳处用正则提列;外键既处理建表内的也处理 ALTER TABLE,另有 pending FK 机制。
-
SQL Server:大量手工解析 + 少量 AST 补充
- 文件:chartdb-main/src/lib/data/sql-import/dialect-importers/sqlserver/sqlserver.ts
- 说明:先预处理脚本(去掉 USE/GO/[]/WITH... 等),手工解析 CREATE TABLE/ALTER 外键,再用 node-sql-parser 补索引/部分约束。
-
SQLite:AST 解析为主(含方言适配)
- 文件:chartdb-main/src/lib/data/sql-import/dialect-importers/sqlite/sqlite.ts
统一转换层(将解析结果转为 Diagram):
- 文件:chartdb-main/src/lib/data/sql-import/common.ts
- 函数:convertToChartDBDiagram、mapSQLTypeToGenericType
简而言之:优先用 node-sql-parser 做 AST 解析;遇到方言不兼容或复杂语法,就用正则/手工解析兜底,SQL Server 这块手工解析占比最大。