C#实现自己的Json解析器(LALR(1)+miniDFA)
Json是一个用处广泛、文法简单的数据格式。本文介绍如何用bitParser(拥有自己的解析器(C#实现LALR(1)语法解析器和miniDFA词法分析器的生成器)迅速实现一个简单高效的Json解析器。
读者可在(https://gitee.com/bitzhuwei/bitParser-demos/tree/master/bitzhuwei.JsonFormat.TestConsole)查看、下载完整代码。
Json格式的文法
我们可以在(https://ecma-international.org/wp-content/uploads/ECMA-404_2nd_edition_december_2017.pdf )找到Json格式的详细说明。据此,可得如下文法:
// Json grammar according to ECMA-404 2nd Edition / December 2017
Json = Object | Array ;
Object = '{' '}' | '{' Members '}' ;
Array = '[' ']' | '[' Elements ']' ;
Members = Members ',' Member | Member ;
Elements = Elements ',' Element | Element ;
Member = 'string' ':' Value ;
Element = Value ;
Value = 'null' | 'true' | 'false' | 'number' | 'string'
| Object | Array ;
%%"([^"\\\u0000-\u001F]|\\["\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"%% 'string'
%%[-]?(0|[1-9][0-9]*)([.][0-9]+)?([eE][+-]?[0-9]+)?%% 'number'
实际上这个文法是我用AI写出来后再整理成的。
此文法说明:
-
一个
Json
要么是一个Object
,要么是一个Array
。 -
一个
Object
包含0-多个键值对("key" : value
),用{ }
括起来。 -
一个
Array
包含0-多个value
,用[ ]
括起来。 -
一个
value
有如下几种类型:null
、true
、false
、number
、string
、Object
、Array
。
其中:
null
、true
、false
就是字面意思,因而可以省略不写。如果要在文法中显式地书写,就是这样:
%%null%% 'null'
%%true%% 'true'
%%false%% 'false'
{
、}
、[
、]
、,
、:
也都是字面意思,因而可以省略不写。如果要在文法中显式地书写,就是这样:
%%\{%% '{'
%%}%% '}'
%%\[%% '['
%%]%% ']'
%%,%% ','
%%:%% ':'
number
可由下图描述:

图上直观地说明了number
这个token的正则表达式由4个依次排列的部分组成:
[-]? (0|[1-9][0-9]*) ([.][0-9]+)? ([eE][+-]?[0-9]+)?
string
可由下图描述:

图上直观地说明了string
这个token的正则表达式是用"
包裹起来的某些字符或转义字符:
" ( [^"\\\u0000-\u001F] | \\["\\/bfnrt] | \\u[0-9A-Fa-f]{4} )* "
/*
实际含义为:
非"、非\、非控制字符(\u0000-\u001F)
\"、\\、\/、\b、\f、\n、\r、\t
\uNNNN
*/
Value = Object | Array;
说明Json中的数据是可以嵌套的。
将此文法作为输入,提供给bitParser,就可以一键生成下述章节介绍的Json解析器代码和文档了。
生成的词法分析器

DFA

DFA文件夹下是依据确定的有限自动机原理生成的词法分析器的全部词法状态。
初始状态lexicalState0
using System;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
private static readonly Action<LexicalContext, char, CurrentStateWrap> lexicalState0 =
static (context, c, wrap) => {
if (false) { /* for simpler code generation purpose. */ }
/* user-input condition code */
/* [1-9] */
else if (/* possible Vt : 'number' */
/* no possible signal */
/* [xxx] scope */
'1'/*'\u0031'(49)*/ <= c && c <= '9'/*'\u0039'(57)*/) {
BeginToken(context);
ExtendToken(context, st.@number);
wrap.currentState = lexicalState1;
}
/* user-input condition code */
/* 0 */
else if (/* possible Vt : 'number' */
/* no possible signal */
/* single char */
c == '0'/*'\u0030'(48)*/) {
BeginToken(context);
ExtendToken(context, st.@number);
wrap.currentState = lexicalState2;
}
/* user-input condition code */
/* [-] */
else if (/* possible Vt : 'number' */
/* no possible signal */
/* [xxx] scope */
c == '-'/*'\u002D'(45)*/) {
BeginToken(context);
wrap.currentState = lexicalState3;
}
/* user-input condition code */
/* " */
else if (/* possible Vt : 'string' */
/* no possible signal */
/* single char */
c == '"'/*'\u0022'(34)*/) {
BeginToken(context);
wrap.currentState = lexicalState4;
}
/* user-input condition code */
/* f */
else if (/* possible Vt : 'false' */
/* no possible signal */
/* single char */
c == 'f'/*'\u0066'(102)*/) {
BeginToken(context);
wrap.currentState = lexicalState5;
}
/* user-input condition code */
/* t */
else if (/* possible Vt : 'true' */
/* no possible signal */
/* single char */
c == 't'/*'\u0074'(116)*/) {
BeginToken(context);
wrap.currentState = lexicalState6;
}
/* user-input condition code */
/* n */
else if (/* possible Vt : 'null' */
/* no possible signal */
/* single char */
c == 'n'/*'\u006E'(110)*/) {
BeginToken(context);
wrap.currentState = lexicalState7;
}
/* user-input condition code */
/* : */
else if (/* possible Vt : ':' */
/* no possible signal */
/* single char */
c == ':'/*'\u003A'(58)*/) {
BeginToken(context);
ExtendToken(context, st.@Colon符);
wrap.currentState = lexicalState8;
}
/* user-input condition code */
/* , */
else if (/* possible Vt : ',' */
/* no possible signal */
/* single char */
c == ','/*'\u002C'(44)*/) {
BeginToken(context);
ExtendToken(context, st.@Comma符);
wrap.currentState = lexicalState9;
}
/* user-input condition code */
/* ] */
else if (/* possible Vt : ']' */
/* no possible signal */
/* single char */
c == ']'/*'\u005D'(93)*/) {
BeginToken(context);
ExtendToken(context, st.@RightBracket符);
wrap.currentState = lexicalState10;
}
/* user-input condition code */
/* \[ */
else if (/* possible Vt : '[' */
/* no possible signal */
/* single char */
c == '['/*'\u005B'(91)*/) {
BeginToken(context);
ExtendToken(context, st.@LeftBracket符);
wrap.currentState = lexicalState11;
}
/* user-input condition code */
/* } */
else if (/* possible Vt : '}' */
/* no possible signal */
/* single char */
c == '}'/*'\u007D'(125)*/) {
BeginToken(context);
ExtendToken(context, st.@RightBrace符);
wrap.currentState = lexicalState12;
}
/* user-input condition code */
/* \{ */
else if (/* possible Vt : '{' */
/* no possible signal */
/* single char */
c == '{'/*'\u007B'(123)*/) {
BeginToken(context);
ExtendToken(context, st.@LeftBrace符);
wrap.currentState = lexicalState13;
}
/* deal with everything else. */
else if (c == ' ' || c == '\r' || c == '\n' || c == '\t' || c == '\0') {
wrap.currentState = lexicalState0; // skip them.
}
else { // unexpected char.
BeginToken(context);
ExtendToken(context);
AcceptToken(st.Error错, context);
wrap.currentState = lexicalState0;
}
};
}
}
DFA文件夹下的实现是最初的也是最直观的实现。它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。
miniDFA

miniDFA文件夹下是依据Hopcroft算法得到的最小化的有限自动机的全部词法状态。它与DFA的区别仅在于词法状态数量可能减少了。
它是第二个实现,它也已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。
tableDFA

tableDFA文件夹下是二维数组形式(ElseIf[][]
)的miniDFA。它与miniDFA表示的内容相同,区别在于:miniDFA用一个函数(Action<LexicalContext, char, CurrentStateWrap>
)表示一个词法状态,而它用一个数组(ElseIf[]
)表示一个词法状态。这样可以减少内存占用。
二维数组形式的miniDFA
using System;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
private static readonly ElseIf[] omitChars = new ElseIf[] {
new('\u0000'/*(0)*/, nextStateId: 0, Acts.None),
new('\t'/*'\u0009'(9)*/, '\n'/*'\u000A'(10)*/, nextStateId: 0, Acts.None),
new('\r'/*'\u000D'(13)*/, nextStateId: 0, Acts.None),
new(' '/*'\u0020'(32)*/, nextStateId: 0, Acts.None),
};
private static readonly ElseIf[][] lexiStates = new ElseIf[47][];
static void InitializeLexiTable() {
ElseIf segment_48_48_25_3_ints_number = new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number);//refered 2 times
ElseIf segment_49_57_24_3_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number);//refered 2 times
ElseIf segment_48_57_37_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number);//refered 3 times
ElseIf segment_48_57_38_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number);//refered 2 times
ElseIf segment_48_57_44_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number);//refered 3 times
ElseIf segment_48_57_45_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number);//refered 2 times
ElseIf segment_46_46_8_0 = new('.'/*'\u002E'(46)*/, 8, Acts.None);//refered 9 times
ElseIf segment_48_48_33_2_ints_number = new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number);//refered 2 times
ElseIf segment_49_57_32_2_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number);//refered 2 times
ElseIf segment_69_69_7_0 = new('E'/*'\u0045'(69)*/, 7, Acts.None);//refered 11 times
ElseIf segment_101_101_7_0 = new('e'/*'\u0065'(101)*/, 7, Acts.None);//refered 11 times
ElseIf segment_0_65535_0_4_ints_number = new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number);//refered 13 times
ElseIf segment_48_48_40_2_ints_number = new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number);//refered 3 times
ElseIf segment_49_57_39_2_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number);//refered 3 times
ElseIf segment_48_57_41_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number);//refered 2 times
lexiStates[0] = new ElseIf[] {
// possible Vt: 'string'
/*0*/new('"'/*'\u0022'(34)*/, 2, Acts.Begin),
// possible Vt: ','
/*1*/new(','/*'\u002C'(44)*/, 27, Acts.Begin | Acts.Extend, st.@Comma符),
// possible Vt: 'number'
/*2*/new('-'/*'\u002D'(45)*/, 1, Acts.Begin),
// possible Vt: 'number'
/*3*///new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number),
/*3*/segment_48_48_25_3_ints_number,
// possible Vt: 'number'
/*4*///new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number),
/*4*/segment_49_57_24_3_ints_number,
// possible Vt: ':'
/*5*/new(':'/*'\u003A'(58)*/, 26, Acts.Begin | Acts.Extend, st.@Colon符),
// possible Vt: '['
/*6*/new('['/*'\u005B'(91)*/, 29, Acts.Begin | Acts.Extend, st.@LeftBracket符),
// possible Vt: ']'
/*7*/new(']'/*'\u005D'(93)*/, 28, Acts.Begin | Acts.Extend, st.@RightBracket符),
// possible Vt: 'false'
/*8*/new('f'/*'\u0066'(102)*/, 3, Acts.Begin),
// possible Vt: 'null'
/*9*/new('n'/*'\u006E'(110)*/, 5, Acts.Begin),
// possible Vt: 'true'
/*10*/new('t'/*'\u0074'(116)*/, 4, Acts.Begin),
// possible Vt: '{'
/*11*/new('{'/*'\u007B'(123)*/, 31, Acts.Begin | Acts.Extend, st.@LeftBrace符),
// possible Vt: '}'
/*12*/new('}'/*'\u007D'(125)*/, 30, Acts.Begin | Acts.Extend, st.@RightBrace符),
};
lexiStates[1] = new ElseIf[] {
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number),
segment_48_48_25_3_ints_number,
// possible Vt: 'number'
//new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number),
segment_49_57_24_3_ints_number,
};
lexiStates[2] = new ElseIf[] {
// possible Vt: 'string'
new(' '/*'\u0020'(32)*/, '!'/*'\u0021'(33)*/, 2, Acts.None),
// possible Vt: 'string'
new('"'/*'\u0022'(34)*/, 36, Acts.Extend, st.@string),
// possible Vt: 'string'
new('#'/*'\u0023'(35)*/, '['/*'\u005B'(91)*/, 2, Acts.None),
// possible Vt: 'string'
new('\\'/*'\u005C'(92)*/, 9, Acts.None),
// possible Vt: 'string'
new(']'/*'\u005D'(93)*/, '\uFFFF'/*�(65535)*/, 2, Acts.None),
};
lexiStates[3] = new ElseIf[] {
// possible Vt: 'false'
new('a'/*'\u0061'(97)*/, 10, Acts.None),
};
lexiStates[4] = new ElseIf[] {
// possible Vt: 'true'
new('r'/*'\u0072'(114)*/, 6, Acts.None),
};
lexiStates[5] = new ElseIf[] {
// possible Vt: 'null'
new('u'/*'\u0075'(117)*/, 11, Acts.None),
};
lexiStates[6] = new ElseIf[] {
// possible Vt: 'true'
new('u'/*'\u0075'(117)*/, 18, Acts.None),
};
lexiStates[7] = new ElseIf[] {
// possible Vt: 'number'
new('+'/*'\u002B'(43)*/, 12, Acts.None),
// possible Vt: 'number'
new('-'/*'\u002D'(45)*/, 12, Acts.None),
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number),
segment_48_57_37_2_ints_number,
};
lexiStates[8] = new ElseIf[] {
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number),
segment_48_57_38_2_ints_number,
};
lexiStates[9] = new ElseIf[] {
// possible Vt: 'string'
new('"'/*'\u0022'(34)*/, 2, Acts.None),
// possible Vt: 'string'
new('/'/*'\u002F'(47)*/, 2, Acts.None),
// possible Vt: 'string'
new('\\'/*'\u005C'(92)*/, 2, Acts.None),
// possible Vt: 'string'
new('b'/*'\u0062'(98)*/, 2, Acts.None),
// possible Vt: 'string'
new('f'/*'\u0066'(102)*/, 2, Acts.None),
// possible Vt: 'string'
new('n'/*'\u006E'(110)*/, 2, Acts.None),
// possible Vt: 'string'
new('r'/*'\u0072'(114)*/, 2, Acts.None),
// possible Vt: 'string'
new('t'/*'\u0074'(116)*/, 2, Acts.None),
// possible Vt: 'string'
new('u'/*'\u0075'(117)*/, 13, Acts.None),
};
lexiStates[10] = new ElseIf[] {
// possible Vt: 'false'
new('l'/*'\u006C'(108)*/, 17, Acts.None),
};
lexiStates[11] = new ElseIf[] {
// possible Vt: 'null'
new('l'/*'\u006C'(108)*/, 19, Acts.None),
};
lexiStates[12] = new ElseIf[] {
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number),
segment_48_57_37_2_ints_number,
};
lexiStates[13] = new ElseIf[] {
// possible Vt: 'string'
new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 14, Acts.None),
// possible Vt: 'string'
new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 14, Acts.None),
// possible Vt: 'string'
new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 14, Acts.None),
};
lexiStates[14] = new ElseIf[] {
// possible Vt: 'string'
new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 15, Acts.None),
// possible Vt: 'string'
new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 15, Acts.None),
// possible Vt: 'string'
new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 15, Acts.None),
};
lexiStates[15] = new ElseIf[] {
// possible Vt: 'string'
new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 16, Acts.None),
// possible Vt: 'string'
new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 16, Acts.None),
// possible Vt: 'string'
new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 16, Acts.None),
};
lexiStates[16] = new ElseIf[] {
// possible Vt: 'string'
new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 2, Acts.None),
// possible Vt: 'string'
new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 2, Acts.None),
// possible Vt: 'string'
new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 2, Acts.None),
};
lexiStates[17] = new ElseIf[] {
// possible Vt: 'false'
new('s'/*'\u0073'(115)*/, 22, Acts.None),
};
lexiStates[18] = new ElseIf[] {
// possible Vt: 'true'
new('e'/*'\u0065'(101)*/, 42, Acts.Extend, st.@true),
};
lexiStates[19] = new ElseIf[] {
// possible Vt: 'null'
new('l'/*'\u006C'(108)*/, 43, Acts.Extend, st.@null),
};
lexiStates[20] = new ElseIf[] {
// possible Vt: 'number'
new('+'/*'\u002B'(43)*/, 23, Acts.None),
// possible Vt: 'number'
new('-'/*'\u002D'(45)*/, 23, Acts.None),
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number),
segment_48_57_44_2_ints_number,
};
lexiStates[21] = new ElseIf[] {
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number),
segment_48_57_45_2_ints_number,
};
lexiStates[22] = new ElseIf[] {
// possible Vt: 'false'
new('e'/*'\u0065'(101)*/, 46, Acts.Extend, st.@false),
};
lexiStates[23] = new ElseIf[] {
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number),
segment_48_57_44_2_ints_number,
};
lexiStates[24] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number),
segment_48_48_33_2_ints_number,
// possible Vt: 'number'
//new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number),
segment_49_57_32_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[25] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
new('0'/*'\u0030'(48)*/, 35, Acts.Extend, st.@number),
// possible Vt: 'number'
new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 34, Acts.Extend, st.@number),
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[26] = new ElseIf[] {
// possible Vt: ':'
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@Colon符),
};
lexiStates[27] = new ElseIf[] {
// possible Vt: ','
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@Comma符),
};
lexiStates[28] = new ElseIf[] {
// possible Vt: ']'
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@RightBracket符),
};
lexiStates[29] = new ElseIf[] {
// possible Vt: '['
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@LeftBracket符),
};
lexiStates[30] = new ElseIf[] {
// possible Vt: '}'
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@RightBrace符),
};
lexiStates[31] = new ElseIf[] {
// possible Vt: '{'
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@LeftBrace符),
};
lexiStates[32] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number),
segment_48_48_40_2_ints_number,
// possible Vt: 'number'
//new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number),
segment_49_57_39_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[33] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number),
segment_48_48_33_2_ints_number,
// possible Vt: 'number'
//new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number),
segment_49_57_32_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[34] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number),
segment_48_57_41_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[35] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[36] = new ElseIf[] {
// possible Vt: 'string'
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@string),
};
lexiStates[37] = new ElseIf[] {
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number),
segment_48_57_37_2_ints_number,
// possible Vt: 'number'
new('E'/*'\u0045'(69)*/, 20, Acts.None),
// possible Vt: 'number'
new('e'/*'\u0065'(101)*/, 20, Acts.None),
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[38] = new ElseIf[] {
// possible Vt: 'number'
new('.'/*'\u002E'(46)*/, 21, Acts.None),
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number),
segment_48_57_38_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[39] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number),
segment_48_48_40_2_ints_number,
// possible Vt: 'number'
//new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number),
segment_49_57_39_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[40] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number),
segment_48_48_40_2_ints_number,
// possible Vt: 'number'
//new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number),
segment_49_57_39_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[41] = new ElseIf[] {
// possible Vt: 'number'
//new('.'/*'\u002E'(46)*/, 8, Acts.None),
segment_46_46_8_0,
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number),
segment_48_57_41_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[42] = new ElseIf[] {
// possible Vt: 'true'
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@true),
};
lexiStates[43] = new ElseIf[] {
// possible Vt: 'null'
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@null),
};
lexiStates[44] = new ElseIf[] {
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number),
segment_48_57_44_2_ints_number,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[45] = new ElseIf[] {
// possible Vt: 'number'
//new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number),
segment_48_57_45_2_ints_number,
// possible Vt: 'number'
//new('E'/*'\u0045'(69)*/, 7, Acts.None),
segment_69_69_7_0,
// possible Vt: 'number'
//new('e'/*'\u0065'(101)*/, 7, Acts.None),
segment_101_101_7_0,
// possible Vt: 'number'
//new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
segment_0_65535_0_4_ints_number,
};
lexiStates[46] = new ElseIf[] {
// possible Vt: 'false'
new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@false),
};
}
}
}
它是第三个实现,它也已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。
Json.LexiTable.gen.bin

这是将二维数组形式(ElseIf[][]
)的miniDFA写入了一个二进制文件。加载Json解析器时,读取此文件即可得到二维数组形式(ElseIf[][]
)的miniDFA。这就不需要将整个ElseIf[][]
硬编码到源代码中了,从而进一步减少了内存占用。
为了方便调试、参考,我为其准备了对应的文本格式:
Json.LexiTable.gen.txt
ElseIf
4 omit chars:
0('\u0000'/*(0)*/->'\u0000'/*(0)*/)=>None,0
0('\t'/*'\u0009'(9)*/->'\n'/*'\u000A'(10)*/)=>None,0
0('\r'/*'\u000D'(13)*/->'\r'/*'\u000D'(13)*/)=>None,0
0(' '/*'\u0020'(32)*/->' '/*'\u0020'(32)*/)=>None,0
0 re-used int[] Vts:
0 re-used IfVt ifVt:
0 re-used IfVt[] ifVts:
15 re-used ElseIf2 segment:
25('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Begin, Extend,11
24('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Begin, Extend,11
37('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
38('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
44('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
45('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
8('.'/*'\u002E'(46)*/->'.'/*'\u002E'(46)*/)=>None,0
33('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,11
32('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,11
7('E'/*'\u0045'(69)*/->'E'/*'\u0045'(69)*/)=>None,0
7('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>None,0
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,11
40('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,11
39('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,11
41('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
47 ElseIf2[] row:
LexiTable.Rows[0] has 13 segments:
2('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>Begin,0
27(','/*'\u002C'(44)*/->','/*'\u002C'(44)*/)=>Begin, Extend,5
1('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>Begin,0
-1
-2
26(':'/*'\u003A'(58)*/->':'/*'\u003A'(58)*/)=>Begin, Extend,7
29('['/*'\u005B'(91)*/->'['/*'\u005B'(91)*/)=>Begin, Extend,3
28(']'/*'\u005D'(93)*/->']'/*'\u005D'(93)*/)=>Begin, Extend,4
3('f'/*'\u0066'(102)*/->'f'/*'\u0066'(102)*/)=>Begin,0
5('n'/*'\u006E'(110)*/->'n'/*'\u006E'(110)*/)=>Begin,0
4('t'/*'\u0074'(116)*/->'t'/*'\u0074'(116)*/)=>Begin,0
31('{'/*'\u007B'(123)*/->'{'/*'\u007B'(123)*/)=>Begin, Extend,1
30('}'/*'\u007D'(125)*/->'}'/*'\u007D'(125)*/)=>Begin, Extend,2
LexiTable.Rows[1] has 2 segments:
-1
-2
LexiTable.Rows[2] has 5 segments:
2(' '/*'\u0020'(32)*/->'!'/*'\u0021'(33)*/)=>None,0
36('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>Extend,6
2('#'/*'\u0023'(35)*/->'['/*'\u005B'(91)*/)=>None,0
9('\\'/*'\u005C'(92)*/->'\\'/*'\u005C'(92)*/)=>None,0
2(']'/*'\u005D'(93)*/->'\uFFFF'/*�(65535)*/)=>None,0
LexiTable.Rows[3] has 1 segments:
10('a'/*'\u0061'(97)*/->'a'/*'\u0061'(97)*/)=>None,0
LexiTable.Rows[4] has 1 segments:
6('r'/*'\u0072'(114)*/->'r'/*'\u0072'(114)*/)=>None,0
LexiTable.Rows[5] has 1 segments:
11('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0
LexiTable.Rows[6] has 1 segments:
18('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0
LexiTable.Rows[7] has 3 segments:
12('+'/*'\u002B'(43)*/->'+'/*'\u002B'(43)*/)=>None,0
12('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>None,0
-3
LexiTable.Rows[8] has 1 segments:
-4
LexiTable.Rows[9] has 9 segments:
2('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>None,0
2('/'/*'\u002F'(47)*/->'/'/*'\u002F'(47)*/)=>None,0
2('\\'/*'\u005C'(92)*/->'\\'/*'\u005C'(92)*/)=>None,0
2('b'/*'\u0062'(98)*/->'b'/*'\u0062'(98)*/)=>None,0
2('f'/*'\u0066'(102)*/->'f'/*'\u0066'(102)*/)=>None,0
2('n'/*'\u006E'(110)*/->'n'/*'\u006E'(110)*/)=>None,0
2('r'/*'\u0072'(114)*/->'r'/*'\u0072'(114)*/)=>None,0
2('t'/*'\u0074'(116)*/->'t'/*'\u0074'(116)*/)=>None,0
13('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0
LexiTable.Rows[10] has 1 segments:
17('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>None,0
LexiTable.Rows[11] has 1 segments:
19('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>None,0
LexiTable.Rows[12] has 1 segments:
-3
LexiTable.Rows[13] has 3 segments:
14('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,0
14('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,0
14('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0
LexiTable.Rows[14] has 3 segments:
15('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,0
15('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,0
15('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0
LexiTable.Rows[15] has 3 segments:
16('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,0
16('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,0
16('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0
LexiTable.Rows[16] has 3 segments:
2('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,0
2('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,0
2('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0
LexiTable.Rows[17] has 1 segments:
22('s'/*'\u0073'(115)*/->'s'/*'\u0073'(115)*/)=>None,0
LexiTable.Rows[18] has 1 segments:
42('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>Extend,9
LexiTable.Rows[19] has 1 segments:
43('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>Extend,8
LexiTable.Rows[20] has 3 segments:
23('+'/*'\u002B'(43)*/->'+'/*'\u002B'(43)*/)=>None,0
23('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>None,0
-5
LexiTable.Rows[21] has 1 segments:
-6
LexiTable.Rows[22] has 1 segments:
46('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>Extend,10
LexiTable.Rows[23] has 1 segments:
-5
LexiTable.Rows[24] has 6 segments:
-7
-8
-9
-10
-11
-12
LexiTable.Rows[25] has 6 segments:
-7
35('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,11
34('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,11
-10
-11
-12
LexiTable.Rows[26] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,7
LexiTable.Rows[27] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,5
LexiTable.Rows[28] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,4
LexiTable.Rows[29] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,3
LexiTable.Rows[30] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,2
LexiTable.Rows[31] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,1
LexiTable.Rows[32] has 6 segments:
-7
-13
-14
-10
-11
-12
LexiTable.Rows[33] has 6 segments:
-7
-8
-9
-10
-11
-12
LexiTable.Rows[34] has 5 segments:
-7
-15
-10
-11
-12
LexiTable.Rows[35] has 4 segments:
-7
-10
-11
-12
LexiTable.Rows[36] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,6
LexiTable.Rows[37] has 4 segments:
-3
20('E'/*'\u0045'(69)*/->'E'/*'\u0045'(69)*/)=>None,0
20('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>None,0
-12
LexiTable.Rows[38] has 5 segments:
21('.'/*'\u002E'(46)*/->'.'/*'\u002E'(46)*/)=>None,0
-4
-10
-11
-12
LexiTable.Rows[39] has 6 segments:
-7
-13
-14
-10
-11
-12
LexiTable.Rows[40] has 6 segments:
-7
-13
-14
-10
-11
-12
LexiTable.Rows[41] has 5 segments:
-7
-15
-10
-11
-12
LexiTable.Rows[42] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,9
LexiTable.Rows[43] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,8
LexiTable.Rows[44] has 2 segments:
-5
-12
LexiTable.Rows[45] has 4 segments:
-6
-10
-11
-12
LexiTable.Rows[46] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,10
它是第四个实现,这是目前使用的实现方式。为了加载路径上的方便,我将其从Json.gen\LexicalAnalyzer
文件夹挪到了Json.gen
文件夹下。
Json.LexicalScripts.gen.cs
这是各个词法分析状态都可能用到的函数,包括3类:Begin
、Extend
、Accept
。其作用是:记录一个token的起始位置(Begin
)和结束位置(Extend
),设置其类型、行数、列数等信息,将其加入List<Token> tokens
数组(Accept
)。
Json.LexicalScripts.gen.cs
using System;
using System.Collections.Generic;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
// this is where new <see cref="Token"/> starts.
private static void BeginToken(LexicalContext context) {
if (context.analyzingToken.type != AnalyzingToken.NotYet) {
context.analyzingToken.Reset(index: context.result.Count, start: context.cursor);
}
}
// extend value of current token(<see cref="LexicalContext.analyzingToken"/>)
private static void ExtendToken(LexicalContext context, int Vt) {
context.analyzingToken.ends[Vt] = context.cursor;
}
private static void ExtendToken2(LexicalContext context, params int[] Vts) {
for (int i = 0; i < Vts.Length; i++) {
var Vt = Vts[i];
context.analyzingToken.ends[Vt] = context.cursor;
}
}
private static void ExtendToken3(LexicalContext context, params IfVt[] ifVts) {
for (int i = 0; i < ifVts.Length; i++) {
var Vt = ifVts[i].Vt;
context.analyzingToken.ends[Vt] = context.cursor;
}
}
// accept current Token
// set Token.type and neutralize the last LexicalContext.MoveForward()
private static void AcceptToken(LexicalContext context, int Vt) {
var startIndex = context.analyzingToken.start.index;
var end = context.analyzingToken.ends[Vt];
context.analyzingToken.value = context.sourceCode.Substring(
startIndex, end.index - startIndex + 1);
context.analyzingToken.type = Vt;
// cancel forward steps for post-regex
var backStep = context.cursor.index - end.index;
if (backStep > 0) { context.MoveBack(backStep); }
// next operation: LexicalContext.MoveForward();
var token = context.analyzingToken.Dump(
#if DEBUG
context.stArray,
#endif
end);
context.result.Add(token);
// 没有注释可跳过 no comment to skip
context.lastSyntaxValidToken = token;
if (token.type == st.Error错) {
context.result.token2ErrorInfo.Add(token,
new TokenErrorInfo(token, "token type unrecognized!"));
}
}
private static void AcceptToken2(LexicalContext context, params int[] Vts) {
AcceptToken(context, Vts[0]);
}
private static void AcceptToken3(LexicalContext context, params IfVt[] ifVts) {
var typeSet = false;
int lastType = st.@终;
if (context.lastSyntaxValidToken != null) {
lastType = context.lastSyntaxValidToken.type;
}
for (var i = 0; i < ifVts.Length; i++) {
var ifVt = ifVts[i];
if (ifVt.signalCondition == context.signalCondition
// if preVt is string.Empty, let's use the first type.
// otherwise, preVt must be the lastType.
&& (ifVt.preVt == st.@终 // default preVt
|| ifVt.preVt == lastType)) { // <'Vt'>
context.analyzingToken.type = ifVt.Vt;
if (ifVt.nextSignal != null) { context.signalCondition = ifVt.nextSignal; }
typeSet = true;
break;
}
}
if (!typeSet) {
for (var i = 0; i < ifVts.Length; i++) {
var ifVt = ifVts[i];
if (// ingnore signal condition and try to assgin a type.
// if preVt is string.Empty, let's use the first type.
// otherwise, preVt must be the lastType.
(ifVt.preVt == st.@终 // default preVt
|| ifVt.preVt == lastType)) { // <'Vt'>
context.analyzingToken.type = ifVt.Vt;
context.signalCondition = LexicalContext.defaultSignal;
typeSet = true;
break;
}
}
}
var startIndex = context.analyzingToken.start.index;
var end = context.analyzingToken.start;
if (!typeSet) {
// we failed to assign type according to lexi statements.
// this indicates token error in source code or inappropriate lexi statements.
//throw new Exception("Algorithm error: token type not set!");
context.analyzingToken.type = st.Error错;
context.signalCondition = LexicalContext.defaultSignal;
// choose longest value
for (int i = 0; i < context.analyzingToken.ends.Length; i++) {
var item = context.analyzingToken.ends[i];
if (end.index < item.index) { end = item; }
}
}
else { end = context.analyzingToken.ends[context.analyzingToken.type]; }
context.analyzingToken.value = context.sourceCode.Substring(startIndex, end.index - startIndex + 1);
// cancel forward steps for post-regex
var backStep = context.cursor.index - end.index;
if (backStep > 0) { context.MoveBack(backStep); }
// next operation: context.MoveForward();
var token = context.analyzingToken.Dump(
#if DEBUG
context.stArray,
#endif
end);
context.result.Add(token);
// 没有注释可跳过 no comment to skip
context.lastSyntaxValidToken = token;
if (token.type == st.Error错) {
context.result.token2ErrorInfo.Add(token,
new TokenErrorInfo(token, "token type unrecognized!"));
}
}
}
}
Json.LexicalReservedWords.gen.cs
这里记录了Json文法的全部保留字(任何编程语言中的keyword),也就是{
、}
、[
、]
、,
、:
、null
、true
、false
这些。显然这是辅助的东西,不必在意。
Json.LexicalReservedWords.gen.cs
using System;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
public static class reservedWord {
/// <summary>
/// {
/// </summary>
public const string @LeftBrace符 = "{";
/// <summary>
/// }
/// </summary>
public const string @RightBrace符 = "}";
/// <summary>
/// [
/// </summary>
public const string @LeftBracket符 = "[";
/// <summary>
/// ]
/// </summary>
public const string @RightBracket符 = "]";
/// <summary>
/// ,
/// </summary>
public const string @Comma符 = ",";
/// <summary>
/// :
/// </summary>
public const string @Colon符 = ":";
/// <summary>
/// null
/// </summary>
public const string @null = "null";
/// <summary>
/// true
/// </summary>
public const string @true = "true";
/// <summary>
/// false
/// </summary>
public const string @false = "false";
}
/// <summary>
/// if <paramref name="token"/> is a reserved word, assign correspond type and return true.
/// <para>otherwise, return false.</para>
/// </summary>
/// <param name="token"></param>
/// <returns></returns>
private static bool CheckReservedWord(AnalyzingToken token) {
bool isReservedWord = true;
switch (token.value) {
case reservedWord.@LeftBrace符: token.type = st.@LeftBrace符; break;
case reservedWord.@RightBrace符: token.type = st.@RightBrace符; break;
case reservedWord.@LeftBracket符: token.type = st.@LeftBracket符; break;
case reservedWord.@RightBracket符: token.type = st.@RightBracket符; break;
case reservedWord.@Comma符: token.type = st.@Comma符; break;
case reservedWord.@Colon符: token.type = st.@Colon符; break;
case reservedWord.@null: token.type = st.@null; break;
case reservedWord.@true: token.type = st.@true; break;
case reservedWord.@false: token.type = st.@false; break;
default: isReservedWord = false; break;
}
return isReservedWord;
}
}
}
README.gen.md
这是词法分析器的说明文档,用mermaid画出了各个token的状态机和整个文法的总状态机,如下图所示。

我知道你们看不清。我也看不清。找个大屏幕直接看README.gen.md文件吧。
生成的语法分析器

Dicitonary<int, LRParseAction>
Json.Dict.LALR(1).gen.cs_是LALR(1)的语法分析状态机,每个语法状态都是一个Dicitonary<int, LRParseAction>
对象。
Json.Dict.LALR(1).gen.cs_
using System;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
private static Dictionary<int, LRParseAction>[] InitializeSyntaxStates() {
const int syntaxStateCount = 29;
var states = new Dictionary<int, LRParseAction>[syntaxStateCount];
// 102 actions
// conflicts(0)=not sovled(0)+solved(0)(0 warnings)
#region create objects of syntax states
states[0] = new(capacity: 5);
states[1] = new(capacity: 1);
states[2] = new(capacity: 1);
states[3] = new(capacity: 1);
states[4] = new(capacity: 4);
states[5] = new(capacity: 13);
states[6] = new(capacity: 4);
states[7] = new(capacity: 2);
states[8] = new(capacity: 2);
states[9] = new(capacity: 1);
states[10] = new(capacity: 4);
states[11] = new(capacity: 2);
states[12] = new(capacity: 2);
states[13] = new(capacity: 2);
states[14] = new(capacity: 3);
states[15] = new(capacity: 3);
states[16] = new(capacity: 3);
states[17] = new(capacity: 3);
states[18] = new(capacity: 3);
states[19] = new(capacity: 3);
states[20] = new(capacity: 3);
states[21] = new(capacity: 4);
states[22] = new(capacity: 2);
states[23] = new(capacity: 10);
states[24] = new(capacity: 4);
states[25] = new(capacity: 11);
states[26] = new(capacity: 2);
states[27] = new(capacity: 2);
states[28] = new(capacity: 2);
#endregion create objects of syntax states
#region re-used actions
LRParseAction aShift4 = new(LRParseAction.Kind.Shift, states[4]);// refered 4 times
LRParseAction aShift5 = new(LRParseAction.Kind.Shift, states[5]);// refered 4 times
LRParseAction aShift9 = new(LRParseAction.Kind.Shift, states[9]);// refered 2 times
LRParseAction aGoto13 = new(LRParseAction.Kind.Goto, states[13]);// refered 2 times
LRParseAction aShift14 = new(LRParseAction.Kind.Shift, states[14]);// refered 3 times
LRParseAction aShift15 = new(LRParseAction.Kind.Shift, states[15]);// refered 3 times
LRParseAction aShift16 = new(LRParseAction.Kind.Shift, states[16]);// refered 3 times
LRParseAction aShift17 = new(LRParseAction.Kind.Shift, states[17]);// refered 3 times
LRParseAction aShift18 = new(LRParseAction.Kind.Shift, states[18]);// refered 3 times
LRParseAction aGoto19 = new(LRParseAction.Kind.Goto, states[19]);// refered 3 times
LRParseAction aGoto20 = new(LRParseAction.Kind.Goto, states[20]);// refered 3 times
LRParseAction aReduce2 = new(regulations[2]);// refered 4 times
LRParseAction aReduce7 = new(regulations[7]);// refered 2 times
LRParseAction aReduce4 = new(regulations[4]);// refered 4 times
LRParseAction aReduce9 = new(regulations[9]);// refered 2 times
LRParseAction aReduce11 = new(regulations[11]);// refered 2 times
LRParseAction aReduce12 = new(regulations[12]);// refered 3 times
LRParseAction aReduce13 = new(regulations[13]);// refered 3 times
LRParseAction aReduce14 = new(regulations[14]);// refered 3 times
LRParseAction aReduce15 = new(regulations[15]);// refered 3 times
LRParseAction aReduce16 = new(regulations[16]);// refered 3 times
LRParseAction aReduce17 = new(regulations[17]);// refered 3 times
LRParseAction aReduce18 = new(regulations[18]);// refered 3 times
LRParseAction aReduce3 = new(regulations[3]);// refered 4 times
LRParseAction aReduce5 = new(regulations[5]);// refered 4 times
LRParseAction aReduce6 = new(regulations[6]);// refered 2 times
LRParseAction aReduce10 = new(regulations[10]);// refered 2 times
LRParseAction aReduce8 = new(regulations[8]);// refered 2 times
#endregion re-used actions
// 102 actions
// conflicts(0)=not sovled(0)+solved(0)(0 warnings)
#region init actions of syntax states
// syntaxStates[0]:
// [-1] Json' : ⏳ Json ;☕ '¥'
// [0] Json : ⏳ Object ;☕ '¥'
// [1] Json : ⏳ Array ;☕ '¥'
// [2] Object : ⏳ '{' '}' ;☕ '¥'
// [3] Object : ⏳ '{' Members '}' ;☕ '¥'
// [4] Array : ⏳ '[' ']' ;☕ '¥'
// [5] Array : ⏳ '[' Elements ']' ;☕ '¥'
/*0*/states[0].Add(st.Json枝, new(LRParseAction.Kind.Goto, states[1]));
/*1*/states[0].Add(st.Object枝, new(LRParseAction.Kind.Goto, states[2]));
/*2*/states[0].Add(st.Array枝, new(LRParseAction.Kind.Goto, states[3]));
/*3*/states[0].Add(st.@LeftBrace符, aShift4);
/*4*/states[0].Add(st.@LeftBracket符, aShift5);
// syntaxStates[1]:
// [-1] Json' : Json ⏳ ;☕ '¥'
/*5*/states[1].Add(st.@终, LRParseAction.accept);
// syntaxStates[2]:
// [0] Json : Object ⏳ ;☕ '¥'
/*6*/states[2].Add(st.@终, new(regulations[0]));
// syntaxStates[3]:
// [1] Json : Array ⏳ ;☕ '¥'
/*7*/states[3].Add(st.@终, new(regulations[1]));
// syntaxStates[4]:
// [2] Object : '{' ⏳ '}' ;☕ ',' ']' '}' '¥'
// [3] Object : '{' ⏳ Members '}' ;☕ ',' ']' '}' '¥'
// [6] Members : ⏳ Members ',' Member ;☕ ',' '}'
// [7] Members : ⏳ Member ;☕ ',' '}'
// [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}'
/*8*/states[4].Add(st.@RightBrace符, new(LRParseAction.Kind.Shift, states[6]));
/*9*/states[4].Add(st.Members枝, new(LRParseAction.Kind.Goto, states[7]));
/*10*/states[4].Add(st.Member枝, new(LRParseAction.Kind.Goto, states[8]));
/*11*/states[4].Add(st.@string, aShift9);
// syntaxStates[5]:
// [4] Array : '[' ⏳ ']' ;☕ ',' ']' '}' '¥'
// [5] Array : '[' ⏳ Elements ']' ;☕ ',' ']' '}' '¥'
// [8] Elements : ⏳ Elements ',' Element ;☕ ',' ']'
// [9] Elements : ⏳ Element ;☕ ',' ']'
// [11] Element : ⏳ Value ;☕ ',' ']'
// [12] Value : ⏳ 'null' ;☕ ',' ']'
// [13] Value : ⏳ 'true' ;☕ ',' ']'
// [14] Value : ⏳ 'false' ;☕ ',' ']'
// [15] Value : ⏳ 'number' ;☕ ',' ']'
// [16] Value : ⏳ 'string' ;☕ ',' ']'
// [17] Value : ⏳ Object ;☕ ',' ']'
// [18] Value : ⏳ Array ;☕ ',' ']'
// [2] Object : ⏳ '{' '}' ;☕ ',' ']'
// [3] Object : ⏳ '{' Members '}' ;☕ ',' ']'
// [4] Array : ⏳ '[' ']' ;☕ ',' ']'
// [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']'
/*12*/states[5].Add(st.@RightBracket符, new(LRParseAction.Kind.Shift, states[10]));
/*13*/states[5].Add(st.Elements枝, new(LRParseAction.Kind.Goto, states[11]));
/*14*/states[5].Add(st.Element枝, new(LRParseAction.Kind.Goto, states[12]));
/*15*/states[5].Add(st.Value枝, aGoto13);
/*16*/states[5].Add(st.@null, aShift14);
/*17*/states[5].Add(st.@true, aShift15);
/*18*/states[5].Add(st.@false, aShift16);
/*19*/states[5].Add(st.@number, aShift17);
/*20*/states[5].Add(st.@string, aShift18);
/*21*/states[5].Add(st.Object枝, aGoto19);
/*22*/states[5].Add(st.Array枝, aGoto20);
/*23*/states[5].Add(st.@LeftBrace符, aShift4);
/*24*/states[5].Add(st.@LeftBracket符, aShift5);
// syntaxStates[6]:
// [2] Object : '{' '}' ⏳ ;☕ ',' ']' '}' '¥'
/*25*/states[6].Add(st.@Comma符, aReduce2);
/*26*/states[6].Add(st.@RightBracket符, aReduce2);
/*27*/states[6].Add(st.@RightBrace符, aReduce2);
/*28*/states[6].Add(st.@终, aReduce2);
// syntaxStates[7]:
// [3] Object : '{' Members ⏳ '}' ;☕ ',' ']' '}' '¥'
// [6] Members : Members ⏳ ',' Member ;☕ ',' '}'
/*29*/states[7].Add(st.@RightBrace符, new(LRParseAction.Kind.Shift, states[21]));
/*30*/states[7].Add(st.@Comma符, new(LRParseAction.Kind.Shift, states[22]));
// syntaxStates[8]:
// [7] Members : Member ⏳ ;☕ ',' '}'
/*31*/states[8].Add(st.@Comma符, aReduce7);
/*32*/states[8].Add(st.@RightBrace符, aReduce7);
// syntaxStates[9]:
// [10] Member : 'string' ⏳ ':' Value ;☕ ',' '}'
/*33*/states[9].Add(st.@Colon符, new(LRParseAction.Kind.Shift, states[23]));
// syntaxStates[10]:
// [4] Array : '[' ']' ⏳ ;☕ ',' ']' '}' '¥'
/*34*/states[10].Add(st.@Comma符, aReduce4);
/*35*/states[10].Add(st.@RightBracket符, aReduce4);
/*36*/states[10].Add(st.@RightBrace符, aReduce4);
/*37*/states[10].Add(st.@终, aReduce4);
// syntaxStates[11]:
// [5] Array : '[' Elements ⏳ ']' ;☕ ',' ']' '}' '¥'
// [8] Elements : Elements ⏳ ',' Element ;☕ ',' ']'
/*38*/states[11].Add(st.@RightBracket符, new(LRParseAction.Kind.Shift, states[24]));
/*39*/states[11].Add(st.@Comma符, new(LRParseAction.Kind.Shift, states[25]));
// syntaxStates[12]:
// [9] Elements : Element ⏳ ;☕ ',' ']'
/*40*/states[12].Add(st.@Comma符, aReduce9);
/*41*/states[12].Add(st.@RightBracket符, aReduce9);
// syntaxStates[13]:
// [11] Element : Value ⏳ ;☕ ',' ']'
/*42*/states[13].Add(st.@Comma符, aReduce11);
/*43*/states[13].Add(st.@RightBracket符, aReduce11);
// syntaxStates[14]:
// [12] Value : 'null' ⏳ ;☕ ',' ']' '}'
/*44*/states[14].Add(st.@Comma符, aReduce12);
/*45*/states[14].Add(st.@RightBracket符, aReduce12);
/*46*/states[14].Add(st.@RightBrace符, aReduce12);
// syntaxStates[15]:
// [13] Value : 'true' ⏳ ;☕ ',' ']' '}'
/*47*/states[15].Add(st.@Comma符, aReduce13);
/*48*/states[15].Add(st.@RightBracket符, aReduce13);
/*49*/states[15].Add(st.@RightBrace符, aReduce13);
// syntaxStates[16]:
// [14] Value : 'false' ⏳ ;☕ ',' ']' '}'
/*50*/states[16].Add(st.@Comma符, aReduce14);
/*51*/states[16].Add(st.@RightBracket符, aReduce14);
/*52*/states[16].Add(st.@RightBrace符, aReduce14);
// syntaxStates[17]:
// [15] Value : 'number' ⏳ ;☕ ',' ']' '}'
/*53*/states[17].Add(st.@Comma符, aReduce15);
/*54*/states[17].Add(st.@RightBracket符, aReduce15);
/*55*/states[17].Add(st.@RightBrace符, aReduce15);
// syntaxStates[18]:
// [16] Value : 'string' ⏳ ;☕ ',' ']' '}'
/*56*/states[18].Add(st.@Comma符, aReduce16);
/*57*/states[18].Add(st.@RightBracket符, aReduce16);
/*58*/states[18].Add(st.@RightBrace符, aReduce16);
// syntaxStates[19]:
// [17] Value : Object ⏳ ;☕ ',' ']' '}'
/*59*/states[19].Add(st.@Comma符, aReduce17);
/*60*/states[19].Add(st.@RightBracket符, aReduce17);
/*61*/states[19].Add(st.@RightBrace符, aReduce17);
// syntaxStates[20]:
// [18] Value : Array ⏳ ;☕ ',' ']' '}'
/*62*/states[20].Add(st.@Comma符, aReduce18);
/*63*/states[20].Add(st.@RightBracket符, aReduce18);
/*64*/states[20].Add(st.@RightBrace符, aReduce18);
// syntaxStates[21]:
// [3] Object : '{' Members '}' ⏳ ;☕ ',' ']' '}' '¥'
/*65*/states[21].Add(st.@Comma符, aReduce3);
/*66*/states[21].Add(st.@RightBracket符, aReduce3);
/*67*/states[21].Add(st.@RightBrace符, aReduce3);
/*68*/states[21].Add(st.@终, aReduce3);
// syntaxStates[22]:
// [6] Members : Members ',' ⏳ Member ;☕ ',' '}'
// [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}'
/*69*/states[22].Add(st.Member枝, new(LRParseAction.Kind.Goto, states[26]));
/*70*/states[22].Add(st.@string, aShift9);
// syntaxStates[23]:
// [10] Member : 'string' ':' ⏳ Value ;☕ ',' '}'
// [12] Value : ⏳ 'null' ;☕ ',' '}'
// [13] Value : ⏳ 'true' ;☕ ',' '}'
// [14] Value : ⏳ 'false' ;☕ ',' '}'
// [15] Value : ⏳ 'number' ;☕ ',' '}'
// [16] Value : ⏳ 'string' ;☕ ',' '}'
// [17] Value : ⏳ Object ;☕ ',' '}'
// [18] Value : ⏳ Array ;☕ ',' '}'
// [2] Object : ⏳ '{' '}' ;☕ ',' '}'
// [3] Object : ⏳ '{' Members '}' ;☕ ',' '}'
// [4] Array : ⏳ '[' ']' ;☕ ',' '}'
// [5] Array : ⏳ '[' Elements ']' ;☕ ',' '}'
/*71*/states[23].Add(st.Value枝, new(LRParseAction.Kind.Goto, states[27]));
/*72*/states[23].Add(st.@null, aShift14);
/*73*/states[23].Add(st.@true, aShift15);
/*74*/states[23].Add(st.@false, aShift16);
/*75*/states[23].Add(st.@number, aShift17);
/*76*/states[23].Add(st.@string, aShift18);
/*77*/states[23].Add(st.Object枝, aGoto19);
/*78*/states[23].Add(st.Array枝, aGoto20);
/*79*/states[23].Add(st.@LeftBrace符, aShift4);
/*80*/states[23].Add(st.@LeftBracket符, aShift5);
// syntaxStates[24]:
// [5] Array : '[' Elements ']' ⏳ ;☕ ',' ']' '}' '¥'
/*81*/states[24].Add(st.@Comma符, aReduce5);
/*82*/states[24].Add(st.@RightBracket符, aReduce5);
/*83*/states[24].Add(st.@RightBrace符, aReduce5);
/*84*/states[24].Add(st.@终, aReduce5);
// syntaxStates[25]:
// [8] Elements : Elements ',' ⏳ Element ;☕ ',' ']'
// [11] Element : ⏳ Value ;☕ ',' ']'
// [12] Value : ⏳ 'null' ;☕ ',' ']'
// [13] Value : ⏳ 'true' ;☕ ',' ']'
// [14] Value : ⏳ 'false' ;☕ ',' ']'
// [15] Value : ⏳ 'number' ;☕ ',' ']'
// [16] Value : ⏳ 'string' ;☕ ',' ']'
// [17] Value : ⏳ Object ;☕ ',' ']'
// [18] Value : ⏳ Array ;☕ ',' ']'
// [2] Object : ⏳ '{' '}' ;☕ ',' ']'
// [3] Object : ⏳ '{' Members '}' ;☕ ',' ']'
// [4] Array : ⏳ '[' ']' ;☕ ',' ']'
// [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']'
/*85*/states[25].Add(st.Element枝, new(LRParseAction.Kind.Goto, states[28]));
/*86*/states[25].Add(st.Value枝, aGoto13);
/*87*/states[25].Add(st.@null, aShift14);
/*88*/states[25].Add(st.@true, aShift15);
/*89*/states[25].Add(st.@false, aShift16);
/*90*/states[25].Add(st.@number, aShift17);
/*91*/states[25].Add(st.@string, aShift18);
/*92*/states[25].Add(st.Object枝, aGoto19);
/*93*/states[25].Add(st.Array枝, aGoto20);
/*94*/states[25].Add(st.@LeftBrace符, aShift4);
/*95*/states[25].Add(st.@LeftBracket符, aShift5);
// syntaxStates[26]:
// [6] Members : Members ',' Member ⏳ ;☕ ',' '}'
/*96*/states[26].Add(st.@Comma符, aReduce6);
/*97*/states[26].Add(st.@RightBrace符, aReduce6);
// syntaxStates[27]:
// [10] Member : 'string' ':' Value ⏳ ;☕ ',' '}'
/*98*/states[27].Add(st.@Comma符, aReduce10);
/*99*/states[27].Add(st.@RightBrace符, aReduce10);
// syntaxStates[28]:
// [8] Elements : Elements ',' Element ⏳ ;☕ ',' ']'
/*100*/states[28].Add(st.@Comma符, aReduce8);
/*101*/states[28].Add(st.@RightBracket符, aReduce8);
#endregion init actions of syntax states
return states;
}
}
}
另外3个Json.Dict.*.gen.cs_
分别是LR(0)、SLR(1)、LR(1)的语法分析状态机,不再赘述。
这是最初的也是最直观的实现,它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。
int[]+LRParseAction[]
Json.Table.LALR(1).gen.cs_是LALR(1)的语法分析状态机,每个语法状态都是一个包含int[]
和LRParseAction[]
的对象。这里的每个int[t]
和LRParseAction[t]
合起来就代替了Dictionary<int, LRParseAction>
对象的一个键值对(key/value
),从而减少了内存占用,也稍微提升了运行效率。
Json.Table.LALR(1).gen.cs_
using System;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
private static LRParseState[] InitializeSyntaxStates() {
const int syntaxStateCount = 29;
var states = new LRParseState[syntaxStateCount];
// 102 actions
// conflicts(0)=not sovled(0)+solved(0)(0 warnings)
for (var i = 0; i < syntaxStateCount; i++) { states[i] = new(); }
#region re-used actions
LRParseAction aShift4 = new(LRParseAction.Kind.Shift, states[4]);// refered 4 times
LRParseAction aShift5 = new(LRParseAction.Kind.Shift, states[5]);// refered 4 times
LRParseAction aShift9 = new(LRParseAction.Kind.Shift, states[9]);// refered 2 times
LRParseAction aGoto13 = new(LRParseAction.Kind.Goto, states[13]);// refered 2 times
LRParseAction aShift14 = new(LRParseAction.Kind.Shift, states[14]);// refered 3 times
LRParseAction aShift15 = new(LRParseAction.Kind.Shift, states[15]);// refered 3 times
LRParseAction aShift16 = new(LRParseAction.Kind.Shift, states[16]);// refered 3 times
LRParseAction aShift17 = new(LRParseAction.Kind.Shift, states[17]);// refered 3 times
LRParseAction aShift18 = new(LRParseAction.Kind.Shift, states[18]);// refered 3 times
LRParseAction aGoto19 = new(LRParseAction.Kind.Goto, states[19]);// refered 3 times
LRParseAction aGoto20 = new(LRParseAction.Kind.Goto, states[20]);// refered 3 times
LRParseAction aReduce2 = new(regulations[2]);// refered 4 times
LRParseAction aReduce7 = new(regulations[7]);// refered 2 times
LRParseAction aReduce4 = new(regulations[4]);// refered 4 times
LRParseAction aReduce9 = new(regulations[9]);// refered 2 times
LRParseAction aReduce11 = new(regulations[11]);// refered 2 times
LRParseAction aReduce12 = new(regulations[12]);// refered 3 times
LRParseAction aReduce13 = new(regulations[13]);// refered 3 times
LRParseAction aReduce14 = new(regulations[14]);// refered 3 times
LRParseAction aReduce15 = new(regulations[15]);// refered 3 times
LRParseAction aReduce16 = new(regulations[16]);// refered 3 times
LRParseAction aReduce17 = new(regulations[17]);// refered 3 times
LRParseAction aReduce18 = new(regulations[18]);// refered 3 times
LRParseAction aReduce3 = new(regulations[3]);// refered 4 times
LRParseAction aReduce5 = new(regulations[5]);// refered 4 times
LRParseAction aReduce6 = new(regulations[6]);// refered 2 times
LRParseAction aReduce10 = new(regulations[10]);// refered 2 times
LRParseAction aReduce8 = new(regulations[8]);// refered 2 times
#endregion re-used actions
// 102 actions
// conflicts(0)=not sovled(0)+solved(0)(0 warnings)
#region init actions of syntax states
// syntaxStates[0]:
// [-1] Json' : ⏳ Json ;☕ '¥'
// [0] Json : ⏳ Object ;☕ '¥'
// [1] Json : ⏳ Array ;☕ '¥'
// [2] Object : ⏳ '{' '}' ;☕ '¥'
// [3] Object : ⏳ '{' Members '}' ;☕ '¥'
// [4] Array : ⏳ '[' ']' ;☕ '¥'
// [5] Array : ⏳ '[' Elements ']' ;☕ '¥'
states[0].nodes = new int[] {
/*0*/st.@LeftBrace符, // (1) -> aShift4
/*1*/st.@LeftBracket符, // (3) -> aShift5
/*2*/st.Json枝, // (12) -> new(LRParseAction.Kind.Goto, states[1])
/*3*/st.Object枝, // (13) -> new(LRParseAction.Kind.Goto, states[2])
/*4*/st.Array枝, // (14) -> new(LRParseAction.Kind.Goto, states[3])
};
states[0].actions = new LRParseAction[] {
/*0*//* st.@LeftBrace符(1), */aShift4,
/*1*//* st.@LeftBracket符(3), */aShift5,
/*2*//* st.Json枝(12), */new(LRParseAction.Kind.Goto, states[1]),
/*3*//* st.Object枝(13), */new(LRParseAction.Kind.Goto, states[2]),
/*4*//* st.Array枝(14), */new(LRParseAction.Kind.Goto, states[3]),
};
// syntaxStates[1]:
// [-1] Json' : Json ⏳ ;☕ '¥'
states[1].nodes = new int[] {
/*5*/st.@终, // (0) -> LRParseAction.accept
};
states[1].actions = new LRParseAction[] {
/*5*//* st.@终(0), */LRParseAction.accept,
};
// syntaxStates[2]:
// [0] Json : Object ⏳ ;☕ '¥'
states[2].nodes = new int[] {
/*6*/st.@终, // (0) -> new(regulations[0])
};
states[2].actions = new LRParseAction[] {
/*6*//* st.@终(0), */new(regulations[0]),
};
// syntaxStates[3]:
// [1] Json : Array ⏳ ;☕ '¥'
states[3].nodes = new int[] {
/*7*/st.@终, // (0) -> new(regulations[1])
};
states[3].actions = new LRParseAction[] {
/*7*//* st.@终(0), */new(regulations[1]),
};
// syntaxStates[4]:
// [2] Object : '{' ⏳ '}' ;☕ ',' ']' '}' '¥'
// [3] Object : '{' ⏳ Members '}' ;☕ ',' ']' '}' '¥'
// [6] Members : ⏳ Members ',' Member ;☕ ',' '}'
// [7] Members : ⏳ Member ;☕ ',' '}'
// [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}'
states[4].nodes = new int[] {
/*8*/st.@RightBrace符, // (2) -> new(LRParseAction.Kind.Shift, states[6])
/*9*/st.@string, // (6) -> aShift9
/*10*/st.Members枝, // (15) -> new(LRParseAction.Kind.Goto, states[7])
/*11*/st.Member枝, // (17) -> new(LRParseAction.Kind.Goto, states[8])
};
states[4].actions = new LRParseAction[] {
/*8*//* st.@RightBrace符(2), */new(LRParseAction.Kind.Shift, states[6]),
/*9*//* st.@string(6), */aShift9,
/*10*//* st.Members枝(15), */new(LRParseAction.Kind.Goto, states[7]),
/*11*//* st.Member枝(17), */new(LRParseAction.Kind.Goto, states[8]),
};
// syntaxStates[5]:
// [4] Array : '[' ⏳ ']' ;☕ ',' ']' '}' '¥'
// [5] Array : '[' ⏳ Elements ']' ;☕ ',' ']' '}' '¥'
// [8] Elements : ⏳ Elements ',' Element ;☕ ',' ']'
// [9] Elements : ⏳ Element ;☕ ',' ']'
// [11] Element : ⏳ Value ;☕ ',' ']'
// [12] Value : ⏳ 'null' ;☕ ',' ']'
// [13] Value : ⏳ 'true' ;☕ ',' ']'
// [14] Value : ⏳ 'false' ;☕ ',' ']'
// [15] Value : ⏳ 'number' ;☕ ',' ']'
// [16] Value : ⏳ 'string' ;☕ ',' ']'
// [17] Value : ⏳ Object ;☕ ',' ']'
// [18] Value : ⏳ Array ;☕ ',' ']'
// [2] Object : ⏳ '{' '}' ;☕ ',' ']'
// [3] Object : ⏳ '{' Members '}' ;☕ ',' ']'
// [4] Array : ⏳ '[' ']' ;☕ ',' ']'
// [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']'
states[5].nodes = new int[] {
/*12*/st.@LeftBrace符, // (1) -> aShift4
/*13*/st.@LeftBracket符, // (3) -> aShift5
/*14*/st.@RightBracket符, // (4) -> new(LRParseAction.Kind.Shift, states[10])
/*15*/st.@string, // (6) -> aShift18
/*16*/st.@null, // (8) -> aShift14
/*17*/st.@true, // (9) -> aShift15
/*18*/st.@false, // (10) -> aShift16
/*19*/st.@number, // (11) -> aShift17
/*20*/st.Object枝, // (13) -> aGoto19
/*21*/st.Array枝, // (14) -> aGoto20
/*22*/st.Elements枝, // (16) -> new(LRParseAction.Kind.Goto, states[11])
/*23*/st.Element枝, // (18) -> new(LRParseAction.Kind.Goto, states[12])
/*24*/st.Value枝, // (19) -> aGoto13
};
states[5].actions = new LRParseAction[] {
/*12*//* st.@LeftBrace符(1), */aShift4,
/*13*//* st.@LeftBracket符(3), */aShift5,
/*14*//* st.@RightBracket符(4), */new(LRParseAction.Kind.Shift, states[10]),
/*15*//* st.@string(6), */aShift18,
/*16*//* st.@null(8), */aShift14,
/*17*//* st.@true(9), */aShift15,
/*18*//* st.@false(10), */aShift16,
/*19*//* st.@number(11), */aShift17,
/*20*//* st.Object枝(13), */aGoto19,
/*21*//* st.Array枝(14), */aGoto20,
/*22*//* st.Elements枝(16), */new(LRParseAction.Kind.Goto, states[11]),
/*23*//* st.Element枝(18), */new(LRParseAction.Kind.Goto, states[12]),
/*24*//* st.Value枝(19), */aGoto13,
};
// syntaxStates[6]:
// [2] Object : '{' '}' ⏳ ;☕ ',' ']' '}' '¥'
states[6].nodes = new int[] {
/*25*/st.@终, // (0) -> aReduce2
/*26*/st.@RightBrace符, // (2) -> aReduce2
/*27*/st.@RightBracket符, // (4) -> aReduce2
/*28*/st.@Comma符, // (5) -> aReduce2
};
states[6].actions = new LRParseAction[] {
/*25*//* st.@终(0), */aReduce2,
/*26*//* st.@RightBrace符(2), */aReduce2,
/*27*//* st.@RightBracket符(4), */aReduce2,
/*28*//* st.@Comma符(5), */aReduce2,
};
// syntaxStates[7]:
// [3] Object : '{' Members ⏳ '}' ;☕ ',' ']' '}' '¥'
// [6] Members : Members ⏳ ',' Member ;☕ ',' '}'
states[7].nodes = new int[] {
/*29*/st.@RightBrace符, // (2) -> new(LRParseAction.Kind.Shift, states[21])
/*30*/st.@Comma符, // (5) -> new(LRParseAction.Kind.Shift, states[22])
};
states[7].actions = new LRParseAction[] {
/*29*//* st.@RightBrace符(2), */new(LRParseAction.Kind.Shift, states[21]),
/*30*//* st.@Comma符(5), */new(LRParseAction.Kind.Shift, states[22]),
};
// syntaxStates[8]:
// [7] Members : Member ⏳ ;☕ ',' '}'
states[8].nodes = new int[] {
/*31*/st.@RightBrace符, // (2) -> aReduce7
/*32*/st.@Comma符, // (5) -> aReduce7
};
states[8].actions = new LRParseAction[] {
/*31*//* st.@RightBrace符(2), */aReduce7,
/*32*//* st.@Comma符(5), */aReduce7,
};
// syntaxStates[9]:
// [10] Member : 'string' ⏳ ':' Value ;☕ ',' '}'
states[9].nodes = new int[] {
/*33*/st.@Colon符, // (7) -> new(LRParseAction.Kind.Shift, states[23])
};
states[9].actions = new LRParseAction[] {
/*33*//* st.@Colon符(7), */new(LRParseAction.Kind.Shift, states[23]),
};
// syntaxStates[10]:
// [4] Array : '[' ']' ⏳ ;☕ ',' ']' '}' '¥'
states[10].nodes = new int[] {
/*34*/st.@终, // (0) -> aReduce4
/*35*/st.@RightBrace符, // (2) -> aReduce4
/*36*/st.@RightBracket符, // (4) -> aReduce4
/*37*/st.@Comma符, // (5) -> aReduce4
};
states[10].actions = new LRParseAction[] {
/*34*//* st.@终(0), */aReduce4,
/*35*//* st.@RightBrace符(2), */aReduce4,
/*36*//* st.@RightBracket符(4), */aReduce4,
/*37*//* st.@Comma符(5), */aReduce4,
};
// syntaxStates[11]:
// [5] Array : '[' Elements ⏳ ']' ;☕ ',' ']' '}' '¥'
// [8] Elements : Elements ⏳ ',' Element ;☕ ',' ']'
states[11].nodes = new int[] {
/*38*/st.@RightBracket符, // (4) -> new(LRParseAction.Kind.Shift, states[24])
/*39*/st.@Comma符, // (5) -> new(LRParseAction.Kind.Shift, states[25])
};
states[11].actions = new LRParseAction[] {
/*38*//* st.@RightBracket符(4), */new(LRParseAction.Kind.Shift, states[24]),
/*39*//* st.@Comma符(5), */new(LRParseAction.Kind.Shift, states[25]),
};
// syntaxStates[12]:
// [9] Elements : Element ⏳ ;☕ ',' ']'
states[12].nodes = new int[] {
/*40*/st.@RightBracket符, // (4) -> aReduce9
/*41*/st.@Comma符, // (5) -> aReduce9
};
states[12].actions = new LRParseAction[] {
/*40*//* st.@RightBracket符(4), */aReduce9,
/*41*//* st.@Comma符(5), */aReduce9,
};
// syntaxStates[13]:
// [11] Element : Value ⏳ ;☕ ',' ']'
states[13].nodes = new int[] {
/*42*/st.@RightBracket符, // (4) -> aReduce11
/*43*/st.@Comma符, // (5) -> aReduce11
};
states[13].actions = new LRParseAction[] {
/*42*//* st.@RightBracket符(4), */aReduce11,
/*43*//* st.@Comma符(5), */aReduce11,
};
// syntaxStates[14]:
// [12] Value : 'null' ⏳ ;☕ ',' ']' '}'
states[14].nodes = new int[] {
/*44*/st.@RightBrace符, // (2) -> aReduce12
/*45*/st.@RightBracket符, // (4) -> aReduce12
/*46*/st.@Comma符, // (5) -> aReduce12
};
states[14].actions = new LRParseAction[] {
/*44*//* st.@RightBrace符(2), */aReduce12,
/*45*//* st.@RightBracket符(4), */aReduce12,
/*46*//* st.@Comma符(5), */aReduce12,
};
// syntaxStates[15]:
// [13] Value : 'true' ⏳ ;☕ ',' ']' '}'
states[15].nodes = new int[] {
/*47*/st.@RightBrace符, // (2) -> aReduce13
/*48*/st.@RightBracket符, // (4) -> aReduce13
/*49*/st.@Comma符, // (5) -> aReduce13
};
states[15].actions = new LRParseAction[] {
/*47*//* st.@RightBrace符(2), */aReduce13,
/*48*//* st.@RightBracket符(4), */aReduce13,
/*49*//* st.@Comma符(5), */aReduce13,
};
// syntaxStates[16]:
// [14] Value : 'false' ⏳ ;☕ ',' ']' '}'
states[16].nodes = new int[] {
/*50*/st.@RightBrace符, // (2) -> aReduce14
/*51*/st.@RightBracket符, // (4) -> aReduce14
/*52*/st.@Comma符, // (5) -> aReduce14
};
states[16].actions = new LRParseAction[] {
/*50*//* st.@RightBrace符(2), */aReduce14,
/*51*//* st.@RightBracket符(4), */aReduce14,
/*52*//* st.@Comma符(5), */aReduce14,
};
// syntaxStates[17]:
// [15] Value : 'number' ⏳ ;☕ ',' ']' '}'
states[17].nodes = new int[] {
/*53*/st.@RightBrace符, // (2) -> aReduce15
/*54*/st.@RightBracket符, // (4) -> aReduce15
/*55*/st.@Comma符, // (5) -> aReduce15
};
states[17].actions = new LRParseAction[] {
/*53*//* st.@RightBrace符(2), */aReduce15,
/*54*//* st.@RightBracket符(4), */aReduce15,
/*55*//* st.@Comma符(5), */aReduce15,
};
// syntaxStates[18]:
// [16] Value : 'string' ⏳ ;☕ ',' ']' '}'
states[18].nodes = new int[] {
/*56*/st.@RightBrace符, // (2) -> aReduce16
/*57*/st.@RightBracket符, // (4) -> aReduce16
/*58*/st.@Comma符, // (5) -> aReduce16
};
states[18].actions = new LRParseAction[] {
/*56*//* st.@RightBrace符(2), */aReduce16,
/*57*//* st.@RightBracket符(4), */aReduce16,
/*58*//* st.@Comma符(5), */aReduce16,
};
// syntaxStates[19]:
// [17] Value : Object ⏳ ;☕ ',' ']' '}'
states[19].nodes = new int[] {
/*59*/st.@RightBrace符, // (2) -> aReduce17
/*60*/st.@RightBracket符, // (4) -> aReduce17
/*61*/st.@Comma符, // (5) -> aReduce17
};
states[19].actions = new LRParseAction[] {
/*59*//* st.@RightBrace符(2), */aReduce17,
/*60*//* st.@RightBracket符(4), */aReduce17,
/*61*//* st.@Comma符(5), */aReduce17,
};
// syntaxStates[20]:
// [18] Value : Array ⏳ ;☕ ',' ']' '}'
states[20].nodes = new int[] {
/*62*/st.@RightBrace符, // (2) -> aReduce18
/*63*/st.@RightBracket符, // (4) -> aReduce18
/*64*/st.@Comma符, // (5) -> aReduce18
};
states[20].actions = new LRParseAction[] {
/*62*//* st.@RightBrace符(2), */aReduce18,
/*63*//* st.@RightBracket符(4), */aReduce18,
/*64*//* st.@Comma符(5), */aReduce18,
};
// syntaxStates[21]:
// [3] Object : '{' Members '}' ⏳ ;☕ ',' ']' '}' '¥'
states[21].nodes = new int[] {
/*65*/st.@终, // (0) -> aReduce3
/*66*/st.@RightBrace符, // (2) -> aReduce3
/*67*/st.@RightBracket符, // (4) -> aReduce3
/*68*/st.@Comma符, // (5) -> aReduce3
};
states[21].actions = new LRParseAction[] {
/*65*//* st.@终(0), */aReduce3,
/*66*//* st.@RightBrace符(2), */aReduce3,
/*67*//* st.@RightBracket符(4), */aReduce3,
/*68*//* st.@Comma符(5), */aReduce3,
};
// syntaxStates[22]:
// [6] Members : Members ',' ⏳ Member ;☕ ',' '}'
// [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}'
states[22].nodes = new int[] {
/*69*/st.@string, // (6) -> aShift9
/*70*/st.Member枝, // (17) -> new(LRParseAction.Kind.Goto, states[26])
};
states[22].actions = new LRParseAction[] {
/*69*//* st.@string(6), */aShift9,
/*70*//* st.Member枝(17), */new(LRParseAction.Kind.Goto, states[26]),
};
// syntaxStates[23]:
// [10] Member : 'string' ':' ⏳ Value ;☕ ',' '}'
// [12] Value : ⏳ 'null' ;☕ ',' '}'
// [13] Value : ⏳ 'true' ;☕ ',' '}'
// [14] Value : ⏳ 'false' ;☕ ',' '}'
// [15] Value : ⏳ 'number' ;☕ ',' '}'
// [16] Value : ⏳ 'string' ;☕ ',' '}'
// [17] Value : ⏳ Object ;☕ ',' '}'
// [18] Value : ⏳ Array ;☕ ',' '}'
// [2] Object : ⏳ '{' '}' ;☕ ',' '}'
// [3] Object : ⏳ '{' Members '}' ;☕ ',' '}'
// [4] Array : ⏳ '[' ']' ;☕ ',' '}'
// [5] Array : ⏳ '[' Elements ']' ;☕ ',' '}'
states[23].nodes = new int[] {
/*71*/st.@LeftBrace符, // (1) -> aShift4
/*72*/st.@LeftBracket符, // (3) -> aShift5
/*73*/st.@string, // (6) -> aShift18
/*74*/st.@null, // (8) -> aShift14
/*75*/st.@true, // (9) -> aShift15
/*76*/st.@false, // (10) -> aShift16
/*77*/st.@number, // (11) -> aShift17
/*78*/st.Object枝, // (13) -> aGoto19
/*79*/st.Array枝, // (14) -> aGoto20
/*80*/st.Value枝, // (19) -> new(LRParseAction.Kind.Goto, states[27])
};
states[23].actions = new LRParseAction[] {
/*71*//* st.@LeftBrace符(1), */aShift4,
/*72*//* st.@LeftBracket符(3), */aShift5,
/*73*//* st.@string(6), */aShift18,
/*74*//* st.@null(8), */aShift14,
/*75*//* st.@true(9), */aShift15,
/*76*//* st.@false(10), */aShift16,
/*77*//* st.@number(11), */aShift17,
/*78*//* st.Object枝(13), */aGoto19,
/*79*//* st.Array枝(14), */aGoto20,
/*80*//* st.Value枝(19), */new(LRParseAction.Kind.Goto, states[27]),
};
// syntaxStates[24]:
// [5] Array : '[' Elements ']' ⏳ ;☕ ',' ']' '}' '¥'
states[24].nodes = new int[] {
/*81*/st.@终, // (0) -> aReduce5
/*82*/st.@RightBrace符, // (2) -> aReduce5
/*83*/st.@RightBracket符, // (4) -> aReduce5
/*84*/st.@Comma符, // (5) -> aReduce5
};
states[24].actions = new LRParseAction[] {
/*81*//* st.@终(0), */aReduce5,
/*82*//* st.@RightBrace符(2), */aReduce5,
/*83*//* st.@RightBracket符(4), */aReduce5,
/*84*//* st.@Comma符(5), */aReduce5,
};
// syntaxStates[25]:
// [8] Elements : Elements ',' ⏳ Element ;☕ ',' ']'
// [11] Element : ⏳ Value ;☕ ',' ']'
// [12] Value : ⏳ 'null' ;☕ ',' ']'
// [13] Value : ⏳ 'true' ;☕ ',' ']'
// [14] Value : ⏳ 'false' ;☕ ',' ']'
// [15] Value : ⏳ 'number' ;☕ ',' ']'
// [16] Value : ⏳ 'string' ;☕ ',' ']'
// [17] Value : ⏳ Object ;☕ ',' ']'
// [18] Value : ⏳ Array ;☕ ',' ']'
// [2] Object : ⏳ '{' '}' ;☕ ',' ']'
// [3] Object : ⏳ '{' Members '}' ;☕ ',' ']'
// [4] Array : ⏳ '[' ']' ;☕ ',' ']'
// [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']'
states[25].nodes = new int[] {
/*85*/st.@LeftBrace符, // (1) -> aShift4
/*86*/st.@LeftBracket符, // (3) -> aShift5
/*87*/st.@string, // (6) -> aShift18
/*88*/st.@null, // (8) -> aShift14
/*89*/st.@true, // (9) -> aShift15
/*90*/st.@false, // (10) -> aShift16
/*91*/st.@number, // (11) -> aShift17
/*92*/st.Object枝, // (13) -> aGoto19
/*93*/st.Array枝, // (14) -> aGoto20
/*94*/st.Element枝, // (18) -> new(LRParseAction.Kind.Goto, states[28])
/*95*/st.Value枝, // (19) -> aGoto13
};
states[25].actions = new LRParseAction[] {
/*85*//* st.@LeftBrace符(1), */aShift4,
/*86*//* st.@LeftBracket符(3), */aShift5,
/*87*//* st.@string(6), */aShift18,
/*88*//* st.@null(8), */aShift14,
/*89*//* st.@true(9), */aShift15,
/*90*//* st.@false(10), */aShift16,
/*91*//* st.@number(11), */aShift17,
/*92*//* st.Object枝(13), */aGoto19,
/*93*//* st.Array枝(14), */aGoto20,
/*94*//* st.Element枝(18), */new(LRParseAction.Kind.Goto, states[28]),
/*95*//* st.Value枝(19), */aGoto13,
};
// syntaxStates[26]:
// [6] Members : Members ',' Member ⏳ ;☕ ',' '}'
states[26].nodes = new int[] {
/*96*/st.@RightBrace符, // (2) -> aReduce6
/*97*/st.@Comma符, // (5) -> aReduce6
};
states[26].actions = new LRParseAction[] {
/*96*//* st.@RightBrace符(2), */aReduce6,
/*97*//* st.@Comma符(5), */aReduce6,
};
// syntaxStates[27]:
// [10] Member : 'string' ':' Value ⏳ ;☕ ',' '}'
states[27].nodes = new int[] {
/*98*/st.@RightBrace符, // (2) -> aReduce10
/*99*/st.@Comma符, // (5) -> aReduce10
};
states[27].actions = new LRParseAction[] {
/*98*//* st.@RightBrace符(2), */aReduce10,
/*99*//* st.@Comma符(5), */aReduce10,
};
// syntaxStates[28]:
// [8] Elements : Elements ',' Element ⏳ ;☕ ',' ']'
states[28].nodes = new int[] {
/*100*/st.@RightBracket符, // (4) -> aReduce8
/*101*/st.@Comma符, // (5) -> aReduce8
};
states[28].actions = new LRParseAction[] {
/*100*//* st.@RightBracket符(4), */aReduce8,
/*101*//* st.@Comma符(5), */aReduce8,
};
#endregion init actions of syntax states
return states;
}
}
}
另外4个Json.Dict.*.gen.cs_
分别是LL(1)、LR(0)、SLR(1)、LR(1)的语法分析状态机,不再赘述。
它是第二个实现,它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。
Json.Table.*.gen.bin
与词法分析器类似,这是将数组形式(int[]+LRParseAction[]
)的语法分析表写入了一个二进制文件。加载Json解析器时,读取此文件即可得到数组形式(int[]+LRParseAction[]
)的语法分析表。这就不需要将整个语法分析表硬编码到源代码中了,从而进一步减少了内存占用。
为了方便调试、参考,我为其准备了对应的文本格式,例如LALR(1)的语法分析表:
Json.Table.LALR(1).gen.txt
conflicts(0)=not sovled(0)+solved(0)(0 warnings)
29 states.
28 re-used actions
[0]:Shift[4] [1]:Shift[5] [2]:Shift[9] [3]:Goto[13]
[4]:Shift[14] [5]:Shift[15] [6]:Shift[16] [7]:Shift[17] [8]:Shift[18]
[9]:Goto[19] [10]:Goto[20] [11]:Reduce[2] [12]:Reduce[7] [13]:Reduce[4]
[14]:Reduce[9] [15]:Reduce[11] [16]:Reduce[12] [17]:Reduce[13] [18]:Reduce[14]
[19]:Reduce[15] [20]:Reduce[16] [21]:Reduce[17] [22]:Reduce[18] [23]:Reduce[3]
[24]:Reduce[5] [25]:Reduce[6] [26]:Reduce[10] [27]:Reduce[8]
states[0].nodes[5]:
1 3 12 13 14
states[0].actions[5]:
-4(0)Shift[4] -4(1)Shift[5] Goto[1] Goto[2] Goto[3]
states[1].nodes[1]:
0
states[1].actions[1]:
Accept[0]
states[2].nodes[1]:
0
states[2].actions[1]:
Reduce[0]
states[3].nodes[1]:
0
states[3].actions[1]:
Reduce[1]
states[4].nodes[4]:
2 6 15 17
states[4].actions[4]:
Shift[6] -2(2)Shift[9] Goto[7] Goto[8]
states[5].nodes[13]:
1 3 4 6 8 9 10 11 13 14 16 18 19
states[5].actions[13]:
-4(0)Shift[4] -4(1)Shift[5] Shift[10] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[11] Goto[12] -2(3)Goto[13]
states[6].nodes[4]:
0 2 4 5
states[6].actions[4]:
-4(11)Reduce[2] -4(11)Reduce[2] -4(11)Reduce[2] -4(11)Reduce[2]
states[7].nodes[2]:
2 5
states[7].actions[2]:
Shift[21] Shift[22]
states[8].nodes[2]:
2 5
states[8].actions[2]:
-2(12)Reduce[7] -2(12)Reduce[7]
states[9].nodes[1]:
7
states[9].actions[1]:
Shift[23]
states[10].nodes[4]:
0 2 4 5
states[10].actions[4]:
-4(13)Reduce[4] -4(13)Reduce[4] -4(13)Reduce[4] -4(13)Reduce[4]
states[11].nodes[2]:
4 5
states[11].actions[2]:
Shift[24] Shift[25]
states[12].nodes[2]:
4 5
states[12].actions[2]:
-2(14)Reduce[9] -2(14)Reduce[9]
states[13].nodes[2]:
4 5
states[13].actions[2]:
-2(15)Reduce[11] -2(15)Reduce[11]
states[14].nodes[3]:
2 4 5
states[14].actions[3]:
-3(16)Reduce[12] -3(16)Reduce[12] -3(16)Reduce[12]
states[15].nodes[3]:
2 4 5
states[15].actions[3]:
-3(17)Reduce[13] -3(17)Reduce[13] -3(17)Reduce[13]
states[16].nodes[3]:
2 4 5
states[16].actions[3]:
-3(18)Reduce[14] -3(18)Reduce[14] -3(18)Reduce[14]
states[17].nodes[3]:
2 4 5
states[17].actions[3]:
-3(19)Reduce[15] -3(19)Reduce[15] -3(19)Reduce[15]
states[18].nodes[3]:
2 4 5
states[18].actions[3]:
-3(20)Reduce[16] -3(20)Reduce[16] -3(20)Reduce[16]
states[19].nodes[3]:
2 4 5
states[19].actions[3]:
-3(21)Reduce[17] -3(21)Reduce[17] -3(21)Reduce[17]
states[20].nodes[3]:
2 4 5
states[20].actions[3]:
-3(22)Reduce[18] -3(22)Reduce[18] -3(22)Reduce[18]
states[21].nodes[4]:
0 2 4 5
states[21].actions[4]:
-4(23)Reduce[3] -4(23)Reduce[3] -4(23)Reduce[3] -4(23)Reduce[3]
states[22].nodes[2]:
6 17
states[22].actions[2]:
-2(2)Shift[9] Goto[26]
states[23].nodes[10]:
1 3 6 8 9 10 11 13 14 19
states[23].actions[10]:
-4(0)Shift[4] -4(1)Shift[5] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[27]
states[24].nodes[4]:
0 2 4 5
states[24].actions[4]:
-4(24)Reduce[5] -4(24)Reduce[5] -4(24)Reduce[5] -4(24)Reduce[5]
states[25].nodes[11]:
1 3 6 8 9 10 11 13 14 18 19
states[25].actions[11]:
-4(0)Shift[4] -4(1)Shift[5] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[28] -2(3)Goto[13]
states[26].nodes[2]:
2 5
states[26].actions[2]:
-2(25)Reduce[6] -2(25)Reduce[6]
states[27].nodes[2]:
2 5
states[27].actions[2]:
-2(26)Reduce[10] -2(26)Reduce[10]
states[28].nodes[2]:
4 5
states[28].actions[2]:
-2(27)Reduce[8] -2(27)Reduce[8]
它是第三个实现,这是目前使用的实现方式。为了加载路径上的方便,我将其从Json.gen\SyntaxParser
文件夹挪到了Json.gen
文件夹下。
Json.Regulations.gen.cs_
这是一个数组,记录了整个Json文法的全部规则:
Json.Regulations.gen.cs_
using System;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
public static readonly IReadOnlyList<Regulation> regulations = new Regulation[] {
// [0] Json = Object ;
new(0, st.Json枝, st.Object枝),
// [1] Json = Array ;
new(1, st.Json枝, st.Array枝),
// [2] Object = '{' '}' ;
new(2, st.Object枝, st.@LeftBrace符, st.@RightBrace符),
// [3] Object = '{' Members '}' ;
new(3, st.Object枝, st.@LeftBrace符, st.Members枝, st.@RightBrace符),
// [4] Array = '[' ']' ;
new(4, st.Array枝, st.@LeftBracket符, st.@RightBracket符),
// [5] Array = '[' Elements ']' ;
new(5, st.Array枝, st.@LeftBracket符, st.Elements枝, st.@RightBracket符),
// [6] Members = Members ',' Member ;
new(6, st.Members枝, st.Members枝, st.@Comma符, st.Member枝),
// [7] Members = Member ;
new(7, st.Members枝, st.Member枝),
// [8] Elements = Elements ',' Element ;
new(8, st.Elements枝, st.Elements枝, st.@Comma符, st.Element枝),
// [9] Elements = Element ;
new(9, st.Elements枝, st.Element枝),
// [10] Member = 'string' ':' Value ;
new(10, st.Member枝, st.@string, st.@Colon符, st.Value枝),
// [11] Element = Value ;
new(11, st.Element枝, st.Value枝),
// [12] Value = 'null' ;
new(12, st.Value枝, st.@null),
// [13] Value = 'true' ;
new(13, st.Value枝, st.@true),
// [14] Value = 'false' ;
new(14, st.Value枝, st.@false),
// [15] Value = 'number' ;
new(15, st.Value枝, st.@number),
// [16] Value = 'string' ;
new(16, st.Value枝, st.@string),
// [17] Value = Object ;
new(17, st.Value枝, st.Object枝),
// [18] Value = Array ;
new(18, st.Value枝, st.Array枝),
};
}
}
为了减少内存占用,这个硬编码的实现方式也已经被一个二进制文件(Json.Regulations.gen.bin)取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。
Json.Regulations.gen.bin对应的文本格式
19
12 = 1 (13)
12 = 1 (14)
13 = 2 (1 | 2)
13 = 3 (1 | 15 | 2)
14 = 2 (3 | 4)
14 = 3 (3 | 16 | 4)
15 = 3 (15 | 5 | 17)
15 = 1 (17)
16 = 3 (16 | 5 | 18)
16 = 1 (18)
17 = 3 (6 | 7 | 19)
18 = 1 (19)
19 = 1 (8)
19 = 1 (9)
19 = 1 (10)
19 = 1 (11)
19 = 1 (6)
19 = 1 (13)
19 = 1 (14)
总而言之,如下所示:

生成的提取器
所谓提取,就是按后序优先遍历的顺序访问语法树的各个结点,在访问时提取出语义信息。
例如,{ "a": 0.3, "b": true, "a": "again" }
的语法树是这样的:
R[0] Json = Object ;⛪T[0->12]
└─R[3] Object = '{' Members '}' ;⛪T[0->12]
├─T[0]='{' {
├─R[6] Members = Members ',' Member ;⛪T[1->11]
│ ├─R[6] Members = Members ',' Member ;⛪T[1->7]
│ │ ├─R[7] Members = Member ;⛪T[1->3]
│ │ │ └─R[10] Member = 'string' ':' Value ;⛪T[1->3]
│ │ │ ├─T[1]='string' "a"
│ │ │ ├─T[2]=':' :
│ │ │ └─R[15] Value = 'number' ;⛪T[3]
│ │ │ └─T[3]='number' 0.3
│ │ ├─T[4]=',' ,
│ │ └─R[10] Member = 'string' ':' Value ;⛪T[5->7]
│ │ ├─T[5]='string' "b"
│ │ ├─T[6]=':' :
│ │ └─R[13] Value = 'true' ;⛪T[7]
│ │ └─T[7]='true' true
│ ├─T[8]=',' ,
│ └─R[10] Member = 'string' ':' Value ;⛪T[9->11]
│ ├─T[9]='string' "a"
│ ├─T[10]=':' :
│ └─R[16] Value = 'string' ;⛪T[11]
│ └─T[11]='string' "again"
└─T[12]='}' }
按后序优先遍历的顺序,提取器会依次访问T[0]
、T[1]
、T[2]
、T[3]
并将其入栈,然后访问R[15] Value = 'number' ;⛪T[3]
,此时应当:
// [15] Value = 'number' ;
var r0 = (Token)context.rightStack.Pop();// T[3]出栈
var left = new JsonValue(JsonValue.Kind.Number, r0.value);
context.rightStack.Push(left);// Value入栈
之后会访问R[10] Member = 'string' ':' Value ;⛪T[1->3]
,此时应当:
// [10] Member = 'string' ':' Value ;
var r0 = (JsonValue)context.rightStack.Pop();// Value出栈
var r1 = (Token)context.rightStack.Pop();// T[2]出栈
var r2 = (Token)context.rightStack.Pop();// T[1]出栈
var left = new JsonMember(key: r2.value, value: r0);
context.rightStack.Push(left);// Member入栈
这样逐步地访问到根节点R[0] Json = Object ;⛪T[0->12]
,此时应当:
var r0 = (List<JsonMember>)context.rightStack.Pop();// Member列表出栈
var left = new Json(r0);
context.rightStack.Push(left);// Json入栈
这样,语法树访问完毕了,栈context.rightStack
中有且只有1个对象,即最终的Json
。此时应当:
// [-1] Json' = Json ;
context.result = (Json)context.rightStack.Pop();
提取器的完整代码InitializeExtractorItems
using System;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
/// <summary>
/// <see cref="LRNode.type"/> -> <see cref="Action{LRNode, TContext{Json}}"/>
/// </summary>
private static readonly Action<LRNode, TContext<Json>>?[]
@jsonExtractorItems = new Action<LRNode, TContext<Json>>[1/*'¥'*/ + 8/*Vn*/];
/// <summary>
/// initialize dict for extractor.
/// </summary>
private static void InitializeExtractorItems() {
var extractorItems = @jsonExtractorItems;
#region obsolete
//extractorDict.Add(st.NotYet,
//(node, context) => {
// not needed.
//});
//extractorDict.Add(st.Error,
//(node, context) => {
// nothing to do.
//});
//extractorDict.Add(st.blockComment,
//(node, context) => {
// not needed.
//});
//extractorDict.Add(st.inlineComment,
//(node, context) => {
// not needed.
//});
#endregion obsolete
extractorItems[st.@终/*0*/] = static (node, context) => {
// [-1] Json' = Json ;
// dumped by user-defined extractor
context.result = (Json)context.rightStack.Pop();
}; // end of extractorItems[st.@终/*0*/] = (node, context) => { ... };
const int lexiVtCount = 11;
extractorItems[st.Json枝/*12*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 0: { // [0] Json = Object ;
// dumped by user-defined extractor
var r0 = (List<JsonMember>)context.rightStack.Pop();
var left = new Json(r0);
context.rightStack.Push(left);
}
break;
case 1: { // [1] Json = Array ;
// dumped by user-defined extractor
var r0 = (List<JsonValue>)context.rightStack.Pop();
var left = new Json(r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Json枝/*12*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Object枝/*13*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 2: { // [2] Object = '{' '}' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new List<JsonMember>();
context.rightStack.Push(left);
}
break;
case 3: { // [3] Object = '{' Members '}' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var r1 = (List<JsonMember>)context.rightStack.Pop();
var r2 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = r1;
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Object枝/*13*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Array枝/*14*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 4: { // [4] Array = '[' ']' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new List<JsonValue>();
context.rightStack.Push(left);
}
break;
case 5: { // [5] Array = '[' Elements ']' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var r1 = (List<JsonValue>)context.rightStack.Pop();
var r2 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = r1;
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Array枝/*14*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Members枝/*15*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 6: { // [6] Members = Members ',' Member ;
// dumped by user-defined extractor
var r0 = (JsonMember)context.rightStack.Pop();
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var r2 = (List<JsonMember>)context.rightStack.Pop();
var left = r2;
left.Add(r0);
context.rightStack.Push(left);
}
break;
case 7: { // [7] Members = Member ;
// dumped by user-defined extractor
var r0 = (JsonMember)context.rightStack.Pop();
var left = new List<JsonMember>();
left.Add(r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Members枝/*15*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Elements枝/*16*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 8: { // [8] Elements = Elements ',' Element ;
// dumped by user-defined extractor
var r0 = (JsonValue)context.rightStack.Pop();
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var r2 = (List<JsonValue>)context.rightStack.Pop();
var left = r2;
left.Add(r0);
context.rightStack.Push(left);
}
break;
case 9: { // [9] Elements = Element ;
// dumped by user-defined extractor
var r0 = (JsonValue)context.rightStack.Pop();
var left = new List<JsonValue>();
left.Add(r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Elements枝/*16*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Member枝/*17*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 10: { // [10] Member = 'string' ':' Value ;
// dumped by user-defined extractor
var r0 = (JsonValue)context.rightStack.Pop();
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var r2 = (Token)context.rightStack.Pop();
var left = new JsonMember(key: r2.value, value: r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Member枝/*17*/ - lexiVtCount] = (node, context) => { ... };
/*
extractorItems[st.Element枝(18) - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 11: { // [11] Element = Value ;
// dumped by DefaultExtractor
// var r0 = (VnValue)context.rightStack.Pop();
// var left = new VnElement(r0);
// context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Element枝(18) - lexiVtCount] = (node, context) => { ... };
*/
extractorItems[st.Value枝/*19*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 12: { // [12] Value = 'null' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();
var left = new JsonValue(JsonValue.Kind.Null, r0.value);
context.rightStack.Push(left);
}
break;
case 13: { // [13] Value = 'true' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();
var left = new JsonValue(JsonValue.Kind.True, r0.value);
context.rightStack.Push(left);
}
break;
case 14: { // [14] Value = 'false' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();
var left = new JsonValue(JsonValue.Kind.False, r0.value);
context.rightStack.Push(left);
}
break;
case 15: { // [15] Value = 'number' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();
var left = new JsonValue(JsonValue.Kind.Number, r0.value);
context.rightStack.Push(left);
}
break;
case 16: { // [16] Value = 'string' ;
// dumped by user-defined extractor
var r0 = (Token)context.rightStack.Pop();
var left = new JsonValue(JsonValue.Kind.String, r0.value);
context.rightStack.Push(left);
}
break;
case 17: { // [17] Value = Object ;
// dumped by user-defined extractor
var r0 = (List<JsonMember>)context.rightStack.Pop();
var left = new JsonValue(r0);
context.rightStack.Push(left);
}
break;
case 18: { // [18] Value = Array ;
// dumped by user-defined extractor
var r0 = (List<JsonValue>)context.rightStack.Pop();
var left = new JsonValue(r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Value枝/*19*/ - lexiVtCount] = (node, context) => { ... };
}
}
}
不同的应用场景会要求不同的语义信息,因而一键生成的提取器代码不是这样的,而是仅仅将语法树压平了,并且保留了尽可能多的源代码信息,如下所示:
一键生成的提取器代码
using System;
using bitzhuwei.Compiler;
namespace bitzhuwei.JsonFormat {
partial class CompilerJson {
/// <summary>
/// <see cref="LRNode.type"/> -> <see cref="Action{LRNode, TContext{Json}}"/>
/// </summary>
private static readonly Action<LRNode, TContext<Json>>?[]
@jsonExtractorItems = new Action<LRNode, TContext<Json>>[1/*'¥'*/ + 8/*Vn*/];
/// <summary>
/// initialize dict for extractor.
/// </summary>
private static void InitializeExtractorItems() {
var extractorItems = @jsonExtractorItems;
#region obsolete
//extractorDict.Add(st.NotYet,
//(node, context) => {
// not needed.
//});
//extractorDict.Add(st.Error,
//(node, context) => {
// nothing to do.
//});
//extractorDict.Add(st.blockComment,
//(node, context) => {
// not needed.
//});
//extractorDict.Add(st.inlineComment,
//(node, context) => {
// not needed.
//});
#endregion obsolete
extractorItems[st.@终/*0*/] = static (node, context) => {
// [-1] Json' = Json ;
// dumped by ExternalExtractor
var @final = (VnJson)context.rightStack.Pop();
var left = new Json(@final);
context.result = left; // final step, no need to push into stack.
}; // end of extractorItems[st.@终/*0*/] = (node, context) => { ... };
const int lexiVtCount = 11;
extractorItems[st.Json枝/*12*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 0: { // [0] Json = Object ;
// dumped by InheritExtractor
// class VnObject : VnJson
var r0 = (VnObject)context.rightStack.Pop();
var left = r0;
context.rightStack.Push(left);
}
break;
case 1: { // [1] Json = Array ;
// dumped by InheritExtractor
// class VnArray : VnJson
var r0 = (VnArray)context.rightStack.Pop();
var left = r0;
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Json枝/*12*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Object枝/*13*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 2: { // [2] Object = '{' '}' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new VnObject(r1, r0);
context.rightStack.Push(left);
}
break;
case 3: { // [3] Object = '{' Members '}' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var r1 = (VnMembers)context.rightStack.Pop();
var r2 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new VnObject(r2, r1, r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Object枝/*13*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Array枝/*14*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 4: { // [4] Array = '[' ']' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new VnArray(r1, r0);
context.rightStack.Push(left);
}
break;
case 5: { // [5] Array = '[' Elements ']' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var r1 = (VnElements)context.rightStack.Pop();
var r2 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new VnArray(r2, r1, r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Array枝/*14*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Members枝/*15*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 6: { // [6] Members = Members ',' Member ;
// dumped by ListExtractor 2
var r0 = (VnMember)context.rightStack.Pop();
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var r2 = (VnMembers)context.rightStack.Pop();
var left = r2;
left.Add(r1, r0);
context.rightStack.Push(left);
}
break;
case 7: { // [7] Members = Member ;
// dumped by ListExtractor 1
var r0 = (VnMember)context.rightStack.Pop();
var left = new VnMembers(r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Members枝/*15*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Elements枝/*16*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 8: { // [8] Elements = Elements ',' Element ;
// dumped by ListExtractor 2
var r0 = (VnElement)context.rightStack.Pop();
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var r2 = (VnElements)context.rightStack.Pop();
var left = r2;
left.Add(r1, r0);
context.rightStack.Push(left);
}
break;
case 9: { // [9] Elements = Element ;
// dumped by ListExtractor 1
var r0 = (VnElement)context.rightStack.Pop();
var left = new VnElements(r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Elements枝/*16*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Member枝/*17*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 10: { // [10] Member = 'string' ':' Value ;
// dumped by DefaultExtractor
var r0 = (VnValue)context.rightStack.Pop();
var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
var r2 = (Token)context.rightStack.Pop();
var left = new VnMember(r2, r1, r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Member枝/*17*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Element枝/*18*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 11: { // [11] Element = Value ;
// dumped by DefaultExtractor
var r0 = (VnValue)context.rightStack.Pop();
var left = new VnElement(r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Element枝/*18*/ - lexiVtCount] = (node, context) => { ... };
extractorItems[st.Value枝/*19*/ - lexiVtCount] = static (node, context) => {
switch (node.regulation.index) {
case 12: { // [12] Value = 'null' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new VnValue(r0);
context.rightStack.Push(left);
}
break;
case 13: { // [13] Value = 'true' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new VnValue(r0);
context.rightStack.Push(left);
}
break;
case 14: { // [14] Value = 'false' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
var left = new VnValue(r0);
context.rightStack.Push(left);
}
break;
case 15: { // [15] Value = 'number' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();
var left = new VnValue(r0);
context.rightStack.Push(left);
}
break;
case 16: { // [16] Value = 'string' ;
// dumped by DefaultExtractor
var r0 = (Token)context.rightStack.Pop();
var left = new VnValue(r0);
context.rightStack.Push(left);
}
break;
case 17: { // [17] Value = Object ;
// dumped by DefaultExtractor
var r0 = (VnObject)context.rightStack.Pop();
var left = new VnValue(r0);
context.rightStack.Push(left);
}
break;
case 18: { // [18] Value = Array ;
// dumped by DefaultExtractor
var r0 = (VnArray)context.rightStack.Pop();
var left = new VnValue(r0);
context.rightStack.Push(left);
}
break;
default: throw new NotImplementedException();
}
}; // end of extractorItems[st.Value枝/*19*/ - lexiVtCount] = (node, context) => { ... };
}
}
}
这是步子最小的保守式代码,程序员可以在此基础上继续开发,也可以自行编写访问各类型结点的提取动作。本应用场景的目的是尽可能高效地解析Json文本文件,因而完全自行编写了访问各类型结点的提取动作。
测试
测试用例0
{}
测试用例1
[]
测试用例2
{ "a": 0.3 }
测试用例3
{
"a": 0.3,
"b": true
}
测试用例4
{
"a": 0.3,
"b": true,
"a": "again"
}
测试用例5
{
"a": 0.3,
"b": true,
"a": "again",
"array": [
1,
true,
null,
"str",
{
"t": 100,
"array2": [ false, 3.14, "tmp" ]
}
]
}
上述测试用例都能够被Json解析器正确解析,也可以在(https://jsonlint.com/)验证。
调用Json解析器的代码如下:
var compiler = new bitzhuwei.JsonFormat.CompilerJson();
var sourceCode = File.ReadAllText("xxx.json");
var tokens = compiler.Analyze(sourceCode);
var syntaxTree = compiler.Parse(tokens);
var json = compiler.Extract(syntaxTree.root, tokens, sourceCode);
// use json ...
文章转载自: ++BIT祝威++