克己php说明器技巧_从0开始低廉甜头解释器重构代码

文章目录 [+]

extern Token g_currentToken; //当前tokenextern int g_nPosition; //当前字符索引的位置extern char g_currentChar; //当前字符串

之前通过 get_next_char() 来返回当前指向的token并变更索引的时候创造我们在任何时候想获取当前指向的字符时永久要变更索引，这样就不得不考虑在某些时候要进行索引的回退。
比如在解析整数退出的时候，此时当前字符已经指向下一个字符了，但是我们在接下来解析其他符号的时候调用 get_next_char() 导致索引多增加了一个。
这种情形常常涌现，因此这里利用全局变量保存当前字符，只在须要进行索引增加的时候进行增加。
其余我们不肯望上层来直接操作这个索引，因此在最底层的Token模块供应一个名为 advance() 的函数用于将索引加一，并获取之后的字符。
它的定义如下

void advance(){ g_nPosition++; // 如果到达字符串尾部，索引不再增加 if (g_nPosition >= strlen(g_pszUserBuf)) { g_currentChar = '

void advance(){ g_nPosition++; // 如果到达字符串尾部，索引不再增加 if (g_nPosition >= strlen(g_pszUserBuf)) { g_currentChar = '\0'; } else { g_currentChar = g_pszUserBuf[g_nPosition]; }}

这样在对应须要用到当前字符的位置就不再利用 get_next_char() , 而是改用全局变量 g_currentChar。例如现在的 skip_whitespace 函数现在的定义如下

'; } else { g_currentChar = g_pszUserBuf[g_nPosition]; }}

这样在对应须要用到当前字符的位置就不再利用 get_next_char() , 而是改用全局变量 g_currentChar。
例如现在的 skip_whitespace 函数现在的定义如下

克己php说明器技巧_从0开始低廉甜头解释器重构代码

void skip_whitespace(){ while (is_space(g_currentChar)) { advance(); }}

这样我们在获取下一个token的时候只在必要的时候进行索引的递增。

（图片来自网络侵删）

lex 模块

由于打标签的事情交个底层的Token模块了，该模块紧张用来实现词法剖析的功能，也便是给各个部分打上标签，根据之前Token部分供应的接口，须要对 get_next_token 函数进行修正。

bool get_next_token(){ dyncstring_reset(&g_currentToken.value); while (g_currentChar != '\0') { if (is_digit(g_currentChar)) { g_currentToken.type = CINT; parser_number(&g_currentToken.value); return true; } else if (is_space(g_currentChar)) { skip_whitespace(); } else { switch (g_currentChar) { case '+': g_currentToken.type = PLUS; dyncstring_catch(&g_currentToken.value, '+'); advance(); break; case '-': g_currentToken.type = MINUS; dyncstring_catch(&g_currentToken.value, '-'); advance(); break; case '': g_currentToken.type = DIV; dyncstring_catch(&g_currentToken.value, ''); advance(); break; case '/': g_currentToken.type = MUL; dyncstring_catch(&g_currentToken.value, '/'); advance(); break; case '(': g_currentToken.type = LPAREN; dyncstring_catch(&g_currentToken.value, '('); advance(); break; case ')': g_currentToken.type = RPAREN; dyncstring_catch(&g_currentToken.value, ')'); advance(); break; case '\0': g_currentToken.type = END_OF_FILE; break; default: return false; } return true; } } return true;}

在这个函数中，将不再通过输出参数来返回当前的token，而是直接修正全局变量。
同时也不再利用get_next_char 函数来获取当前指向的字符，而是直策应用全局变量。
并且在适当的机遇调用advance 来实现递增。

其余在上层我们直策应用 g_currentToken 拿到当前的token，而在适当的机遇调用新增的eat() 函数来实现更新token的操作。

bool eat(LPTOKEN pToken, ETokenType eType){ if (pToken->type == eType) { get_next_token(); return true; } return false;}

该函数接管两个参数，第一个是当前token的值，第二个是我们期望当前token是何种类型。
如果当前token的类型与期望的不符则报错，否则更新token。

interpreter 模块

该模块紧张卖力解析根据前面的BNF范式来完成打算并解析内容。
这个模块供应三个函数get_factor、get_term、expr。
这三个函数的功能没有变革，只是在实现上依赖lex 模块供应的功能。
紧张思路是直策应用 g_currentToken 这个全局变量来获得当前的token，利用 eat() 来更新并得到下一个token的值。
这里我们以get_factor() 函数为例

int get_factor(bool pRet){ int value = 0; if (g_currentToken.type == CINT) { value = atoi(g_currentToken.value.pszBuf); pRet = eat(&g_currentToken, CINT); } else { if (g_currentToken.type == LPAREN) { bool bValid = true; bValid = eat(&g_currentToken, LPAREN); value = expr(&bValid); bValid = eat(&g_currentToken, RPAREN); pRet = bValid; } } return value;}

与前面剖析的相同，该函数紧张卖力获取整数和打算括号中子表达式的值。
在解析完全数和括号中的子表达式之后，须要调用eat分别跳过对应的值。
只是在识别到括号之后须要跳过旁边两个括号。

这样就完成了对应的分层，每层只卖力自己该做的事。
不用在上层考虑修正索引的问题，构造也更加清晰，未来在添加功能的时候也更加方便。
剩下几个函数就不再贴出代码了，感兴趣的小伙伴可以去对应的GitHub仓库上查阅干系代码。