
expression: number
| expression ‘*’ expression
| expression ‘/’ expression
| expression ‘+’ expression
| expression ‘-’ expression

number: T_INTLIT

function expression() {
Scan and check the first token is a number. Error if it’s not
Get the next token
If we have reached the end of the input, return, i.e. base case

Otherwise, call expression()
让我们来模拟一次此函数的运行,输入为2 + 3 - 5 T_EOF其中T_EOF 是反映输入结束的标记。

Scan in the 2, it’s a number
Get next token, +, which isn’t T_EOF
Call expression()

expression1:Scan in the 3, it's a numberGet next token, -, which isn't T_EOFCall expression()expression2:Scan in the 5, it's a numberGet next token, T_EOF, so return from expression2return from expression1

return from expression0


// defs.h
// AST node types
enum {

// Abstract Syntax Tree structure
struct ASTnode {
int op; // “Operation” to be performed on this tree
struct ASTnode *left; // Left and right child trees
struct ASTnode *right;
int intvalue; // For A_INTLIT, the integer value

tree.c 中的代码具有构建 AST 的功能。函数mkastnode()生成一个节点并返回指向节点的指针:

// tree.c
// Build and return a generic AST node
struct ASTnode *mkastnode(int op, struct ASTnode *left,
struct ASTnode *right, int intvalue) {
struct ASTnode *n;

// Malloc a new ASTnode
n = (struct ASTnode *) malloc(sizeof(struct ASTnode));
if (n == NULL) {
fprintf(stderr, “Unable to malloc in mkastnode()\n”);
// Copy in the field values and return it
n->op = op;
n->left = left;
n->right = right;
n->intvalue = intvalue;
return (n);

// Make an AST leaf node
struct ASTnode *mkastleaf(int op, int intvalue) {
return (mkastnode(op, NULL, NULL, intvalue));

// Make a unary AST node: only one child
struct ASTnode *mkastunary(int op, struct ASTnode *left, int intvalue) {
return (mkastnode(op, left, NULL, intvalue));
我们将使用 AST 来存储我们识别的每个表达式,以便稍后我们可以递归遍历它来计算表达式的最终值。 我们确实想处理数学运算符的优先级。 这是一个例子。 考虑表达式 2 * 3 4 * 5。现在,乘法比加法具有更高的优先级。 因此,我们希望将乘法操作数绑定在一起并在进行加法之前执行这些操作。

如果我们生成 AST 树看起来像这样:

      +/ \/   \/     \*       */ \     / \
2   3   4   5

然后,在遍历树时,我们会先执行 2 * 3,然后是 4 * 5。一旦我们有了这些结果,我们就可以将它们传递给树的根来执行加法。


// expr.c
// Convert a token into an AST operation.
int arithop(int tok) {
switch (tok) {
case T_PLUS:
return (A_ADD);
case T_MINUS:
return (A_SUBTRACT);
case T_STAR:
return (A_MULTIPLY);
case T_SLASH:
return (A_DIVIDE);
fprintf(stderr, “unknown token in arithop() on line %d\n”, Line);
我们需要一个函数来检查下一个标记是否是整数文字,并构建一个 AST 节点来保存文字值。如下:

// Parse a primary factor and return an
// AST node representing it.
static struct ASTnode *primary(void) {
struct ASTnode *n;

// For an INTLIT token, make a leaf AST node for it
// and scan in the next token. Otherwise, a syntax error
// for any other token type.
switch (Token.token) {
case T_INTLIT:
n = mkastleaf(A_INTLIT, Token.intvalue);
return (n);
fprintf(stderr, “syntax error on line %d\n”, Line);


// Return an AST tree whose root is a binary operator
struct ASTnode *binexpr(void) {
struct ASTnode *n, *left, *right;
int nodetype;

// Get the integer literal on the left.
// Fetch the next token at the same time.
left = primary();

// If no tokens left, return just the left node
if (Token.token == T_EOF)
return (left);

// Convert the token into a node type
nodetype = arithop(Token.token);

// Get the next token in

// Recursively get the right-hand tree
right = binexpr();

// Now build a tree with both sub-trees
n = mkastnode(nodetype, left, right, 0);
return (n);

/ \

2 +
3 *
4 5

      +/ \/   \/     \*       */ \     / \
2   3   4   5



First, interpret the left-hand sub-tree and get its value
Then, interpret the right-hand sub-tree and get its value
Perform the operation in the node at the root of our tree
on the two sub-tree values, and return this value

interpretTree0(tree with +):
Call interpretTree1(left tree with *):
Call interpretTree2(tree with 2):
No maths operation, just return 2
Call interpretTree3(tree with 3):
No maths operation, just return 3
Perform 2 * 3, return 6

Call interpretTree1(right tree with *):
Call interpretTree2(tree with 4):
No maths operation, just return 4
Call interpretTree3(tree with 5):
No maths operation, just return 5
Perform 4 * 5, return 20

Perform 6 + 20, return 26
这是在interp.c 中并依据上述伪代码写的功能:

// Given an AST, interpret the
// operators in it and return
// a final value.
int interpretAST(struct ASTnode *n) {
int leftval, rightval;

// Get the left and right sub-tree values
if (n->left)
leftval = interpretAST(n->left);
if (n->right)
rightval = interpretAST(n->right);

switch (n->op) {
case A_ADD:
return (leftval + rightval);
return (leftval - rightval);
return (leftval * rightval);
case A_DIVIDE:
return (leftval / rightval);
case A_INTLIT:
return (n->intvalue);
fprintf(stderr, “Unknown AST operator %d\n”, n->op);
这里还有一些其他代码,比如调用 main() 中的解释器:

scan(&Token); // Get the first token from the input
n = binexpr(); // Parse the expression in the file
printf("%d\n", interpretAST(n)); // Calculate the final result
