|
|
|
|
Abstract Syntax TreesDescriptionVarious programs (such as compilers and calculators), use a data structure called an Abstract Syntax Tree (AST) to represent syntax that has been recognized by the program.
Note:
An Abstract Syntax Tree is not the same
as a Binary Search Tree.
In a Binary Search Tree, both interior and leaf nodes can contain data, and the nodes are ordered in such a way that the data can be found by searching the tree in a prescribed order. In an Abstract Syntax Tree, the interior nodes contain operator information, while the leaf nodes contain operand information. The order of the nodes within a Binary Search Tree is determined by the contents of each node. The order of the nodes in an AST is determined by the expression that the AST represents, and by the precedence of the expression's components. For example, the arithmetic expression: -5+12*4 would be represented in Abstract Syntax Tree form as:
Note: Conventionally, trees in Computer Science are drawn upside down when compared to trees in the real world. Thus, the root of the tree is at the top, not the bottom. An arithmetic expression uses a convention to distinguish between the different possible interpretations. For example, the expression: -5+12*4 could be interpreted as either: (-5+12)*4 or: -5+(12*4) The latter is the conventionally correct interpretation, because the multiplication operator conventionally has higher precedence than the addition operator. However, the Abstract Syntax Tree representation is inherently unambiguous, which is one of the reasons it is so popular. In the above AST diagram, each element in the tree is connected to other items in the tree by lines. An element in an Abstract Syntax Tree is usually called a Node. Nodes can either be interior nodes or leaf nodes. Interior nodes are those that have one or more lines connecting them with other nodes further down the tree. Leaf nodes have no such connections; they are the lowest nodes in that particular branch of the tree. We say that interior nodes have child nodes, while leaf nodes do not. In this example, interior nodes are used to represent the arithmetic
operators ( A node below an operand in the tree can represent a value such as 5 or 12 (in the above diagram, all values are integer values). If the node represents a simple value, then it must be a leaf node. If it represents an expression (consisting of some combination of operators and operands), then the node must represent an operator, and therefore must be an interior node. For example, imagine you wish to modify the original expression to look like this: -(5/23)+12*(4-2) The Abstract Syntax Tree for this new expression looks like:
Note that we have replaced the original leaf nodes for 5 and 4 by subtrees which represent the new subexpressions. The top node in each of these subexpressions represents an operator; which operator depends on the relative precedence of the operators in the subexpression. A program can evaluate such a tree by walking the tree in a prescribed order: at each interior node it determines the value from each subtree, and applies the operator to those values to produce a value for the tree represented by that interior node. A leaf node provides a value directly. A program can also walk the tree in the same fashion to produce a printout of the equivalent arithmetic expression (although it would have to insert parentheses appropriately to maintain the operator precedence). The AssignmentYour job is to write a set of classes which will allow you to construct such an abstract syntax tree, and then walk it, and evaluate it. Here's an abstract Node class to start with:
This class is to be used as a base class for a family of node classes which can be used to construct an AST and then evaluate and walk it. Cut it from this web page and paste it into your editor. The
The
-5 while calling
12 * 4 Calling walk() for the node at the top of the tree will print out the entire expression: -5 + 12 * 4 We will break the assignment into three parts:
|
|
This page was last modified on 02 October, 2007 |