Abstract syntax tree
From Wikipedia, the free encyclopedia
In computer science, an abstract syntax tree (AST), or just syntax tree is a finite, labeled, directed tree, where each interior node represents a programming language construct and the children of that node represent meaningful components of the construct. We devise operators to name these programming language constructs. Internal nodes are labeled by these operators, and the leaf nodes represent the operands of the operators. Thus, the leaf nodes are nullary operators and only represent variables or constants. An AST differs from a parse tree (also known as a concrete syntax tree) by omitting nodes and edges for syntax rules that do not affect the semantics of the program. Only significant programming language constructs are included. The classic example of such an omission is grouping parentheses, since in an AST the grouping of operands is implicit in the tree structure.
The AST is used in a parser as an intermediate between a parse tree and a data structure, the latter of which is often used as a compiler or interpreter's internal representation of a computer program while it is being optimized and from which code generation is performed. The range of all possible such structures is described by the abstract syntax. Creating an AST in a parser for a language described by a context free grammar, as nearly all programming languages are, is straightforward. Most rules in the grammar create a new node with the node's edges being the symbols in the rule. Rules that do not contribute to the AST, such as grouping rules, merely pass through the node for one of their symbols. Alternatively, a parser can create a full parse tree, and a post-pass over the parse tree can convert it to an AST by removing the nodes and edges not used in the abstract syntax.
[edit] See also
- Abstract semantic graph (ASG)
- Code generation syntax tree (CST)
- Datrix
- Document Object Model (DOM)
- Interpretation syntax tree (IST)
- Semantic resolution tree (RST)
- Shunting yard algorithm
- Symbol table
- TreeCC
- TreeDL
[edit] References
This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.
[edit] External links
- AST View, an Eclipse plugin to visualize a Java abstract syntax tree
- Good information about the Eclipse AST and Java Code Manipulation
- Paper "Abstract Syntax Tree Implementation Idioms" by Joel Jones (overview of AST implementation in various language families)
- Paper "Abstract Syntax Tree Design" by Nicola Howarth (note that this merely presents the design of *one particular* project's AST, and is not generally informative)
- Paper "Understanding source code evolution using abstract syntax tree matching" by Iulian Neamtiu, Jeffrey S. Foster and Michael Hicks
- Paper "Change Distilling: Tree Differencing for Fine-Grained Source Code Change Extraction" by Beat Fluri, Michael Würsch, Martin Pinzger, and Harald C. Gall.
- Diploma thesis "Improving Abstract Syntax Tree based Source Code Change Detection" by Michael Würsch
- Article "Thoughts on the Visual C++ Abstract Syntax Tree (AST)" by Jason Lucas
- Tutorial "Abstract Syntax Tree Metamodel Standard"
- PMD uses AST representation to control code source quality
- CAST representation
- Abstract Syntax Tree Unparsing

