I was terribly interested in getting my hands on this book since I'm taking a formal course on Compilers and Interpreters at university and I really wanted to know: What's the difference between what we (as computer scientists) are taught in a compilers' course and the more practical approach presented in the book?
As it turns out there's a big difference. If you want to be the ultimate guru of compilers (eg. contributing an even more efficient compiling technique for language X or creating a language that forces us all to reconsider what we know about compilers) you need both, the theory the practice (because without the theory you wouldn't know how to improve or make obsolete an existing technique, and without the practice you wouldn't be able to put that knowledge to work inside a language compiler). Now if you just want to be able to deal with your DSL (domain specific language), create data readers, code generators, source-to-source translators, source analyzers, etc. you'll love the hands on information presented in this book.
Let's be honest, how many of us developers are required or willing to create a language from scratch together with its compiler or interpreter versus the ones that just need to parse an XML file, process a DSL or create a configuration file reader? I would say that there are much more developers in the later group. But fortunately we all (or almost all) share one thing in common: we know software patterns! This is how the author structured the book, offering patterns (ala Gamma et al) that you can use when creating your language processors (an excellent approach in my opinion since each pattern focuses on different stages of language processing which helps the developer modularize the solution and understand how the different parts of the "machine" work without loosing sight of the big picture).
So, in case you're wondering "what is this guy talking about?". A compiler is a program that transforms code created in one language into another (eg. C source code into executable code). Normally the transformation goes from a higher level language to a lower level language (eg. to machine code)(if it's the other way round we have a "decompiler"). When the transformation happens between languanges on the same level we're dealing with a language translator or converter (normally called "source-to-source translator")(eg. Sharpen, an open source framework created by the db4o team that converts Java code to C#). A compiler is likely to perform several operations such as lexical analysis, preprocessing, parsing, semantic analysis, code generation, and code optimization (which are directly or indirectly covered in the patterns offered in the book). The ultimate tool for developers interested in building compilers is a compiler-compiler or parser generator which, as you might have already guessed, provides a high level description language to help you build your own compiler (this usually involves the creation of a lexer and a parser).
However, I feel I should mention that there's a whole lot of complexity in handling and maintaining all the intermediate information when you're creating your own compiler for your own language which is covered only indirectly in this book. There's also no detailed explanation of the final steps of a compiler implementation such as machine code generation and optimization, register allocation, etc. Overall this is an excellent book for day-to-day language applications (involving parsing, translations, etc).
Now I find pretty important to mention who the author is. Terence Parr created the ANTLR parser generator (antlr.org) and the StringTemplate engine (stringtemplate.org). He's a professor of computer science that's no theorist (this guys has real practical experience!). He has so much experience that he started to see these patterns when developing language processors coming again and again. The end result is this book that presents a compilation of those patterns.
The structure of the book is pretty straight forward. Four general parts:
* Getting Started with Parsing: where you'll learn about the general architecture of language applications and review the patterns that involve parsing.
* Analyzing Languages: where you'll see how to use the parsing techniques described in the previous section to build trees that hold language constructs in memory, how to walk those trees to track and identify various symbols (such as variables and functions) and how to determine the type of the expressions. Overall you'll learn how to check whether an input stream makes sense.
* Building Interpreters: four interpreter patterns are presented that vary in terms of implementation difficulty and run-time efficiency. In the two previous parts the focus was on patterns that verify the syntax of an input sentence and make sure that it follows a set of semantic rules. In this part the focus is on patterns for processing the input sequences (not just validating them).
* Translating and Generating Languages: here you'll learn how to translate one language to another and how to generate text using the StringTemplate engine.
The patterns are laid in the order you'd follow to implement a language (section 1.2, "A Tour of Patterns" describes how all the patterns fit together). There are 31 patterns in the book, each with four parts: purpose, discussion, implementation and related patterns. The implementation section provides illustrative code in Java (but it's not meant to serve as a library). You don't need a background in language theory or compilers to understand the book but you have to have a solid programming background and be confortable with the concept of recursion in algorithms.
Overall, if you're a developer that has to deal with any of the use cases described in this review this book *must* be in your bookshelf (but if you're really into compilers you should also have Aho's Dragon book next to it =)