Grammar and Parsers


The purpose of this page is to provide an overview to the origins to the grammar pages below which are used to build the parsers that are contained in Browse and Doc It.

The origins for the expert and parsing code came from two distinct points: one, I was abysmal at documenting my code (this was a long long time ago and I think I’ve improved); and Delphi 5’s Explorer used to lock up regularly stopping me being able to browse my code. So I thought how hard can it be to parse code and built an IDE plug-in that allows you to browse code and document it. Answer: A lot harder than I thought.

The parsers currently in Browse and Doc It are the third generation of parsers. My first two attempts were dreadful. I built the parsers based on what I thought the grammar behind Object Pascal was which I found to my disappointment to be very different from what it actually is. I also tried to workout how to parse code by myself without researching the subject on the internet. By the time I got round to the third generation parser for Object Pascal I had stumbled on recursive descent parsing by accident. This came from finding in the back of the Delphi 7 Object Pascal Language reference a formal grammar. The structure of the grammar lead me to the recursive solution. I’ve latterly read more about how you are supposed to parse code. One of the documents I found was Lets Build a Compiler Tutorial which I strongly suggest you read if you are interested. It dates back to Turbo Pascal 4 but I think its still relevant.

The code is provided for people to learn from, including my mistakes. The major mistake I’ve made with these parsers is to try and tokenise the text stream first before passing those tokens to the grammar parsers. I think in hindsight it makes things more complicated.

The pages below are just some of the grammar files associated with some of the parsers in Browse and Doc It that I think people will be interested in.