Mimir:Draft1 Chapter12

From Openitware
Jump to: navigation, search

Back to Table of Contents


Chapter 12: Lex and Yacc

"Make everything as simple as possible, but not simpler" - Albert Eistein


Lex Yacc.JPG This is usually a paradigm, and you would put an introduction here.

Lex & Yacc


Lex and yacc are tools designed for writers of compilers and interpreters, although they are also useful for many applications that will interest the noncompiler writer. Any application that looks for patterns in its input, or has an input or command language is a good candidate for lex and yacc. Furthermore, they allow for rapid application prototyping, easy modification, and simple maintenance of programs.

Lex and yacc were both developed at Bell Laboratories in the 1970s. Yacc was the first of the two, developed by Stephen C. Johnson. Lex was designed by Mike Lesk and Eric Schmidt to work with yacc. Both lex and yacc have been standard UNIX utilities since 7th Edition UNIX. System V and older versions of BSD use the original AT&T versions, while the newest version of BSD uses flex (see below) and Berkeley yacc. The articles written by the developers remain the primary source of information on lex and yacc.

Lex is officially known as a "Lexical Analyser". Its main job is to break up an input stream into more usable elements. It is used to identify the "interesting bits" in a text file. More to come....

Yacc takes a grammar that you specify and writes a parser that recognizes valid “sentences” in that grammar.

Overview - What is Lex and Yacc (or Flex and Bison)

Lexical analyzers: A lexer takes an arbitrary input stream and tokenizes it, i.e., divides it up into lexical tokens. This tokenized output can then be processed further, usually by yacc, or it can be the “end product.”

When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the patterns matches, the lex program invokes C code (or any language you used to created your program) that you provide which does something with the matched text. In this way a lex program divides the input into strings which we call tokens.

In Summary, lex recognizes regular expressions, yacc recognizes entire grammars. Lex divides the input stream into pieces (tokens) and then yacc takes these pieces and groups them together logically.

Yacc takes a grammar that you specify and writes a parser that recognizes valid “sentences” in that grammar. The term “sentence” here is used in a fairly general way.

A grammar is a series of rules that the parser uses to recognize syntactically valid input.

Lex (Flex): A lexical analyzer generator

Lexical rules

Regular Expressions


Parsing a line of code

Lex Implementation Example

Yacc (Bison): Yet Another Compiler-Compiler

Yacc rules (Context Free Grammars)

Yacc BNF format description

Yacc Code example


A quick summary of the chapter should go here

Key Terms







Regular expression:


Problem Sets

A list of practice problems