CSc 453 : Programming Assignment 2 (Lexical and Syntax Analysis)
Start Date: Thu Sept 17, 2015
Due Date: 11:59 PM, Mon Oct 5, 2015
1. General
This assignment involves implementing a scanner and parser for
C--,
using a scanner generator such as lex or flex and a parser
generator as such as yacc or bison for this purpose.
For this assignment, your code should deal with only the
lexical and
syntax rules of C--.
In other words, anything that requires semantic
information-i.e., information involving declarations-should be ignored.
At this point, your program will act simply as a syntax checker:
syntactically correct input will be accepted silently, while syntax errors
will give rise to error messages that will be reported to stderr.
2. The Scanner
2.1. General
The scanner should be implemented as a function that returns, each time
it is called, either a positive integer indicating what kind of token was
found on the input stream, or the value 0 ("end of file") indicating that
no further input is available.
Note that keywords cannot be used as identifiers.
The values of different kinds of tokens should be defined as macros to
simplify the interface between the scanner and parser. For this purpose,
it is simplest to define single-character tokens such as ( and ;
to have the value of the corresponding character constant, e.g., the
value of a "left-parenthesis" token will be that of the character
constant '('. (The simplest way to do this is to use yacc -d
to generate a file y.tab.h that contains the macro definitions, then
#include this file into the scanner. Your make file will have to be
set up carefully to make this work right.)
2.2. Comments and Whitespace
Comments and whitespace are to
be skipped silently. It is an error to encounter an end-of-file
inside a comment.
2.3. Errors
The simplest way to deal with lexical errors is to let the parser worry
about them. This can be done by simply returning the value of any
unrecognized character to the parser.
3. The Parser
3.1. General
You should transform your grammar, as necessary, to eliminate
conflicts. The "dangling else" shift/reduce conflict will be tolerated,
as will shift/reduce conflicts between error productions, but you will be
penalized for any other conflicts. If you encounter conflicts, you may
consult the file y.output generated by yacc
(invoked with the -v option) for more information.
3.2. Errors
You are to implement error handling and recovery for syntax errors. This does
not include errors involving semantic checking (i.e., anything
that demands information from declarations), which will be dealt
with in the next assignment.
Your program will be expected to deal with errors in a
"reasonable" way. Error messages should be specific and should contain
enough information (with at least a line number) to allow the user to locate
syntax problems. Error recovery should allow your parser to recover
gracefully and continue processing the input even after syntax errors are
encountered.
3.3. Exit Status
The exit status of your program should be 0 if no errors are encountered during processing,
and 1 if any errors (including syntax errors) are encountered at any point.
4. Invoking Your Program
Your program will be called compile. It will read from stdin
and send all output to stdout. Error messages will be sent to stderr.
E.g.:
cat foo.c | ./compile
or
./compile < foo.c
5. Turnin
You should turn in the sources to your code on lectura. These should
include:
-
The sources and headers for your scanner and parser, in particular the
input specifications to the tools
flex, bison>, etc.
-
A main routine that calls your parser;
-
A make file called Makefile that should support at least the following
targets:
-
compile
The command "make compile" should build
your scanner and parser from scratch, by invoking
the appropriate tools (flex, bison, etc.)
on their input specifications, and should result in the creation of an
executable file compile that implements the functionality
specified above for this assignment.
-
clean
The command "make clean" should delete
(at least) all software-generated files in the current directory,
including: C source files created by flex/bison,
e.g., y.tab.h, lex.yy.c, y.output,
etc.; object files, *.o; as well as the executable file
compile.
-
Any additional material you wish to turn in. Any documentation or comments
may be turned in in a file README.
To turn in your files, use the command
turnin cs453f15-assg2
file1
file2
...
filen
Turn in the files you want to submit just as they are: don't zip them up
or turn in a directory containing your files.
For more information on the turnin command, see man turnin.
Note: The turnin command copies the files submitted
into another directory. Because of this, programs that use
relative path names in include files and make files (e.g.,
#include "../../foo/bar/baz.h") may not
compile and execute correctly once they are
turned in. Please avoid using relative pathnames.
The output of your program will be compared with the "expected" output using
diff utility (see diff(1)).
With the exception of error
messages (where the requirements are given above), your output must follow
the specification exactly.
For this reason it is recommended that you follow
the specification, and instructions for turnin, closely.