University of Arizona, Department of Computer Science

CSc 453: μC Language Specification

Lexical rules | Syntax rules | Semantic rules | Operational characteristics

Notation

The lexical and syntax rules given below use a notation called Extended BNF Notation. BNF notation characters are written in magenta.

For more information on Extended BNF, see Wikipedia.

Text in blue italics below are explanatory comments.



1. Lexical Rules

letter ::= a | b | ... | z | A | B | ... | Z
digit ::= 0 | 1 | ... | 9
id ::= letter { letter | digit | _ }
intcon ::= digit { digit }
charcon ::= 'ch' | '\n' | '\0', where ch denotes any printable ASCII character, as specified by isprint(), other than \ (backslash) and ' (single quote).
Comments   Comments are as in C, i.e. a sequence of characters preceded by /* and followed by */, and not containing any occurrence of */.


[ Back to top ]

2. Syntax Rules

Nonterminals are shown in italics; terminals are shown in boldface, and sometimes enclosed within quotes for clarity.

2.1 Grammar Productions

prog : { dcl  |  func }
 
dcl : type var_decl { ',' var_decl } ';'
 | [ extern ] type id '(' parm_types ')' { ',' id '(' parm_types ')' } ';'
 | [ extern ] void id '(' parm_types ')' { ',' id '(' parm_types ')' } ';'
 
var_decl : id
 
type : char
 |int
 
parm_types : void
 |type id { ',' type id }
 
func : type id '(' parm_types ')' '{' { type var_decl { ',' var_decl } ';' } { stmt } '}'
  | void id '(' parm_types ')' '{' { type var_decl { ',' var_decl } ';' } { stmt } '}'
 
stmt : if '(' expr ')' stmt [ else stmt ]
 | loop_stmt
 | update_stmt ';'
 | callret_stmt ';'
 | '{' { stmt } '}'
 | ';'
 
loop_stmt : while '(' expr ')' stmt
 | for '(' [ update_stmt ] ';' [ expr ] ';' [ update_stmt ] ')' stmt
 
update_stmt : id '=' expr
  | incdec_expr
 
callret_stmt : id '(' [expr { ',' expr } ] ')'
 | return [ expr ]
 
expr : '' expr
 | '!' expr
 | expr arithop expr
 | expr relop expr
 | expr logical_op expr
 | incdec_expr
 | id
 | id '(' [expr { ',' expr } ] ')'
 | '(' expr ')'
 | intcon
  | charcon
 
incdec_expr : id ++
 | id --
 
arithop : +
 |
 | *
 | /
 
relop : ==
 | !=
 | <=
 | <
 | >=
 | >
 
logical_op : &&
 | ||

2.2. Operator Associativities and Precedences

The following table gives the associativities of various operators and their relative precedences. An operator with a higher precedence binds "tighter" than one with lower precedence. Precedences decrease as we go down the table.

Operator Associativity
++, --none
!, – (unary)right to left
*, /left to right
+, – (binary)left to right
<, <=, >, >=left to right
==, !=left to right
&&left to right
||left to right


[ Back to top ]

3. Semantic Rules

3.1. Declarations

The following rules guide the processing of declarations. Here, the definition of a function refers to the specification of its formals, locals, and its body.
  1. An identifier may be declared at most once as a global, and at most once as a local in any particular function; however, an identifier may appear as a local in many different functions.

  2. A function may have at most one prototype; a function may be defined at most once.

  3. If a function has a prototype, then the types of the formals at its definition must match (i.e., be the same), in number and order, the types of the argument in its prototype; and the type of the return value at its definition must match the type of the return value at its prototype.
    The prototype, if present, must precede the definition of the function.

  4. An identifier can occur at most once in the list of formal parameters in a function definition.

  5. The formal parameters of a function have scope local to that function.

  6. If a function takes no parameters, its prototype must indicate this by using the keyword void in place of the formal parameters.

  7. A function whose prototype is preceded by the keyword extern must not be defined in the program being processed.


3.2. Type Consistency Requirements

The notion of type compatibility is defined as follows: (i) int is compatible with int, and char is compatible with char; (ii) int is compatible with char, and vice versa; and (iii) any pair of types not covered by (i) and (ii) compatible. In essence, the notion of type compatibility states that bools (see below) and function types cannot take part in arithmetic.

3.2.1. Symbol Lookups

The following rules guide symbol lookups for identifiers in expressions and statements.

  1. Variables must be declared before they are used.

  2. Functions must have their argument types and return value specified (either via a prototype or via a definition) before they are called.

  3. A use of an identifier must be consistent with the declaration of that identifier in the most deeply nested scope enclosing that use. In other words:

3.2.2. Function Definitions

  1. Any function called from within an expression must not have return type void. Any function call that is a statement must have return type void.

  2. A function whose return type is void cannot return a value, i.e., it cannot contain a statement of the form "return expr;"
    A function whose return type is not void cannot contain a statement of the form "return;" Such functions must contain at least one statement of the form "return expr;" (Note that it is still possible for such functions to fail to return a value by "falling off the end".)

3.2.3. Expressions

The type of an an expression e is given by the following:
  1. If e is an integer constant, then its type is int.

  2. If e is an identifier, then the type of e is the type of that identifier.

  3. If e is a function call, then the type of e is the return type for that function.

  4. If e is an expression of the form e1 + e2, e1 e2, e1 * e2, e1 / e2, e1, id++, or id--, then the type of e is int.

  5. If e is an expression of the form e1 >= e2, e1 <= e2, e1 > e2, e1 < e2, e1 == e2, or e1 != e2 then the type of e is bool.

  6. if e is an expression of the form e1 && e2, e1 || e2, or !e1, then the type of e is bool.

The rules for type checking expressions are given by the following:
  1. Each actual parameter of a function call must be compatible with the corresponding formal parameter.

  2. The subexpressions associated with the operators +, -, *, /, <=, >=, <, >, ==, and != must be compatible with int. The subexpressions associated with the operators &&, ||, and ! must have types compatible with bool.

  3. In expressions of the form id++ and id--, the identifier id must have type compatible with int.

Comment: The type bool can arise only out of expressions involving relational and logical operators. The language does not allow identifiers to be declared to have type bool.

3.2.4. Statements

  1. An identifier occurring in an update statement (see the grammer rules for the non-terminal update_stmt above) must be of type char or int.

  2. The type of the right hand side of an assignment must be compatible with int or char.

  3. The type of the expression in a return statement in a function must be compatible with the return type of that function.

  4. The type of the conditional in if, for, and while statements must have type bool.

  5. Each actual parameter of a function call must be compatible with the corresponding formal parameter.



[ Back to top ]

4. Operational Characteristics

The μC language has the execution characteristics expected of a C-like block-structured language. The description below mentions only a few specific points that are likely to be of interest. For points not mentioned explicitly, you should consider the behavior of μC to be as for C.

4.1. Data

  1. An object of type int occupies 32 bits. an object of type char occupies 8 bits.

  2. Values of type char are signed quantities. Converting a char to an int requires sign extension.

4.2. Expressions

4.2.1. Order of Evaluation

Arithmetic Expressions
  1. The operands of an expression must be evaluated before the expression is evaluated.

  2. When there is more than one operand, the order in which they are evaluated is left unspecified.

Boolean Expressions
  1. The order of evaluation of the operands of comparison operators (>=, >, <=, <, ==, !=) is left unspecified.

  2. Expressions involving the logical operators && and || must be evaluated using short circuit evaluation.

4.2.2. Type Conversion

If an object of type char is part of an expression, its value must be converted (sign extended) to a value of type int before the expression is evaluated.

4.3. Assignment Statements

A value of type char is converted (sign extended) to a 32-bit quantity before it is assigned to an object of type int.

A value of type int is converted (truncated) to an 8-bit quantity, by discarding the top 24 bits, before it is assigned to an object of type char.

4.4. Functions

4.4.1. Evaluation of Actuals

The order in which the actual parameters in a function call are evaluated is unspecified.

4.4.2. Parameter Passing

Arguments are passed by value.

An argument of type char is converted (sign extended) to a 32-bit quantity before it is passed as an actual parameter to a function.

Since a function that has a formal parameter of type char will, in any case, be passed a 32-bit quantity as an actual, it must convert (truncate) the actual to an 8-bit quantity before using it.

4.4.3. Return from a Function

Execution returns from a function if either an explicit return statement is executed, or if execution "falls off" the end of the function body. In the latter case, no value is returned.

4.5. Program Execution

Execution begins at a procedure named main().



[ Back to top ]