Preprocessing
- most preprocessors are hand-crafted and written in a ad-hoc manner
- most don't completely parse the source language
- they often place restrictions on program layout
- few are completely correct and robust
- most are difficult to maintain and modify
- rarely do preprocessors handle syntax errors in the program being processed
IMT -- A Case Study
- the IMT (Icon Meta-Translator) system was developed to simplify the
production of correct and robust preprocessors for Icon
- the idea behind this system is to use the same grammar for building
preprocessors as is used by the Icon compiler itself
- this assures that the parsing is correct and that syntactic errors
are handled
- the technique is to use macros in the actions for the Yacc grammar
that is used to generate the parser for the Icon compiler
- different sets of macro definitions are used for the Icon compiler
and for the preprocessing system
- the macro definitions are, of course, written in C
- for building preprocessors, the macros are defined to generate Icon
source code that corresponds to the input (up to physical layout); this
is called an identity translation
- the translation can be changed by changing these macro definitions
- a specification system simplifies the creation of macro definitions
for preprocessing
- going one step further, to "meta-translation", the macros
are defined to write Icon code consisting of procedure calls
- procedure declarations are provided to produce an identity translation
- changing these procedures can be used to change the translation
- the advantage of this approach is that the translation can be written
in Icon instead of C, and hence is easier and accessible to Icon programmers
who aren't proficient in C
Yacc Grammar for Icon
...
expr4 : expr5 ;
| expr4 SEQ expr5 {Bseq($1,$2,$3);} ;
| expr4 SGE expr5 {Bsge($1,$2,$3);} ;
| expr4 SGT expr5 {Bsgt($1,$2,$3);} ;
| expr4 SLE expr5 {Bsle($1,$2,$3);} ;
| expr4 SLT expr5 {Bslt($1,$2,$3);} ;
| expr4 SNE expr5 {Bsne($1,$2,$3);} ;
| expr4 NMEQ expr5 {Beq($1,$2,$3);} ;
| expr4 NMGE expr5 {Bge($1,$2,$3);} ;
| expr4 NMGT expr5 {Bgt($1,$2,$3);} ;
| expr4 NMLE expr5 {Ble($1,$2,$3);} ;
| expr4 NMLT expr5 {Blt($1,$2,$3);} ;
| expr4 NMNE expr5 {Bne($1,$2,$3);} ;
| expr4 EQUIV expr5 {Beqv($1,$2,$3);} ;
| expr4 NEQUIV expr5 {Bneqv($1,$2,$3);} ;
expr5 : expr6 ;
| expr5 CONCAT expr6 {Bcat($1,$2,$3);} ;
| expr5 LCONCAT expr6 {Blcat($1,$2,$3);} ;
expr6 : expr7 ;
| expr6 PLUS expr7 {Bplus($1,$2,$3);} ;
| expr6 DIFF expr7 {Bdiff($1,$2,$3);} ;
| expr6 UNION expr7 {Bunion($1,$2,$3);} ;
| expr6 MINUS expr7 {Bminus($1,$2,$3);} ;
expr7 : expr8 ;
| expr7 STAR expr8 {Bstar($1,$2,$3);} ;
| expr7 INTER expr8 {Binter($1,$2,$3);} ;
| expr7 SLASH expr8 {Bslash($1,$2,$3);} ;
| expr7 MOD expr8 {Bmod($1,$2,$3);} ;
...
Macro Definitions for Icon Compiler
...
#define Bact(x1,x2,x3) $$ = tree5(N_Activat,x2,x2,x3,x1)
#define Bamper(x1,x2,x3) $$ = tree5(N_Conj,x2,x2,x1,x3)
#define Bassgn(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bcaret(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bcat(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bdiff(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Beq(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Beqv(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bge(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bgt(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Binter(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Blcat(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Ble(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Blt(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bminus(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bmod(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bne(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bneqv(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bplus(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Brassgn(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Brswap(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bseq(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bsge(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bsgt(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bslash(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bsle(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bslt(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bsne(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bstar(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bswap(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
#define Bunion(x1,x2,x3) $$ = tree5(N_Binop,x2,x2,x1,x3)
...
Macro Definitions for Preprocessing
...
#define Bact(x,y,z) $$ = cat(5,q("Binop(\"@\","),x,q(","),z,q(")"))
#define Bamper(x,y,z) $$ = cat(5,q("Binop(\"&\","),x,q(","),z,q(")"))
#define Bassgn(x,y,z) $$ = cat(5,q("Binop(\":=\","),x,q(","),z,q(")"))
#define Bcaret(x,y,z) $$ = cat(5,q("Binop(\"^\","),x,q(","),z,q(")"))
#define Bcat(x,y,z) $$ = cat(5,q("Binop(\"||\","),x,q(","),z,q(")"))
#define Bdiff(x,y,z) $$ = cat(5,q("Binop(\"-\","),x,q(","),z,q(")"))
#define Beq(x,y,z) $$ = cat(5,q("Binop(\"=\","),x,q(","),z,q(")"))
#define Beqv(x,y,z) $$ = cat(5,q("Binop(\"===\","),x,q(","),z,q(")"))
#define Bge(x,y,z) $$ = cat(5,q("Binop(\">=\","),x,q(","),z,q(")"))
#define Bgt(x,y,z) $$ = cat(5,q("Binop(\">\","),x,q(","),z,q(")"))
#define Binter(x,y,z) $$ = cat(5,q("Binop(\"**\","),x,q(","),z,q(")"))
#define Blcat(x,y,z) $$ = cat(5,q("Binop(\"|||\","),x,q(","),z,q(")"))
#define Ble(x,y,z) $$ = cat(5,q("Binop(\"<=\","),x,q(","),z,q(")"))
#define Blt(x,y,z) $$ = cat(5,q("Binop(\"<\","),x,q(","),z,q(")"))
#define Bminus(x,y,z) $$ = cat(5,q("Binop(\"-\","),x,q(","),z,q(")"))
#define Bmod(x,y,z) $$ = cat(5,q("Binop(\"%\","),x,q(","),z,q(")"))
#define Bne(x,y,z) $$ = cat(5,q("Binop(\"~=\","),x,q(","),z,q(")"))
#define Bneqv(x,y,z) $$ = cat(5,q("Binop(\"~===\","),x,q(","),z,q(")"))
#define Bplus(x,y,z) $$ = cat(5,q("Binop(\"+\","),x,q(","),z,q(")"))
#define Brassgn(x,y,z) $$ = cat(5,q("Binop(\"<-\","),x,q(","),z,q(")"))
#define Brswap(x,y,z) $$ = cat(5,q("Binop(\"<->\","),x,q(","),z,q(")"))
#define Bseq(x,y,z) $$ = cat(5,q("Binop(\"==\","),x,q(","),z,q(")"))
#define Bsge(x,y,z) $$ = cat(5,q("Binop(\">>=\","),x,q(","),z,q(")"))
#define Bsgt(x,y,z) $$ = cat(5,q("Binop(\">>\","),x,q(","),z,q(")"))
#define Bslash(x,y,z) $$ = cat(5,q("Binop(\"/\","),x,q(","),z,q(")"))
#define Bsle(x,y,z) $$ = cat(5,q("Binop(\"<<=\","),x,q(","),z,q(")"))
#define Bslt(x,y,z) $$ = cat(5,q("Binop(\"<<\","),x,q(","),z,q(")"))
#define Bsne(x,y,z) $$ = cat(5,q("Binop(\"~==\","),x,q(","),z,q(")"))
#define Bstar(x,y,z) $$ = cat(5,q("Binop(\"*\","),x,q(","),z,q(")"))
#define Bswap(x,y,z) $$ = cat(5,q("Binop(\":=:\","),x,q(","),z,q(")"))
#define Bunion(x,y,z) $$ = cat(5,q("Binop(\"++\","),x,q(","),z,q(")"))
...
Specifications for Identity Translation
...
<bop>(x,y,z) x " " y " " z
<uop>(x,y) x y
...
Brace(x,y,z) x "\n" y "\n" z
Brack(x,y,z) x y z
Compound(x,y,z) x y "\n" z
Field(x,y,z) x y z
...
Specifications for Identity Meta-Translation
...
<bop>(x,y,z) "Binop(\"<bop>\"," x " , " z ")"
<uop>(x,y) "Unop(\"<uop>\"," y ")"
...
Brace(x,y,z) "Compound(" y ")"
Brack(x,y,z) "List(" y ")"
Compound(x,y,z) x "," z
Field(x,y,z) "Field(" x ",\"" z "\")"
...
Identity Meta-Translation Procedures
procedure main()
Mp() # call translation
end
...
procedure Binop(op, e1, e2) # e1 op e2
return ("(" || e1 || " " || op || " " || e2 || ")")
end
procedure Body(es[]) # procedure body
every write(!es)
return
end
procedure Break(e) # break e
return ("break " || e)
end
...
procedure Create(e) # create e
return ("create " || e)
end
...
procedure End() # end
write("end")
return
end
procedure Every(e) # every e
return ("every " || e)
end
procedure EveryDo(e1, e2) # every e1 do e2
return ("every " || e1 || " do " || e2)
end
...
procedure Invoke(e, es[] # e(e1, e2, ...)
local result
if *es = 0 then return (e ||"()")
result := ""
every result ||:= !es || ", "
return (e || "(" || result[1:-2] || ")")
end
procedure Key(s) # &s
return ("&" || s)
end
...
procedure Unop(op, e) # op e
return ("(" || op || "(" || e || "))")
end
...
procedure While(e) # while e
return ("while " || e)
end
procedure WhileDo(e1, e2) # while e1 do e2
return ("while " || e1 || " do " || e2)
end
Overview of the Meta-Translation Process
data:image/s3,"s3://crabby-images/c3bf6/c3bf6fec6c02c2b5748c0625d759bc7fe8390251" alt=""
Example of Identity Meta-Translation
procedure main()
1 & 2
x := a(1,2)
x.b := 3
write(x.b || x.c)
while 1 do break
write(while 1 do break "hello")
write(&time)
fail
end
- output of meta-translation:
procedure Mp()
Proc("main",)
Binop("&",Ilit(1),Ilit(2)),
Binop(":=",Var("x"),Invoke(Var("a"),Ilit(1),Ilit(2))),
Binop(":=",Field(Var("x"),"b"),Ilit(3)),
Invoke(Var("write"),Binop("||",Field(Var("x"),"b"),Field(Var("x"),"c"))),
WhileDo(Ilit(1),Break(Null())),
Invoke(Var("write"),WhileDo(Ilit(1),Break(Slit("hello")))),
Invoke(Var("write"),Key("time")),
Fail(),
);
End();
end
- result of compiling identity meta-translation procedures with
Mp()
and executing the result:
procedure main()
(1 & 2)
(x := a(1, 2))
((x.b) := 3)
write(((x.b) || (x.c)))
while 1 do break
write(while 1 do break "hello")
write(&time)
fail
end
Creating Other Meta-Translations
- in order to produce other kinds of translations, it only is necessary
to modify code-generation procedures
- for example, although string scanning is a control structure, not an
operation, and hence cannot be modeled by a procedure (it evaluates its
first argument and sets up a scanning environment before its second argument
is evaluated), it can be modeled by two procedures:
expr1 ? expr2 -> Escan(Bscan(expr1), expr2)
- such a translation can be used to better understand how string scanning
maintains scanning environments
- the identity code generation procedure for scanning is
procedure Scan(e1, e2) # e1 ? e2
return ("(" || e1 || " ? " || e2 || ")")
end
- the translation above can be accomplished by changing this procedure
to
procedure Scan(e1, e2)
return ("Escan(Bscan(" || e1 || ")," || e2 || ")")
end
Meta-Translators -- Comments and Critique
- creating the Icon meta-translator system was a major effort
- it works very well and has been used extensively, especially for program
monitoring
- the mechanism is intricate and somewhat difficult to understand, but
the details can be automated and hidden from the user
- the intermediate meta-translation is large -- typically 2 to 3 times
the size of the program being translated
- how easy it is to produce different translators depends heavily on
what is being done
- a person producing a meta-translator must thoroughly understand the
syntax and sometimes semantics of Icon
- since it is a meta-system, the syntax for code generation may be tedious
- the concept of meta-translation is largely language independent
- for example, it would be relatively easy to modify IMT to produce C
functions instead of Icon procedures to generate Icon code
- similarly, meta-translation systems can be written for other programming
languages
- to follow the model used by Icon, a formal grammar in terms of a suitable
parser-generator system (such as Yacc) is needed
- writing such a translator requires expert knowledge of the source-language
syntax and how parser-generators work