Knowledge Representation System Generator Description Dave Kuhlman Reify dkuhlman@netcom.com 1-415-368-8450 Topics: -- Preliminary -- Rational -- What it does: (1) Functions and code generated; (2) support code; (3) embedded language interface -- What it does NOT do -- System requirements -- Rights -- How to use KR-SG -- Possible uses -- Current status -- Future directions -- Problems -- Miscellaneous comments -- Addresses and Sources for Support Code Preliminary -- Rational Suppose you want to develop an application that enables users to manipulate and browse/edit a complex set of objects. Your approach might me to define a set of C++ classes, one for each type of object to be manipulated, to implement these C++ classes, and to write a graphical browser editor. In doing so, you will produce three categories of code: (1) application specific code, (2) generic object manipulation code, (3) GUI code implementing the graphic interaction between the user and the objects. KR-SG enables you to generate (2) the generic object manipulation code. What It Does KR-SG -- Description Page 1 KR-SG enables the user/developer to define a set of types. For each type, the developer specifies the contents of the type, i.e. the fields or member data items in the type and the type of each of these fields, and, optionally, a base type or superclass for the type. For each specified type, KR-SG generates a C++ class for each type described by the developer. The data members are protected members of a C++ class, so that they will be inherited by subclasses of the type. KR-SG generates member functions for each class, as described below, plus some support code, e.g. for compiling objects of the described types. KR-SG generates for each specified type a canonical C++ class containing the following functions: -- Default ctor -- Copy ctor -- Virtual dtor -- Assignment operator KR-SG generates for each specified type a set of member functions: -- A Copy function which does a deep copy. -- A Save function that will save the object into a database. -- A Load function that will load an object from a database. -- A decompile function that will convert the internal, binary form of an object into an external, text (ASCII) form. -- Code which, when processed by a parser generator (PCCTS), will convert the external, text (ASCII) form of an object into the internal, binary form. This code can also be used as the front end for a developer's parser of language processor. -- A function that will, when called, generate code for IPFC (the Information Presentation Facility Compiler) from the contents of specified string fields. For each member data item of each type (class), KR-SG generates set an get public member functions. If the name of a member data item is M and its type is T, then the prototype of the get and set functions are: T Get_M(); void Set_M(T newValue); KR-SG comes with a sample application. It is a C++ program that (1) constructs some objects, (2) saves the objects into a database, (3) loads the objects from the database, (4) decompiles the object to produce an ASCII file. This application also contains a REXX KR-SG -- Description Page 2 interface, i.e. it will run and execute the REXX code in a REXX file which contains calls to a sub-command handler which is implemented in the sample application. KR-SG contains a working skeleton for a command-line app that is a compiler and decompiler. -- When executed with the '-c' operation, it creates a new database. -- When executed with the '-a' operation, it compiles an ASCII file and adds a new object to a database. -- When executed with the '-x' operation, it loads an object from a database and decompiles it to an ASCII file. -- When executed with the '-d' operation, it deletes an object from a database, making the space available for reuse. The sample/test program distributed with KR-SG contains a REXX interface. It shows how to call a user/developer's REXX file and execute the REXX code in that file. It contains a REXX subcommand handler which will process commands in the REXX file that are unknown to REXX or are in quotes (a signal to the REXX interpreter that the command is to be excuted in an external environment). This sample subcommand handler supports operations to get and set member data items (fields in a type), create and delete objects, as well as operations to create, delete, and manipulate cursors used to point to and mantain position among objects. KR-SG provides a string class KrString. It is a subclass of the string class offered by the compiler vender (class String for Borland C++ for OS/2; class string for WATCOM C++ for OS/2). KR-SG provides support for lists of objects. For each type specified, KR-SG generates a type-specific list class. This means that adding and accessing objects in lists is type-safe. If the name of a user's type is X, then the generated list type is a class named XList. The developer can inspect generated code and the sample app for code samples showing how to create, added to, and access data in lists. There is also a string list class (KrStringList). Comments on decompiling and compiling -- Why you might want to be able to do it. If you feel that a decompiler and compiler is of no use, you can either (1) turn off the flag that causes KR-SG to generate it or (2) read the following reasons and uses: -- A decompiler and compiler provides a bootstrap mechanism for producing data during early development, e.g. before you have implemented a graphical f your own. For example, you can use the set_x() member functions to create a small set of objects, then decompile the objects, then use your text editor to copy the objects, then re-compile. -- A decompiler can be used to produce backup. This may KR-SG -- Description Page 3 provide some assurance, if you do not trust the binary database or worry that that database can be corrupted. -- A decompiler and compiler can be used to move data from one platform to another. A text file may port across platforms where the binary database might not. You can decompile to text on one platform, move the text form to the other platform, then recompile. -- A compiler can provide a mechanism for importing data from other applications. If you can add code or macros to the other application that produces a text file that satisfies the syntax of a decompiled file, then you can import that database by using the compiler. -- You can use the compiler to produce programs that transform data while they compile it. The compiler is composed of a front-end (an Antlr grammar; Antlr is a part of PCCTS; it creates the parse tree) and a back-end (a Sorcerer tree grammar; it walks the parse tree and creates the objects). You can add C code to the actions in the rules in the back-end, to perform transformations on those objects. -- You can use the generated grammars to produce a compiler of your own, possibly by keeping the front-end grammar, but then replacing the actions in the back-end grammar. -- The ability to use a decompiler produce a text form of your data may enable you to use tools that you could not otherwise use. For example, you can use grep as well as the search commands in your text editor to find things. You may be able to use your the search and replace facility in your text editor to make rapid global changes. You may be able to use REXX or some other text processing language to write filters and transformers. And the code the KR-SG generates for the compiler can be used as the basis for pattern matching operations. (Remember that the grammars generated by KR-SG contains a rule that matches each object and field in your data. What It Does Not Do KR-SG does not generate a GUI (graphical user interface) or code to support one. If you get any ideas on what KR-SG could do to support such a need, please let me (Dave Kuhlman, dkuhlman@netcom.com, 415-368-8450) know about them. The code produced by KR-SG does no memory management. This is C++, remember; you asked for a low level language; now you live with it. You need to delete anything that you create (with new) and a lot of things that the generated code creates, too. KR-SG -- Description Page 4 System Requirements In order to use KR-SG you will need the following: -- ANSI C/C++ compiler. KR-SG has been tested with Borland C++ for OS/2, [WATCOM C++ for OS/2, IBM CSet++, and MetaWare C++ for OS/2]. -- Softfocus BTree package, a BTree and record manager distributed in C source code form. KR-SG contains a small file containing glue functions which provide the interface between KR-SG generated code and the Softfocus BTree package. The developer can substitute a different BTree package by replacing/rewriting the functions in this file. -- PCCTS -- the Purdue compiler construction toolset. This parser generator is in the public domain and is distributed in source code form. I supply the needed support/header files and executables for OS/2. -- The Icon programming language -- I supply executables for OS/2. These are sufficient for using KR-SG. If you make serious use of the Icon programming language, then you should send money to the Icon Project, so that they can track users, so that you can help support the development and distribution of Icon, and so that you will be sent the Icon newsletter. (See below for address of the Icon Project.) Icon is a high level language that contains powerful built-in data types (strings, sets, lists, tables/dictionaries, records), generators, co-expressions, backtracking, garbage collection, etc. Rights I have placed KR-SG in the public domain. You can use the generated code, the code generator, and support code in any way you wish. KR-SG is provided "as-is" without warranty or liability. I ask only the following: (1) if you pass on the code I provide, that you leave my name and address in place, and (2) if you create an application, that you place the following notice at the beginning of your documentation and in the help file in a window labeled "Credits" in the top level of the Contents list: This application produced with the help of the Knowledge Representation System Generator developed by Dave Kuhlman Internet: dkuhlman@netcom.com Phone: 415-368-8450 Please send suggestions and bug reports to me at the above E-mail address. KR-SG -- Description Page 5 How to use KR-SG This section is an overview of how to use KR-SG. More details can be found in the KR-SG User Guide. A developer uses KR-SG to produce an application by performing the following steps. Some of these steps will be unnecessary for some applications and uses: -- Unzip the distribution file to produce the KR directory structure on the developer's hard disk. -- Describe the objects/types. -- Create a control file. -- Run MSG (the system generator) to produce source code. -- Review and make any desired changes to the generated code. -- Compile the generated code to produce a library. -- Compile supporting code (DB, SUP). -- Compile KRC (the Knowledge Representation Compiler). -- Make changes to the sample application. -- Compile the sample application. -- Embed KR-SG into your application. -- Modify the generator to customize generated code. -- Integrate the generated REXX interface support. -- Integrate with VX-REXX. -- Integrate generated template insertion code into you text editor (EPM or Preditor/2). Unzip the distribution file to produce the KR directory structure on the developer's hard disk -- Doing so will produce the following directory tree: KR-SG -- Description Page 6 kr ----+---- h | +---- sup (support code for KrStrings, KrStreams, etc) | +---- db (database support code, BTree glue functions, etc) | +---- gen (code and support for the generator itself) | +---- kr (the generated code goes here) | +---- krc (the KR command line compiler/decompiler) | +---- test (a sample program) | +---- libx (x: b=Borland, i=IBM Cset++, w=WATCOM) Describe the objects/types -- Produce a knowledge representation specification file (called, say, myapp.krs). This file contains a description of each type that you intend to define and manipulate. For each type, you specify the fields in that type. Each field must be one of the following types: (1) integer, (2) float, (3) string (a pointer to a KrString), (4) boolean, (5) user defined type (a pointer to a user defined type), (6) a list of one of the above. In addition, optionally specify a superclass for the type. Create a control file -- This file contains command line flags used by the generator. These flags specify, for example, whether to generate specific types of code (C++ code, parser code, VX-REXX interface code, REXX interface code, etc), the input specification file, whether to generate comments in the code, etc. Run MSG (the system generator) to produce source code -- MSG reads the type specification file and generates the various code output files. Copy these files to the /kr/kr directory. A batch file in the /kr/kr directory can be used to do this copy. Review and make any desired changes to the generated code -- Although this may not be necessary, and although you as a developer should attempt to avoid modifying the generated code (for one, because then you will not be able to change your type specification file and re-generate code.), still let's look at where some of the different types of code go. MSG generates the following files: myapp.hpp -- C++ header file for the type definitions. Contains C++ class definitions and record definitions for saving each type into the database. Also contains a list class for each type. myapp.cpp -- C++ code for the member functions for each class/type. For each type, the following functions are generated: (1) default constructor, (2) copy constructor, (3) virtual destructor, (4), Load function to load object from a database, (5) Save function to save function into database, (6) Decompile function to translate the object into a text form, (7) GenIpf function that produces code for the IPFC help compiler. Also contains member function KR-SG -- Description Page 7 definitions for the member functions in the list classes. myappg.g -- Parser grammar used to produce a compiler for the types specified. myappg.sor -- Tree parser grammar used to produce a compiler for the types specified. myapp*.h -- miscellaneous header files. Compile the generated code to produce a library. The /kr/kr directory contains make files for compiling the generated code. Compile supporting code in the following directories: /kr/db, /kr/sup. These contain make files for compiling the generated code. Compile KRC (the Knowledge Representation Compiler). The /kr/krc directory contains make files. Make changes to the sample application. The calls to the Set_xxx() functions will have to be changed. Compile the sample application. Run it. It should produce a database, a decompiled file, and a .IPF file (that can be compiled with IPFC.EXE. Embed KR-SG into your own application. This should reasonably easy if you follow the sample code in the sample application, the command line compiler, and the generated code itself (/kr/test/test.cpp, /kr/krc/krc.cpp, and /kr/kr/kr.cpp respectively). Modify the generator to customize generated code. This step is optional. Do this if, for example, you need to embed some kinds of hooks or function calls into the generated code. You can help yourself here by going to the /kr/gen directory, uncommenting the line at the beginning of GEN.ICN that defines msg_debug, then run MSG.EXE using the '-C' flag for comments (either in the MSG.CFG file or on the command line). This will generate comments in you .CPP file containing the file and line numbers of the code that produced the generated code. This will help you find the location where you need to make changes. Hopefully, you will be able to add a line of code that calls your function to a list of strings (lines) that become the generated code. Integrate the generated REXX interface support. The sample application (/test/test.cpp) contains a sample REXX interface. In addition, you will need to provide the C code which implements procedures that you want your users to be able to call from REXX code. KR-SG helps you with this; it generates code that provides data access procedures that can be called from REXX to get and set the contents of member data items in your defined data types. This code is in the file krrex.inc. You will need to #include this into your (sample/test/real) application. Integrate with VX-REXX. [I haven't figured this one out yet.] KR-SG -- Description Page 8 Integrate generated template insertion code into your text editor (EPM or Preditor/2). KR-SG generates two types of text editor template code: (1) ASSIST code for EPM and (2) .PEL code for Preditor/2. If you understand either of these, I believe you will be able to figure out where and how to integrate this code. Possible Uses A bibliography and references database -- The sample application is a starter set for such an application. This application contains types: Bibliography, Author, Documentation, Publisher. A GUI description language -- Actually, my work for several years on such a language, and the wish I had to be able to generate a system to process such a language without all the hand coding I had to do, was the possible motivation for the invention and implementation of KR-SG. Plant/factory and work station description and modeling -- This application might contain types Plant, WorkStation, WorkStationQueue, WorkPiece. If each WorkStation has a list of Events which each contain a list of Actions to be performed when that Event is received by the WorkStation, and if these Events were broadcast on (simulated) clock ticks, this could be a discrete event simulation (DES) system. Political party control application -- This application might track the activities of the units and people working within a political party. This application might contain the types Party, LocalCommittee, Activist, RadicalActivist, etc. A system for tracking fruit shipments -- This application might contain the types Truck, Railcar, Grower, RailwayTerminal, Buyer. Railcars are filled with packed fruit from Growers, then sent to an RailwayTerminal. Buyers request that Railcars containing specific types of fruit, be routed to destination RailwayTerminals where they are available for unloading and distribution to SuperMarkets. A action language system -- Most likely, this would be a part of a larger application. This action language might deal with Objects, Events (messages sent to sets of objects), and Actions. An object in this application might contain a list of Events; each event would contain a list of Actions. The application executes an event loop that takes Events from a queue and sends them to objects. If the object contains that Event, then it executes each of the Actions in that Event. A Dungeons and Dragons or role playing game -- In this application Players each have a current Role. A Player's current role enables that player to respond to certain events in specific ways (i.e. the role contains a list of Events each of which contain a list of Actions (powers) that can be performed. KR-SG -- Description Page 9 Current Status Currently implemented and tested: -- Database (BTree) support code -- Misc support code -- KrString, KrOstream -- C++ code generation, C++ class generation -- member functions: default constructor, copy constructor, virtual destructor, Save, Load, Decompile, GenIpf; type safe list class; minor miscellaneous support functions. -- Compiler grammar generation -- Antlr grammar generation; Sorcerer grammar generation. -- KRC (Knowledge Representation Compiler) -- create database, compile, decompile, delete object. -- Sample application. The following work is "in progress" and not yet completed: -- Generation of REXX support code -- Generation of VX-REXX support code. KR-SG has been tested with: -- Icon version: Version: Icon Interpreter Version 9.0. May 26, 1994 Features: OS/2 interpreted ASCII co-expressions direct execution environment variables external functions fixed regions keyboard functions large integers multiple regions pipes string invocation system function -- PCCTS version 1.23 compiled for OS/2 with WATCOM C++ v. 10. -- C/C++: (1) Borland C++ for OS/2 v. 1.5; (2) WATCOM C++ v. 10 running under OS/2; (3) IBM CSet++ v. 2.01. -- Softfocus BTree v. 3.1. KR-SG -- Description Page 10 Future Directions Here is a list of things from which I'll most likely select future work on KR-SG: SQL/relational database support -- It would be nice to be able to store objects into and load objects from a relational database. And it would be useful to be able to search for objects using SQL. GUI support -- It would be beneficial to be able to implement graphic front ends and graphical editor/browsers for applications produced with the aid of KR-SG and to do so in a platform independent way. How? I don't know. Perhaps when Tcl and Tk is supported on multiple platforms... Some of us feel that a language which is higher level than REXX is needed. The problem with REXX is that its only data type is the string, and that is not very appropriate for a system (KR-SG) whose intent is to enable developers and users to manipulate objects and lists of objects. Icon (the language used to implement the code generator for KR-SG is a possibility. Python, an interpreted, object-oriented language, is also a possibility. Both Icon and Python handle lists of objects well. The difficulties are: (1) embed the language into the application and (2) write the interface code that maps KR-SG objects into Icon/Python objects and back. CEnvi/Cmm support -- Cmm is a C-like language that could serve as a more platform independent macro language than REXX. An Oberon-2 connection -- C++ is a language for suicidal masochists. I'd like to see developers able to write their applications in Oberon-2 and still be able to access the classes, member data items, and member functions generated by KR-SG. One large step in this direction would be an Oberon-2 compiler that translates Oberon-2 code into C++ code and provides a good foreign function interface to Oberon-2 function calls. What KR-SG should do to support this is to generate Oberon-2 (interface) modules that define the type/record equivalent to the generated C++ classes. This needs design work. Since Oberon-2 does garbage collection, Oberon-2 support from KR-SG should take advantage of garbage collection so that developers do not have to worry about when to delete objects. Problems Code bulk -- KR-SG makes it easy to generate code. So, applications bulk up very quickly. None of the current C/C++ linkers strip un-needed functions from .OBJ files containing at least one needed function or un-needed member functions from needed C++ classes. Also note that C++ templates generate code when instantiated. If C++ templates are good, then I suppose that generating lots of code is good. Limited and awkward database connections -- KR-SG does not currently provide any specific support for conventional relational or SQL KR-SG -- Description Page 11 databases. We need a grouping facility in the database, possibly a hierarchical directory structure. The design of the generated code requires that an entire data object be loaded from the database into memory at once. There is no support for reading individual sub-objects one by one. KR-SG is most likely not appropriate for an application that must process a large database of hundreds of thousands of records, especially where transaction processing and transaction rollback support must be provided by the database. KR-SG is currently tied to a specific BTree package (Softfocus) for file support. However, note that the access functions are all in a single file (/kr/db/bt.cpp), which should make it reasonably easy to substitute a different BTree package, or possibly even a relational DB package if it supports BLOBs (variable size binary blocks), as, for example, WATCOM's SQL package does. The code generator itself is written in a high level language with good support for manipulating text strings (the Icon programming language), but is still relatively difficult to modify and extend. We need a better code generator implementation strategy/technology. I don't have a solution to this problem; we need exploration and research to find better code generator implementation technologies. Somebody needs to make my work and my life easier. KR-SG is tied to C++ (it generates C++ code and supporting code is written in C++). C++ is a terrible language. It is low level and dangerous. Although, as a target language for code generation, it's not so bad. KR-SG had weak collection classes. KR-SG generates only lists. On the positive side, there is no dependence on a compiler vender specific collection class library. KR-SG provides no way for the developer to generate code, modify the code, and then re-generate code without losing her/his modifications. The developer can modify the code generator so that it inserts hooks and callbacks in the generated code, but even that is awkward. KR-SG generates static class/type definitions. KR-SG provides no support for the development of applications that enable end users to define new types and to modify existing types at runtime. Most of us respond to this criticism by saying, "Who cares?" However, this is a severe restriction for someone who is seeking a means for doing artificial intelligence style programming. KR-SG does not permit cycles in data structures. Data structures must be acyclic. For example, in the case of a Bibliography which contains a list of Authors each of which contains a list of Documents, you cannot put a pointer to an Author in a Document. Miscellaneous Comments KR-SG -- Description Page 12 Populist programming -- I want to expand the set of people who can produce applications. That's why the REXX interface is an essential, not incidental part of KR-SG. I'm hoping that KR-SG will enable developers to produce application platforms, so that users who are programmers but not systems programmers can create and extend applications. Why I'm putting KR-SG in the public domain: -- KR-SG is not all that packagable. I don't really believe it could be successfully sold. -- I hope to make money from KR-SG related work, perhaps by using KR-SG to create applications for pay. -- I want to support populist programming. I want (1) to help people to create applications much more easily and (2) to help make programming available to more people. I believe that KR-SG is a valuable tool to achieve this. Questions for Contemplation and Research Question: Is the technology behind KR-SG (code generation) the solution to all our future software development problems? Not likely. KR-SG seems a reasonably natural way to produce code for applications needing definitions and code that does a lot of the same kind of things for different objects. What about app frameworks? Are these an alternative to code generators? Are they complementary? Are code generators and app frameworks suitable for different situations? What are the characteristics of these different situations? Arguments for Using Code Generators You get lots of code. You get it quickly. You save lots of work. Once the code generator has been debugged, there is a range of bugs that do not get into your (generated) code. For examples, typos, errors in copying code, etc. You don't have to figure out how to do some of the things that the generated code does. Examples: save to and load from a database; implement a parser/compiler; implement type-safe list classes. Code generators generate regular, uniform code. In the case of KR-SG, once you have looked at one class generated by KR-SG, you have seen them all. The next time you generate a class with KR-SG, you will know where to look and what to look for in those classes. KR-SG -- Description Page 13 This regularity and consistency has benefits for those who must maintain and modify a system. Addresses and Sources for Support Code Icon Programming Language: The Icon Project Department of Computer Science Gould-Simpson Building University of Arizona Tucson, Arizona 85721 1-602-621-8448 Internet: icon-project@cs.arizona.edu Softfocus BTree Library: Softfocus 1343 Stanbury Rd. Oakville, Ontario, Canada L6L 2J5 1-905-825-0903 PCCTS (Purdue Compiler Construction Toolset): Internet anonymous ftp: everest.ee.umn.edu:/pub/pccts /pub/pccts/sorcerer Also see Internet newsgroup: comp.compilers.tools.pccts References [Gris 1990] Griswold, Ralph E. and Madge T. Griswold; The Icon programming Language; Prentice Hall, 1990. KR-SG -- Description Page 14