From tenaglia@fps.mcw.edu  Wed Jan  3 08:16:12 1990
Received: from RUTGERS.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA16236; Wed, 3 Jan 90 08:16:12 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA02559; Wed, 3 Jan 90 10:15:21 EST
Received: by uwm.edu; id AA02111; Wed, 3 Jan 90 09:06:14 -0600
Message-Id: <9001031506.AA02111@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Wed,  3 Jan 90 08:31:28 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Wed,  3 Jan 90 08:08:07 CDT
Date: Wed,  3 Jan 90 08:08:07 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@arizona.edu
Subject: Handy icon procedure make hex dumps of strings
Status: O

I have another handy but small procedure. It's called dump(str). It's
chiefly a debugging tool. It has to linked with radcon however. dump(str)
converts a string of unknown bytes into a list of hexidecimal formatted
ascii. The string "Hello" becomes list ["48","65","6C","6C","6F"]. This
can be output nicely with the expression : every writes(!dump(str),"  ")

-------------------------------------------------------------------------

##################################################################
#                                                                #
# THIS PROCEDURE CONVERTS A MYSTERY STRING TO A HEX DUMP LIST    #
#                                                                #
##################################################################
procedure dump(Str)             # REQUIRES LINK RADCON !
  Buffer := []
  every put(Buffer,right(map(radcon(ord(!Str),10,16),&lcase,&ucase),2,"0"))
  return Buffer
  end

---------------------------------------------------------------------------

Perhaps it could have been done better as a generator or coexpression.
Any ideas for improvements?

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From gmt  Wed Jan  3 10:29:59 1990
Date: Wed, 3 Jan 90 10:29:59 MST
From: "Gregg Townsend" <gmt>
Message-Id: <9001031729.AA22739@megaron.arizona.edu>
Received: by megaron.arizona.edu (5.59-1.7/15)
	id AA22739; Wed, 3 Jan 90 10:29:59 MST
To: icon-group
Subject: The Icon project has moved
Status: O

The Icon project's home machine, formerly "arizona.edu", has changed its
Internet domain name to "cs.arizona.edu".  The uucp sitename of "arizona"
has not changed.

FTP files from:			cs.arizona.edu
				(128.196.128.118 or 192.12.69.1)

Send questions to:		icon-project@cs.arizona.edu
				uunet!arizona!icon-project

Mailing list contributions:	icon-group@cs.arizona.edu		
				uunet!arizona!icon-group

Changes of address:		icon-group-request@cs.arizona.edu
				uunet!arizona!icon-group-request

From tenaglia@fps.mcw.edu  Fri Jan  5 09:16:56 1990
Received: from RUTGERS.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA13014; Fri, 5 Jan 90 09:16:56 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA06126; Fri, 5 Jan 90 11:16:00 EST
Received: by uwm.edu; id AA27525; Fri, 5 Jan 90 09:32:05 -0600
Message-Id: <9001051532.AA27525@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Fri,  5 Jan 90 08:50:10 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Fri,  5 Jan 90 08:07:21 CDT
Date: Fri,  5 Jan 90 08:07:21 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: Procedures for packed decimal conversions
Status: O


Dear Icon-Group,

Here are two other interesting procedures. pack() and unpack() deal
with translating numbers to and from a packed decimal format. Packed
format is used to de/compress decimal numbers. The packed numbers use
just over 1/2 the space of ascii formatted numbers. I use the format
for experiments with encryption. Sometimes databases may use them.
unpack() requires radcon() from the icon program library. Perhaps
someone would care to improve on my code either in performance or
elegance? pack(num,width) packs a number into a packed decimal string.
num is the number and width is the size of the target packed string.
unpack(val,width) unpacks a packed decimal number into an integer.
width is the width of the returned integer.

##################################################################
#                                                                #
# THIS PROCEDURE PACKS AN INTEGER IN TO PACKED DECIMAL STRING.   #
#                                                                #
##################################################################
procedure pack(num,width)           # 5p
  local int,sign,prep,packed,i,word,calc
  (int := integer(num)) | fail                
  if int < 0 then sign := "=" else sign := "<"
  prep := int || sign ; packed := ""
  if (*prep % 2) ~= 0 then prep := "0" || prep
  every i := 1 to *prep by 2 do
    {
    word := prep[i+:2]
    if word[-1] == ("=" | "<") then
      {
      calc     := word[1]*16 + ord(word[2])-48
      packed ||:= char(calc)
      next
      }
    calc     := word[1]*16 + word[2]
    packed ||:= char(calc)
    }
  /width := *packed
  return right(packed,width,"\0")
  end                                   

##################################################################
#                                                                #
# THIS PROCEDURE UNPACKS A VALUE INTO AN INTEGER.                #
#                                                                #
##################################################################
procedure unpack(val,width)         # 6p     REQUIRES LINK RADCON !
  local tmp,number,tens,ones,sign
  tmp := "" ; sign := 1
  every number := ord(!val) do
    {
    hex := map(radcon(number,10,16),&lcase,&ucase)
    tmp ||:= hex
    }
  if tmp[-1] == ("B" | "D") then sign := -1
  tmp[-1] := "" ; tmp *:= sign ; /width := *tmp
  return right(tmp,width)
  end

Have Fun !

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From SHAFIE@UCBEH.SAN.UC.EDU  Thu Jan 11 13:29:03 1990
Received: from ucbeh.san.uc.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA07629; Thu, 11 Jan 90 13:29:03 MST
Date: Thu, 11 Jan 90 14:13 EST
From: Amin Shafie - Univ of Cincinnati Comp Ctr <SHAFIE@UCBEH.SAN.UC.EDU>
Subject: SIGUCCS CALL for PARTICIPATION
To: 386USERS@TWG.COM, 9370-L%HEARN.BITNET@MITVMA.MIT.EDU,
        AAI@ST-LOUIS-EMH2.ARMY.MIL, ADA-SW@WSMR-SIMTEL20.ARMY.MIL,
        ADVISE-L%CANADA01.BITNET@CUNYVM.CUNY.EDU, ADVSYS@EDDIE.MIT.EDU,
        AG-EXP-L%NDSUVM1.BITNET@CUNYVM.CUNY.EDU, AI-ED@SUMEX-AIM.STANFORD.EDU,
        AIDSNEWS%RUTVM1.BITNET@CUNYVM.CUNY.EDU, AIList@AI.AI.MIT.EDU,
        AIX-L%BUACCA.BITNET@MITVMA.MIT.EDU, ALLIN1-L@CCVM.SUNYSB.EDU,
        AMETHYST-USERS@WSMR-SIMTEL20.ARMY.MIL, AMIGA-RELAY@UDEL.EDU,
        ANDREW-DEMOS@ANDREW.CMU.EDU, ANTHRO-L%UBVM.BITNET@CUNYVM.CUNY.EDU,
        apollo@UMIX.CC.UMICH.EDU, ARMS-D@XX.LCS.MIT.EDU,
        ARPANET-BBOARDS@MC.LCS.MIT.EDU, ASM370%UCF1VM.BITNET@CUNYVM.CUNY.EDU,
        AVIATION@MC.LCS.MIT.EDU, AVIATION-THEORY@MC.LCS.MIT.EDU,
        bicycles@BBN.COM, BIG-LAN@SUVM.ACS.SYR.EDU, BIG-LAN@SUVM.BITNET,
        BIOTECH%UMDC.BITNET@CUNYVM.CUNY.EDU, BIOTECH@UMDC.UMD.EDU,
        BITNEWS%BITNIC.BITNET@CUNYVM.CUNY.EDU,
        BMDP-L%MCGILL1.BITNET@CORNELLC.CCS.CORNELL.EDU,
        bug-1100@SUMEX-AIM.STANFORD.EDU, CA@THINK.COM,
        CADinterest^.es@XEROX.COM, CAN-INET@MC.LCS.MIT.EDU,
        cisco@SPOT.COLORADO.EDU
Message-Id: <F5FA22744FDF00D81C@UCBEH.SAN.UC.EDU>
X-Envelope-To: Icon-Group@ARIZONA.EDU
X-Vms-To: @LISTS.DIS
X-Vms-Cc: SHAFIE
Status: O

<--------------------------------------------------------------------
< 
<                 SIGUCCS User Services Conference XVIII
<                        Call For Participation
<
<                  New Centerings in Computing Services
< 
<                  September 30 through October 3, 1990
<
<                           Westin Hotel
<                         Cincinnati, Ohio
< 
<
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<< 
<<Attention Directors, Managers, Analysts, Consultants, Programmers,
<<Technical Writers, Trainers, and Librarians!
<< 
<<The higher education computing scene in the 1990s will present exciting
<<challenges.  To accommodate users' needs, computing service organizations
<<are now visibly transforming in function and structure.  The widespread
<<adoption of personal computing by all disciplines, the increasing demand
<<for desktop access to shared resources, the growth in demand for
<<supercomputing capabilities, and the proliferation of powerful desktop
<<workstations exert irresistible forces on central computing services.
<<In response, the central site grows exponentially in staff and machinery
<<at one academic institution; at another, the computing center is disbanded
<<to provide distributed computing!  At some sites increasing specialization
<<is urged; at others, generalization is required.  Regardless of the
<<transforming strategy adopted by an individual institution, one fact
<<seems clear:  the user is the center toward which all computing services
<<are directed.
<< 
<<SIGUCCS '90 invites you to participate in the examination and discussion
<<of the myriad challenges facing user services professionals as we enter a
<<new decade and of the new centerings computing service organizations are
<<discovering to meet them.  Please join us!
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<<You can Participate
<< 
<<	Presentations
<< 
<<	Papers
<< 
<<	Panel Discussions
<< 
<<	Quick Workshops
<< 
<<	Educational Materials Competition
<< 
<<	Newsletter Competition
<< 
<<	Technical Writing Competition
<< 
<<	Documentation Display
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<< 
<< 
<<Important Dates
<< 
<<	March 1, 1990		Presentation proposals due
<<	April 1, 1990		Notification of proposal acceptance
<<	May 1, 1990		Final Papers due
<<	June 1, 1990		Newsletter entries due
<<	June 1, 1990		Technical writing entries due
<<	June 15, 1990		Notification of paper/panel acceptance
<<	September 1, 1990	Deadline for materials for
<<				documentation display
<< 
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<<Presentation Topic Areas
<< 
<< 
<<Information Exchange Technology
<< 
<<Information exchange may well be the most important computing
<<activity of the 1990s. The infrastructure for information delivery, the
<<National Research and Academic Network (NREN), is presently being developed.
<<How do we meet the challenges of a world where the
<<facilitation of information delivery may be a principal user services
<<responsibility?  Topics of particular interest include:
<< 
<<	new approaches to information exchange
<< 
<<	campus activity in implementing information exchange
<<	facilities that comply with emerging international standards
<< 
<<	research and development of computer-mediated information
<<	exchange methods
<< 
<< 
<<Distributed Services
<< 
<<As the role of user services shifts to providing distributed support,
<<we must create new ways of providing traditional services as well as
<<designing new services.  Topics of particular interest include:
<< 
<<	providing support staff in departments and colleges
<< 
<<	funding issues
<< 
<<	if and how to charge back for services
<< 
<<	human networking of distributed support staff
<< 
<<	nonlabor-intensive support strategies
<< 
<<	cooperative efforts with other departments
<< 
<< 
<< 
<<Management Strategies
<< 
<<How do user services managers cooperate with other administrative and
<<academic units that use or provide computing resources?  How do they
<<meet the many and diverse demands?  Topics of particular interest include:
<< 
<<	reorganization
<< 
<<	interaction with faculty advisory groups
<< 
<<	delegating and distributing responsibility
<< 
<<	coordinating university computing resources
<< 
<<	staff professional development
<< 
<< 
<<Marketing your Services
<< 
<<Changing roles may require changing your services and, often, your image on
<<campus as you provide new services to new users.  Topics of particular in-
<<terest include:
<< 
<<	promotional strategies
<< 
<<	conducting market research
<< 
<<	designing services for unique or special audiences
<< 
<< 
<< 
<<Strategies for Small Schools
<< 
<<How can a small liberal arts college have distributed user services and
<<centralized user services?  How do distributed and centralized services work
<<together to provide computing services beyond word processing?  The
<<sciences have become computer literate; now, how do we reach out  from the
<<center to the humanities and fine arts?  Are we getting out of the
<<office and into the trenches?  Are we making too many "house calls"?
<<Should we make them at all?
<< 
<< 
<<Security and Ethics
<< 
<<As electronic mail and conferencing become more popular, computing
<<systems are widely accessible to more users.  How secure should academic
<<computing resources be?  What are the ethical guidelines provided for users
<<of electronic mail and conferencing systems?  Topics of particular interest
<<include:
<< 
<<	promoting responsible and ethical use of computing resources
<< 
<<	security strategies
<< 
<<	adopting an ethics policy
<< 
<< 
<<Serving New Audiences
<< 
<<People from the humanities, the arts, and other traditionally nontechnical
<<disciplines are discovering that computers can help in areas other than
<<word processing.  In an increasingly proactive stance in the central
<<computing facility, what do we do to attract and support these new audi-
<<ences?  Topics of interest include:
<< 
<<	providing information about off-the-shelf specialized
<<	programs for music, fine arts, and the humanities
<< 
<<	facilitating technical support of nontraditional areas
<< 
<<	serving the computing beginner who wants to do
<<	sophisticated tasks
<< 
<< 
<<Consulting, Training, and Documentation
<< 
<<Supporting those who use the computing resources that we provide re-
<<mains an important responsibility of user services organizations.  Topics
<<of particular interest include:
<< 
<<	new approaches to training
<< 
<<	providing distributed consulting
<< 
<<	documentation distribution services
<< 
<< 
<<and/or other topics that would be of interest to your national
<<and international colleagues
<< 
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<<Submitting Proposals
<< 
<< 
<<Submit proposals via electronic mail to:
<< 
<<	SIGPAPER@OHSTVMA.BITNET or
<< 
<<	SIGPAPER@OHSTVMA.IRCC.OHIO-STATE.EDU
<< 
<<If you do not have access to electronic mail, send a printed copy to:
<< 
<<		Susan Jenkins Saari
<<		Instruction and Research
<<		Computer Center
<<		The Ohio State University
<<		1971 Neil Avenue
<<		Columbus, OH 43210
<< 
<<		phone:      (614) 292-4843
<<		fax:      (614) 292-7081
<< 
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<<Accepted Proposals
<< 
<< 
<<Proposals must be received by March 1, 1990.  Any submisson received
<<after this date will not be considered by the Program Committee.  You will
<<be notified of the Program CommitteeUs decision by April 1, 1990.  If your
<<proposal is accepted, you will be asked to submit a full paper by May 1,
<<1990.  Any papers received after this date will not be considered.  You will
<<be notified of the Program CommitteeUs decision by June 15, 1990.
<< 
<<If your presentation is accepted, SIGUCCS is depending on you.  If you are
<<ker to make your presentation (not a substitute presentation).
<< 
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<< 
<<How to Participate
<< 
<< 
<<Proposals
<< 
<<For each proposal, include your name, title, affiliation, mailing ad-
<<type of  proposal (presentation or panel discussion) and its topic area.
<<In addition, you must enclose the proper materials from the following
<<requirements list:
<< 
<<Description
<< 
<<Papers		Papers will be presented in 20-minute ntervals, with
<<		three papers scheduled per 90-minute session. Speakers'
<<		papers will be published in the conference proceedings.
<< 
<<Panels		Panels will be in-depth treatments of a single topic by
<<		two to four speakers from at least two different schools,
<<		coordinated by a moderator.  Allow ample time for audience
<<		discussion.  Abstracts for panels should be submitted
<<		as a unit by the person who wishes to act as a moderator.
<<		Panelists' papers will be published in the conference
<<		proceedings.
<< 
<<Quick Workshops	Quick workshops provide 90-minute overviews of new technolo-
<<		gies, innovative applications, and creative strategies
<<		for addressing the needs of computer users on campus.
<< 
<< 
<<Requirements
<< 
<<Papers		A 250- to 300-word abstract of the paper.  Acceptance of
<<		a proposal does not automatically ensure acceptance
<<		of a paper for presentation; you must submit a full
<<		paper to be considered for the conference program.
<< 
<<Panels		A 250- to 300-word description of the panel, including
<<		each panelist's name, title, affiliation, and presentation
<<		topic.  Acceptance of a panel description does not
<<		automatically ensure acceptance of the panel for
<<		presentation; each panelist must submit a full paper
<<		to be considered for the conference program.
<< 
<<Quick Workshops	A one- to two-page outline of the presentation and a
<<		10-minute videotape excerpt from the proposed presentation.
<<		Acceptance of a proposal does not automatically ensure
<<		acceptance of a workshop for presentation; you must
<<		submit a full paper to be considered for the conference
<<		program.  Only three or four presentations will be a
<<		ccepted in this category because it is highly competiive.
<< 
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<< 
<<Other Ways to Participate
<< 
<<Education and Training Materials Competition
<< 
<<Interest in and the importance of user education and training have grown
<<with each SIGUCCS conference.  The 1990 SIGUCCS Conference offers,
<<for the first time, competition for user education and training materials for
<<colleges and universities.*  We invite you to submit no more than two
<<entries in any or all of the following categories: curriculum catalog, class-
<<room printed materials, or self-contained printed tutorials.  Although the
<<first year of this competition includes only printed materials, we would like
<<to know if there is an interest in expanding our future competitions to
<<include video, audio, and computer-based tutorials.  Deadline for entry is
<<June 1, 1990.  For more details and an entry form, or to address the issue
<<of future competition categories, contact:
<< 
<<		Diane Jung-Gribble
<<		Indiana University
<<		750 North State Road 46 Bypass
<<		Bloomington, IN  47405
<< 
<<		(812) 855-0962
<< 
<< 
<<		JUNG@IUBACS.BITNET
<<		JUNG@JADE.BACS.INDIANA.EDU
<< 
<<*NOTE:  this competition is not open to commercial materials
<< 
<<Newsletter Competition
<< 
<<Winning an award in the SIGUCCS Newsletter Competition is a mark of
<<distinction for your institution, and for your editors, writers,artists,and
<<designers.  You will be asked to submit two consecutive issues published
<<between June 1989 and May 1990.  Deadline for entry is June 1, 1990.
<<For more details and an entry form, contact:
<< 
<<		Jess Anderson
<<		Madison Academic Computing Center
<<		University of Wisconsin-Madison
<<		1210 West Dayton Street
<<		Madison, WI   53706
<< 
<<		(608) 263-6988
<< 
<<		ANDERSON@MACC.WISC.EDU
<<		ANDERSON@WISCMACC.BITNET
<< 
<< 
<<Technical Writing Competition
<< 
<<If you have written or published a particularly good article in a computing
<<newsletter, enter it in the Technical Writing Competition.  Each computing
<<center may enter one article.  Deadline for entry is June 1,1990.  To obtain
<<entry forms and more details, contact:
<< 
<<		Donald J. Montabana
<<		University of Pennsylvania
<<		Computing Resources Center
<<		1202 Blockley Hall
<<		Philadelphia, PA  19104-6021
<< 
<<		(215) 898-9085
<< 
<<		MONTABANA@A1.RELAY.UPENN.EDU
<< 
<< 
<< 
<<Documentation Display
<< 
<<The documentation room will feature an online system for submitted
<<documentation.  Conference attendees who have BITNET or INTERNET
<<access will be able to email documentation to their university or college.
<<Documentation may be submitted electronically to DOCUMENT@MIAMIU,
<<by hardcopy, or diskette (IBM or Mac formatted) and must be received
<<before September 1, 1990.  Direct inquries to:
<< 
<<		Al Kaled
<<		Academic Computing Services
<<		Miami University
<<		Oxford, OH  45056
<< 
<<		(513) 529-6226
<< 
<<		AK75STAF@MIAMIU
<< 
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<< 
<<More Information
<< 
<< 
<<General Information
<
<<Amin Shafie, Conference Chair
<<University of Cincinnati
<< 
<< 
<<		e-mail:		SHAFIE@UCBEH.BITNET
<< 
<<		phone:		(513) 556-9001
<< 
<<		fax:		(513) 556-0035
<< 
<< 
<<Call for Participation
<<Susan Jenkins Saari, Program Chair
<<The Ohio State University
<< 
<<		e-mail:		SIGPAPER@OHSTVMA.BITNET
<< 
<<		phone:		(614) 292-4843
<< 
<<		fax:		(614) 292-7081
<< 
<< 
<<Registration
<<Ken Maccarone, Registration Chair
<<University of Cincinnati
<< 
<<		e-mail:		MACCARON@UCBEH.BITNET
<< 
<< 
<<		phone:		(513) 556-9098
<<		fax:		(513) 556-0035
<< 
<< 
<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<< 
<< 
<<ACM SIGUCCS
<< 
<<The Association of Computing Machinery's (ACM) Special Interest Group
<<for University and College Computing (SIGUCCS) is one of ACM's
<<organizational units devoted to the technical activities of its members.
<<SIGUCCS provides a link for guidance and the interchange of ideas among
<<computing professionals in the full range of small to large institutions.
<<Its newsletter, annual conferences, and workshops promote the discussion
<<of mutual problems. networks, user services, and computer center management.
<<This SIGUCCS conference emphasizes practical ways to improve services for
<<those who use university and college computing centers.


Amin Shafie
Assistant Director
Academic Computing Services                UCBEH::SHAFIE
University of Cincinnati                   SHAFIE@UCBEH.SAN.UC.EDU
ML 088                                     SHAFIE@UCBEH.BITNET
Cincinnati, Ohio  45221
(513) 556-9022

From ASRRM@asuacad.bitnet  Fri Jan 12 13:47:46 1990
Message-Id: <9001122047.AA20569@megaron>
Received: from rvax.ccit.arizona.edu by megaron (5.59-1.7/15) via SMTP
	id AA20569; Fri, 12 Jan 90 13:47:46 MST
Received: from ASUACAD.BITNET by rvax.ccit.arizona.edu; Fri, 12 Jan 90 13:43 MST
Received: by ASUACAD (Mailer R2.05) id 9455; Fri, 12 Jan 90 13:37:55 MST
Date: Fri, 12 Jan 90 13:37:11 MST
From: mannem ravinder reddy <ASRRM@asuacad.bitnet>
Subject: unsubscribe
To: icon-group <icon-group@cs.arizona.edu>
Status: O

unsubscribe icon-group

From PRONK@HROEUR5.BITNET  Thu Jan 18 16:01:25 1990
Message-Id: <9001182301.AA18292@megaron.arizona.edu>
Received: from rvax.ccit.arizona.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA18292; Thu, 18 Jan 90 16:01:25 MST
Received: from HROEUR5.BITNET by rvax.ccit.arizona.edu; Thu, 18 Jan 90 15:44 MST
Date: Thu, 18 Jan 90 13:06 N
From: PRONK@HROEUR5.BITNET
Subject: unsubscribe
To: icon-group@cs.arizona.edu
X-Original-To:  icon-group@cs.arizona.edu, PRONK
Status: O

unsubscribe

From icon-group-request@arizona.edu  Fri Jan 19 06:50:55 1990
Received: from ucbvax.Berkeley.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA01143; Fri, 19 Jan 90 06:50:55 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41)
	id AA29135; Fri, 19 Jan 90 05:40:50 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews
	for icon-group@arizona.edu (icon-group@arizona.edu)
	(contact usenet@ucbvax.Berkeley.EDU if you have questions)
Date: 18 Jan 90 22:00:00 GMT
From: amdahl!ntmtv!hildum@apple.com  (Eric Hildum)
Organization: Northern Telecom (Mountain View, CA)
Subject: Installing Icon 7.5 on Sun 4
Message-Id: <679@ntmtv.UUCP>
References: <678@ntmtv.UUCP>
Sender: icon-group-request@arizona.edu
To: icon-group@arizona.edu
Status: O


I have installed Icon 7.5 on a Sun 4 workstation, and run into some
problems. The operating system is SunOS 4.0.3c, the installation was
done by Bill Mitchell on November 22, 1988. 

After the installation, I ran the full test suite, and the gc2 and
checking tests apparently did not pass.  In addition, this port does
not support overflow checking or co-expressions.

Are these known problems, and is there a more recent port to the Sun 4
which supports overflow checking and co-expressions?

			Thanks,
				Eric

replies to:

			ntmtv!hildum@ames.com
			hildum@iris.ucdavis.edu

From icon-group-request@arizona.edu  Fri Jan 19 06:51:00 1990
Received: from ucbvax.Berkeley.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA01158; Fri, 19 Jan 90 06:51:00 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41)
	id AA29153; Fri, 19 Jan 90 05:41:00 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews
	for icon-group@arizona.edu (icon-group@arizona.edu)
	(contact usenet@ucbvax.Berkeley.EDU if you have questions)
Date: 18 Jan 90 21:53:30 GMT
From: amdahl!ntmtv!hildum@apple.com  (Eric Hildum)
Organization: Northern Telecom (Mountain View, CA)
Subject: Installing Icon on Sun3 with SunOS 4.0.3
Message-Id: <678@ntmtv.UUCP>
Sender: icon-group-request@arizona.edu
To: icon-group@arizona.edu
Status: O


I have just installed Icon 7.5 on a Sun 3/60 with SunOS 4.0.3 and have a
couple of issues.

First, I have to change the -m68020 switch to -sun3 to get correct
compiliation.  This seems to work just fine; is this change going to
be made to the installation available on arizona.edu?

The new default for the cc compiler is to use software floating point,
rather than the switch option.  Would it be reasonable to change the
default sun3 installation to include the -fswitch option?

The Header file now requires almost 12000 bytes. Is this reasonable?


Other than these issues, everything went well, and all the tests
passed.

From cargo@tardis.cray.com  Fri Jan 19 07:39:42 1990
Received: from uc.msc.umn.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA02802; Fri, 19 Jan 90 07:39:42 MST
Received: from hall.cray.com by uc.msc.umn.edu (5.59/1.14)
	id AA19081; Fri, 19 Jan 90 08:37:55 CST
Received: from zk.cray.com by hall.cray.com
	id AA04437; 3.2/CRI-3.12; Fri, 19 Jan 90 08:39:39 CST
Received: by zk.cray.com
	id AA00765; 3.2/CRI-3.12; Fri, 19 Jan 90 08:39:35 CST
Date: Fri, 19 Jan 90 08:39:35 CST
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9001191439.AA00765@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: concat with blank and a question
Status: O

I recently purchased a product called HyperPAD for the PC.  It is intended
to be a HyperCard-like product for the PC.  What I found interesting was
something in its list of operators.  To concatenate two strings there is
the concatenation operator &.  However, there is an operator to concatenate
two strings with a space in between them, the && operator.  I realized that
this one little feature was a a nice convenience.  I know I have often used
something like  a || " " || b because I needed to put a space between two
strings I was combining.  Maybe from an overall language viewpoint this isn't
a significant improvement, but I thought it was an interesting addition to
a language with string processing.

The question I have is:  What is the best way to elminate spaces (or, to
generalize, members of a particular cset) from a string while still preserving
the order of the remaining characters?  I'm going to be performing some comparisons
where certain characters are very likely to not be significant.  I'm looking for
an efficient way of doing preprocessing to remove the insignificant characters.

                         o       o
                          \_____/
                          /-o-o-\     _______
DDDD      SSSS   CCCC    /   ^   \   /\\\\\\\\
D   D    S      C        \ \___/ /  /\   ___  \
D   D     SSS   C         \_   _/  /\   /\\\\  \
D   D        S  C           \  \__/\   /\ @_/  /
DDDDavid SSSS.   CCCCargo    \____\____\______/ CARGO@TARDIS.CRAY.COM

From goer@sophist.uchicago.edu  Fri Jan 19 09:17:22 1990
Received: from tank.uchicago.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA09664; Fri, 19 Jan 90 09:17:22 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 19 Jan 90 10:17:26 CST
Return-Path: <goer@sophist.uchicago.edu>
Received:  by sophist.uchicago.edu (3.2/UofC3.0)
	id AA22197; Fri, 19 Jan 90 10:13:19 CST
Date: Fri, 19 Jan 90 10:13:19 CST
From: Richard Goerwitz <goer@sophist.uchicago.edu>
Message-Id: <9001191613.AA22197@sophist.uchicago.edu>
To: icon-group@arizona.edu
Subject: strip?
Status: O

A snail asked:

The question I have is:  What is the best way to elminate spaces (or, to
generalize, members of a particular cset) from a string while still preserving
the order of the remaining characters?

I wonder if you are referring to a stripping routine?

procedure Strip(s,c)
  s2 := ""
  s ? {
    while s2 ||:= tab(upto(c))
    do tab(many(c))
    s2 ||:= tab(0)
    }
  return s2
end

This will work with strings, and I suppose that type conversion
will make it work with csets, too.  For operations specifically
having to do with csets, you can of course say

     c1 --:= c2

where c1 is the cset you are trying to strip down, and c2 is the
cset containing the characters to be removed from it.  The trouble
here, though, is that, unlike strings, csets are not an ordered
sequence of characters (you did say something about "original or-
der," didn't you?).

I guess I'm confused.  If the original order is important, use 
Strip(s,c), and feed it strings.  Does this help?

                                        -Richard L. Goerwitz
                                        goer@sophist.uchicago.edu
                                        goer%sophist@uchicago.bitnet
                                        rutgers!oddjob!gide!sophist!goer

From tenaglia@fps.mcw.edu  Fri Jan 19 10:20:36 1990
Received: from RUTGERS.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA13085; Fri, 19 Jan 90 10:20:36 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA25922; Fri, 19 Jan 90 12:18:26 EST
Received: by uwm.edu; id AA26075; Fri, 19 Jan 90 11:17:22 -0600
Message-Id: <9001191717.AA26075@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Fri, 19 Jan 90 11:15:03 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Fri, 19 Jan 90 11:08:11 CDT
Date: Fri, 19 Jan 90 11:08:11 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: RE: concat with blank and a question
X-Vms-Mail-To: UUCP%"cargo@tardis.cray.com"
Status: O

In reply to the concat,...

If you are running ICON under unix, and are adventurous, you can build
in your own concat with the 'Personal Interpretor' or 'Variant Translator'.
I've built a personal code library of icon software chips which I just
include using the editor when I use them.

Having an operator $ for example may accomplish :

  both := first $ second   (rather then both := first || " " || second)     

But it still not as flexible as a procedure :

  procedure cat(s1,s2,s3)
    /s3 := " "             # where an optional string may or may not appear.
    return s1 || s3 || s2
    end

---------------------------------------------------

Concerning Character Elimination in a String

This is best done with the string scanning feature of ICON. I'll present
a little procedure. Others may approach the problem differently.

  procedure strip(str,chrs)
    local text,chars
    /chrs := ' '
    chars := &cset -- chrs
    text  := ""
    str ? while tab(upto(chars)) do text ||:= tab(many(chars))
    return text
    end

Yours truly,

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From cargo@tardis.cray.com  Fri Jan 19 16:03:37 1990
Received: from uc.msc.umn.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA09993; Fri, 19 Jan 90 16:03:37 MST
Received: from hall.cray.com by uc.msc.umn.edu (5.59/1.14)
	id AA01870; Fri, 19 Jan 90 17:01:37 CST
Received: from zk.cray.com by hall.cray.com
	id AA16178; 3.2/CRI-3.12; Fri, 19 Jan 90 17:03:22 CST
Received: by zk.cray.com
	id AA01299; 3.2/CRI-3.12; Fri, 19 Jan 90 17:03:16 CST
Date: Fri, 19 Jan 90 17:03:16 CST
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9001192303.AA01299@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: deleting characters
Status: O

After sending out my mail about deleting or stripping characters out of a
string I received a couple of responses from kindly Iconists.  I later was
showing off Icon to a friend when I notices a "delete" function in
strutil.icn.  Upon closer examination I found that delete also did what I
wanted.  The next step for me was to write a little benchmarking program so
I could see how the different methods compared speedwise.  It turns out
that the IPL delete routine was the slowest, although the fastest was only
15 percent faster.

Here is the test program:

procedure main()
    test_string :=
        "    str ? while tab(upto(chars)) do text ||:= tab(many(chars))"
    remove := &cset -- &lcase
    time1 := &time
    limit := 1000
    every 1 to limit do result1 := delete(test_string, remove)
    time2 := &time
    every 1 to limit do result2 := strip1(test_string, remove)
    time3 := &time
    every 1 to limit do result3 := strip2(test_string, remove)
    time4 := &time
    write(time1)
    write(time2-time1, " ", result1)
    write(time3-time2, " ", result2)
    write(time4-time3, " ", result3)
    return
end

#  from IPL strutil.icn
#  delete characters
#
procedure delete(s,c)
    local i
    while i := upto(c,s) do
        s[i:many(c,s,i)] := ""
    return s
end

#From: Richard Goerwitz <goer@sophist.uchicago.edu>
procedure strip1(s,c)
    s2 := ""
    s ? {
        while s2 ||:= tab(upto(c)) do
            tab(many(c))
        s2 ||:= tab(0)
        }
  return s2
end


#From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
procedure strip2(str,chrs)
    local text,chars
#    /chrs := ' '
#   (I commmented this out because the others don't do such checks.)
    chars := &cset -- chrs
    text  := ""
    str ? while tab(upto(chars)) do text ||:= tab(many(chars))
    return text
end

And the output (first column is time, second column is the result string
which is always supposed to be the same).  Initial time is 0, other times
are in milliseconds.

0
21033 strwhiletabuptocharsdotexttabmanychars
20350 strwhiletabuptocharsdotexttabmanychars
17717 strwhiletabuptocharsdotexttabmanychars

All three are reasonable, but the differences in approach are educational.
                         o       o
                          \_____/
                          /-o-o-\     _______
DDDD      SSSS   CCCC    /   ^   \   /\\\\\\\\
D   D    S      C        \ \___/ /  /\   ___  \
D   D     SSS   C         \_   _/  /\   /\\\\  \
D   D        S  C           \  \__/\   /\ @_/  /
DDDDavid SSSS.   CCCCargo    \____\____\______/ CARGO@TARDIS.CRAY.COM

From flee@shire.cs.psu.edu  Fri Jan 19 17:07:13 1990
Received: from shire.cs.psu.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA15092; Fri, 19 Jan 90 17:07:13 MST
Received: from localhost by shire.cs.psu.edu with SMTP 
	(5.61/PSUCS-1.0) id AA02947; Fri, 19 Jan 90 19:07:42 -0500
Message-Id: <9001200007.AA02947@shire.cs.psu.edu>
To: cargo@tardis.cray.com (David S. Cargo)
Cc: icon-group@cs.arizona.edu
Subject: Re: deleting characters 
In-Reply-To: Your message of Fri, 19 Jan 90 17:03:16 CST.
             <9001192303.AA01299@zk.cray.com> 
Date: Fri, 19 Jan 90 19:07:40 EST
From: Felix Lee <flee@shire.cs.psu.edu>
Status: O

> It turns out that the IPL delete routine was the slowest, although
> the fastest was only 15 percent faster.

On pathological cases, the delete routine can be much slower.  Try
removing spaces from repl("a   ", 1000).

The delete routine is quadratic wrt the length of the source string,
while the strip routines are quadratic wrt the result.  This is due
to the terrible amount of copying involved in manipulating Icon
strings: delete has to copy 3997 + 3994 + ... + 1000 characters,
while the other procedures need only copy 1 + 2 + ... + 1000.

You can get linear performance if you do it in C.
--
Felix Lee	flee@shire.cs.psu.edu	*!psuvax1!flee

From gmt  Fri Jan 19 17:48:05 1990
Date: Fri, 19 Jan 90 17:48:05 MST
From: "Gregg Townsend" <gmt>
Message-Id: <9001200048.AA16501@megaron.arizona.edu>
Received: by megaron.arizona.edu (5.59-1.7/15)
	id AA16501; Fri, 19 Jan 90 17:48:05 MST
In-Reply-To: <9001200007.AA02947@shire.cs.psu.edu>
To: icon-group
Subject: Re: deleting characters
Cc: cargo@tardis.cray.com, flee@shire.cs.psu.edu
Status: O

    Felix Lee (flee@shire.cs.psu.edu) writes:

    On pathological cases, the delete routine can be much slower.

True.
    
    ...You can get linear performance if you do it in C.

You get it in Icon, too.  Both Goerwitz's and Tenaglia's procedures are linear.

Building strings by successive concatenation is sufficiently common that it was
worth optimizing the implementation.  If no other concurrent activity disrupts
things, only the new characters (those added at the end) are copied.

    Gregg Townsend / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
    +1 602 621 4325     gmt@cs.arizona.edu     110 57 16 W / 32 13 45 N / +758m

From wgg@cs.washington.edu  Fri Jan 19 18:13:02 1990
Received: from june.cs.washington.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA17870; Fri, 19 Jan 90 18:13:02 MST
Received: by june.cs.washington.edu (5.61/7.0jh)
	id AA16921; Fri, 19 Jan 90 17:11:14 -0800
Date: Fri, 19 Jan 90 17:11:14 -0800
From: wgg@cs.washington.edu (William Griswold)
Return-Path: <wgg@cs.washington.edu>
Message-Id: <9001200111.AA16921@june.cs.washington.edu>
To: cargo@tardis.cray.com, flee@shire.cs.psu.edu
Subject: Re: deleting characters
Cc: icon-group@cs.arizona.edu
Status: O


> The delete routine is quadratic wrt the length of the source string,
> while the strip routines are quadratic wrt the result.

I may be wrong, but I believe you will find that in modern implementations 
of Icon that the strip routines are linear in time.  Icon is smart enough 
to know that a string is located at the end of the string memory region 
(in this case the value of the variable holding the accumulating result 
string), and can just add to the end of it to concatenate.  Any other 
*modification* of a string requires copying--substring creation does not
require copying, since it is implemented as a pointer and an index. 

> You can get linear performance if you do it in C.

Many common operations in Icon require *more* time to perform in C--using
available abstractions--such as computing the length of a string.  Also
note that string concatentation in C in the standard way (using strcat)
takes linear time.  It also requires knowing the destination string is
long enough to hold the longer result.  Thus making strip as ``fast'' as
Icon's requires a little effort. 

					Bill Griswold

From flee@shire.cs.psu.edu  Fri Jan 19 19:50:23 1990
Received: from shire.cs.psu.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP
	id AA21917; Fri, 19 Jan 90 19:50:23 MST
Received: from localhost by shire.cs.psu.edu with SMTP 
	(5.61/PSUCS-1.0) id AA04238; Fri, 19 Jan 90 21:51:28 -0500
Message-Id: <9001200251.AA04238@shire.cs.psu.edu>
To: "Gregg Townsend" <gmt@cs.arizona.edu>
Cc: icon-group@cs.arizona.edu
Subject: Re: deleting characters 
In-Reply-To: Your message of Fri, 19 Jan 90 17:48:05 MST.
             <9001200048.AA16501@megaron.arizona.edu> 
Date: Fri, 19 Jan 90 21:51:27 EST
From: Felix Lee <flee@shire.cs.psu.edu>
Status: O

> If no other concurrent activity disrupts things, only the new characters
> (those added at the end) are copied.

Ah, I forgot about that optimization.
--
Felix

From icon-group-request@arizona.edu  Sun Jan 21 20:52:21 1990
Received: from ucbvax.Berkeley.EDU by megaron (5.59-1.7/15) via SMTP
	id AA12560; Sun, 21 Jan 90 20:52:21 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41)
	id AA06366; Sun, 21 Jan 90 19:47:56 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews
	for icon-group@arizona.edu (icon-group@arizona.edu)
	(contact usenet@ucbvax.Berkeley.EDU if you have questions)
Date: 22 Jan 90 03:24:46 GMT
From: aramis.rutgers.edu!paul.rutgers.edu!jac@rutgers.edu  (Jonathan A. Chandross)
Organization: Rutgers Univ., New Brunswick, N.J.
Subject: Re: deleting characters
Message-Id: <Jan.21.22.24.44.1990.9958@paul.rutgers.edu>
References: <9001200111.AA16921@june.cs.washington.edu>
Sender: icon-group-request@arizona.edu
To: icon-group@arizona.edu
Status: O


wgg@CS.WASHINGTON.EDU (William Griswold)
> Many common operations in Icon require *more* time to perform in C--using
> available abstractions--such as computing the length of a string.  Also
> note that string concatentation in C in the standard way (using strcat)
> takes linear time.  It also requires knowing the destination string is
> long enough to hold the longer result.  Thus making strip as ``fast'' as
> Icon's requires a little effort. 

I don't know if your statement is totally fair.  There is nothing to
prevent one from using BCPL style strings (i.e. also store a length
with the string) in a C program.

In fact, this is done.  The MESA language (XEROX) generates C code
which is then compiled normally.  Strings in MESA are stored with
a length, and are word aligned.  This allows strcpy, strcmp, et al
to work on word quantities,  producing much faster string routines. 
I see no reason (aside from inertia) for why this has not been done
to C.  (Well, one would have to write routines to convert from the
library function's notion of a character string to the new one with
a length.)

A while back I needed to derive the name from a file pointer.  Since
stdio does not support this I had to write a piece of code like:
	struct N_FILE {
		char	*name;
		FILE	*file;
		};
and the associated front-end routines for stdio.  This was not hard
to do, and did not take all that much time.

Of course, one could say that the pattern matching, associative
table features, etc. that make Icon so popular could also be added
to C using the argument I give above.  I won't (and can't) defend
such a statement.  My point is that condemning C for a shortcoming
in the library routines is not really fair.  Especially when that
problem could be fixed in a few days hacking.


Jonathan A. Chandross
Internet: jac@paul.rutgers.edu
UUCP: rutgers!paul.rutgers.edu!jac

From goer@sophist.uchicago.edu  Sun Jan 21 22:18:57 1990
Received: from tank.uchicago.edu by megaron (5.59-1.7/15) via SMTP
	id AA16449; Sun, 21 Jan 90 22:18:57 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Sun, 21 Jan 90 23:19:04 CST
Return-Path: <goer@sophist.uchicago.edu>
Received:  by sophist.uchicago.edu (3.2/UofC3.0)
	id AA24968; Sun, 21 Jan 90 23:14:57 CST
Date: Sun, 21 Jan 90 23:14:57 CST
From: Richard Goerwitz <goer@sophist.uchicago.edu>
Message-Id: <9001220514.AA24968@sophist.uchicago.edu>
To: icon-group@arizona.edu
Subject: condemning
Status: O

Recent point:

> My point is that condemning C for a shortcoming in the library
> routines is not really fair.  Especially when the problem could
> be fixed in a few days hacking.

I never got the impression that anyone was condemning C.  Aren't
you overreacting a bit?  Whether or not the poster was correct in
this instance, it does seem that making C behave like Icon often
does result in very poor performance.  Granted, you can often go
down some by-way, and come up with a new Iconish library routine
that will outperform Icon itself.  But this is the very sort of in-
convenience that Icon was intended to help us avoid.  It's a trade-
off.  Name your poison.
-Richard

From wgg@cs.washington.edu  Mon Jan 22 01:57:26 1990
Received: from june.cs.washington.edu by megaron (5.59-1.7/15) via SMTP
	id AA00782; Mon, 22 Jan 90 01:57:26 MST
Received: by june.cs.washington.edu (5.61/7.0jh)
	id AA15833; Sun, 21 Jan 90 23:59:00 -0800
Date: Sun, 21 Jan 90 23:59:00 -0800
From: wgg@cs.washington.edu (William Griswold)
Return-Path: <wgg>
Message-Id: <9001220759.AA15833@june.cs.washington.edu>
To: @rutgers.edu:paul.rutgers.edu!jac@aramis.rutgers.edu
Subject: Re: deleting characters
Cc: icon-group@cs.arizona.edu
Status: O

>Date: 22 Jan 90 03:24:46 GMT
>From: aramis.rutgers.edu!paul.rutgers.edu!jac@rutgers.edu  (Jonathan A. Chandross)
>Organization: Rutgers Univ., New Brunswick, N.J.
>Subject: Re: deleting characters
>To: icon-group@arizona.edu
>
>
>wgg@CS.WASHINGTON.EDU (William Griswold)
>> Many common operations in Icon require *more* time to perform in C--using
>> available abstractions--such as computing the length of a string.  Also
>> note that string concatentation in C in the standard way (using strcat)
>> takes linear time.  It also requires knowing the destination string is
>> long enough to hold the longer result.  Thus making strip as ``fast'' as
>> Icon's requires a little effort. 
>
>I don't know if your statement is totally fair.  There is nothing to
>prevent one from using BCPL style strings (i.e. also store a length
>with the string) in a C program.
>
>In fact, this is done.  The MESA language (XEROX) generates C code
>which is then compiled normally.  Strings in MESA are stored with
>a length, and are word aligned.  This allows strcpy, strcmp, et al
>to work on word quantities,  producing much faster string routines. 
>I see no reason (aside from inertia) for why this has not been done
>to C.  (Well, one would have to write routines to convert from the
>library function's notion of a character string to the new one with
>a length.)
>
...
>
>Of course, one could say that the pattern matching, associative
>table features, etc. that make Icon so popular could also be added
>to C using the argument I give above.  I won't (and can't) defend
>such a statement.  My point is that condemning C for a shortcoming
>in the library routines is not really fair.  Especially when that
>problem could be fixed in a few days hacking.
>
>
>Jonathan A. Chandross
>Internet: jac@paul.rutgers.edu
>UUCP: rutgers!paul.rutgers.edu!jac
>

Looks like I've got my foot stuck in the Turing Tar Pit.  I'm aware that I 
can do (almost) anything I want in any programming language at (close to) 
the performance the theoreticians tell me.  As indicated at the end of 
your message, it is not possibility, but reality that counts.  The reality 
is that C translates a string literal into a fixed-sized null-terminated 
character array.  For a programmer to reimplement strings has to work
pretty hard (The folks at Xerox and Arizona are good examples) 
particularly if that includes dynamically sized string management.  Only
with the encapsulation provided by C++ can we get close to what you claim,
which, with care, can even handle C string literals correctly. 

I will readily confess that there are many problems that I would rather
code in C than Icon--each is suited to a special set of tasks.

					Bill Griswold

From tenaglia@fps.mcw.edu  Tue Jan 23 15:54:03 1990
Received: from RUTGERS.EDU by megaron (5.59-1.7/15) via SMTP
	id AA19233; Tue, 23 Jan 90 15:54:03 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA28358; Tue, 23 Jan 90 17:52:43 EST
Received: by uwm.edu; id AA03553; Tue, 23 Jan 90 16:28:55 -0600
Message-Id: <9001232228.AA03553@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Tue, 23 Jan 90 16:12:14 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Tue, 23 Jan 90 13:56:10 CDT
Date: Tue, 23 Jan 90 13:56:10 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: Correction to packed decimal converter
Status: O


Several weeks ago I posted procedures for converting integers to and
from packed format. After perfecting an application using them, a flaw
became apparent in the procedure unpack(). Below is the corrected version.

##################################################################
#                                                                #
# THIS PROCEDURE UNPACKS A VALUE INTO AN INTEGER.                #
#                                                                #
##################################################################
procedure unpack(val,width)         # REQUIRES LINK RADCON !
  local tmp,number,tens,ones,sign
  tmp  := ""
  sign := 1
  every number := ord(!val) do
    tmp ||:= right(map(radcon(number,10,16),&lcase,&ucase),2,"0") #this line changed
  if tmp[-1] == ("B" | "D") then sign := -1
  tmp[-1] := ""
  tmp    *:= sign
  /width  := *tmp
  return right(tmp,width)
  end

Yours truly,

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From cargo@tardis.cray.com  Wed Jan 24 14:31:32 1990
Received: from uc.msc.umn.edu by megaron (5.59-1.7/15) via SMTP
	id AA18152; Wed, 24 Jan 90 14:31:32 MST
Received: from hall.cray.com by uc.msc.umn.edu (5.59/1.14)
	id AA23656; Wed, 24 Jan 90 15:29:38 CST
Received: from zk.cray.com by hall.cray.com
	id AA12949; 3.2/CRI-3.12; Wed, 24 Jan 90 15:31:27 CST
Received: by zk.cray.com
	id AA05119; 3.2/CRI-3.12; Wed, 24 Jan 90 15:31:21 CST
Date: Wed, 24 Jan 90 15:31:21 CST
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9001242131.AA05119@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: questions about records
Status: O

I happened to be looking at an application for rsg.icn from the IPL,
when I happened to be looking at the beginning of the program and saw:

record nonterm(name)
record charset(chars)
record query(name)

I observed that two records had the same field name ("name").  This
prompted a couple of questions that I couldn't find answers to in any
of the Icon programming language documentation I looked at (including
the book).

What are the restrictions on reusing field names from record declarations?
For example, you can clearly use the same field name in two different
record declarations.  You can also use the same field name in two different
ordinal locations in two different record declarations.  The field names
can also be the same as names of local variables (surprise!).  That was
not what I would have expected.

What seems most confusing is from page 222 of the Icon book:

record-declaration:
       record identifier ( field-list )

where the field-list is subscripted with "opt" (meaning optional).
The syntax says you can have a declaration like

record weird()

and I tried that in a test program and it translated without
complaint.  But what can you use it for?

A very puzzled snail,    o       o
                          \_____/
                          /-o-o-\     _______
DDDD      SSSS   CCCC    /   ^   \   /\\\\\\\\
D   D    S      C        \ \___/ /  /\   ___  \
D   D     SSS   C         \_   _/  /\   /\\\\  \
D   D        S  C           \  \__/\   /\ @_/  /
DDDDavid SSSS.   CCCCargo    \____\____\______/ cargo@tardis.cray.com

From ralph  Wed Jan 24 15:13:41 1990
Date: Wed, 24 Jan 90 15:13:41 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9001242213.AA21470@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA21470; Wed, 24 Jan 90 15:13:41 MST
To: cargo@tardis.cray.com, icon-group@cs.arizona.edu
Subject: Re:  questions about records
In-Reply-To: <9001242131.AA05119@zk.cray.com>
Status: O

Yes, you can have the same field name in different records, and the
positions need not be the same, as in

	record foo(a,b,c)
	record baz(c,b,a)

Icon will handle this properly.

Also, as you've observed, the "name spaces" for identifiers and field
names are disjoint, so you can, for example, have a local identifier
named b and do something like

	b := foo()
	   .
	   .
	   .
	b.b := 1

(Not recommended, of course.)

And a record need not have any fields, as in

	record nil()

Useful if you need objects of an identifiable type, but the objects have
no attributes.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From @um.cc.umich.EDU:Paul_Abrahams@Wayne-MTS  Wed Jan 24 17:17:07 1990
Resent-From: @um.cc.umich.EDU:Paul_Abrahams@Wayne-MTS
Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP
	id AA01355; Wed, 24 Jan 90 17:17:07 MST
Received: from megaron (megaron.cs.arizona.edu) by Arizona.EDU; Wed, 24 Jan 90
 02:22 MST
Received: from sharkey.cc.umich.edu by megaron (5.59-1.7/15) via SMTP id
 AA04923; Wed, 24 Jan 90 02:25:00 MST
Received: from ummts.cc.umich.edu by sharkey.cc.umich.edu (5.61/1123-1.0) id
 AA14554; Wed, 24 Jan 90 04:21:59 -0500
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Wed, 24 Jan 90
 04:23:14 EST
Resent-Date: Wed, 24 Jan 90 02:23 MST
Date: Tue, 23 Jan 90 23:55:38 EST
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.EDU
Subject: Strings in C
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <EC2682F3055F203C1C@Arizona.EDU>
Message-Id: <195450@Wayne-MTS>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@arizona.edu
Status: O

This forum isn't about C, but anyway---
 
The problem with C is not just that it's a high-level machine language--
it's a high-level machine language for the PDP11.  But even given that,
the decision to null-terminate strings was a dreadful mistake (see my
article on the subject in SIGPLAN Notices, Oct 88 I think).
 
Paul Abrahams

From icon-group-request@arizona.edu  Thu Jan 25 11:08:40 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP
	id AA27922; Thu, 25 Jan 90 11:08:40 MST
Received: from megaron (megaron.cs.arizona.edu) by Arizona.EDU; Thu, 25 Jan 90
 11:03 MST
Received: from ucbvax.Berkeley.EDU by megaron (5.59-1.7/15) via SMTP id
 AA27812; Thu, 25 Jan 90 11:06:29 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA05008; Thu, 25 Jan 90
 09:55:56 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 25 Jan 90 11:04 MST
Date: 25 Jan 90 06:01:11 GMT
From: tellab5!wheaton!johnh@uunet.uu.NET
Subject: installing icon 5.7 on ultrix 3.0
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Organization: Wheaton College, Wheaton Il
Resent-Message-Id: <EB149124BC3F20405B@Arizona.EDU>
Message-Id: <1786@wheaton.UUCP>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@arizona.edu
Status: O


This may have been discussed before but:

I have (I think) installed icon 5.7 on ultrix 3.0.  I could not do it
without using the BSD4.2 sources.  There were three problems.

1) Apparently in BSD4.2 a C program allocates a fixed number of slots
   for files inside of the users area (below &end).  The Icon
   compiler/translator uses that fact along with setbuf calls to
   have complete control of memory allocation above &end.
   Apparently in ultrix 3.0 space for files are allocated at runtime
   above the &end symbol.  Since the Icon compiler/intepreter assumes
   that the limit of memory starts at &end the memory initialization
   routines walk over the file slots.
   The result is that the distributed (unsupported) binarys will not run on 
   ultrix 3.0. (In our case compiles and intrepets hits eof immeadiatly).
   Changes to the sources from BSD4.2 were minmal (3 files - fgrep for brk
   and &end replace code using sbrk)

2) The manual of execve which include a description of "intepreter" files
   where the first line of a file start with #! intrepeter
   I could not get to work on ultrix 3.0.  I remember a problem with 
   pascal pi object files not working with some ultrix releases.  The
   pi object files on ultrix 3.0 do not use this facility.  There
   is a reasonable discussion of the issue in doc/install and the solution
   is to not use the -directex flag when running icon-setup.

3) There was seemed to be a small problem in the Makefile for icont.  It
   refered to a object mon.o.  There is no mon.c but there was a mon.o 
   file in one of the libraries included in the link step.  I removed
   the mon.o entry in the Makefile.

   johnh...
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
UUCP: (obdient spl1)!wheaton!johnh           telephone: (312) 260-3871 (office)
Mail: John Hayward Math/Computer Science Dept. Wheaton College Wheaton Il 60187
       Act justly, love mercy and walk humbly with your God. Micah 6:8b

From cargo@tardis.cray.com  Mon Jan 29 14:39:51 1990
Received: from timbuk.cray.com by megaron (5.59-1.7/15) via SMTP
	id AA26026; Mon, 29 Jan 90 14:39:51 MST
Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34)
	id AA05890; Mon, 29 Jan 90 15:39:40 CST
Received: from zk.cray.com by hall.cray.com
	id AA01356; 3.2/CRI-3.12; Mon, 29 Jan 90 15:39:37 CST
Received: by zk.cray.com
	id AA00300; 3.2/CRI-3.12; Mon, 29 Jan 90 15:39:34 CST
Date: Mon, 29 Jan 90 15:39:34 CST
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9001292139.AA00300@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: comparing csets
Status: O

I got my hardcopy Icon news in the mail and saw my name mentioned.
I thought I'd furnish an update on the way I eventually solved my
cset comparison problem.  I eventually started using the following
procedure:

procedure overlap(c1, c2)
    return '' ~=== c1 ** c2
end

The main feature that I exploit with this procedure is that when I have
tracing turned on, I can see the result of the comparison. Approaches
like

0 ~= *(c1 ** c2)

are probably more efficient (though I haven't checked), but not as
informative when I'm testing.

                         o       o
                          \_____/
                          /-o-o-\     _______
DDDD      SSSS   CCCC    /   ^   \   /\\\\\\\\
D   D    S      C        \ \___/ /  /\   ___  \
D   D     SSS   C         \_   _/  /\   /\\\\  \
D   D        S  C           \  \__/\   /\ @_/  /
DDDDavid SSSS.   CCCCargo    \____\____\______/ cargo@tardis.cray.com

From cargo@tardis.cray.com  Thu Feb  1 13:52:05 1990
Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA17936; Thu, 1 Feb 90 13:52:05 MST
Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34)
	id AA12869; Thu, 1 Feb 90 14:52:00 CST
Received: from zk.cray.com by hall.cray.com
	id AA10468; 3.2/CRI-3.12; Thu, 1 Feb 90 14:51:57 CST
Received: by zk.cray.com
	id AA03270; 3.2/CRI-3.12; Thu, 1 Feb 90 14:52:14 CST
Date: Thu, 1 Feb 90 14:52:14 CST
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9002012052.AA03270@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: table initialization
Status: O

I was looking at implementing some Icon code to initialize font width
tables.  Naturally I thought about using tables to do this.  I then
realized that while most other structures can be initialized with
constants, there doesn't seem to be a convenient way to do this with
tables.  Or is there something I'm missing.

What I will probably wind up doing is either using lists indexed by
character values (using ord(s)) or using two constant lists to
initialize a table, char_val and char_width are the two lists:

every i := 1 to n do width[char_val[i]] := char_width[i]

Eventually I'll probably want to know which is faster to use, a
character width table or and array indexed using ord(s))

string_width := 0 # for either case
every string_width +:= char_width[ord(!s)] # for a list
every string_width +:= char_width[!s] # for a table

It would seem to be a trade between ease of access and creation
of temporaries.

Has anybody tried anything like this already and found an answer
to which is better speed-wise?

From wgg@cs.washington.edu  Thu Feb  1 15:27:59 1990
Received: from june.cs.washington.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA27918; Thu, 1 Feb 90 15:27:59 MST
Received: by june.cs.washington.edu (5.61/7.0jh)
	id AA27239; Thu, 1 Feb 90 14:26:50 -0800
Date: Thu, 1 Feb 90 14:26:50 -0800
From: wgg@cs.washington.edu (William Griswold)
Return-Path: <wgg@cs.washington.edu>
Message-Id: <9002012226.AA27239@june.cs.washington.edu>
To: cargo@tardis.cray.com, icon-group@cs.arizona.edu
Subject: Re:  table initialization
Status: O

>From: cargo@tardis.cray.com (David S. Cargo)
>To: icon-group@cs.arizona.edu
>Subject: table initialization
>
>I was looking at implementing some Icon code to initialize font width
>tables.  Naturally I thought about using tables to do this.  I then
>realized that while most other structures can be initialized with
>constants, there doesn't seem to be a convenient way to do this with
>tables.  Or is there something I'm missing.
>
>What I will probably wind up doing is either using lists indexed by
>character values (using ord(s)) or using two constant lists to
>initialize a table, char_val and char_width are the two lists:
>
>every i := 1 to n do width[char_val[i]] := char_width[i]
>
>Eventually I'll probably want to know which is faster to use, a
>character width table or and array indexed using ord(s))
>
>string_width := 0 # for either case
>every string_width +:= char_width[ord(!s)] # for a list
>every string_width +:= char_width[!s] # for a table
>
>It would seem to be a trade between ease of access and creation
>of temporaries.
>
>Has anybody tried anything like this already and found an answer
>to which is better speed-wise?
>

You might want to think about storing your character set and font widths
in an external file, so that if the values (i.e., font) change, you won't
have to change your program.  Then the code for tables vs. lists is not
so different:

Your input:

a	5
b	5
...
m       8
...
z       6

the processing code:

# e.g., for tables
table := width()
while line := read(font-file) do 
    line ? width[move(1)] := integer((tab(many(' \t')) & tab(0)))


As for performance, there are several things you can try.  It is likely
that the ord(s) function will be faster, since the hashing and chaining
used to implement tables will be avoided.

If you want to hide which type you are using, use a procedure.  Procedure
call is pretty cheap, so you don't have to worry much about the cost: 

# for tables
procedure width(char)
static width-table
initial width-table := table()

    return width-table[char]
end


# for lists
procedure width(char)
static width-list
initial width-list := list(256)

    return width-list[ord(char)]
end

Note that since I didn't dereference the return variable, they can be
assigned to by the input function:

    while line := read(font-file) do 
	line ? width(move(1)) := integer((tab(many(' \t')) & tab(0)))

Thus even the font width reader doesn't need to know the representation
you are using.

One thing that just occurred to me is that with the table representation
(or some hybrid enscapsulated in a procedure) you could look-up word
widths as well as character widths, probably at little extra effort (say,
if you stored the widths of the 2000 most commonly used words) and some
performance benefit.  You could even use a history scheme, in which you
remember the widths of words already computed in the current document.  It 
seems like overkill, but it gives you some idea of the flexibility of Icon. 

					Bill Griswold


From cargo@tardis.cray.com  Thu Feb  1 15:47:09 1990
Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA29364; Thu, 1 Feb 90 15:47:09 MST
Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34)
	id AA14604; Thu, 1 Feb 90 16:47:02 CST
Received: from zk.cray.com by hall.cray.com
	id AA12124; 3.2/CRI-3.12; Thu, 1 Feb 90 16:46:59 CST
Received: by zk.cray.com
	id AA03421; 3.2/CRI-3.12; Thu, 1 Feb 90 16:47:18 CST
Date: Thu, 1 Feb 90 16:47:18 CST
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9002012247.AA03421@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: Re:  table initialization
Status: O

"You might want to think about storing your character set and font widths
in an external file, so that if the values (i.e., font) change, you won't
have to change your program."

As a matter of fact, I will start by reading the Adobe Font Metrics file
to get the initial values.  I could either use them directly, or have
the program write initialization code to be used by another program.
It's a matter of early or late binding in effect.  If you think that
ord(s) is fast relative to hashing, I'll probably go that way.

david snail

From @mirsa.inria.fr:ol@cerisi.cerisi.Fr  Mon Feb  5 13:28:51 1990
Received: from mirsa.inria.fr by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02205; Mon, 5 Feb 90 13:28:51 MST
Received: from cerisi.cerisi.fr by mirsa.inria.fr with SMTP
	(5.59++/IDA-1.2.8) id AA23439; Mon, 5 Feb 90 21:26:06 +0100
Message-Id: <9002052026.AA23439@mirsa.inria.fr>
Date: Mon, 5 Feb 90 21:26:48 -0100
Posted-Date: Mon, 5 Feb 90 21:26:48 -0100
From: Lecarme Olivier <ol@cerisi.cerisi.Fr>
To: icon-group@cs.arizona.edu
Subject: Icon on RISC machines
Status: O

Our laboratory and computing center are just buying brand new RISC
machines made by DEC (or more precisely, sold by DEC). On these
machines, Unix or something like this is supposed to work, but most
programming languages have been forgotten: after all, C is enough for
everything, or is it not enough after all? Thus, I'm missing Pascal,
Modula-2... and Icon!

Being naturally optimistic, I tried to pretend to the Icon installation
that this machine is in fact a Vax with Ultrix. Something went wrong
during "make Icon", in program rlocal.c of src/iconx. I could try to
figure what happened, but it's somewhat late for working, and I
preferred to ask the Icon community whether anybody has already made the
job. Maybe I made a mistake when copying the whole Icon distribution, or
maybe something more serious is happening.

Can anybody help me?


			    Olivier Lecarme

From cargo@tardis.cray.com  Wed Feb  7 08:58:08 1990
Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA11797; Wed, 7 Feb 90 08:58:08 MST
Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34)
	id AA09246; Wed, 7 Feb 90 09:57:51 CST
Received: from zk.cray.com by hall.cray.com
	id AA29474; 3.2/CRI-3.12; Wed, 7 Feb 90 09:57:48 CST
Received: by zk.cray.com
	id AA01397; 3.2/CRI-3.12; Wed, 7 Feb 90 09:57:47 CST
Date: Wed, 7 Feb 90 09:57:47 CST
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9002071557.AA01397@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: generation of procedures
Status: O

I was looking at the code for RSG (part of the Icon Program Library),
noticing how one part of the code takes advantage of the order of a
list of procedure variables to successively try evaluating a line of
input until one of the procedures succeeds.   My first reaction was
that this reminded me of searching an object hierarchy looking for a
handler for a particular type of message.  If a particular object
can't handle the message, it fails and lets the object next higher
in the hierarchy have a crack at it.  It sort of reminded me of
message passing, but with a distinctively Icon flavor to it.

My real question is that I can't figure out how to decide what the
symantics of the operation really are.  Normally I would expect to
say the !plist comes as element generation, which should succeed
when the first element is generated.  But then it is followed by
a parameter list and surrounded by parentheses.  This seems to
combine to make it an alternation expression equivelent to:

   (define | generate | grammar | source | comment | prompter | error)(line)


   plist := [define,generate,grammar,source,comment,prompter,error]
   :
   :
   while in := pop(ifile) do {		# process all files
      repeat {
         writes(\prompt)
         line := read(in) | break
         while line[-1] == "\\" do line := line[1:-1] || read(in) | break
         (!plist)(line)
# the line above is the interesting one!
         }
      close(in)
      }


I can't seem to find anything in the Icon book that spells out what is really
happening here.  It looked at first like !plist wasn't in a context that required
generation of all the list elements, but clearly that is not the case.

The confused snail,      o       o
                          \_____/
                          /-o-o-\     _______
DDDD      SSSS   CCCC    /   ^   \   /\\\\\\\\
D   D    S      C        \ \___/ /  /\   ___  \
D   D     SSS   C         \_   _/  /\   /\\\\  \
D   D        S  C           \  \__/\   /\ @_/  /
DDDDavid SSSS.   CCCCargo    \____\____\______/ cargo@tardis.cray.com

From ralph  Wed Feb  7 09:11:17 1990
Date: Wed, 7 Feb 90 09:11:17 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9002071611.AA13262@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA13262; Wed, 7 Feb 90 09:11:17 MST
To: cargo@tardis.cray.com, icon-group@cs.arizona.edu
Subject: Re:  generation of procedures
In-Reply-To: <9002071557.AA01397@zk.cray.com>
Status: O

The procedures are generated by !plist as you surmised.  The first one
is then applied to the argument list, resulting in a procedure call.
If that call fails, !plist is resumed to produce another procedure.

Think of a procedure call as

	e0(e1, e2, ..., en)

The order of evaluation is e0, e1, e2, ..., en.  If all succeed, the value of e0
is applied to the values of e1, e2, ..., en. If the resulting procedure call
fails, en, ..., e2, e1, e0 are resumed in that order (assuming they suspended).
If any produces a new result, evaluation starts to the right again. In the
case you cite, only e0 is a generator, so failure of the procedure call
causes e0 to produce another procedure, which is then applied to the
arguments.  If any of the procedure calls fails, the process stops, since
there is nothing to drive further generation.  The effect is to apply the
procedures in plist until one succeeds.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From wgg@cs.washington.edu  Wed Feb  7 09:50:43 1990
Received: from june.cs.washington.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA17139; Wed, 7 Feb 90 09:50:43 MST
Received: by june.cs.washington.edu (5.61/7.0jh)
	id AA00868; Wed, 7 Feb 90 08:49:37 -0800
Date: Wed, 7 Feb 90 08:49:37 -0800
From: wgg@cs.washington.edu (William Griswold)
Return-Path: <wgg@cs.washington.edu>
Message-Id: <9002071649.AA00868@june.cs.washington.edu>
To: cargo@tardis.cray.com, icon-group@cs.arizona.edu
Subject: Re:  generation of procedures
Status: O

>Date: Wed, 7 Feb 90 09:57:47 CST
>From: cargo@tardis.cray.com (David S. Cargo)
>To: icon-group@cs.arizona.edu
>Subject: generation of procedures
>Errors-To: icon-group-errors@cs.arizona.edu
>
>I was looking at the code for RSG (part of the Icon Program Library),
>noticing how one part of the code takes advantage of the order of a
>list of procedure variables to successively try evaluating a line of
>input until one of the procedures succeeds.   My first reaction was
>that this reminded me of searching an object hierarchy looking for a
>handler for a particular type of message.  If a particular object
>can't handle the message, it fails and lets the object next higher
>in the hierarchy have a crack at it.  It sort of reminded me of
>message passing, but with a distinctively Icon flavor to it.
>

  I like your analogy to searching for a handler in an object hierarchy--
  it looks a lot like delegation.  I'll think about this one some more.
  Perhaps one could use PDCOs or a Variant Translator to make such a
  scheme syntactically papable.

>My real question is that I can't figure out how to decide what the
>symantics of the operation really are.  Normally I would expect to
>say the !plist comes as element generation, which should succeed
>when the first element is generated.  But then it is followed by
>a parameter list and surrounded by parentheses.  This seems to
>combine to make it an alternation expression equivelent to:
>
>   (define | generate | grammar | source | comment | prompter | error)(line)
>
>
>   plist := [define,generate,grammar,source,comment,prompter,error]
>   :
>   :
>   while in := pop(ifile) do {		# process all files
>      repeat {
>         writes(\prompt)
>         line := read(in) | break
>         while line[-1] == "\\" do line := line[1:-1] || read(in) | break
>         (!plist)(line)
># the line above is the interesting one!
>         }
>      close(in)
>      }
>
>
>I can't seem to find anything in the Icon book that spells out what is really
>happening here.  It looked at first like !plist wasn't in a context that required
>generation of all the list elements, but clearly that is not the case.
>


  Element generation in normal expression context evaluates the expression 
  in an attempt to produce a single result.  So this code will try list
  elements until one invocation succeeds.  Thus error(line) gets executed
  only if none of the other alternatives succeed.  It is trying to parse the 
  line, using each of the parsing procedures as a possible syntactic 
  alternative.  As is usual with parsing, you want only one result, and in
  this case we take the first one that comes, assuming that it is preferred
  or unique.


					Bill Griswold


From icon-group-request@arizona.edu  Thu Feb  8 09:53:28 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13028; Thu, 8 Feb 90 09:53:28 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 8 Feb 90 09:53 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA06835; Thu, 8 Feb 90 08:47:22
 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 8 Feb 90 09:55 MST
Date: 8 Feb 90 16:43:27 GMT
From: tank!iitmax!chien@handies.ucar.EDU
Subject: icon source wanted
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <E01DF8A10AFF401F39@Arizona.EDU>
Message-Id: <3348@iitmax.IIT.EDU>
Organization: Illinois Institute of Technology, Chicago
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Is icon source for UN*X machines available for ftp?  Thanks for the info.

Greg Chien
Manager, Design Processes Laboratory
Institute of Design
Illinois Institute of Technology
Internet: chien@iitmax.iit.edu

From icon-group-request@arizona.edu  Mon Feb 12 23:42:25 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA14475; Mon, 12 Feb 90 23:42:25 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Mon, 12 Feb 90 23:40 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA12563; Mon, 12 Feb 90
 22:23:00 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Mon, 12 Feb 90 23:42 MST
Date: 12 Feb 90 14:42:17 GMT
From: van-bc!ubc-cs!alberta!myrias!dragos!wally@ucbvax.Berkeley.EDU
Subject: compiler, compiler, where art thou?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <DC85C6C540FFC01060@Arizona.EDU>
Message-Id: <1990Feb12.144217.17097@dragos.uucp>
Organization: Orbital Mind Control Lasers, Inc.
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


 
   So, the question of the day is:
 
   Where does one find a compiler for our wonderful Icon that runs
 on the Atari ST? 
 
  We`ve found icont, and iconx, but is there to be found an iconc?


-- 
  O o          Wallace Harshaw
 (   )         somewhere around here...
 "] ["   <you don't have to understand it, just look at it...>

From goer@sophist.uchicago.EDU  Tue Feb 13 01:31:14 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA18977; Tue, 13 Feb 90 01:31:14 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Tue, 13 Feb 90 01:30 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Tue, 13 Feb 90
 02:28:40 CST
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA29016; Tue, 13 Feb 90
 02:24:10 CST
Resent-Date: Tue, 13 Feb 90 01:32 MST
Date: Tue, 13 Feb 90 02:24:10 CST
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: compiler, compiler
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <DC7674F83C5FC010CA@Arizona.EDU>
Message-Id: <9002130824.AA29016@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

	  We`ve found icont, and iconx, but is there to be found an iconc?

Just about everyone would like to see a compiler, but it's a whole dif-
ferent ball game than an interpreter.  The most recent Icon newsletter
mentioned that research in this are was going on.  With all the work the
icon-project members put in right now, though, it might not be fitting
for us to press them on this subject.

The "Icon book," as everyone affectionately calls it, speaks of a com-
piler (version 5 is it?).  Compilers haven't been implemented since
then, though you can save an image of the executing program under Unix,
chmod it so that it is an executable file, and then pretend it is a
compiled program.  You're working on an Atari, though, so you're out
of luck.  Sorry.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From icon-group-request@arizona.edu  Tue Feb 13 02:54:27 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA21988; Tue, 13 Feb 90 02:54:27 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 13 Feb 90 02:55 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA25510; Tue, 13 Feb 90
 01:46:31 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 13 Feb 90 02:55 MST
Date: 13 Feb 90 04:32:21 GMT
From: ssbell!mcmi!unocss!dent@uunet.uu.NET
Subject: Anyone using Idol..?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <DC6AD3378C5FC00B17@Arizona.EDU>
Message-Id: <1996@unocss..unl.edu>
Organization: U. of Nebraska at Omaha
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Is anyone out there using Idol for anything?  I've got it running in tandem
with Icon 7.5 on a UNIX (DYNIX really) machine here, and I think it's very
interesting, but I have no background in "object orientedness" at all, so
I was curious to see if there were some slightly more complex examples of
Idol use around.

Thanks for any pointers you might want to give.. :-)

-/ Dave Caplinger /---------------------------------------------------------
 Microcomputer Specialist,   Campus Computing,   Univ. of Nebraska at Omaha
 dent@zeus.unl.edu         ...!uunet!unocss!dent                DENT@UNOMA1

From ralph  Tue Feb 13 06:27:50 1990
Date: Tue, 13 Feb 90 06:27:50 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9002131327.AA00802@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA00802; Tue, 13 Feb 90 06:27:50 MST
To: van-bc!ubc-cs!alberta!myrias!dragos!wally@ucbvax.Berkeley.EDU
Subject: Re:  compiler, compiler, where art thou?
Cc: icon-group
In-Reply-To: <1990Feb12.144217.17097@dragos.uucp>
Status: O

The so-called Icon compiler, iconc, has not been supported for any
version of Icon for many years, and there never hsa been one for the
Atari ST.

The term compiler in this context is somewhat misleading.  The iconc
you refer to compiled Icon mostly into subroutine calls and was only
5-10% faster than the interpreter.  Granted there are other advantages
to iconc, like being able to link C functions.

However, iconc was not portable and we did not have the resources to
maintain it as a separate program.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From icon-group-request@arizona.edu  Wed Feb 14 22:53:35 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA26458; Wed, 14 Feb 90 22:53:35 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 14 Feb 90 22:48 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA00866; Wed, 14 Feb 90
 21:36:58 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 14 Feb 90 22:54 MST
Date: 14 Feb 90 21:06:57 GMT
From: esquire!yost@nyu.EDU
Subject: a reads() bug
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <DAFA10C2BD5FC01C2F@Arizona.EDU>
Message-Id: <1785@esquire.UUCP>
Organization: DP&W, New York, NY
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

#!/bin/sh

# Demonstrate Icon reads() bug on Sun4
# Reading more characters than available on a pipe can cause trouble
# Don't know if it is a system bug or an Icon bug
# May also be a problem if reading from a device.
# Works OK on the Pyramid
# Icon version 7.5
# Moral of the story, reads(,4096) at a time, max

# 2/14/90 Dave Yost, DP&W <yost@DPW.COM>

pipebufsize=4096
just_big_enough_for_trouble1=`expr $pipebufsize + 1`
just_big_enough_for_trouble2=`expr $pipebufsize + 1`

tmp=xxx
bigfile=/etc/termcap

dd ibs=$just_big_enough_for_trouble1 count=1 < $bigfile > $tmp

cat << END > pipebug.icn

procedure
main (args)
    while writes(reads(&input, $just_big_enough_for_trouble2))
    return
end

END

icont pipebug.icn

echo ">> Both of these commands should succeed, but for the bug
"
echo ">> ./pipebug < $tmp | cmp - $tmp"
	 ./pipebug < $tmp | cmp - $tmp

echo ">> cat $tmp | ./pipebug | cmp - $tmp"
	 cat $tmp | ./pipebug | cmp - $tmp

rm -f pipebug.icn $tmp

ality of the underlying UNIX system call.

 --dave yost
   yost@dpw.com or uunet!esquire!yost

From kelvin@astro.cs.iastate.EDU  Thu Feb 15 08:31:44 1990
Resent-From: kelvin@astro.cs.iastate.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA20024; Thu, 15 Feb 90 08:31:44 MST
Received: from megaron.cs.arizona.edu by Arizona.EDU; Thu, 15 Feb 90 08:31 MST
Received: from atanasoff.cs.iastate.edu by megaron.cs.arizona.edu (5.59-1.7/15)
 via SMTP id AA19794; Thu, 15 Feb 90 08:28:47 MST
Received: from astro.cs.iastate.edu by atanasoff (99.99) id AA10782; Thu, 15
 Feb 90 09:26:21 -0600
Received: by astro.cs.iastate.edu (3.24) id AA28956; Thu, 15 Feb 90 09:27:11 CST
Resent-Date: Thu, 15 Feb 90 08:33 MST
Date: Thu, 15 Feb 90 09:27:11 CST
From: kelvin@astro.cs.iastate.EDU
Subject: reads() considered weak
Resent-To: icon-group@cs.arizona.edu
To: esquire!yost@nyu.EDU
Cc: icon-group@arizona.edu
Resent-Message-Id: <DAA94B00E15FC02730@Arizona.EDU>
Message-Id: <9002151527.AA28956@astro.cs.iastate.edu>
In-Reply-To: esquire!yost@nyu.EDU's message of 14 Feb 90 22:00:14 GMT
 <1786@esquire.UUCP>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: esquire!yost@nyu.EDU
X-Vms-Cc: icon-group@Arizona.edu
Status: O


you might want to look at:

  A Stream Data Type that Supports Goal-Directed Pattern Matching on 
    Unbounded Sequences of Values, in Computer Languages, Vol. 15, No. 1
    (jan 1990) by Kelvin Nilsen

this describes one proposed solution to the sort of problems you mentioned.

unfortunately, i haven't yet gathered enough external funding and/or 
 academic rank to spend much time on development and distribution of
 my real-time derivative of Icon, Conicon.


Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA  50011 
 (515) 294-2259   kelvin@atanasoff.cs.iastate.edu  uunet!atanasoff!kelvin


From goer@sophist.uchicago.EDU  Thu Feb 15 09:47:27 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA27427; Thu, 15 Feb 90 09:47:27 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Thu, 15 Feb 90 09:45 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 15 Feb 90
 10:44:22 CST
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA01848; Thu, 15 Feb 90
 10:39:46 CST
Resent-Date: Thu, 15 Feb 90 09:49 MST
Date: Thu, 15 Feb 90 10:39:46 CST
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: Conicon - What?!!
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <DA9EA6FE623FC026D5@Arizona.EDU>
Message-Id: <9002151639.AA01848@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

> unfortunately, i haven't yet gathered enough external funding and/or 
> academic rank to spend much time on development and distribution of
> my real-time derivative of Icon, Conicon.
                                    ^^^^^

Did you really think you could toss this one off without being asked
for more information? :-)  What is Conicon?

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From kelvin@astro.cs.iastate.EDU  Thu Feb 15 13:15:02 1990
Resent-From: kelvin@astro.cs.iastate.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA14373; Thu, 15 Feb 90 13:15:02 MST
Received: from megaron.cs.arizona.edu by Arizona.EDU; Thu, 15 Feb 90 13:09 MST
Received: from atanasoff.cs.iastate.edu by megaron.cs.arizona.edu (5.59-1.7/15)
 via SMTP id AA13788; Thu, 15 Feb 90 13:06:45 MST
Received: from astro.cs.iastate.edu by atanasoff (99.99) id AA15938; Thu, 15
 Feb 90 14:04:15 -0600
Received: by astro.cs.iastate.edu (3.24) id AA29215; Thu, 15 Feb 90 14:05:04 CST
Resent-Date: Thu, 15 Feb 90 13:14 MST
Date: Thu, 15 Feb 90 14:05:04 CST
From: kelvin@astro.cs.iastate.EDU
Subject: Conicon - What?!!
Resent-To: icon-group@cs.arizona.edu
To: goer@sophist.uchicago.EDU
Cc: icon-group@arizona.edu
Resent-Message-Id: <DA81EBA7C51FC029B3@Arizona.EDU>
Message-Id: <9002152005.AA29215@astro.cs.iastate.edu>
In-Reply-To: Richard Goerwitz's message of Thu, 15 Feb 90 10:39:46 CST
 <9002151639.AA01848@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: goer@sophist.uchicago.EDU
X-Vms-Cc: icon-group@Arizona.edu
Status: O


Conicon is a contraction for concurrent Icon.  Conicon is designed to
 provide the high-level power of Icon to real-time programmers.  The
 implementation of Conicon differs somewhat from that of Icon.  In particular,
 we use a special real-time garbage collection algorithm designed in part
 by me, and a different virtual machine encoding which allows real-time
 response to interrupts (certain machine instructions in Icon's virtual
 machine represent potentially unbounded amounts of computation.  Since
 it is not possible to switch contexts in the middle of executing a particular
 instruction, the worst-case time required to execute a virtual machine
 instruction represents a lower bound on the time required to respond to
 a high-priority interrupt.)

Also, Conicon provides several new (and different) programming paradigms:

  1) The stream data type represents an unbounded sequence of values.
     Generally, you can treat this like a pipe from a concurrent process,
     or as an I/O connection to the outside world (to A/D converters,
     keyboards, terminals, modems, etc...).  In Conicon, string scanning
     is replaced with stream scanning.  The integration is, I think, fairly
     clean and natural.  Streams are described more thoroughly in the
     paper mentioned in my earlier mail:

        A Stream Data Type that Supports Goal-Directed Pattern Matching on
          Unbounded Sequences of Values - Kelvin Nilsen
           Computer Languages, Vol. 15, No. 1, Jan. 90.

     I can provide reprints to anyone who is interested in this.

  2) Conicon supports concurrent processes.  These processes are
     spawned in one of two ways.  First, Icon's create operator serves
     in Conicon to create a concurrent process instead of creating
     a coexpression.  A stream which represents the sequence of values
     generated by the spawned expression is automatically created when
     the process is spawned.  Second, Conicon introduces yet another
     operator: binary !, which is interpreted as "concurrent alternation."
     For example,

	every write(1 to 3 ! 5 to 7)

     might output the sequence:

	5, 6, 1, 2, 3, 7

     There are a variety of useful programming techniques that can be
     based on the concurrent alternation operator.  These techniques,
     and other aspects of concurrency in Conicon are discussed more
     thoroughly in a paper submitted to Software -- Practice & Experience.
     We have not yet heard back from the referees.  If anyone would like
     to see a draft, please send me mail...





From root@fergvax.unl.edu  Fri Feb 16 07:22:53 1990
Received: from fergvax.unl.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA10598; Fri, 16 Feb 90 07:22:53 MST
Received: by fergvax.unl.edu (5.57/Ultrix2.4-C)
	id AA09632; Fri, 16 Feb 90 08:20:06 CST
Date: Fri, 16 Feb 90 08:20:06 CST
From: root@fergvax.unl.edu (System PRIVILEGED Account)
Message-Id: <9002161420.AA09632@fergvax.unl.edu>
To: icon-group@cs.arizona.edu
Subject: unocss.unl.edu
Cc: mkb@fergvax.unl.edu
Status: O

Dear Icon-Group List Manager:

The network that host unocss is currently on is in the process of being changed
over from a class C network to a class B network.  No mail can get through to 
it.  Messages that are currently being routed through fergvax.unl.edu are unable
to get through and are spooling up here. 

I would appreciate it if you would remove them from the mailing list until they
get their network problems straightened out.  When they do, you will need
to send all icon-group mail to unocss directly.  You will need to change the 
mailing address of 
		payne%unocss.unl.edu@fergvax.unl.edu
to be		payne@unocss.unomaha.edu

I do not know what the IP# will be for unocss.  But if you are running
named, you should not need it.

Thank you.

Sincerely,

FERGVAX System Manager

P.S.	For your information:

		Mail Queue (6 requests)
--QID-- --Size-- -----Q-Time----- ------------Sender/Recipient------------
AA27250     2562 Thu Feb 15 14:42 <icon-group-sender@cs.arizona.edu>
		 (Deferred: Connection timed out during user open with unocss.)
				  <payne%unocss.unl.edu@fergvax.unl.edu>
AA24126      481 Thu Feb 15 11:26 <icon-group-sender@cs.arizona.edu>
		 (Deferred: Connection timed out during user open with unocss.)
				  <payne%unocss.unl.edu@fergvax.unl.edu>
AA23032      620 Thu Feb 15 10:31 <icon-group-sender@cs.arizona.edu>
		 (Deferred: Connection timed out during user open with unocss.)
				  <payne%unocss.unl.edu@fergvax.unl.edu>
AA19377      969 Thu Feb 15 00:34 <icon-group-sender@cs.arizona.edu>
		 (Deferred: Connection timed out during user open with unocss.)
				  <payne%unocss.unl.edu@fergvax.unl.edu>
AA19373     1054 Thu Feb 15 00:33 <icon-group-sender@cs.arizona.edu>
		 (Deferred: Connection timed out during user open with unocss.)
				  <payne%unocss.unl.edu@fergvax.unl.edu>
AA17854      652 Tue Feb 13 10:12 <icon-group-sender@cs.arizona.edu>
		 (Deferred: Connection timed out during user open with unocss.)
				  <payne%unocss.unl.edu@fergvax.unl.edu>

From goer@sophist.uchicago.EDU  Thu Feb 22 20:06:32 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA18569; Thu, 22 Feb 90 20:06:32 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Thu, 22 Feb 90 20:08 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 22 Feb 90
 21:06:43 CST
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03453; Thu, 22 Feb 90
 21:02:03 CST
Resent-Date: Thu, 22 Feb 90 20:08 MST
Date: Thu, 22 Feb 90 21:02:03 CST
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: BSD -> SYSV filename mapper (reformats entire tar archive)
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <D4C7FAE1B7DFE01F25@Arizona.EDU>
Message-Id: <9002230302.AA03453@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


Recently I had occasion to install a number of BSD archives on my
home (SysV) machine, and I got fed up with having to rename all the
directories, and altering all the source to recognize the new and
shorter names.  It seemed better to create a filter that would take
a tar archive and map everything all at once.

I started writing the program in C, but I soon realized that doing
the job in C would take a couple of evenings.  The Icon program took
just part of one evening.  It's not meant to be pretty, but since it
is probably something others would find useful, I'm posting it.  It
works here at my site.  Naturally, I don't guarantee that it will
work anywhere else.

----------------------------------------------------------------------
global filenametbl, chunkset, short_chunkset        # see procedure mappiece(s)
record hblock(name,junk,size,mtime,chksum,
              linkflag,linkname,therest)            # see readtarhdr(s)


procedure main(a)

    usage := "usage:  maptarfile inputfile      # output goes to stdout"
    0 < *a < 2 | stop("Bad arg count.\n",usage)
    intext := open(a[1],"r") |
	stop("maptarfile:  can't open ",a[1])

    # run through all the headers in the input file, filling
    # (global) filenametbl with the names of overlong files;
    # make_table_of_filenames fails if there are no such files
    make_table_of_filenames(intext) |
	stop("maptarfile:  no overlong path names to map")
  
    # now that a table of overlong filenames exists, go back
    # through the text, remapping all occurrences of these names
    # to new, 14-char values; also, reset header checksums, and
    # reformat text into correctly padded 512-byte blocks
    seek(intext,1)
    output_mapped_headers_and_texts(intext) |
	stop("maptarfile:  error reformatting text")

    close(intext)
    write_report()
    exit(0)
    
end



procedure make_table_of_filenames(intext)

    # global chunkset (set of overlong filenames)
    local header

    # read headers for overlong filenames; for now
    # ignore everything else
    while header := readtarhdr(reads(intext,512)) do {
	tab_nxt_hdr(intext,trim_str(header.size))
	fixpath(trim_str(header.name))
    }
    *chunkset = 0 & fail
    return &null

end



procedure output_mapped_headers_and_texts(intext)

    # remember that filenametbl, chunkset, and short_chunkset
    # (which are used by various procedures below) are GLOBAL
    local header, newtext, full_block 

    # read in headers, one at a time
    while header := readtarhdr(reads(intext,512)) do {

	# replace overlong filenames with shorter ones, according to
	# the conversions specified in the global hash table filenametbl
      	header.name := left(map_filenams(header.name),100,"\x00")
	header.linkname := left(map_filenams(header.linkname),100,"\x00")

	# use header.size field to read in and map the subsequent text
	newtext := trim(
	    map_filenams(tab_nxt_hdr(intext,trim_str(header.size))),'\x00'
	    )

	# now, find the length of newtext, and insert it into the size field
	header.size := right(exbase10(*newtext,8) || " ",12," ")

	# calculate the checksum of the newly retouched header
	header.chksum := right(exbase10(get_checksum(header),8)||"\x00 ",8," ")

	# finally, join all the header fields into a new block and write it out
	full_block := ""; every full_block ||:= !header
	writes(left(full_block,512,"\x00"))

	# now we're ready to write out the text, padding the final block
	# out to an even 512 bytes if necessary; the next header must start
	# right at the beginning of a 512 byte block
	newtext ? {
	    while writes(move(512))
	    if not pos(0)
	    then writes(left(tab(0),512,"\x00")) | fail
	}
    }
    writes(repl("\x00",512))
    return &null

end



procedure trim_str(s)
    # knock out spaces, nulls
    return s ? {
	(tab(many(' ')) | &null) &
	    trim(tab(find("\x00")|0))
    } \ 1
end 



procedure tab_nxt_hdr(f,size_str)

    hs := integer("8r" || size_str)
    next_header_offset := (hs / 512) * 512
    hs % 512 ~= 0 & next_header_offset +:= 512
    if 0 = next_header_offset then return ""
    return reads(f,next_header_offset) |
	stop("maptarfile:  error reading in ",
	     string(next_header_offset)," bytes.")

end



procedure fixpath(s)

    # fixpath is a misnomer of sorts, since it is used on
    # the first pass only, and merely examines each filename
    # in a path, using the procedure mappiece to record any
    # overlong ones in the global table filenametbl and in
    # the global sets chunkset and short_chunkset; no fixing
    # is actually done here
    s2 := ""
    s ? {
	while piece := tab(find("/")+1)
	do s2 ||:= mappiece(piece) 
	s2 ||:= mappiece(tab(0))
    }
    return s2

end



procedure mappiece(s)
    
    # global filenametbl, chunkset short_chunkset
    initial {
	filenametbl := table()
	chunkset := set()
	short_chunkset := set()
    }
    
    chunk := trim(s,'/')
    if *chunk > 14 then {
	i := 0
	repeat {
	# if the file has already been looked at, continue
	    if \filenametbl[chunk] then next
	# else find a new unique 14-character name for it
	    lchunk := chunk[1:12] || right(string(i+:=1),3,"0")
	    if lchunk == !filenametbl
	    then next else break
	}
	# record filename in various global sets and tables
	filenametbl[chunk] := lchunk
	insert(chunkset,chunk)
	insert(short_chunkset,chunk[1:16])
    }
    else lchunk := chunk

    lchunk ||:= (s[-1] == "/")
    return lchunk

end



procedure readtarhdr(s)
    this_block := hblock()
    s ? {
	this_block.name     := move(100)    # <- to be looked at later
	this_block.junk     := move(8+8+8)  # skip the permissions, uid, etc.
	this_block.size     := move(12)     # <- to be looked at later
	this_block.mtime    := move(12)
	this_block.chksum   := move(8)      # <- to be looked at later
	this_block.linkflag := move(1)
	this_block.linkname := move(100)    # <- to be looked at later
	this_block.therest  := tab(0)
    }
    integer(this_block.size) | fail
    return this_block
end



procedure map_filenams(s)

    # chunkset is global, and contains all the overlong filenames
    # found in the first pass through the input file; here the aim
    # is to map the filenames to the shortened variants as stored
    # in filenametbl (which happens to be GLOBAL)

    local s2

    s2 := ""
    s ? {
	until pos(0) do {
	    # first narrow the possibilities, then try to map;
	    # short_chunkset, chunkset & filenametbl are global
	    if member(short_chunkset,&subject[&pos:&pos+15])
	    then s2 ||:= filenametbl[=!chunkset]
	    else s2 ||:= move(1)
	}
    }
    return s2

end



#  Author:  Ralph E. Griswold
#  Date:  June 10, 1988
#  exbase10(i,j) convert base-10 integer i to base j
#  The maximum base allowed is 36.

procedure exbase10(i,j)
   static digits
   local s, d, sign
   initial digits := &digits || &lcase
   if i = 0 then return 0
   if i < 0 then {
      sign := "-"
      i := -i
      }
   else sign := ""
   s := ""
   while i > 0 do {
      d := i % j
      if d > 9 then d := digits[d + 1]
      s := d || s
      i /:= j
      }
   return sign || s
end



procedure get_checksum(r)
    sum := 0
    r.chksum := "        "
    every field := !r
    do every sum +:= ord(!field)
    return sum
end



procedure write_report()

    # this procedure writes out a list of filenames which were
    # remapped (because they exceeded the SysV 14-char limit)

    local outtext, stbl, i

    (outtext := open(fname := "mapping.report","w")) |
	open(fname := "/tmp/mapping.report","w") |
	     stop("maptarfile:  Can't find a place to put mapping.report!")
    stbl := sort(filenametbl,3)
    every i := 1 to *stbl -1 by 2 do {
	write(outtext,left(stbl[i],35," ")," ",stbl[i+1])
    }
    write(&errout,"maptarfile:  ",fname," contains the list of changes")
    close(outtext)
    return &null

end

From CELEX@HNYMPI52.BITNET  Sat Feb 24 19:16:52 1990
Resent-From: CELEX@HNYMPI52.BITNET
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA00779; Sat, 24 Feb 90 19:16:52 MST
Received: from rvax.ccit.arizona.edu by Arizona.EDU; Sat, 24 Feb 90 19:16 MST
Received: from HNYMPI52.BITNET by rvax.ccit.arizona.edu; Sat, 24 Feb 90 19:12
 MST
Resent-Date: Sat, 24 Feb 90 19:18 MST
Date: Sat, 24 Feb 90 20:03 N
From: CELEX@HNYMPI52.BITNET
Subject: strip
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <D33CB1465D1FE01ACE@Arizona.EDU>
Message-Id: <D33D74193DDF2053C7@rvax.ccit.arizona.edu>
X-Original-To:  icon-group@arizona.edu, CELEX
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

 
We use for removing unwanted characters from a string the following
method:
 
(&unwanchar is the set of the unwanted characters)
 
while string[upto(&unwanchar,string)+:1] :== ""
 
                                            Hope this helps,
                                            Marcel Bingley
 
                                                        C
 
                                                              C
   --   C E L E X   --                            C        C      C
                                                       C
 University of Nijmegen                                          C    CCCCCC
 Wundtlaan 1                                                C     CCCCCCCCCCCCC
 6525 XD  NIJMEGEN                       C           C    C     CCCCCCCCCCCCCCCC
 The Netherlands                                            CCCCCCCCCC        CC
                                                      C    CCCCCCCC
                                                          CCCCCCCC
 Tel: (+31) (0)80 - 512117                                CCCCCCCC
                  - 515797                               CCCCCCCC
                                                         CCCCCCCC
 EARN/BITNET:   celex@hnympi52                           CCCCCCCC
 Internet:    celexmail@celex.surfnet.nl                  CCCCCCCC
 SURFNET:  celex::celexmail                               CCCCCCCC
 JANET:  celex%hnympi52@earn-relay                         CCCCCCCCC
 PSI:  020418802007380::celexmail                            CCCCCCCCCCC

From goer@sophist.uchicago.EDU  Wed Feb 28 02:40:39 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA01623; Wed, 28 Feb 90 02:40:39 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Wed, 28 Feb 90 02:38 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 28 Feb 90
 03:37:08 CST
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA10175; Wed, 28 Feb 90
 03:32:21 CST
Resent-Date: Wed, 28 Feb 90 02:40 MST
Date: Wed, 28 Feb 90 03:32:21 CST
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: in situ filename truncator for tar files
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <D0A356C4AEFF400C10@Arizona.EDU>
Message-Id: <9002280932.AA10175@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Having received a number of requests for this thing from a number
of places, it seemed prudent to clean it up, comment it, fix the
bugs, and repost.

I had no idea that anyone would really _use_ it.


#-------------------------------------------------------------------
#
#	MAPTARFILE 
#	
#	Map 15+ char. filenames in a tar archive to 14 chars.
#	Handles both the header blocks and the source itself.
#	Obviates the need for renaming files and directories
#	by hand, and for altering source and docs to refer to
#	the new file and directory names.
#
#	Richard L. Goerwitz, III
#
#	Last modified 2/27/90
#
#-------------------------------------------------------------------


global filenametbl, chunkset, short_chunkset   # see procedure mappiece(s)
record hblock(name,junk,size,mtime,chksum,
              linkflag,linkname,therest)       # see readtarhdr(s)


procedure main(a)

    usage := "usage:  maptarfile inputfile      # output goes to stdout"
    0 < *a < 2 | stop("Bad arg count.\n",usage)
    intext := open(a[1],"r") |
	stop("maptarfile:  can't open ",a[1])

    # Run through all the headers in the input file, filling
    # (global) filenametbl with the names of overlong files;
    # make_table_of_filenames fails if there are no such files.
    make_table_of_filenames(intext) |
	stop("maptarfile:  no overlong path names to map")
  
    # Now that a table of overlong filenames exists, go back
    # through the text, remapping all occurrences of these names
    # to new, 14-char values; also, reset header checksums, and
    # reformat text into correctly padded 512-byte blocks.  Ter-
    # minate output with 512 nulls.
    seek(intext,1)
    every writes(output_mapped_headers_and_texts(intext))

    close(intext)
    write_report()   # Record mapped file and dir names for future ref.
    exit(0)
    
end



procedure make_table_of_filenames(intext)

    local header # chunkset is global

    # search headers for overlong filenames; for now
    # ignore everything else
    while header := readtarhdr(reads(intext,512)) do {
	# tab upto the next header block
	tab_nxt_hdr(intext,trim_str(header.size),1)
	# record overlong filenames in several global tables, sets
	fixpath(trim_str(header.name))
    }
    *chunkset = 0 & fail
    return &null

end



procedure output_mapped_headers_and_texts(intext)

    # Remember that filenametbl, chunkset, and short_chunkset
    # (which are used by various procedures below) are global.
    local header, newtext, full_block, block, lastblock

    # Read in headers, one at a time.
    while header := readtarhdr(reads(intext,512)) do {

	# Replace overlong filenames with shorter ones, according to
	# the conversions specified in the global hash table filenametbl
	# (which were generated by fixpath() on the first pass).
      	header.name := left(map_filenams(header.name),100,"\x00")
	header.linkname := left(map_filenams(header.linkname),100,"\x00")

	# Use header.size field to determine the size of the subsequent text.
	# Read in the text as one string.  Map overlong filenames found in it
 	# to shorter names as specified in the global hash table filenamtbl.
	newtext := map_filenams(tab_nxt_hdr(intext,trim_str(header.size)))

	# Now, find the length of newtext, and insert it into the size field.
	header.size := right(exbase10(*newtext,8) || " ",12," ")

	# Calculate the checksum of the newly retouched header.
	header.chksum := right(exbase10(get_checksum(header),8)||"\x00 ",8," ")

	# Finally, join all the header fields into a new block and write it out
	full_block := ""; every full_block ||:= !header
	suspend left(full_block,512,"\x00")

	# Now we're ready to write out the text, padding the final block
	# out to an even 512 bytes if necessary; the next header must start
	# right at the beginning of a 512-byte block.
	newtext ? {
	    while block := move(512)
	    do suspend block
	    pos(0) & next
            lastblock := left(tab(0),512,"\x00")
	    suspend lastblock
	}
    }
    # Write out a final null-filled block.  Some tar programs will write
    # out 1024 nulls at the end.  Dunno why.
    return repl("\x00",512)

end



procedure trim_str(s)

    # Knock out spaces, nulls from those crazy tar header
    # block fields (some of which end in a space and a null,
    # some just a space, and some just a null [anyone know
    # why?]).
    return s ? {
	(tab(many(' ')) | &null) &
	    trim(tab(find("\x00")|0))
    } \ 1

end 



procedure tab_nxt_hdr(f,size_str,firstpass)

    # Tab upto the next header block.  Return the bypassed text
    # as a string (this value is not always used).

    local hs, next_header_offset

    hs := integer("8r" || size_str)
    next_header_offset := (hs / 512) * 512
    hs % 512 ~= 0 & next_header_offset +:= 512
    if 0 = next_header_offset then return ""
    else {
	# if this is pass no. 1 don't bother returning a value; we're
	# just collecting long filenames;
	if \firstpass then {
	    seek(f,where(f)+next_header_offset)
	    return
	}
	else {
	    return reads(f,next_header_offset)[1:hs+1] |
		stop("maptarfile:  error reading in ",
		     string(next_header_offset)," bytes.")
	}
    }

end



procedure fixpath(s)

    # Fixpath is a misnomer of sorts, since it is used on
    # the first pass only, and merely examines each filename
    # in a path, using the procedure mappiece to record any
    # overlong ones in the global table filenametbl and in
    # the global sets chunkset and short_chunkset; no fixing
    # is actually done here.

    s2 := ""
    s ? {
	while piece := tab(find("/")+1)
	do s2 ||:= mappiece(piece) 
	s2 ||:= mappiece(tab(0))
    }
    return s2

end



procedure mappiece(s)

    # Check s (the name of a file or dir as recorded in the tar header
    # being examined) to see if it is over 14 chars long.  If so,
    # generate a unique 14-char version of the name, and store
    # both values in the global hashtable filenametbl.  Also store
    # the original (overlong) file name in chunkset.  Store the
    # first fifteen chars of the original file name in short_chunkset.
    # Sorry about all of the tables and sets.  It actually makes for
    # a reasonably efficient program.  Doing away with both sets,
    # while possible, causes a tenfold drop in execution speed!
    
    # global filenametbl, chunkset, short_chunkset
    local j, ending

    initial {
	filenametbl := table()
	chunkset := set()
	short_chunkset := set()
    }
    
    chunk := trim(s,'/')
    if *chunk > 14 then {
	i := 0
	repeat {
	# if the file has already been looked at, continue
	    if \filenametbl[chunk] then next
	# else find a new unique 14-character name for it
	# preserve important suffixes like ".Z," ".c," etc.
	    if chunk ?
	       (tab(find(".")), ending := move(1) || tab(any(&ascii)), pos(0))
	    then lchunk := chunk[1:11] || right(string(i+:=1),2,"0") || ending
	    else lchunk := chunk[1:12] || right(string(i+:=1),3,"0")
	    if lchunk == !filenametbl
	    then next else break
	}
	# record filename in various global sets and tables
	filenametbl[chunk] := lchunk
	insert(chunkset,chunk)
	insert(short_chunkset,chunk[1:16])
    }
    else lchunk := chunk

    lchunk ||:= (s[-1] == "/")
    return lchunk

end



procedure readtarhdr(s)

    # Read the silly tar header into a record.  Note that, as was
    # complained about above, some of the fields end in a null, some
    # in a space, and some in a space and a null.  The procedure
    # trim_str() may (and in fact often _is_) used to remove this
    # extra garbage.

    this_block := hblock()
    s ? {
	this_block.name     := move(100)    # <- to be looked at later
	this_block.junk     := move(8+8+8)  # skip the permissions, uid, etc.
	this_block.size     := move(12)     # <- to be looked at later
	this_block.mtime    := move(12)
	this_block.chksum   := move(8)      # <- to be looked at later
	this_block.linkflag := move(1)
	this_block.linkname := move(100)    # <- to be looked at later
	this_block.therest  := tab(0)
    }
    integer(this_block.size) | fail  # If it's not an integer, we've hit
                                     # the final (null-filled) block.
    return this_block

end



procedure map_filenams(s)

    # Chunkset is global, and contains all the overlong filenames
    # found in the first pass through the input file; here the aim
    # is to map these filenames to the shortened variants as stored
    # in filenametbl (GLOBAL).

    local s2

    s2 := ""
    s ? {
	until pos(0) do {
	    # first narrow the possibilities, using short_chunkset
	    if member(short_chunkset,&subject[&pos:&pos+15])
            # then try to map from a long to a shorter 14-char filename
	    then s2 ||:= (filenametbl[=!chunkset] | move(1))
	    else s2 ||:= move(1)
	}
    }
    return s2

end


#  From the IPL.  Thanks, Ralph -
#  Author:  Ralph E. Griswold
#  Date:  June 10, 1988
#  exbase10(i,j) convert base-10 integer i to base j
#  The maximum base allowed is 36.

procedure exbase10(i,j)

   static digits
   local s, d, sign
   initial digits := &digits || &lcase
   if i = 0 then return 0
   if i < 0 then {
      sign := "-"
      i := -i
      }
   else sign := ""
   s := ""
   while i > 0 do {
      d := i % j
      if d > 9 then d := digits[d + 1]
      s := d || s
      i /:= j
      }
   return sign || s

end

# end IPL material


procedure get_checksum(r)
 
    # Calculates the new value of the checksum field for the
    # current header block.  Note that the specification say
    # that, when calculating this value, the chksum field must
    # be blank-filled.

    sum := 0
    r.chksum := "        "
    every field := !r
    do every sum +:= ord(!field)
    return sum

end



procedure write_report()

    # This procedure writes out a list of filenames which were
    # remapped (because they exceeded the SysV 14-char limit),
    # and then notifies the user of the existence of this file.

    local outtext, stbl, i

    (outtext := open(fname := "mapping.report","w")) |
	open(fname := "/tmp/mapping.report","w") |
	     stop("maptarfile:  Can't find a place to put mapping.report!")
    stbl := sort(filenametbl,3)
    every i := 1 to *stbl -1 by 2 do {
	write(outtext,left(stbl[i],35," ")," ",stbl[i+1])
    }
    write(&errout,"maptarfile:  ",fname," contains the list of changes")
    close(outtext)
    return &null

end

From tenaglia@fps.mcw.edu  Wed Feb 28 08:20:02 1990
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA12588; Wed, 28 Feb 90 08:20:02 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA19229; Wed, 28 Feb 90 10:17:25 EST
Received: by uwm.edu; id AA23161; Wed, 28 Feb 90 09:08:35 -0600
Message-Id: <9002281508.AA23161@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Wed, 28 Feb 90 08:29:52 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Wed, 28 Feb 90 08:29:37 CDT
Date: Wed, 28 Feb 90 08:29:37 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: And now something useless
Status: O


Icon is a great language for recreational programming as well. I recently
read a Scientific American where someone described a program that takes a
known text and scrambles it in a most bizaare fashion. The output is not
unlike a Max Headroom monolog. Also they interfaced it with a rhyming
engine to generate bizaare poetry. Well I thought it would be fun to try
doing it in icon. Below is the scrambler. I guess I was lazy not to do the
rhyming engine. It chooses subsequent words based on the likelyhood of them
occuring after the current word. This icon program accomplishes that. I
find it rather amusing what it does to my own documentation. Perhaps someone
has a more clever method, or perhaps someone would want to post a rhyming
engine?

##################### 80 lines follow ############################
#                                                                #
# Poet.Icn               02/28/90          BY TENAGLIA           #
#                                                                #
# THIS PROGRAM TAKES A DOCUMENT AND RE-OUTPUTS IT IN A CLEVERLY  #
# SCRAMBLED FASHION. IT USES THE NEXT TWO MOST LIKELY WORDS TO   #
# TO FOLLOW. USAGE : ICONX POET INPUT_FILE [OUTPUT_FILE]         #
# IF NO OUTPUT FILE IS SPECIFIED, THE GIBBERISH IS SENT TO TTY   #
# THE CONCEPT WAS FOUND IN A RECENT SCIENTIFIC AMERICAN AND ICON #
# SEEMED TO OFFER THE BEST IMPLEMENTATION.                       #
#                                                                #
##################################################################
global vocab,index
procedure main(param)
  source := param[1]        | input("_Source:")
  target := param[2]        | "tt:"
  (in  := open(source))     | stop("Can't open ",source)
  (out := open(target,"w")) | stop("Can't open ",target)
  vocab:= []
  index:= table([])
  write("Loading vocabulary")
  while line := read(in) do
    {
    vocab |||:= parse(line,' ')
    writes(".")
    }
  close(in)

  write("\nindexing...\n")
  every i := 1 to *vocab-2 do index[vocab[i]] |||:= [i]
  index[vocab[-2]] |||:= [-2]    # wrap end to front in order to
  index[vocab[-1]] |||:= [-1]    # prevent stuck loop if last word chosen

  n := -1 ; &random := map(&clock,":","0") ; line := ""
  write("\n")
  every 1 to *vocab/2 do
    {
    (n > 1) | (n := ?(*vocab-2))
    word    := vocab[n]
    follows := vocab[(?(index[word]))+1]
    n       := (?(index[follows])) + 1
    if (*line + *word + *follows + 2) > 80 then
      {
      write(out,line)
      line := ""
      }
    line ||:= word || " " || follows || " "
    }
  write(out,line,".")
  close(out)
  end

##################################################################
#                                                                #
# THIS PROCEDURE PULLS ALL THE ELEMENTS (TOKENS) OUT OF A LINE   #
# BUFFER AND RETURNS THEM IN A LIST. A VARIABLE NAMED 'CHARS'    #
# CAN BE STATICALLY DEFINED HERE OR GLOBAL. IT IS A CSET THAT    #
# CONTAINS THE VALID CHARACTERS THAT CAN COMPOSE THE ELEMENTS    #
# ONE WISHES TO EXTRACT.                                         #
#                                                                #
##################################################################
procedure parse(line,delims)
  static chars
  chars  := &cset -- delims
  tokens := []
  line ? while tab(upto(chars)) do put(tokens,tab(many(chars)))
  return tokens
  end

##################################################################
#                                                                #
# THIS PROCEDURE IS TERRIBLY HANDY IN PROMPTING AND GETTING      #
# AN INPUT STRING                                                #
#                                                                #
##################################################################
procedure input(prompt)
  writes(prompt)
  return read()
  end

Yours truly,

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu

From icon-group-request@arizona.edu  Wed Feb 28 18:13:45 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA27832; Wed, 28 Feb 90 18:13:45 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 28 Feb 90 18:13 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AB00291; Tue, 27 Feb 90
 20:19:27 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 28 Feb 90 18:14 MST
Date: 27 Feb 90 18:20:53 GMT
From: esquire!yost@nyu.EDU
Subject: Icon instead of shell scripts or C code
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <D020FA1A957F401151@Arizona.EDU>
Message-Id: <1806@esquire.UUCP>
Organization: DP&W, New York, NY
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

There world is divided into two kinds of programs:
  1.  Simple data manipulation batch programs (e.g. grep)
  2.  All other programs.

Unfortunately, Icon doesn't reach much beyond Class 1.
It would be great to be able to use Icon for all programs.

There are many reasons write programs on unix in Icon instead
of the shell or C, and I'm sure we all know what those are.
Unfortunately, those reasons don't stand a chance against
the reason why you can't use Icon:

   The vast majority of unix system calls are not supported.

As a result, Icon has these deficiencies, among others:

   Lack of sophistication in the running of subprocesses:
	Keyboard interrupt while a system() command is ignored
	No way to run a unix command and capture its output in a variable
	You can't run a program in the background and get its process (group)
	     id for a later kill
	No fork, exec, wait, etc.
   No trapping of signals, and therefore no cleanup on forced exit,
	no timeouts

Has anyone implemented more of the unix system calls?
Would you please tell us about it?

Icon is so nice.  It's a shame it can't be used for more things.

 --dave yost
   yost@dpw.com or uunet!esquire!yost
   Please ignore the From or Reply-To fields above, if different.

P.S.

Here is a routine that adds a little to Icon's capability to replace
shell scripts:

# Run a command with the contents of an Icon string variable as input
# Note: If the string is not newline-terminated, it will appear to the
# command as if it were.  There are workarounds for this
procedure
tosystem (inputstring, command)

    return system ("<<'**END**' " || command || "\n" ||
	       inputstring || if inputstring[-1] ~== "\n" then "\n" else "" ||
	       "**END**\n")
end

From jeffc@osf.ORG  Mon Mar  5 09:14:07 1990
Resent-From: jeffc@osf.ORG
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA25244; Mon, 5 Mar 90 09:14:07 MST
Received: from osf.osf.org by Arizona.EDU; Mon, 5 Mar 90 09:06 MST
Received: from soba.osf.org by osf.osf.org (5.61/OSF 0.9) id AA16654; Mon, 5
 Mar 90 11:02:30 -0500
Resent-Date: Mon, 5 Mar 90 09:11 MST
Date: Mon, 05 Mar 90 11:02:29 -0500
From: Jeff Carter <jeffc@osf.ORG>
Subject: Porting Icon 7.5 to a new and Unique UNIX machine
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <CC7EE79266DF4033BE@Arizona.EDU>
Message-Id: <9533.636652949@soba>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I recently began porting (a reasonably simple task) Icon to a 
DECstation 3100 (MIPS RISC chipset, runs Ultrix. aka PMAX). I have run
in to a couple of things that hopefully someone else out there has a 
solution for, or suggestions how to debug:

(1) There are numurous calls to fopen() with "rb" and "wb" modes that
are _not_ surrounded by OS-specific #ifdef's. This leads me to believe
(unfortunately) that maybe the particular version I have hasn't been
run on a wide variety of UNIX machines. Any comments? Are other versions
of UNIX more forgiving than ULTRIX 3.0?

(2) Floating-point conversions. I get numerous failures of the "eval"
and "fncs" tests that seem to stem from a problem with the conversion of
real numbers to their string representations. For example, from the
"eval" test:

3c3
< 2.0 === +2.0 ----> 9.018482111602407e-O4
---
> 2.0 === +2.0 ----> 2.0

And from the "fncs" test:

3c3
< copy(1.0) ----> 9.017964046223754e-O4
---
> copy(1.0) ----> 1.0

There are numerous other examples, but almost all of the reported errors 
are similar to these.

(3) Memory allocation. Early the startup, the executable calls fopen() in 
order to get the code file. This, unfortunately, causes fopen() to call
malloc(), which immediately fails because initalloc() hasn't been 
called. And initalloc doesn't get called until after the header is read
from the code file. This forces me to use the static allocation versions
of the memory management routines. The first application that I want to use
this for wants to use a _lot_ of string space. I keep getting "out
of space in string region" errors, and having to restart with larger and
larger values. This is a royal pain. Has anyone looked at making the code
region be allocated out of the static memory region, or some other technique
that would let me initialize the memory allocation routine earlier? 
Is there a particular reason why this _won't_ work (so I don't waste my
time on it, only to dicover the fatal flaw.)

	jeff carter
	jeffc@osf.org

From ralph  Mon Mar  5 09:29:18 1990
Resent-From: "Ralph Griswold" <ralph>
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA26537; Mon, 5 Mar 90 09:29:18 MST
Received: from megaron.cs.arizona.edu by Arizona.EDU; Mon, 5 Mar 90 09:26 MST
Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA26252; Mon, 5 Mar 90
 09:26:10 MST
Resent-Date: Mon, 5 Mar 90 09:27 MST
Date: Mon, 5 Mar 90 09:26:10 MST
From: Ralph Griswold <ralph@cs.arizona.edu>
Subject: RE:  Porting Icon 7.5 to a new and Unique UNIX machine
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu, jeffc@osf.ORG
Resent-Message-Id: <CC7CC173687F4035D1@Arizona.EDU>
Message-Id: <9003051626.AA26252@megaron.cs.arizona.edu>
In-Reply-To: <9533.636652949@soba>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu, jeffc@osf.ORG
Status: O

Version 8 of Icon, to be out shortly, will support the DECstation and
several other newer workstations, including the Sun SPARCstation and
the NeXT machine.  (All of the problems noted in earlier mail are
corrected in Version 8.)

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From tenaglia@fps.mcw.edu  Wed Mar 14 16:18:58 1990
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA26299; Wed, 14 Mar 90 16:18:58 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA18368; Wed, 14 Mar 90 18:17:58 EST
Received: by uwm.edu; id AA22806; Wed, 14 Mar 90 16:49:13 -0600
Message-Id: <9003142249.AA22806@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Wed, 14 Mar 90 16:02:34 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Wed, 14 Mar 90 15:05:15 CDT
Date: Wed, 14 Mar 90 15:05:15 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: Handy Icon Procedure for Reports
Status: O

Dear Icon Group :

I am including a handy procedure that can reformat strings. It's
fairly intuitive as far as usage is concerned. My implementaion
is pretty plain. Perhaps someone has a more elegant expression
that makes use of string scanning or co-expressions? Enjoy!

########################################################################
#                                                                      #
# THIS PROCEDURE IS A HANDY STRING REMAPPER/FORMATTER AND IT'S         #
# VERY HANDY FOR REPORT GENERATION. USAGE PATCH(VARIABLE,MASK)         #
# WHERE VARIABLE IS A STRING AND MASK IS A STRING. MASK CONTAINS       #
# A SEQUENCE THAT TRANSFORMS VARIABLE. THE # CHARACTER MEANS TO        #
# COPY THE CHARACTER AT THAT POSITION. THE $ CHARACTER MEANS TO        #
# DELETE THE CURRENT CHARACTER AT THAT POSITION. ANY OTHER BYTES       #
# GET INSERTED INTO THE VARIABLE AT THEIR RESPECTIVE POSITIONS.        #
# EXAMPLES : patch("12/03/89","##$##$##") returns 120389               #
#            patch("120389","##/##/19##") returns 12/03/1989           #
#            patch("12/03/1989","##$")    returns 12                   #
#                                                                      #
########################################################################
procedure patch(var,mask)
  text := ""
  i    := 0
  every mark := !mask do
    {
    case mark of
      {
      "#" : {
            text ||:= var[(i+:=1)]
            next
            }
      "$" : {
            i +:= 1
            next
            }
  default : text ||:= mark
      }
    }
  return text
  end

#############################################################
Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu



From wgg@cs.washington.edu  Wed Mar 14 17:54:59 1990
Received: from june.cs.washington.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02804; Wed, 14 Mar 90 17:54:59 MST
Received: by june.cs.washington.edu (5.61/7.0jh)
	id AA20769; Wed, 14 Mar 90 16:54:34 -0800
Date: Wed, 14 Mar 90 16:54:34 -0800
From: wgg@cs.washington.edu (William Griswold)
Return-Path: <wgg@cs.washington.edu>
Message-Id: <9003150054.AA20769@june.cs.washington.edu>
To: icon-group@cs.arizona.edu, tenaglia@mis.mcw.edu
Subject: Re:  Handy Icon Procedure for Reports
Status: O

An Icon programmer writes...

>Date: Wed, 14 Mar 90 15:05:15 CDT
>From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
>To: icon-group@cs.arizona.edu
>Subject: Handy Icon Procedure for Reports
>Errors-To: icon-group-errors@cs.arizona.edu
>Status: R
>
>Dear Icon Group :
>
>I am including a handy procedure that can reformat strings. It's
>fairly intuitive as far as usage is concerned. My implementaion
>is pretty plain. Perhaps someone has a more elegant expression
>that makes use of string scanning or co-expressions? Enjoy!
>

Although the paradigm is a little different, there are a whole class of
problems like this that can be cleverly implemented in one line with the
map() function.  In one of the later chapters of the Icon book there are
several examples.  Perhaps someone would like to submit some....

					Bill Griswold


From goer@sophist.uchicago.EDU  Wed Mar 14 18:04:23 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA03325; Wed, 14 Mar 90 18:04:23 MST
Return-Path: goer@Arizona.edu
Received: from tank.uchicago.edu by Arizona.EDU; Wed, 14 Mar 90 17:55 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 14 Mar 90
 18:54:59 CST
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03952; Wed, 14 Mar 90
 18:26:35 CST
Resent-Date: Wed, 14 Mar 90 18:02 MST
Date: Wed, 14 Mar 90 18:26:35 CST
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: patch; using string scanning
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <C52255ED54FFC05ACC@Arizona.EDU>
Message-Id: <9003150026.AA03952@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


I liked the previous posting, and I don't think there was anything
wrong with it.  String scanning just seems a bit clearer to me than
the i/j stuff.  This is how I would have done it:

procedure patch(var,mask)
  text := ""
  var ? {
    every chr := !mask do {
      case chr of {
        "#" : text ||:= move(1)
        "$" : move(1)
        default : text ||:= chr
        }
      }
    }
  return text
end

Warning, warning:  This code fragment has not been tested (though
with Icon it's pretty hard to screw up something of this sort).

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From icon-group-request@arizona.edu  Fri Mar 16 15:06:10 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA01622; Fri, 16 Mar 90 15:06:10 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 16 Mar 90 13:52 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA29091; Fri, 16 Mar 90
 12:43:34 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Fri, 16 Mar 90 14:06 MST
Date: 16 Mar 90 15:58:20 GMT
From: mcsun!ukc!dcl-cs!se@uunet.uu.NET
Subject: icon on a PC
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <C3B0EE8559BFE00341@Arizona.EDU>
Message-Id: <891@dcl-vitus.comp.lancs.ac.uk>
Organization: Department of Computing at Lancaster University, UK.
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I recently helped to install Icon on the University's Sequent Symmetry S81
and then put on some workshops to 'educate the masses'.
I'm now being inundated with questions from people who were so impressed
they got a copy to run on their PCs. They're now coming to me with questions
about the implementation that I cannot answer. Anyone care to help?

1) Is there a PC version of icon which creates an executable file
   instead of having to run the ICONCX program every time?

2) Text files which I want to process using icon involve home-made
   fonts created in Pascal. What is the possibility of processing such
   fonts in icon?

3) I keep getting an error message 'inadequate space in block region'.
   Is there an environment variable that can be set to stop this? This 
   happens with long files.

Thanks in advance for any light shed on these problems.

Steve



-- 
NAME:	Steve Elliott			WORK PHONE: +44 524 65201 ext 3783
EMAIL:	se@uk.ac.lancs.comp
POST:	University of Lancaster, Department of Computing,
	Engineering Building, Bailrigg, Lancaster, LA1 4YR, UK.

From nowlin@iwtqg.att.COM  Fri Mar 16 17:23:41 1990
Resent-From: nowlin@iwtqg.att.COM
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA12143; Fri, 16 Mar 90 17:23:41 MST
Received: from att-in.att.com by Arizona.EDU; Fri, 16 Mar 90 15:47 MST
Resent-Date: Fri, 16 Mar 90 16:38 MST
Date: Fri, 16 Mar 90 16:41 CST
From: nowlin@iwtqg.att.COM
Subject: RE: icon on a PC
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Resent-Message-Id: <C39BA908DB1FE003CD@Arizona.EDU>
Message-Id: <C3A2DF4FF35FE003B5@Arizona.EDU>
>From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268)
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Status: O

> 1) Is there a PC version of icon which creates an executable file
>    instead of having to run the ICONCX program every time?

No.

> 2) Text files which I want to process using icon involve home-made
>    fonts created in Pascal. What is the possibility of processing such
>    fonts in icon?

That's up to you to implement but Icon should be up to it.

> 3) I keep getting an error message 'inadequate space in block region'.
>    Is there an environment variable that can be set to stop this? This 
>    happens with long files.

The third one I'm familiar with on a number of systems.  Define HEAPSIZE to
be larger than the default for your system to get rid of that problem.  The
default on the system I use (3B2) is 51,200 so I use 100,000 when I start
to get the block region warning.

Jerry Nowlin

From cjeffery  Fri Mar 16 17:41:50 1990
Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13199; Fri, 16 Mar 90 17:41:50 MST
Date: Fri, 16 Mar 90 17:41:48 mst
From: "Clinton Jeffery" <cjeffery>
Message-Id: <9003170041.AA14968@caslon>
Received: by caslon; Fri, 16 Mar 90 17:41:48 mst
To: icon-group
In-Reply-To: nowlin@iwtqg.att.COM's message of Fri, 16 Mar 90 16:41 CST <C3A2DF4FF35FE003B5@Arizona.EDU>
Subject: icon on a PC
Status: O

>> 3) I keep getting an error message 'inadequate space in block region'.
>>    Is there an environment variable that can be set to stop this? This 
>>    happens with long files.

>The third one I'm familiar with on a number of systems.  Define HEAPSIZE to
>be larger than the default for your system to get rid of that problem.  The
>default on the system I use (3B2) is 51,200 so I use 100,000 when I start
>to get the block region warning.

This is the correct answer.  Unfortunately, I have my doubts as to whether
most MS-DOS Icon implementations can support HEAPSIZE values larger than
64K due to the segmentation of the 8086 architecture.  Large Icon programs
have to be designed well in order to run under MS-DOS.  Version 8.0 of
Icon is more space-efficient in its use of the block region.

From icon-group-request@arizona.edu  Tue Mar 20 05:56:18 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA05939; Tue, 20 Mar 90 05:56:18 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 20 Mar 90 05:46 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20684; Tue, 20 Mar 90
 04:33:00 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 20 Mar 90 05:48 MST
Date: 18 Mar 90 22:36:53 GMT
From: cs.utexas.edu!news-server.csri.toronto.edu!qucdn!walmslec@tut.cis.ohio-state.EDU
Subject: RE: icon on a PC
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <C0D1CFD280BFE00F37@Arizona.EDU>
Message-Id: <90077.173653WALMSLEC@QUCDN.BITNET>
Organization: Queen's University at Kingston
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <C3A2DF4FF35FE003B5@Arizona.EDU>, <9003170041.AA14968@caslon>
Status: O

Regarding Icon version 8.0.

Is it available yet, if not then when, and what new features, fixes will it
provide?

thanks
chris
-------
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~|
| Christopher J. M. Walmsley                 |  Queen's University     |
| BITNET:  WALMSLEC@QUCDN                    |  Kingston, Ontario      |
| X.400:   Christopher.Walmsley@QueensU.CA   |  Canada                 |
|                                            |                         |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From ralph  Tue Mar 20 06:26:57 1990
Resent-From: "Ralph Griswold" <ralph>
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA07059; Tue, 20 Mar 90 06:26:57 MST
Received: from megaron.cs.arizona.edu by Arizona.EDU; Tue, 20 Mar 90 06:07 MST
Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA06718; Tue, 20 Mar 90
 06:06:58 MST
Resent-Date: Tue, 20 Mar 90 06:18 MST
Date: Tue, 20 Mar 90 06:06:58 MST
From: Ralph Griswold <ralph@cs.arizona.edu>
Subject: RE: icon on a PC
Resent-To: icon-group@cs.arizona.edu
To: cs.utexas.edu!news-server.csri.toronto.edu!qucdn!walmslec@tut.cis.ohio-state.EDU,
        icon-group@arizona.edu
Resent-Message-Id: <C0CDA5E3797FE00A45@Arizona.EDU>
Message-Id: <9003201306.AA06718@megaron.cs.arizona.edu>
In-Reply-To: <90077.173653WALMSLEC@QUCDN.BITNET>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: 
 cs.utexas.edu!news-server.csri.toronto.edu!qucdn!walmslec@tut.cis.ohio-state.EDU,
 icon-group@Arizona.edu
Status: O

Version 8 of Icon will be released on a system-by-system basis as we
get individual implementations and documentation done.

We expect to release Version 8 for UNIX and VMS in a few weeks.  Others
will follow as time and resources permit.

We'll provide a summary of new features and other relevant information
when the first release is announced.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From tenaglia@fps.mcw.edu  Wed Mar 21 11:26:58 1990
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA24446; Wed, 21 Mar 90 11:26:58 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA14178; Wed, 21 Mar 90 13:26:47 EST
Received: by uwm.edu; id AA06632; Wed, 21 Mar 90 12:25:18 -0600
Message-Id: <9003211825.AA06632@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Wed, 21 Mar 90 11:54:37 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Wed, 21 Mar 90 11:54:14 CDT
Date: Wed, 21 Mar 90 11:54:14 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: Icon Ideas ?
Status: O


I've noticed in the Icon Newsletter discussion of an object oriented version
of Icon (IDOL?). It makes use of the $ character. Somehow I can never quite
seem to comprehend this 'object oriented' stuff. It looks like kloojed
terminology.

But back to the $.

I wonder about the use of the $ to create user define operators. For example:

operation("$+",lst)
  case *lst of
    {
    1 : return &null                    # unary form not defined
    2 : return lst[1] || " " || lst[2]  # binary form ok
    }
  end

Later ...

a := b $+ c     # 'a' is 'b' appended with a space and then 'c'
d := $+e        # unary form should fail or be &null
x $+:= z        # augmented form appends blank and 'z' to 'x'

Is this a desirable 'feature' for Icon 8.2 or 9.0? Or would it be impractical?

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From icon-group-request@arizona.edu  Thu Mar 22 22:36:35 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA09424; Thu, 22 Mar 90 22:36:35 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 22 Mar 90 22:31 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA18719; Thu, 22 Mar 90
 21:27:49 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 22 Mar 90 22:34 MST
Date: 23 Mar 90 05:27:19 GMT
From: zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!corre@tut.cis.ohio-state.EDU
Subject: RE: icon on a PC
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <BEB2F93CF23FE01986@Arizona.EDU>
Message-Id: <3028@uwm.edu>
Organization: University of Wisconsin-Milwaukee
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <891@dcl-vitus.comp.lancs.ac.uk>
Status: O



I use custom made characters on my Zenith by loading a table of
chars 128-255 into memory, then going to the graphics screen with
writes("\e[=6h")
I first install the ANSI.SYS terminal driver by including the
relevant command in the CONFIG.SYS file.
--
Alan D. Corre
Department of Hebrew Studies
University of Wisconsin-Milwaukee                     (414) 229-4245
PO Box 413, Milwaukee, WI 53201               corre@csd4.csd.uwm.edu

From icon-group-request@arizona.edu  Mon Mar 26 09:07:23 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA01386; Mon, 26 Mar 90 09:07:23 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Mon, 26 Mar 90 09:02 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA07097; Mon, 26 Mar 90
 07:53:56 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Mon, 26 Mar 90 09:05 MST
Date: 26 Mar 90 15:39:54 GMT
From: consp22@bingvaxu.cc.binghamton.EDU
Subject: Need general Info
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <BBFF54C3DDFFE03131@Arizona.EDU>
Message-Id: <3210@bingvaxu.cc.binghamton.edu>
Organization: SUNY Binghamton POD consultants
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


	I was asked to do some 'poking' around to see what I could
find out about icon.  We are looking into using it to pre-process some
information before moving the data down to a micro.  If somebody would
to be so kind as to give me or direct me to information on the language.

						Thank you,

-------------------------------------------------------------------------------
|  Consp22@Bingsuns.pod.binghamton.edu  |  SUNY-B Computer Consultants -      |
|  Consp22@Bingvaxu.cc.binghamton.edu   |  Trying to keep the world safe from |
|---------------------------------------|  the SUNY-B Computer users.         |
|  Consultant/Techie - World Computers  |-------------------------------------|
|  Computer Cons. - SUNY Binghamton     |     Darren `Mac Hack' Handler       |
|-----------------------------------------------------------------------------|
I don't know if I am going to heaven or hell, I just hope God grades on a curve

From tenaglia@fps.mcw.edu  Mon Mar 26 18:29:53 1990
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA03233; Mon, 26 Mar 90 18:29:53 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA16891; Mon, 26 Mar 90 19:19:10 EST
Received: by uwm.edu; id AA06342; Mon, 26 Mar 90 14:43:35 -0600
Message-Id: <9003262043.AA06342@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Mon, 26 Mar 90 14:04:18 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Mon, 26 Mar 90 13:31:24 CDT
Date: Mon, 26 Mar 90 13:31:24 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: Re: Icon Ideas ?
X-Vms-Mail-To: UUCP%"langley@DG-RTP.DG.COM"
Status: O

In response to a response to my posting,....

> > But back to the $.
> >
> > I wonder about the use of the $ to create user define operators. For example:
> >
> > operation("$+",lst)
> >   case *lst of
> >     {
> >     1 : return &null                    # unary form not defined
> >     2 : return lst[1] || " " || lst[2]  # binary form ok
> >     }
> >   end
> >
> > Later ...
> >
> > a := b $+ c     # 'a' is 'b' appended with a space and then 'c'
> > d := $+e        # unary form should fail or be &null
> > x $+:= z        # augmented form appends blank and 'z' to 'x'
> >
> > Is this a desirable 'feature' for Icon 8.2 or 9.0? Or would it be impractical?
>
> Wouldn't general operator overloading in Icon be better?

Yes, I gave some thought to overloading some of the existing Icon operators.

I had had some classes in ADA language which permits this. However, as a
group of programmers (40 of us) discussed it. The thought that + could be
* or - made us nervous. A language such as ADA is very strict about DATA TYPES,
and for it NOT to be strict with the OPERATORS seemed sort of inconsistent.

My icon background gives me the philosophy of typed operators as well as
(loosely) typed data/procedures. It seems more natural to keep the user
defined objects (operators, procedures, variables) separated from the built
in ones. This is only my opinion, and it may not line up with the goals
of the Icon project in the long run.

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From nowlin@iwtqg.att.COM  Tue Mar 27 00:16:50 1990
Resent-From: nowlin@iwtqg.att.COM
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA21961; Tue, 27 Mar 90 00:16:50 MST
Received: from att-in.att.com by Arizona.EDU; Tue, 27 Mar 90 00:05 MST
Resent-Date: Tue, 27 Mar 90 00:14 MST
Date: Mon, 26 Mar 90 23:15 CST
From: nowlin@iwtqg.att.COM
Subject: RE: Icon Ideas? (operators)
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Resent-Message-Id: <BB805174307FE036D9@Arizona.EDU>
Message-Id: <BB818785EF9FE037D8@Arizona.EDU>
>From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268)
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Status: O

> In response to a response to my posting,....
> 
> > > But back to the $.
> > >
> > > I wonder about the use of the $ to create user define operators.
> >
> > Wouldn't general operator overloading in Icon be better?
> 
> Yes, I gave some thought to overloading some of the existing Icon operators.
> 
> ...
>
> My icon background gives me the philosophy of typed operators as well as
> (loosely) typed data/procedures. It seems more natural to keep the user
> defined objects (operators, procedures, variables) separated from the built
> in ones. This is only my opinion, and it may not line up with the goals
> of the Icon project in the long run.
> 
> Chris Tenaglia (System Manager)

I can't remember if Bill's object oriented Icon has operator and function
overloading or not.  This is my two cents worth and if I've got it all
wrong I trust someone will let me know.

The language that has overloaded operators that I'm most familiar with is
C++.  It discriminates between overloaded operators (functions) by
enforcing strict typing of operands (arguments).  This is how the compiler
determines which operation to perform on the operands.  Operators can
appear to take on almost any type for the programmer working with well
defined C++ classes.

Icon, on the other hand, has operands or variables that can be any type.
To discriminate between different types of operands Icon uses fairly
strongly typed operators (and functions).  There are exceptions
(assignment) but for the most part the operators in Icon are type specific.
I know this because of all the run time errors I generate.  You get a great
deal of automatic type conversion in Icon but it's driven by the operators
more than the types of the operands.

You can add two strings of digits in Icon with the "+" operator but you get
a numeric result, not a longer string of digits.  You can also concatenate
two numbers into a string of digits with the "||" operator.  To allow
overloaded operators would violate this scheme.  How would Icon know
whether to do automatic type conversion or try for another version of the
operator that was a better fit to the given operands?  Someone with
experience in the implementation could shed more light on this.

User defined operators that are distinguished from built-in operators by an
explicit syntax are the best compromise but there are an awful lot of
operators in Icon already.  Procedure names can be very descriptive. (hint)

Jerry Nowlin
(...!att!iwtqg!nowlin)

From kwalker  Tue Mar 27 09:45:16 1990
Date: Tue, 27 Mar 90 09:45:16 MST
From: "Kenneth Walker" <kwalker>
Message-Id: <9003271645.AA17945@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA17945; Tue, 27 Mar 90 09:45:16 MST
In-Reply-To: <BB818785EF9FE037D8@Arizona.EDU>
To: icon-group
Subject: RE: Icon Ideas? (operators)
Status: O

> Date: Mon, 26 Mar 90 23:15 CST
> From: nowlin@iwtqg.att.COM
> 
> You can add two strings of digits in Icon with the "+" operator but you get
> a numeric result, not a longer string of digits.  You can also concatenate
> two numbers into a string of digits with the "||" operator.  To allow
> overloaded operators would violate this scheme.  How would Icon know
> whether to do automatic type conversion or try for another version of the
> operator that was a better fit to the given operands?  Someone with
> experience in the implementation could shed more light on this.

"+" does give a numeric result, but numeric is either integer or real.
The decision about whether to do integer arithmetic or real arithmetic
is made at run-time. If you could replace "+" with with your own operation
and somehow invoke the old "+" within your operation, you could get the
effect of overloading. Assuming the function old_op() gets you the built-in
version, the following operation would enhance "+" to do pair wise addition
of lists.

operator +(a,b)
   if type(a) == type(b) == "list" then {
      r := []
      every i := 1 to *a do
         put(r, old_op(a[i], b[i]))
      return r
      }
   else
      return old_op(a,b)
end
      
I don't necessarily think being able to arbitrarily redefine operators
is a good idea. It leaves you with too few features in the language
whose meaning you can "trust" while reading a program. The idea of
being able to add new operators does not have this problem. However,
no one has brought up the problems of precedence and associtivity.
Does a $- b - c mean (a $- b) - c or a $- (b - c)? You need something
in your definition of an operator to deal with this.

Currently, the organization of icont does not allow you to add new
operators. With the tools we use to make icont, it is possible to
organize a translator so that adding new operators can be done, but
you must decide on a fixed set of precedences. If you decide "+"
is at precedence 12 and "*" is at precedence 13, you will not be
able to add operators with intermediate precedences. If you fix
them at 12 and 15, you are limited to 2 levels of precedence between
them. To get around these problems (which are not particularly
serious), you would need a different kind of parser within icont.

  Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 2858  kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker

From tenaglia@fps.mcw.edu  Tue Mar 27 10:22:43 1990
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA20477; Tue, 27 Mar 90 10:22:43 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA23876; Tue, 27 Mar 90 11:18:50 EST
Received: by uwm.edu; id AA23916; Tue, 27 Mar 90 09:22:38 -0600
Message-Id: <9003271522.AA23916@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Tue, 27 Mar 90 09:13:57 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Tue, 27 Mar 90 08:46:42 CDT
Date: Tue, 27 Mar 90 08:46:42 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: VMS Icon 7.0 process cleanup
Status: O

I'm running Icon 7.0 on a VAX with VMS 5.2. I've noticed a little problem.

With files this code fragment works as expected :

          (inf := open(file)) | stop("Can't open ",file)
          while line := read(inf) do
            if find(target,line) then break
          close(inf)
          return line

But with processes something goes wrong

          (inf := open(cmd,"pr") | stop("Can't run ",cmd)
          while line := read(inf) do
            if find(target,line) then break
          close(inf)
          return line

The close(inf) doesn't work here. The process stays open. Shouldn't it just
be killed? Eventually ones process quota is reached and the open fails if
the fragment is in a loop. Is this fixed in Icon 8? I think the unix versions
work properly (do they?).

          (inf := open(cmd,"pr") | stop("Can't run ",cmd)
          while line := read(inf) do
            if find(target,line) then temp := line
          close(inf)
          return temp

is my current work-around. But if it generates thousands of lines of output,
and I'm only interested in the first 10, it's rather wasteful.

Thanx,

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From gudeman  Tue Mar 27 12:19:21 1990
Date: Tue, 27 Mar 90 12:19:21 MST
From: "David Gudeman" <gudeman>
Message-Id: <9003271919.AA00259@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA00259; Tue, 27 Mar 90 12:19:21 MST
To: icon-group
In-Reply-To: "Kenneth Walker"'s message of Tue, 27 Mar 90 09:45:16 MST <9003271645.AA17945@megaron.cs.arizona.edu>
Subject: Icon Ideas? (operators)
Status: O

Date: Tue, 27 Mar 90 09:45:16 MST
From: "Kenneth Walker" <kwalker>

]I don't necessarily think being able to arbitrarily redefine operators
]is a good idea. It leaves you with too few features in the language
]whose meaning you can "trust" while reading a program.

Ken is doing research that involves optimization of Icon programs by
doing static analysis of the programs.  Obviously this gets harder as
more things become dependent on run-time conditions.  Perhaps this is
slightly effecting his opinion? ;-)

There is an important advantage to overloading operators, though.
Suppose you write a calculator program.  Of course, inside this
program you use mathematical operators.  Now suppose you decide to
upgrade the program to use complex numbers.  You can't just define a
new type and write functions to operate on complex numbers, you have
to go through the entire program and replace every arithmetic
expression such as ``a + b'' with ``add(a,b)'' (if it is in a position
to take complex values for ``a'' and/or ``b'').  Ick.

If you could overload Icon's built-in operators, all you would have to
do is overload the arithmetic operators so that they understood
complex numbers.  I can think of similar examples for non-numeric
applications.

Someone objected that this lets you do strange things like define + to
do subtraction.  True enough.  But honestly, if a programmer is that
determined to make write an unreadable program, he can do it just as
easily without operator overloading.  There is nothing the language
designer can do about obtuseness of programmers, and it seems
pointless to try.

From icon-group-request@arizona.edu  Tue Mar 27 20:15:21 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA03298; Tue, 27 Mar 90 20:15:21 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 27 Mar 90 20:17 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20957; Tue, 27 Mar 90
 19:03:35 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 27 Mar 90 20:17 MST
Date: 28 Mar 90 02:30:09 GMT
From: shelby!csli!poser@decwrl.dec.COM
Subject: RE: Icon Ideas? (operators)
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <BAD84AEEF71FE032CF@Arizona.EDU>
Message-Id: <12860@csli.Stanford.EDU>
Organization: Center for the Study of Language and Information, Stanford U.
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <9003271645.AA17945@megaron.cs.arizona.edu>,
 <9003271919.AA00259@megaron.cs.arizona.edu>
Status: O

In article <9003271919.AA00259@megaron.cs.arizona.edu> gudeman@CS.ARIZONA.EDU ("David Gudeman") writes:
>
>If you could overload Icon's built-in operators, all you would have to
>do is overload the arithmetic operators so that they understood
>complex numbers.  I can think of similar examples for non-numeric
>applications.
>
>Someone objected that this lets you do strange things like define + to
>do subtraction.

There is an intermediate approach available in object-oriented languages
as well as in languages like ML that provide disjunctive procedure
definitions. Implement operator overloading as ADDITION of methods for
new data types, but don't allow pre-defined methods (i.e. the built-in
operators) to be removed. This guarantees that an operator will have
the expected semantics when applied to built-in data types and reduces the
uncertainty to derived types.

From goer@sophist.uchicago.EDU  Wed Mar 28 11:36:57 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA24195; Wed, 28 Mar 90 11:36:57 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Wed, 28 Mar 90 11:38 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 28 Mar 90
 12:38:40 CST
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA18812; Wed, 28 Mar 90
 12:33:15 CST
Resent-Date: Wed, 28 Mar 90 11:38 MST
Date: Wed, 28 Mar 90 12:33:15 CST
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: icon & prolog
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <BA578B75D7DFE0384F@Arizona.EDU>
Message-Id: <9003281833.AA18812@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Just an idle question:  Has anyone thought of implementing
Prolog in Icon, either as a Prolog -> Icon translator, or
as a Prolog interpreter written in Icon?  I'm not a Prolog
expert, but it occurs to me that Icon might offer facili-
ties to make such a project much easier than it might be
for most other languages.

Just curious.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From gudeman  Wed Mar 28 13:15:06 1990
Resent-From: "David Gudeman" <gudeman>
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA01233; Wed, 28 Mar 90 13:15:06 MST
Received: from megaron.cs.arizona.edu by Arizona.EDU; Wed, 28 Mar 90 13:15 MST
Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA01095; Wed, 28 Mar 90
 13:12:49 MST
Resent-Date: Wed, 28 Mar 90 13:16 MST
Date: Wed, 28 Mar 90 13:12:49 MST
From: David Gudeman <gudeman@cs.arizona.edu>
Subject: icon & prolog
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <BA49E0F5B17FE0433E@Arizona.EDU>
Message-Id: <9003282012.AA01095@megaron.cs.arizona.edu>
In-Reply-To: Richard Goerwitz's message of Wed, 28 Mar 90 12:33:15 CST
 <9003281833.AA18812@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

   From: Richard Goerwitz <goer@sophist.uchicago.EDU>

   Just an idle question:  Has anyone thought of implementing
   Prolog in Icon, either as a Prolog -> Icon translator, or
   as a Prolog interpreter written in Icon?  I'm not a Prolog
   expert, but it occurs to me that Icon might offer facili-
   ties to make such a project much easier than it might be
   for most other languages.

I wrote an interpreter for a small logic language in Icon, not much
like Prolog, but it did do goal-directed unification with backtracking
like Prolog does.  Your intuition is correct that Icon makes this
easy, at least for an interpreter.  I was able to use Icon's
goal-directed evaluation to do all the goal-directed evualation of the
logic language, so I didn't have to keep track of states or anything
like that.

I just looked for the code I wrote, and it seems to have disapeared.
Oh well.

From icon-group-request@arizona.edu  Thu Mar 29 04:47:01 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA01727; Thu, 29 Mar 90 04:47:01 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 04:46 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20318; Thu, 29 Mar 90
 03:42:50 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 29 Mar 90 04:47 MST
Date: 29 Mar 90 10:54:12 GMT
From: zaphod.mps.ohio-state.edu!usc!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
Subject: RE: icon & prolog
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B9C7EB4F5BDFE04741@Arizona.EDU>
Message-Id: <1996@bruce.OZ>
Organization: Monash Uni. Computer Science, Australia
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <9003281833.AA18812@sophist.uchicago.edu>
Status: O

In article <9003281833.AA18812@sophist.uchicago.edu>, goer@SOPHIST.UCHICAGO.EDU (Richard Goerwitz) writes:
> Just an idle question:  Has anyone thought of implementing
> Prolog in Icon, either as a Prolog -> Icon translator, or
> as a Prolog interpreter written in Icon?  I'm not a Prolog ...

I wrote a Prolog interpreter in Icon some time ago.  I never got
around to doing anything with it (i.e. publishing wise).  I was
in the process of writing the converse (Icon interpreter in Prolog)
when I got sidetracked.  I will post the sources and documentation.
There were four versions in increasing order of complexity.  I only got
around to documentation for versions 1 and 2.  The versions appear to
have implemented the following incrementally:
	1. Basic pure Prolog with negation by failure,
	2. List notation added (syntactic sugar),
	3. Assert and Retract,
	4. Cut.

The program documentation files *.doc are in troff format.  They're still
readable however.  The user guides *.usr are just plain text.  The source
is copyright in the sense that it can be used anywhere for any purpose 
provided the copyright is maintained and I get credit for my work.

I would be interested in any comments about the code.  I was trying to get
as succinct a source file as possible without sacrificing clarity (but its
always tempting to save a line here and there!).

From icon-group-request@arizona.edu  Thu Mar 29 05:02:47 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02405; Thu, 29 Mar 90 05:02:47 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:02 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20510; Thu, 29 Mar 90
 03:47:58 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 29 Mar 90 05:04 MST
Date: 29 Mar 90 11:10:30 GMT
From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
Subject: Prolog in Icon (version 1)
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B9C573061FDFE04C55@Arizona.EDU>
Message-Id: <1997@bruce.OZ>
Organization: Monash Uni. Computer Science, Australia
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


########################## global variables and types ######################

record ctxt(env,subst)    # integer[string] * ((integer | struct | null) list)
record struct(name,args)  # string * ((integer | struct) list)
record rule(ids,head,body)# string list * predicate * predicate list \
record all(ids,body)      # string list * predicate list              } clauses
record one(ids,body)      # string list * predicate list             /
record fun(name,args)     # string * predicate list        \ types of 
record var(name)          # string                         / predicates
global dbase              # table of clauses indexed by head name
global consult            # stack of files being consulted
global query              # top level query

################################## driver ##################################

procedure main()
   dbase:=table([]); consult:=[&input] # empty dbase; standard input
   while \query | *consult>0 do { # more queries possible
      prog() # parse clauses, possibly setting query as a side effect
      if \query then case type(query) of {
         "all" : {every printsoln(query); write("no more solutions")}
         "one" : if not printsoln(query) then write("no")}
      else pop(consult)}
end

procedure printsoln(qry) # print first or next solution to qry 
   local ans,v
   every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { 
      writes("yes")
      every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst))
      suspend write()}
end

########################### Prolog interpreter #############################

procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl]
   local sub,q                     # in given context, returns updated context
   if hd>tl then return ctext
   if (q:=qry[hd]).name=="~" then # negation by failure
      {if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)}
   else every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],ctext.subst) do
           suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))
end

procedure tryclause(term,cls,sub) # resolves term using given clause or fails
   local ctext                    # a copy of sub is used so no side effects
   ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause
   if unify(term,scanpred(cls.head,ctext),ctext.subst) then 
      suspend resolve(cls.body,1,*cls.body,ctext).subst
end

procedure scanpred(prd,ctext) # converts predicate to structure 
   local args; args:=[] 
   if type(prd)=="var" then return ctext.env[prd.name]
   every put(args,scanpred(!prd.args,ctext))
   return struct(prd.name,args) 
end

######################## primitive domain operations ########################

procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub 
   local v,i,num           # side effect: sub is updated
   if type(t1)=="integer" then {
      while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1
      return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2}
   if type(t2)=="integer" then return unify(t2,t1,sub)
   if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then {
      every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail
      return}
end

procedure newctxt(ids,sub)       # forms a new context by extending sub
   local env; env:=table(&null)  # to accommodate the unbound identifiers
   every env[!ids]:=*put(sub,&null)
   return ctxt(env,sub)
end
   
procedure trmstr(str,sub) # converts a term to a string suitable for output
   local s; s:=""
   case type(str) of {
      "integer" : return trmstr(sub[str],sub)
      "struct" : {every s:=s||trmstr(!str.args,sub)||","
                  return str.name||(if *s=0 then "" else "("||s[1:-1]||")")}
      "null" : return "undefined"}
end

############################## Prolog parser ###############################

procedure prog() # parses consult[1] until query found or end of file
   query:=&null
   while write(read(consult[1])) ? clause() 
   if /query & consult[1]~===&input then close(consult[1])
end

procedure clause() # adds a clause to the dbase or fails when query set
   local p,b,ids,t; b:=[]; ids:=[]
   if =":-" then query:=all(ids,b:=body())
   else if ="?-" then query:=one(ids,b:=body())
   else {p:=pred(); if =":-" then b:=body()}
   if (t:=trim(tab(0)))~=="." then # syntax error
      return write("syntax error: ",t,if *t=0 then "." else " not"," expected")
   every extractids(ids,\p|!b) # list of variable identifiers
   if (\p).name=="consult" then every push(consult,open((!p.args).name))
   return dbase[(\p).name]:=dbase[p.name]|||[rule(ids,p,b)]
end
 
procedure body() # list of predicates
   local b; b:=[]
   if put(b,pred()) then while ="," & put(b,pred())
   return b
end

procedure pred() # ~pred | name(body) | uc_name | lc_name()
   local name,args; args:=[]
   if ="~" then return fun("~",[pred()])
   if not (name:=tab(many(&ucase++&lcase++'0123456789._'))) then fail
   if any(&ucase,name) then return var(name)
   if ="(" & args:=body() then # arguments parsed
      if  not =")" then write("syntax error: \")\" expected before ",tab(0))
   return fun(name,args)
end

procedure extractids(ids,pred)
   if type(pred)=="fun" then every extractids(ids,!pred.args)
   else if not (pred.name==!ids) then put(ids,pred.name)
   return
end

From icon-group-request@arizona.edu  Thu Mar 29 05:02:53 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02430; Thu, 29 Mar 90 05:02:53 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:03 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20775; Thu, 29 Mar 90
 03:53:51 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 29 Mar 90 05:04 MST
Date: 29 Mar 90 11:20:55 GMT
From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
Subject: prolog2.usr
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B9C577C4B95FE04C55@Arizona.EDU>
Message-Id: <2000@bruce.OZ>
Organization: Monash Uni. Computer Science, Australia
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


                          Prolog in Icon (version 2)
                          --------------------------
                        Alan Finlay, Monash University.

The Prolog interpretor associated with this file is written in Icon for  a  very
simple  version  of  pure Prolog.  Cuts, arithmetic, assert and retract are  not
implemented.   There  is  only  one  non  logical  primitive "consult"  which is
used as if it were a fact with one argument.  The argument to consult is used as
a file name and the file is consulted  for  clauses  and queries.   Negation  is
implemented as negation by failure.  Version two includes list notation.

Acceptable Prolog programs consist of a list of clauses and queries,   one   per
line.  The clauses are either facts or rules:

        predicate.
        predicate:-predicate,...,predicate.

A fact is a predicate followed by a full stop.  The syntax of a  predicate  will
be   described   later.   A rule has a "head" on the left of the turnstile ":-",
and a "body" on the right.  The body is a  list  of  one  or   more   predicates
separated   by  commas and terminated by a full stop.  The predicates are simple
identifiers, identifiers parameterised by one or more  arguments,   or   negated
predicates:

        identifier
        identifier(argument,...,argument)
        ~predicate

An identifier is a string of letters, digits,  underline  or  full   stop.    An
identifier   which   starts  with  an  upper  case  letter  and has no arguments
associated with it is interpreted as a variable identifier.  The  arguments  are
syntactically   identical   to  predicates  but  interpreted differently.  Those
arguments with arguments of their own are called structures, those  without  are
called   constants   if  not starting with an upper case letter and variables if
they do.

A query is like a rule without a head:

        :-predicate,...,predicate.
        ?-predicate,...,predicate.

The first form causes the interpreter to find all solutions to the  query.   The
second   form   only  asks  for  one  solution.   The interpreter reports values
assigned to free variables in the  query  and  this  is  the  way   answers   to
questions more complex than yes/no are obtained.

As a simple example here is a traditional inference test program

        mortal(X):-man(X).
        man(X):-greek(X).
        greek(socrates).
        ?-mortal(socrates).

Try typing in this example after starting the interpreter with the command

        iconx prolog

and the response is

        yes

After this the interpreter will be waiting for another clause or query   to   be
entered.  Try entering another query for example

        ?-mortal(Socrates).

which produces the strange response

        yes, Socrates = socrates

This is because the upper case S indicates a  free  variable.    To   experiment
with negation try

        being(X):-man(X).
        being(X):-god(X).
        god(apollo).
        :-being(X),~mortal(X).

which produces the response

        yes, X = apollo
        no more solutions

Note that negation should only be applied to ground terms (terms with  all   the
variables bound) or strange behaviour may result.  For example the query

        :-~mortal(X),being(X).

which fails with no solutions.

Finaly send an end of file  to  finish  interpreting  Prolog   commands.    More
examples   can   be  found  in files "test1.plg", "test2.plg", . . .  To run the
first of these enter the fact

        consult(test?.plg).

etc, after starting the interpreter.

The list notation is simply  a  set  of  convenient  abbreviations.   Lists  are
assumed to be represented by using the binary dot constructor "." and "nil". The
dot constructor should  only  be  applied  to  (element,list)  pairs  and  "nil"
represents  an  empty list.  An infix version of the dot constructor "|" is also
provided and is useful for pattern matching.  Some examples follow.

        abbreviation                    represents
        []                              nil
        [1]                             .(1,nil)
        1|nil                           .(1,nil)
        [1,2]                           .(1,.(2,nil))
        [1,2,3,4]                       .(1,.(2,.(3,.(4,nil))))
        1|2|3|4|nil                     .(1,.(2,.(3,.(4,nil))))
        [[1,2],[3,4]]                   .(.(1,.(2,nil)),.(.(3,.(4,nil)),nil))
        1|2                             .(1,2)

The last example is not a proper list since  the  second  argument  of  the  dot
constuctor  is  not  a list.  The list [A,B,C,D] must have exactly four elements
whereas the list A|B|C|D has three or more depending upon the length of the list
bound to D.  The following two versions of naive reverse are exactly equivalent.

reverse([],[]).
reverse(X|Y,Z):-reverse(Y,W),append(W,[X],Z).
append(X,[],X).
append([],X,X).
append(X|Y,Z,X|W):-append(Y,Z,W).

reverse(nil,nil).
reverse(.(X,Y),Z):-reverse(Y,W),append(W,.(X,nil),Z).
append(X,nil,X).
append(nil,X,X).
append(.(X,Y),Z,.(X,W)):-append(Y,Z,W).

From icon-group-request@arizona.edu  Thu Mar 29 05:02:57 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02452; Thu, 29 Mar 90 05:02:57 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:03 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20539; Thu, 29 Mar 90
 03:48:27 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 29 Mar 90 05:04 MST
Date: 29 Mar 90 11:14:23 GMT
From: cs.utexas.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
Subject: Prolog in Icon (version 2)
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B9C57CFA579FE04C55@Arizona.EDU>
Message-Id: <1998@bruce.OZ>
Organization: Monash Uni. Computer Science, Australia
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


# Prolog in Icon, version 2, (C) Alan Finlay, Monash University.
########################## global variables and types ######################

record ctxt(env,subst)    # integer[string] * ((integer | struct | null) list)
record struct(name,args)  # string * ((integer | struct) list)
record rule(ids,head,body)# string list * predicate * predicate list \
record all(ids,body)      # string list * predicate list              } clauses
record one(ids,body)      # string list * predicate list             /
record fun(name,args)     # string * predicate list        \ types of 
record var(name)          # string                         / predicates
global dbase              # table of clauses indexed by head name
global consult            # stack of files being consulted
global query              # top level query

################################## driver ##################################

procedure main()
   dbase:=table([]); consult:=[&input] # empty dbase; standard input
   while \query | *consult>0 do { # more queries possible
      prog() # parse clauses, possibly setting query as a side effect
      if \query then case type(query) of {
         "all" : {every printsoln(query); write("no more solutions")}
         "one" : if not printsoln(query) then write("no")}
      else pop(consult)}
end

procedure printsoln(qry) # print first or next solution to qry 
   local ans,v
   every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { 
      writes("yes")
      every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst))
      suspend write()}
end

########################### Prolog interpreter #############################

procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl]
   local sub,q                     # in given context, returns updated context
   if hd>tl then return ctext
   if (q:=qry[hd]).name=="~" then # negation by failure
      {if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)}
   else every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],ctext.subst) do
           suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))
end

procedure tryclause(term,cls,sub) # resolves term using given clause or fails
   local ctext                    # a copy of sub is used so no side effects
   ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause
   if unify(term,scanpred(cls.head,ctext),ctext.subst) then 
      suspend resolve(cls.body,1,*cls.body,ctext).subst
end

procedure scanpred(prd,ctext) # converts predicate to structure 
   local args; args:=[] 
   if type(prd)=="var" then return ctext.env[prd.name]
   every put(args,scanpred(!prd.args,ctext))
   return struct(prd.name,args) 
end

######################## primitive domain operations ########################

procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub 
   local v,i,num           # side effect: sub is updated
   if type(t1)=="integer" then {
      while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1
      return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2}
   if type(t2)=="integer" then return unify(t2,t1,sub)
   if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then {
      every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail
      return}
end

procedure newctxt(ids,sub)       # forms a new context by extending sub
   local env; env:=table(&null)  # to accommodate the unbound identifiers
   every env[!ids]:=*put(sub,&null)
   return ctxt(env,sub)
end
   
procedure trmstr(trm,sub) # converts a term to a string suitable for output
   local s; s:=""
   case type(trm) of {
      "integer" : return trmstr(sub[trm],sub)
      "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list 
                 else {every s:=s||trmstr(!trm.args,sub)||","
                      return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")}
      "null" : return "undefined"}
end

procedure lstr(l,sub) # succeeds if l is a proper non-empty list and
   local hd,tl        # converts l to string suitable for output
   if l.name=="." & *l.args=2 then {
      hd:=trmstr(l.args[1],sub); tl:=l.args[2]
      while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl
      case type(tl) of {
         "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil
                     return hd||","||lstr(tl,sub)}                 # cons
         "null" : return "undefined"}}
end

############################## Prolog parser ###############################

procedure prog() # parses consult[1] until query found or end of file
   query:=&null
   while write(read(consult[1])) ? clause() 
   if /query & consult[1]~===&input then close(consult[1])
end

procedure clause() # adds a clause to the dbase or fails when query set
   local p,b,ids,t; b:=[]; ids:=[]
   if =":-" then query:=all(ids,b:=body())
   else if ="?-" then query:=one(ids,b:=body())
   else {p:=pred(); if =":-" then b:=body()}
   if (t:=trim(tab(0)))~=="." then # syntax error
      return write("syntax error: ",t,if *t=0 then "." else " not"," expected")
   every extractids(ids,\p|!b) # list of variable identifiers
   if (\p).name=="consult" then every push(consult,open((!p.args).name))
   return dbase[(\p).name]:=dbase[p.name]|||[rule(ids,p,b)]
end
 
procedure body() # list of predicates (may be empty)
   local b; b:=[]
   if put(b,pred()) then while ="," & put(b,pred())
   return b
end

procedure dots() # converts non-empty body of list to cons cells
   local p
   if p:=pred() then if ="," then return fun(".",[p,dots()])
                     else return fun (".",[p,fun("nil",[])])
end

procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred
   local name,args,d,p,pp; args:=[]
   if ="~" then p:=fun("~",[pred()])
   else if name:=tab(many(&ucase++&lcase++'0123456789._')) then {
      if any(&ucase,name) then p:=var(name)
      else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}}
   else if ="[]" then p:=fun("nil",[]) # empty list abbreviation
   else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation
   if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons
                else write("syntax error: missing second argument to \"|\"")
   return \p # n.b. fails if predicate invalid
end

procedure check(s) # report error if s not present or skip over it
   if not =s then write("syntax error: ",s," expected before ",tab(0))
end

procedure extractids(ids,pred) # build the set of variable identifiers 
   if type(pred)=="fun" then every extractids(ids,!pred.args)
   else if not (pred.name==!ids) then put(ids,pred.name)
   return # the identifiers have been appended to reference parameter ids
end

From icon-group-request@arizona.edu  Thu Mar 29 05:03:05 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02469; Thu, 29 Mar 90 05:03:05 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:03 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20789; Thu, 29 Mar 90
 03:54:04 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 29 Mar 90 05:04 MST
Date: 29 Mar 90 11:22:14 GMT
From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
Subject: prolog3.icn
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B9C583A4465FE04C55@Arizona.EDU>
Message-Id: <2001@bruce.OZ>
Organization: Monash Uni. Computer Science, Australia
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O




# Prolog in Icon, version 3, (C) Alan Finlay, Monash University.
########################## global variables and types ######################

record ctxt(env,subst)    # integer[string] * ((integer | struct | null) list)
record struct(name,args)  # string * ((integer | struct) list)
record rule(ids,head,body)# string list * predicate * predicate list \
record all(ids,body)      # string list * predicate list              } clauses
record one(ids,body)      # string list * predicate list             /
record fun(name,args)     # string * predicate list        \ types of 
record var(name)          # string                         / predicates
global dbase              # table of clauses indexed by head name
global consult            # stack of files being consulted
global query              # top level query

################################## driver ##################################

procedure main()
   dbase:=table([]); consult:=[&input] # empty dbase; standard input
   while \query | *consult>0 do { # more queries possible
      prog() # parse clauses, possibly setting query as a side effect
      if \query then case type(query) of {
         "all" : {every printsoln(query); write("no more solutions")}
         "one" : if not printsoln(query) then write("no")}
      else pop(consult)}
end

procedure printsoln(qry) # print first or next solution to qry 
   local ans,v
   every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { 
      writes("yes")
      every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst))
      suspend write()}
end

########################### Prolog interpreter #############################

procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl]
   local sub,q,cls,r               # and returns updated context
   if hd>tl then return ctext
   case (q:=qry[hd]).name of {
      "assert"  : {r:=rule([],q.args[1],q.args[2:0])
                   every extractids(r.ids,!q.args)
                   dbase[q.args[1].name]:=dbase[q.args[1].name]|||[r]
                   suspend resolve(qry,hd+1,tl,ctext)} # always succeeds
      "retract" : suspend retract(q.args[1],ctext) & resolve(qry,hd+1,tl,ctext)
      "~"       : {if not resolve(q.args,1,1,ctext) then 
                      suspend resolve(qry,hd+1,tl,ctext)} # negation by failure
      default   : {goal:=scanpred(q,ctext)
                   every cls := !dbase[q.name] do
                      every sub:=tryclause(goal,cls,ctext.subst) do
                         suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))}
      }
end

procedure retract(pred,ctext) # removes a clause matching pred from dbase 
local cand,goal,entry,i; i:=1 # fails when no more matching clauses to remove
   goal:=scanpred(pred,ctext)
   every entry:=!dbase[goal.name] do { # check for matching clause
      cand:=scanpred(entry.head,newctxt(entry.ids,ctext.subst))
      if unify(goal,cand,copy(ctext.subst)) then { # found one so remove it
         dbase[goal.name]:=extract(dbase[goal.name],i) 
         suspend} # on backtracking more retractions can occur
      else i+:=1 # i keeps track of the entry number even with extractions
      }
end

procedure tryclause(term,cls,sub) # resolves term using given clause or fails
   local ctext                    # a copy of sub is used so no side effects
   ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause
   if unify(term,scanpred(cls.head,ctext),ctext.subst) then 
      suspend resolve(cls.body,1,*cls.body,ctext).subst
end

procedure scanpred(prd,ctext) # converts predicate to structure 
   local args; args:=[] 
   if type(prd)=="var" then return ctext.env[prd.name]
   every put(args,scanpred(!prd.args,ctext))
   return struct(prd.name,args) 
end

######################## primitive domain operations ########################

procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub 
   local v,i,num           # side effect: sub is updated
   if type(t1)=="integer" then {
      while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1
      return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2}
   if type(t2)=="integer" then return unify(t2,t1,sub)
   if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then {
      every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail
      return}
end

procedure newctxt(ids,sub)       # forms a new context by extending sub
   local env; env:=table(&null)  # to accommodate the unbound identifiers
   every env[!ids]:=*put(sub,&null)
   return ctxt(env,sub)
end
   
procedure trmstr(trm,sub) # converts a term to a string suitable for output
   local s; s:=""
   case type(trm) of {
      "integer" : return trmstr(sub[trm],sub)
      "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list 
                 else {every s:=s||trmstr(!trm.args,sub)||","
                      return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")}
      "null" : return "undefined"}
end

procedure lstr(l,sub) # succeeds if l is a proper non-empty list and
   local hd,tl        # converts l to string suitable for output
   if l.name=="." & *l.args=2 then {
      hd:=trmstr(l.args[1],sub); tl:=l.args[2]
      while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl
      case type(tl) of {
         "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil
                     return hd||","||lstr(tl,sub)}                 # cons
         "null" : return "undefined"}}
end

procedure extract(list,el) # extract list element in position [el:el+1]
   return list:=list[1:el]|||list[el+1:0]
end

############################## Prolog parser ###############################

procedure prog() # parses consult[1] until query found or end of file
   query:=&null
   while write(read(consult[1])) ? clause() 
   if /query & consult[1]~===&input then close(consult[1])
end

procedure clause() # adds a clause to the dbase or fails when query set
   local p,b,ids,t; b:=[]; ids:=[]
   if =":-" then query:=all(ids,b:=body())
   else if ="?-" then query:=one(ids,b:=body())
   else {p:=pred(); if =":-" then b:=body()}
   if (t:=trim(tab(0)))~=="." then # syntax error
      return write("syntax error: ",t,if *t=0 then "." else " not"," expected")
   every extractids(ids,\p|!b) # list of variable identifiers
   if (\p).name=="consult" then every push(consult,open((!p.args).name))
   return dbase[(\p).name]:=dbase[p.name]|||[rule(ids,p,b)]
end
 
procedure body() # list of predicates (may be empty)
   local b; b:=[]
   if put(b,pred()) then while ="," & put(b,pred())
   return b
end

procedure dots() # converts non-empty body of list to cons cells
   local p
   if p:=pred() then if ="," then return fun(".",[p,dots()])
                     else return fun (".",[p,fun("nil",[])])
end

procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred
   local name,args,d,p,pp; args:=[]
   if ="~" then p:=fun("~",[pred()])
   else if name:=tab(many(&ucase++&lcase++'0123456789._')) then {
      if any(&ucase,name) then p:=var(name)
      else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}}
   else if ="[]" then p:=fun("nil",[]) # empty list abbreviation
   else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation
   if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons
                else write("syntax error: missing second argument to \"|\"")
   return \p # n.b. fails if predicate invalid
end

procedure check(s) # report error if s not present or skip over it
   if not =s then write("syntax error: ",s," expected before ",tab(0))
end

procedure extractids(ids,pred) # build the set of variable identifiers 
   if type(pred)=="fun" then every extractids(ids,!pred.args)
   else if not (pred.name==!ids) then put(ids,pred.name)
   return # the identifiers have been appended to reference parameter ids
end

From icon-group-request@arizona.edu  Thu Mar 29 05:03:09 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02479; Thu, 29 Mar 90 05:03:09 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:03 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20977; Thu, 29 Mar 90
 03:58:01 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 29 Mar 90 05:04 MST
Date: 29 Mar 90 11:23:34 GMT
From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
Subject: prolog4.icn
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B9C58C73861FE04C55@Arizona.EDU>
Message-Id: <2002@bruce.OZ>
Organization: Monash Uni. Computer Science, Australia
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


# Prolog in Icon, version 4, (C) Alan Finlay, Monash University.
########################## global variables and types ######################

record ctxt(env,subst)    # integer[string] * ((integer | struct | null) list)
record struct(name,args)  # string * ((integer | struct) list)
record rule(ids,head,body)# string list * predicate * predicate list \
record all(ids,body)      # string list * predicate list              } clauses
record one(ids,body)      # string list * predicate list             /
record fun(name,args)     # string * predicate list        \ types of 
record var(name)          # string                         / predicates
global dbase              # table of clauses indexed by head name
global consult            # stack of files being consulted
global query              # top level query

################################## driver ##################################

procedure main()
   dbase:=table([]); consult:=[&input] # empty dbase; standard input
   while \query | *consult>0 do { # more queries possible
      prog() # parse clauses, possibly setting query as a side effect
      if \query then case type(query) of {
         "all" : {every printsoln(query); write("no more solutions")}
         "one" : if not printsoln(query) then write("no")}
      else pop(consult)}
end

procedure printsoln(qry) # print first or next solution to qry 
   local ans,v
   every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do {
      if (type(ans)=="string") & (ans=="cut") then fail # cut query
      writes("yes")
      every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst))
      suspend write()}
end

########################### Prolog interpreter #############################

procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl]
   local sub,q,cls,r               # and returns updated context
   if hd>tl then return ctext # terminate linear recursion
   case (q:=qry[hd]).name of {
      "assert"  : {r:=rule([],q.args[1],q.args[2:0])
                   every extractids(r.ids,!q.args)
                   dbase[q.args[1].name]:=dbase[q.args[1].name]|||[r]
                   suspend resolve(qry,hd+1,tl,ctext)} # always succeeds
      "retract" : suspend retract(q.args[1],ctext) & resolve(qry,hd+1,tl,ctext)
      "~"       : {if not (sub:=resolve(q.args,1,1,ctext).subst) |
                         (type(sub)=="string" & sub=="cut") then 
                      suspend resolve(qry,hd+1,tl,ctext)} # negation by failure
      "!"       : {suspend resolve(qry,hd+1,tl,ctext)
                   return "cut"} # causes failure of parent clause
      default   : {goal:=scanpred(q,ctext)
                   every cls := !dbase[q.name] do
                      every sub:=tryclause(goal,cls,ctext.subst) do {
                         if type(sub)=="string" & sub=="cut" then fail
                         else suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))}
                  }
      }
end

procedure retract(pred,ctext) # removes a clause matching pred from dbase 
local cand,goal,entry,i; i:=1 # fails when no more matching clauses to remove
   goal:=scanpred(pred,ctext)
   every entry:=!dbase[goal.name] do { # check for matching clause
      cand:=scanpred(entry.head,newctxt(entry.ids,ctext.subst))
      if unify(goal,cand,copy(ctext.subst)) then { # found one so remove it
         dbase[goal.name]:=extract(dbase[goal.name],i) 
         suspend} # on backtracking more retractions can occur
      else i+:=1 # i keeps track of the entry number even with extractions
      } # n.b. this is a primitive retract since only the head is matched
end

procedure tryclause(term,cls,sub) # resolves term using given clause or fails
   local ctext,res                # a copy of sub is used so no side effects
   ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause
   if unify(term,scanpred(cls.head,ctext),ctext.subst) then 
      every res:=resolve(cls.body,1,*cls.body,ctext) do {
          if (type(res)=="string") & (res=="cut") then suspend "cut"
          else suspend res.subst} 
end

######################## primitive domain operations ########################

procedure scanpred(prd,ctext) # converts predicate to structure 
   local args; args:=[] 
   if type(prd)=="var" then return ctext.env[prd.name]
   every put(args,scanpred(!prd.args,ctext))
   return struct(prd.name,args) 
end

procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub 
   local v,i,num           # side effect: sub is updated
   if type(t1)=="integer" then {
      while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1
      return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2}
   if type(t2)=="integer" then return unify(t2,t1,sub)
   if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then {
      every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail
      return}
end

procedure newctxt(ids,sub)       # forms a new context by extending sub
   local env; env:=table(&null)  # to accommodate the unbound identifiers
   every env[!ids]:=*put(sub,&null)
   return ctxt(env,sub)
end
   
procedure trmstr(trm,sub) # converts a term to a string suitable for output
   local s; s:=""
   case type(trm) of {
      "integer" : return trmstr(sub[trm],sub)
      "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list 
                 else {every s:=s||trmstr(!trm.args,sub)||","
                      return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")}
      "null" : return "undefined"}
end

procedure lstr(l,sub) # succeeds if l is a proper non-empty list and
   local hd,tl        # converts l to string suitable for output
   if l.name=="." & *l.args=2 then {
      hd:=trmstr(l.args[1],sub); tl:=l.args[2]
      while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl
      case type(tl) of {
         "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil
                     return hd||","||lstr(tl,sub)}                 # cons
         "null" : return "undefined"}}
end

procedure extract(list,el) # extract list element in position [el:el+1]
   return list:=list[1:el]|||list[el+1:0]
end

############################## Prolog parser ###############################

procedure prog() # parses consult[1] until query found or end of file
   query:=&null
   while write(read(consult[1])) ? clause() 
   if /query & consult[1]~===&input then close(consult[1])
end

procedure clause() # adds a clause to the dbase or fails when query set
   local p,b,ids,t; b:=[]; ids:=[]
   if =":-" then query:=all(ids,b:=body())
   else if ="?-" then query:=one(ids,b:=body())
   else {p:=pred(); if =":-" then b:=body()}
   if (t:=trim(tab(0)))~=="." then # syntax error
      return write("syntax error: ",t,if *t=0 then "." else " not"," expected")
   every extractids(ids,\p|!b) # list of variable identifiers
   if (\p).name=="consult" then every push(consult,open((!p.args).name))
   return dbase[(\p).name]:=dbase[p.name]|||[rule(ids,p,b)]
end
 
procedure body() # list of predicates (may be empty)
   local b; b:=[]
   if put(b,pred()) then while ="," & put(b,pred())
   return b
end

procedure dots() # converts non-empty body of list to cons cells
   local p
   if p:=pred() then if ="," then return fun(".",[p,dots()])
                     else return fun (".",[p,fun("nil",[])])
end

procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred
   local name,args,d,p,pp; args:=[]
   if ="~" then p:=fun("~",[pred()])
   else if ="!" then p:=fun("!",[])
   else if name:=tab(many(&ucase++&lcase++'0123456789._')) then {
      if any(&ucase,name) then p:=var(name)
      else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}}
   else if ="[]" then p:=fun("nil",[]) # empty list abbreviation
   else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation
   if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons
                else write("syntax error: missing second argument to \"|\"")
   return \p # n.b. fails if predicate invalid
end

procedure check(s) # report error if s not present or skip over it
   if not =s then write("syntax error: ",s," expected before ",tab(0))
end

procedure extractids(ids,pred) # build the set of variable identifiers 
   if type(pred)=="fun" then every extractids(ids,!pred.args)
   else if not (pred.name==!ids) then put(ids,pred.name)
   return # the identifiers have been appended to reference parameter ids
end

From icon-group-request@arizona.edu  Thu Mar 29 05:03:45 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02493; Thu, 29 Mar 90 05:03:45 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:04 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20764; Thu, 29 Mar 90
 03:53:34 -0800
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 29 Mar 90 05:05 MST
Date: 29 Mar 90 11:17:51 GMT
From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
Subject: prolog2.doc
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B9C5526D17BFE04BD8@Arizona.EDU>
Message-Id: <1999@bruce.OZ>
Organization: Monash Uni. Computer Science, Australia
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


.ce
.uh "Prolog in Icon version 2 : Program documentation"
.ce
Alan Finlay, Monash University, March 1990.

This version of Prolog is loosely based upon the various denotational
semantics for Prolog with which I have been acquainted\*[*\*]
.(f
* In particular: T. Nicholson and N. Foo. "A Denotational Semantics for Prolog",
to appear in ACM TOPLAS.
.)f
and discussion with colleagues at Monash University.
The motivation was to gain experience with Icon and to see if there was any
truth to the claim "It should only take about half a page of Icon to implement
a Prolog interpreter."

.sh1 "Data structures"

.ftCW
########################## global variables and types ######################

.nf
record ctxt(env,subst)    # integer[string] * ((integer | struct | null) list)
record struct(name,args)  # string * ((integer | struct) list)
record rule(ids,head,body)# string list * predicate * predicate list \\
record all(ids,body)      # string list * predicate list              } clauses
record one(ids,body)      # string list * predicate list             /
record fun(name,args)     # string * predicate list        \\ types of 
record var(name)          # string                         / predicates
global dbase              # table of clauses indexed by head name
global consult            # stack of files being consulted
global query              # top level query
.ft
.fi

Despite the lack of enforced type discipline in Icon I have used some 
self discipline.  The record and global declarations above indicate types
using "|" for disjoint sum and shared field names to get a degree of type
inheritance.
Type "ctxt" is a context for goal resolution and consists of an environment
and a substitution.  The environment is a table which maps variable identifiers
to variables.  A variable is represented by an integer which is its position
in the substitution list.  A substitution then effectively maps variables
to terms or unbound.  The null value is used to represent an unbound variable. 
A term is either another variable or a structure.
Type "struct" is used to represent structures which are either functors or
constants.  Functors have a name and a list of arguments which are terms.
Constants are represented as functors with zero arguments.

The types for clauses and predicates are used to represent the syntax of
Prolog.  I will use the words predicate and functor more or less
interchangeably since the interpreter doesn't distinguish between them
structurally.  A consequence of this is that a variable may be used as
a goal.
The use of separate domains for syntactic predicates and their run time
equivalents (called structures above) is required because the syntax is
independent of any context whereas the "terms" referred to above are only
meaningful when considered together with an appropriate context.  At various
stages in a computation a given clause or predicate will have to be used
in different contexts.  There are a few idiosyncrasies of the syntactic 
domains worth mentioning.  Clauses have the redundant field "ids" which is a
list of all the variable identifiers used in the clause.  This saves a lot
of recalculation during execution.  The types "all" and "one" are respectively
queries where all or one solution are requested.  Syntactically this is
signaled by using the turnstile ":-" for all solutions and "?-" for one.
As before type "fun" can be used with no arguments to indicate constants.
Facts are represented as rules with empty bodies.  

The database is global and consists of a table of clause lists indexed
by the name of the predicate at the head.  The index is used for efficiency
reasons only.  The global "consult" is a stack of files being consulted,
used for nesting of consult commands.  The top level query is global simply 
to simplify the parser and could have been passed to the driver as a parameter
via procedure "prog".

.ne7
.sh1 "The driver"

.ftCW
################################## driver ##################################

procedure main()
   dbase:=table([]); consult:=[&input] # empty dbase; standard input
   while \\query | *consult>0 do { # more queries possible
      prog() # parse clauses, possibly setting query as a side effect
      if \\query then case type(query) of {
         "all" : {every printsoln(query); write("no more solutions")}
         "one" : if not printsoln(query) then write("no")}
      else pop(consult)}
end

procedure printsoln(qry) # print first or next solution to qry 
   local ans,v
   every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { 
      writes("yes")
      every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst))
      suspend write()}
end
.ft

Most of procedure "main" is concerned with providing a usable interactive
interface and the details are of little interest.  The parser is called
until a query is encountered and "printsoln" is used to resolve it and display
the answer substitution.  The separate
procedure "printsoln" is required because it can generate all solutions and
may be activated once only or resumed until it fails depending on the type
of query.
The procedure "printsoln" simply uses procedure "resolve" to generate solution
contexts from an initial context determined by the query.  This initial 
context has an environment with all and only the free variables in the query
and a substitution in which they are unbound.  This context is created by
calling procedure "newctxt" which will be described later.

.ne7
.sh1 "The interpreter proper"

.ftCW
########################### Prolog interpreter #############################

procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl]
   local sub,q                     # in given context, returns updated context
   if hd>tl then return ctext
   if (q:=qry[hd]).name=="~" then # negation by failure
      {if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)}
   else every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],ctext.subst) do
           suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))
end

procedure tryclause(term,cls,sub) # resolves term using given clause or fails
   local ctext                    # a copy of sub is used so no side effects
   ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause
   if unify(term,scanpred(cls.head,ctext),ctext.subst) then 
      suspend resolve(cls.body,1,*cls.body,ctext).subst
end

procedure scanpred(prd,ctext) # converts predicate to structure 
   local args; args:=[] 
   if type(prd)=="var" then return ctext.env[prd.name]
   every put(args,scanpred(!prd.args,ctext))
   return struct(prd.name,args) 
end
.ft

The procedure "resolve" is a generator which produces all solutions to a sublist
of the supplied goal list "qry".  This sublist is from element "hd" to element
.(f
* Lisp's variety of lists can be implemented in Icon but lack the useful
operators the inbuilt lists have.  On the other hand the inbuilt list type
in Icon does not have a non copying "tail" or "cdr" operation.
.)f
"tl" and is passed in this way to save copying sublists\*[*\*].  The
resolution takes place with respect to the supplied context and
an updated context is returned as the answer.  
Ignoring negation for the moment, "resolve" proceeds by resolving
the first goal in the list and for each solution uses a recursive call
to satisfy the rest of the goal list if possible and generates a result
for every case.  The first goal is resolved by resuming "tryclause" for
each clause in the database matching the goal and "tryclause" itself being
a generator can be resumed several times for each clause.  For those not
accustomed to Icon's procedure resumption conventions this is more clearly
expressed

.ftCW
   q:=qry[hd]
   every cls:=!dbase[q.name] do
      every sub:=tryclause(scanpred(q,ctext),cls,ctext.subst) do
         suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))
.ft

Notice that the updated substitution return by "tryclause" is supplied
to "resolve" for the rest of the list hence the effects of goal resolution
within a clause body are cumulative.  Another important point is to note that
the recursion bottoms out when the sublist of goals is empty and succeeds in
this case.  It is tempting to try to use a goal generator instead of a goal
list as in

.ftCW
   sub:=ctext.subst
   every q:=!qry do
      every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],sub)
   return ctxt(ctext.env,sub)
.ft

This appears to save an explicit test for the end of the list but suffers
from a fatal flaw.  Apart from only being able to generate one solution this
scheme finds the first solution provided there is one but otherwise
succeeds anyway with a partial solution.  

Negation is simply handled as "negation as failure" by using Icon's "not"
operator which succeeds if and only if its argument fails.  Since when
a negated goal
succeeds the substitution cannot be updated the original substitution is
passed to the remaining goals in this case.

The procedure "tryclause" first creates a context for resolving the supplied
term against the supplied clause.  This context is based upon the supplied 
substitution and all the free identifiers in the clause.  The free identifiers
and the a copy of the substitution are used by "newctxt" to create a context
which has an environment for the identifiers as new variables and the
substitution extended with these new variables unbound.  Denotational
semantics for Prolog may perform this task on the fly as a clause is
interpreted and this corresponds operationally to a great deal of recomputation.
Here the syntax parser generates a list of free variable identifiers (without
repetitions) only once and combined with the "newctxt" this avoids extending
the context in a piecemeal fashion.

The substitution passed to "newctxt" is a copy since "newctxt" and "unify"
cause side effects upon their substitution parameter.  If these side effects
were eliminated it would require two copying operations to be performed on
very similar substitutions where only one is required.  The unifier returns
only success or failure of the attempted unification but in the case of
success the supplied substitution is updated as a side effect.

The procedure "scanpred" simply converts a predicate from its syntactic form
into a term according to some relevant context.  There are no side effects.

.ne7
.sh1 "Primitive domain operations and the parser"

.ftCW
######################## primitive domain operations ########################

procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub 
   local v,i,num           # side effect: sub is updated
   if type(t1)=="integer" then {
      while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1
      return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2}
   if type(t2)=="integer" then return unify(t2,t1,sub)
   if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then {
      every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail
      return}
end

procedure newctxt(ids,sub)       # forms a new context by extending sub
   local env; env:=table(&null)  # to accommodate the unbound identifiers
   every env[!ids]:=*put(sub,&null)
   return ctxt(env,sub)
end
   
procedure trmstr(trm,sub) # converts a term to a string suitable for output
   local s; s:=""
   case type(trm) of {
      "integer" : return trmstr(sub[trm],sub)
      "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list 
                 else {every s:=s||trmstr(!trm.args,sub)||","
                      return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")}
      "null" : return "undefined"}
end

procedure lstr(l,sub) # succeeds if l is a proper non-empty list and
   local hd,tl        # converts l to string suitable for output
   if l.name=="." & *l.args=2 then {
      hd:=trmstr(l.args[1],sub); tl:=l.args[2]
      while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl
      case type(tl) of {
         "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil
                     return hd||","||lstr(tl,sub)}                 # cons
         "null" : return "undefined"}}
end

############################## Prolog parser ###############################

procedure prog() # parses consult[1] until query found or end of file
   query:=&null
   while write(read(consult[1])) ? clause() 
   if /query & consult[1]~===&input then close(consult[1])
end

procedure clause() # adds a clause to the dbase or fails when query set
   local p,b,ids,t; b:=[]; ids:=[]
   if =":-" then query:=all(ids,b:=body())
   else if ="?-" then query:=one(ids,b:=body())
   else {p:=pred(); if =":-" then b:=body()}
   if (t:=trim(tab(0)))~=="." then # syntax error
      return write("syntax error: ",t,if *t=0 then "." else " not"," expected")
   every extractids(ids,\\p|!b) # list of variable identifiers
   if (\\p).name=="consult" then every push(consult,open((!p.args).name))
   return dbase[(\\p).name]:=dbase[p.name]|||[rule(ids,p,b)]
end
 
procedure body() # list of predicates (may be empty)
   local b; b:=[]
   if put(b,pred()) then while ="," & put(b,pred())
   return b
end

procedure dots() # converts non-empty body of list to cons cells
   local p
   if p:=pred() then if ="," then return fun(".",[p,dots()])
                     else return fun (".",[p,fun("nil",[])])
end

procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred
   local name,args,d,p,pp; args:=[]
   if ="~" then p:=fun("~",[pred()])
   else if name:=tab(many(&ucase++&lcase++'0123456789._')) then {
      if any(&ucase,name) then p:=var(name)
      else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}}
   else if ="[]" then p:=fun("nil",[]) # empty list abbreviation
   else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation
   if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons
                else write("syntax error: missing second argument to \\"|\\"")
   return \\p # n.b. fails if predicate invalid
end

procedure check(s) # report error if s not present or skip over it
   if not =s then write("syntax error: ",s," expected before ",tab(0))
end

procedure extractids(ids,pred) # build the set of variable identifiers 
   if type(pred)=="fun" then every extractids(ids,!pred.args)
   else if not (pred.name==!ids) then put(ids,pred.name)
   return # the identifiers have been appended to reference parameter ids
end
.ft

The rest of the interpreter is supplied for completeness but is of little
intrinsic interest.  

The unifier updates the supplied substitution.  It is natural to consider 
using the backtrackable assignment and hence avoid the need to copy
substitutions altogether.  A preliminary investigation indicates that this
is feasible but space inefficient.  This may be due to poor implementation
of backtrackable assignment in the particular Icon interpreter used\*[*\*].
.(f
* &version = "Icon Version 6.0.  July 7, 1986." (University of Arizona)
.)f

The parser uses procedures which return parse trees except that "clause"
uses a global variable to return a top level query and fails when it
encounters one.  This causes termination of string scanning in "prog".
The behaviour of the user interface is described in the user manual.

From kwalker  Thu Mar 29 12:27:32 1990
Date: Thu, 29 Mar 90 12:27:32 MST
From: "Kenneth Walker" <kwalker>
Message-Id: <9003291927.AA00336@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA00336; Thu, 29 Mar 90 12:27:32 MST
In-Reply-To: <9003281833.AA18812@sophist.uchicago.edu>
To: icon-group
Subject: Re:  icon & prolog
Status: O

> Date: Wed, 28 Mar 90 12:33:15 CST
> From: Richard Goerwitz <goer@sophist.uchicago.EDU>
> 
> Just an idle question:  Has anyone thought of implementing
> Prolog in Icon, either as a Prolog -> Icon translator, or
> as a Prolog interpreter written in Icon?

You might want to check out "Logicon: an Integration of Prolog into
Icon" by Guy Lapalme and Suzanne Chapleau, Software Practice and
Experience, Oct 1986. They implement a Prolog interpreter in Icon
which lets you call back and forth between the two languages.

  Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 2858  kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker

From ralph  Sat Mar 31 05:07:38 1990
Date: Sat, 31 Mar 90 05:07:38 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9003311207.AA17346@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA17346; Sat, 31 Mar 90 05:07:38 MST
To: icon-group
Subject: Version 8 of Icon
Status: O

Version 8 of Icon is complete and implementations will be available
for most computer systems soon.

Version 8 has both new features and improvements to the implementation.

New language features:

	Math functions:  sin(), cos(), ..., exp(), log(), etc.

	Keyboard functions:  getch(), getche(), kbhit(); PC implementations
	   only.

	key(T) to generate the keys in table T.

	name(v) and variable(s) to produce string name of variable v and
	   vice versa.

	p!L to invoke p with arguments in list L.

	&letters, cset of all letters.

	Arbitrary-precision integer arithmetic (not supported on all PCs).

	Serial numbers for structures.

	An interface for calling C functions from Icon and vice versa.

Implementation changes:

	Smaller structures.

	Dynamic hashing for sets and tables.

	Instrumentation of storage management.

Implementations of Version 8 will be announced as they become available.

Please direct any questions to me, not to icon-group or icon-project.



  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From ralph  Sat Mar 31 05:50:35 1990
Date: Sat, 31 Mar 90 05:50:35 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9003311250.AA18283@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA18283; Sat, 31 Mar 90 05:50:35 MST
To: icon-group
Subject: Version 8 of Icon for UNIX
Status: O

Version 8 of Icon for UNIX systems is now available. This implementation
can be configured for a wide variety of UNIX systems. Configuration
information is provided for 56 different systems, including the Sun
Sparcstation, the DecStation, the NeXT, the DG AViiON, and the Cray-2.

Configurations for new systems can be added with relative ease.

The UNIX distribution includes source code, configuration files,
documentation, the Icon program library (new in Version 8), and
several auxiliary components of Icon.

Version 8 of Icon for UNIX systems can be obtained by anonymous FTP
to cs.arizona.edu. After connecting, cd /icon/v8.  Get READ.ME
there for more information.

If you do not have FTP access or prefer to obtain a magnetic tape
and printed documentation, Version 8 of Icon for UNIX can be ordered
from:

	Icon Project
	Department of Computer Science
	Gould-Simpson Building
	The University of Arizona
	Tucson, AZ   85721

	602 621-2018 (voice)
	602 621-4246 (FAX)

The price is $30, payable in US dollars with a check written on a bank
in the United States.  Orders also can be charged to MasterCard or Visa.
This price includes shipping by parcel post in the United States, Canada,
and Mexico. Add $10 for air mail delivery to other countries.

Please direct any questions to me, not to icon-project or icon-group.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From goer@sophist.uchicago.EDU  Sat Mar 31 16:32:35 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA21928; Sat, 31 Mar 90 16:32:35 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Sat, 31 Mar 90 16:33 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Sat, 31 Mar 90
 17:33:50 CST
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA24215; Sat, 31 Mar 90
 17:28:19 CST
Resent-Date: Sat, 31 Mar 90 16:34 MST
Date: Sat, 31 Mar 90 17:28:19 CST
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: more help w/ BSD->SysV filename conversion
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B7D2D2117EBFE0252D@Arizona.EDU>
Message-Id: <9003312328.AA24215@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Yet another version of the BSD->SysV tar file mapping aid.

The chore of renaming files is time-consuming, and any automation that
can be introduced into the process is certainly of use to me.  I hope
that this succession of mapping programs I'm posting will be as useful
to others, too.  This one has been used on more platforms than the one
I posted before, and has a few more error checks.  It also prevents the
user from fooling with nested tar archives, and permits preservation of
an arbitrary number of extensions of any length (e.g. .dvi.Z, .pxl,
etc.)


#-------------------------------------------------------------------
#
#	PROGNAME:  mtf (stands for "map tar file")
#	
#	PURPOSE:  Maps 15+ char. filenames in a tar archive to 14
#	chars.  Handles both header blocks and the archive itself.
#
#	USAGE:  mtf inputfile .extensions     # (writes to stdout)
#
#       Inputfile is a tar archive; "extensions" is a sequence of
#       strings which denote .extensions to be preserved in mapped
#       filenames.  One-char extensions are automatically preserved.
#       Writes a "mapped" tar archive to the stdout.
#
#       BUGS:  Mtf only maps filenames found in the main tar headers.
#       Because of this, mtf cannot accept nested tar archives.  Mtf
#       also obviously cannot know about conflicts with filenames in
#       use outside the archive.  Check before you extract!
#
#	Richard L. Goerwitz, III
#	Last modified 3/29/90
#
#-------------------------------------------------------------------


global filenametbl, chunkset, short_chunkset   # see procedure mappiece(s)
global extensions                              # ditto
record hblock(name,junk,size,mtime,chksum,
              linkflag,linkname,therest)       # see readtarhdr(s)


procedure main(a)

    usage := "usage:  mtf inputfile extensions  # output goes to stdout\n" ||
             "        useful extensions include .tar.Z .pxl .cpi, etc."
    0 < *a | stop(usage)
    intext := open(a[1],"r") |
	stop("mtf:  can't open ",a[1])
    a[1][-2:0] == ".Z" &
        stop("mtf:  sorry, can't accept compressed files")
    extensions := a[2:0]
    every i := 1 to *extensions
    do extensions[i] ?:= (=".", tab(0))

    # Run through all the headers in the input file, filling
    # (global) filenametbl with the names of overlong files;
    # make_table_of_filenames fails if there are no such files.
    make_table_of_filenames(intext) | {
	write(&errout,"mtf:  no overlong path names to map") 
	a[1] ? (tab(find(".tar")+4), pos(0)) |
	  write(&errout,"(Is ",a[1]," even a tar archive?)")
 	exit(1)
    } 

    # Now that a table of overlong filenames exists, go back
    # through the text, remapping all occurrences of these names
    # to new, 14-char values; also, reset header checksums, and
    # reformat text into correctly padded 512-byte blocks.  Ter-
    # minate output with 512 nulls.
    seek(intext,1)
    every writes(output_mapped_headers_and_texts(intext))

    close(intext)
    write_report()   # Record mapped file and dir names for future ref.
    exit(0)
    
end



procedure make_table_of_filenames(intext)

    local header # chunkset is global

    # search headers for overlong filenames; for now
    # ignore everything else
    while header := readtarhdr(reads(intext,512)) do {
	# tab upto the next header block
	tab_nxt_hdr(intext,trim_str(header.size),1)
	# record overlong filenames in several global tables, sets
	fixpath(trim_str(header.name))
    }
    *\chunkset ~= 0 | fail
    return &null

end



procedure output_mapped_headers_and_texts(intext)

    # Remember that filenametbl, chunkset, and short_chunkset
    # (which are used by various procedures below) are global.
    local header, newtext, full_block, block, lastblock

    # Read in headers, one at a time.
    while header := readtarhdr(reads(intext,512)) do {

	# Replace overlong filenames with shorter ones, according to
	# the conversions specified in the global hash table filenametbl
	# (which were generated by fixpath() on the first pass).
      	header.name := left(map_filenams(header.name),100,"\x00")
	header.linkname := left(map_filenams(header.linkname),100,"\x00")

	# Use header.size field to determine the size of the subsequent text.
	# Read in the text as one string.  Map overlong filenames found in it
 	# to shorter names as specified in the global hash table filenamtbl.
	newtext := map_filenams(tab_nxt_hdr(intext,trim_str(header.size)))

	# Now, find the length of newtext, and insert it into the size field.
	header.size := right(exbase10(*newtext,8) || " ",12," ")

	# Calculate the checksum of the newly retouched header.
	header.chksum := right(exbase10(get_checksum(header),8)||"\x00 ",8," ")

	# Finally, join all the header fields into a new block and write it out
	full_block := ""; every full_block ||:= !header
	suspend left(full_block,512,"\x00")

	# Now we're ready to write out the text, padding the final block
	# out to an even 512 bytes if necessary; the next header must start
	# right at the beginning of a 512-byte block.
	newtext ? {
	    while block := move(512)
	    do suspend block
	    pos(0) & next
            lastblock := left(tab(0),512,"\x00")
	    suspend lastblock
	}
    }
    # Write out a final null-filled block.  Some tar programs will write
    # out 1024 nulls at the end.  Dunno why.
    return repl("\x00",512)

end



procedure trim_str(s)

    # Knock out spaces, nulls from those crazy tar header
    # block fields (some of which end in a space and a null,
    # some just a space, and some just a null [anyone know
    # why?]).
    return s ? {
	(tab(many(' ')) | &null) &
	    trim(tab(find("\x00")|0))
    } \ 1

end 



procedure tab_nxt_hdr(f,size_str,firstpass)

    # Tab upto the next header block.  Return the bypassed text
    # as a string if not the first pass.

    local hs, next_header_offset

    hs := integer("8r" || size_str)
    next_header_offset := (hs / 512) * 512
    hs % 512 ~= 0 & next_header_offset +:= 512
    if 0 = next_header_offset then return ""
    else {
	# if this is pass no. 1 don't bother returning a value; we're
	# just collecting long filenames;
	if \firstpass then {
	    seek(f,where(f)+next_header_offset)
	    return
	}
	else {
	    return reads(f,next_header_offset)[1:hs+1] |
		stop("mtf:  error reading in ",
		     string(next_header_offset)," bytes.")
	}
    }

end



procedure fixpath(s)

    # Fixpath is a misnomer of sorts, since it is used on
    # the first pass only, and merely examines each filename
    # in a path, using the procedure mappiece to record any
    # overlong ones in the global table filenametbl and in
    # the global sets chunkset and short_chunkset; no fixing
    # is actually done here.

    s2 := ""
    s ? {
	while piece := tab(find("/")+1)
	do s2 ||:= mappiece(piece) 
	s2 ||:= mappiece(tab(0))
    }
    return s2

end



procedure mappiece(s)

    # Check s (the name of a file or dir as recorded in the tar header
    # being examined) to see if it is over 14 chars long.  If so,
    # generate a unique 14-char version of the name, and store
    # both values in the global hashtable filenametbl.  Also store
    # the original (overlong) file name in chunkset.  Store the
    # first fifteen chars of the original file name in short_chunkset.
    # Sorry about all of the tables and sets.  It actually makes for
    # a reasonably efficient program.  Doing away with both sets,
    # while possible, causes a tenfold drop in execution speed!
    
    # global filenametbl, chunkset, short_chunkset, extensions
    local j, ending

    initial {
	filenametbl := table()
	chunkset := set()
	short_chunkset := set()
    }
   
    chunk := trim(s,'/')
    if chunk ? (tab(find(".tar")+4), pos(0)) then {
	write(&errout, "mtf:  Sorry, I can't let you do this.\n",
	               "      You've nested a tar archive within\n",
	               "      another tar archive, which makes it\n",
	               "      likely I'll f your filenames ubar.")
	exit(2)
    }
    if *chunk > 14 then {
	i := 0

	if /filenametbl[chunk] then {
	# if we have not seen this file, then...
	    repeat {
		# ...find a new unique 14-character name for it;
		# preserve important suffixes like ".Z," ".c," etc.
		# First, check to see if the original filename (chunk)
		# ends in an important extension...
		if chunk ?
		    (tab(find(".")),
		     ending := move(1) || tab(match(!\extensions)|any(&ascii)),
		     pos(0)
		     )
		# ...If so, then leave the extension alone; mess with the
		# middle part of the filename (e.g. file.with.extension.c ->
		# file.with001.c).
		then {
		    j := (15 - *ending - 3)
		    lchunk:= chunk[1:j] || right(string(i+:=1),3,"0") || ending
		}
		# If no important extension is present, then reformat the
		# end of the file (e.g. too.long.file.name -> too.long.fi01).
		else lchunk := chunk[1:13] || right(string(i+:=1),2,"0")

		# If the resulting shorter file name has already been used...
		if lchunk == !filenametbl
		# ...then go back and find another (i.e. increment i & try
		# again; else break from the repeat loop, and...
		then next else break
	    }
            # ...record both the old filename (chunk) and its new,
	    # mapped name (lchunk) in filenametbl.  Also record the
	    # mapped names in chunkset and short_chunkset.
	    filenametbl[chunk] := lchunk
	    insert(chunkset,chunk)
	    insert(short_chunkset,chunk[1:16])
	}
    }

    # If the filename is overlong, return lchunk (the shortened
    # name), else return the original name (chunk).  If the name,
    # as passed to the current function, contained a trailing /
    # (i.e. if s[-1]=="/"), then put the / back.  This could be
    # done more elegantly.
    return (\lchunk | chunk) || ((s[-1] == "/") | "")

end



procedure readtarhdr(s)

    # Read the silly tar header into a record.  Note that, as was
    # complained about above, some of the fields end in a null, some
    # in a space, and some in a space and a null.  The procedure
    # trim_str() may (and in fact often _is_) used to remove this
    # extra garbage.

    this_block := hblock()
    s ? {
	this_block.name     := move(100)    # <- to be looked at later
	this_block.junk     := move(8+8+8)  # skip the permissions, uid, etc.
	this_block.size     := move(12)     # <- to be looked at later
	this_block.mtime    := move(12)
	this_block.chksum   := move(8)      # <- to be looked at later
	this_block.linkflag := move(1)
	this_block.linkname := move(100)    # <- to be looked at later
	this_block.therest  := tab(0)
    }
    integer(this_block.size) | fail  # If it's not an integer, we've hit
                                     # the final (null-filled) block.
    return this_block

end



procedure map_filenams(s)

    # Chunkset is global, and contains all the overlong filenames
    # found in the first pass through the input file; here the aim
    # is to map these filenames to the shortened variants as stored
    # in filenametbl (GLOBAL).

    local s2

    s2 := ""
    s ? {
	until pos(0) do {
	    # first narrow the possibilities, using short_chunkset
	    if member(short_chunkset,&subject[&pos:&pos+15])
            # then try to map from a long to a shorter 14-char filename
	    then s2 ||:= (filenametbl[=!chunkset] | move(1))
	    else s2 ||:= move(1)
	}
    }
    return s2

end


#  From the IPL.  Thanks, Ralph -
#  Author:  Ralph E. Griswold
#  Date:  June 10, 1988
#  exbase10(i,j) convert base-10 integer i to base j
#  The maximum base allowed is 36.

procedure exbase10(i,j)

   static digits
   local s, d, sign
   initial digits := &digits || &lcase
   if i = 0 then return 0
   if i < 0 then {
      sign := "-"
      i := -i
      }
   else sign := ""
   s := ""
   while i > 0 do {
      d := i % j
      if d > 9 then d := digits[d + 1]
      s := d || s
      i /:= j
      }
   return sign || s

end

# end IPL material


procedure get_checksum(r)
 
    # Calculates the new value of the checksum field for the
    # current header block.  Note that the specification say
    # that, when calculating this value, the chksum field must
    # be blank-filled.

    sum := 0
    r.chksum := "        "
    every field := !r
    do every sum +:= ord(!field)
    return sum

end



procedure write_report()

    # This procedure writes out a list of filenames which were
    # remapped (because they exceeded the SysV 14-char limit),
    # and then notifies the user of the existence of this file.

    local outtext, stbl, i

    (outtext := open(fname := "mapping.report","w")) |
	open(fname := "/tmp/mapping.report","w") |
	     stop("mtf:  Can't find a place to put mapping.report!")
    stbl := sort(filenametbl,3)
    every i := 1 to *stbl -1 by 2 do {
	write(outtext,left(stbl[i],35," ")," ",stbl[i+1])
    }
    write(&errout,"mtf:  ",fname," contains the list of changes")
    close(outtext)
    return &null

end

From goer@sophist.uchicago.EDU  Sat Mar 31 16:59:29 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA22736; Sat, 31 Mar 90 16:59:29 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Sat, 31 Mar 90 17:00 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Sat, 31 Mar 90
 18:00:26 CST
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA24270; Sat, 31 Mar 90
 17:54:54 CST
Resent-Date: Sat, 31 Mar 90 17:01 MST
Date: Sat, 31 Mar 90 17:54:54 CST
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: help with MS-DOS files
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B7CF0D1F6EDFE058C0@Arizona.EDU>
Message-Id: <9003312354.AA24270@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Appended below are two short programs for converting MSDOS
text files to Unix format and vice-versa.  I use them all
the time.  I hope someone else finds them helpful.

#-------------------------------------------------------------------
#
#     PROGRAM:  nocr (stands for "no carriage return")
#
#     usage:  nocr [file1 [file2 [etc.]]]
#
#     PURPOSE:  Nocr simply removes carriage returns from the
#     files whose names are given as arguments to the program.
#     I use it to import MS-DOS files.
#
#     BUGS:  None known.
#
#     Richard L. Goerwitz, III
#     last modified 1/20/90
#
#-------------------------------------------------------------------

procedure main(a)

  local fname, infile, outfile

  *a = 0 & stop("usage:  nocr file1 file2...")
  while fname := pop(a) do {
    infile := open(fname,"r") | (er(), next)
    outfile := open(fname || ".xM","w") | (er(), next)
    while line := !infile do {
      if line[-1] == "\x0D"
      then write(outfile,line[1:-1])
      else write(outfile,line)
      }
    close(infile) | stop("nocr:  cannot close, ",fname)
    remove(fname) | stop("nocr:  cannot remove ",fname)
    rename(fname || ".xM",fname)
    }

end

procedure er()
  write(&errout,"nocr:  cannot open ",fname," for reading")
  return
end


#-------------------------------------------------------------------
#
#     PROGRAM:  yescr
#
#     usage:  yescr [file1 [file2 [etc.]]]
#
#     PURPOSE:  Yescr simply adds a CR after each newlines in the
#     files whose names are given as arguments to the program.
#     I use it to export MS-DOS files.
#
#     BUGS:  None known.
#
#     Richard L. Goerwitz, III
#     last modified 1/20/90
#
#-------------------------------------------------------------------

procedure main(a)

  local fname, infile, outfile

  *a = 0 & stop("usage:  yescr file1 file2...")
  while fname := pop(a) do {
    infile := open(fname,"r") | (er(), next)
    outfile := open(fname || ".xM","w") | (er(), next)
    while line := !infile do {
      if line[-1] ~== "\x0D" | line == ""
      then write(outfile,line || "\x0D")
      else write(outfile,line)
      }
    close(infile) | stop("yescr:  cannot close, ",fname)
    remove(fname) | stop("yescr:  cannot remove ",fname)
    rename(fname || ".xM",fname)
    }

end

procedure er()
  write(&errout,"yescr:  cannot open ",fname," for reading")
  return
end


    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From ralph  Mon Apr  2 18:35:07 1990
Date: Mon, 2 Apr 90 18:35:07 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004030135.AA24469@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA24469; Mon, 2 Apr 90 18:35:07 MST
To: icon-group
Subject: Version 8 of Icon for VMS
Status: O

Version 8 of Icon for VAX/VMS is now available.

The VMS distribution includes source code, object code, executable binaries,
documentation, and the Icon program library (new in Version 8).

Version 8 of Icon for VMS systems can be obtained by anonymous FTP
to cs.arizona.edu. After connecting, cd /icon/v8.  Get READ.ME
there for more information.  See vmsfix.com in that directory for
information about patching up VMS BACKUP tapes after FTP.

If you do not have FTP access or prefer to obtain a magnetic tape
and printed documentation, Version 8 of Icon for VMS can be ordered
from:

	Icon Project
	Department of Computer Science
	Gould-Simpson Building
	The University of Arizona
	Tucson, AZ   85721

	602 621-2018 (voice)
	602 621-4246 (FAX)

The price is $30, payable in US dollars with a check written on a bank
in the United States.  Orders also can be charged to MasterCard or Visa.
This price includes shipping by parcel post in the United States, Canada,
and Mexico. Add $10 for air mail delivery to other countries.

Please direct any questions to me, not to icon-project or icon-group.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph


From cjeffery  Mon Apr  2 20:52:45 1990
Resent-From: "Clinton Jeffery" <cjeffery>
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA01383; Mon, 2 Apr 90 20:52:45 MST
Received: from megaron.cs.arizona.edu by Arizona.EDU; Mon, 2 Apr 90 20:54 MST
Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15)
 via SMTP id AA01306; Mon, 2 Apr 90 20:52:11 MST
Received: by caslon; Mon, 2 Apr 90 20:52:10 mst
Resent-Date: Mon, 2 Apr 90 20:54 MST
Date: Mon, 2 Apr 90 20:52:10 mst
From: Clinton Jeffery <cjeffery@cs.arizona.edu>
Subject: Icon Ideas? (operator overloading)
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B61C186D9FFFE06CE1@Arizona.EDU>
Message-Id: <9004030352.AA03647@caslon>
In-Reply-To: shelby!csli!poser@decwrl.dec.COM's message of 28 Mar 90 02:30:09
 GMT <12860@csli.Stanford.EDU>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

> There is an intermediate approach...Implement operator overloading as
> ADDITION of methods for new data types, but don't allow pre-defined
> methods (i.e. the built-in operators) to be removed. This guarantees 
> that an operator will have the expected semantics when applied to 
> built-in data types and reduces the uncertainty to derived types.

I like this a lot.  For Icon, though, I am not sure operator overloading
makes sense, since Icon does not support the addition of new data types
(other than records).  It makes great sense in the object-oriented
Icon-derivatives (what a mouthful!), but none of them do operator
overloading so far as I know.  No one is willing to translate + into an
Icon procedure call.  Neither is anyone willing to make extensive
additions to the Icon interpreter to support this feature which "normal"
Icon programs couldn't use.  In the absence of type information, are there
any other alternatives?

From icon-group-request@arizona.edu  Tue Apr  3 10:44:53 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13687; Tue, 3 Apr 90 10:44:53 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 3 Apr 90 10:21 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA10428; Tue, 3 Apr 90 09:20:00
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 3 Apr 90 10:22 MST
Date: 3 Apr 90 15:02:32 GMT
From: esquire!yost@nyu.EDU
Subject: The Splash programming language
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B5AB4B25657FC000D2@Arizona.EDU>
Message-Id: <1909@esquire.UUCP>
Organization: DP&W, New York, NY
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

A friend at the IBM Watson Research Center saw a talk
last week by Paul Abrahams on a language he is designing
called Splash, that is supposed to derive to some extent
on Icon.  Does anyone know how to find out more about
this language?

Thanks

 --dave yost
   yost@dpw.com or uunet!esquire!yost
   Please ignore the From or Reply-To fields above, if different.

From icon-group-request@arizona.edu  Tue Apr  3 16:33:39 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13536; Tue, 3 Apr 90 16:33:39 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 3 Apr 90 16:33 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA10493; Tue, 3 Apr 90 16:30:24
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 3 Apr 90 16:35 MST
Date: 3 Apr 90 23:30:14 GMT
From: usenet@arizona.edu
Subject: Can Icon be ftp'd from anywhere? Any icon for the Mac?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B5771EFA9FDFC00452@Arizona.EDU>
Message-Id: <35270@ucbvax.BERKELEY.EDU>
Organization: School of Education, UC-Berkeley
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

around? For the Mac would be great. Any help much appreciated. Thanks
From: thom@dewey.soe.berkeley.edu (Thom Gillespie)
Path: dewey.soe.berkeley.edu!thom

--Thom



From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS  Tue Apr  3 19:28:26 1990
Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA07078; Tue, 3 Apr 90 19:28:26 MST
Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0)
	id AA05905; Tue, 3 Apr 90 22:28:21 -0400
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Tue, 3 Apr 90 22:28:14 EDT
Date: Tue, 3 Apr 90 20:31:14 EDT
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
To: icon-group@cs.arizona.edu
Message-Id: <214949@Wayne-MTS>
Subject: The SPLASH Programming Language
Status: O

 
Re Dave Yost's inquiry: here's the abstract of the talk that I've been
giving on SPLASH:
 
========================================================
 
SPLASH: A Systems Programming Language for Software Hackers
 
SPLASH is a programming language designed for programmers who delight in
their tools.  SPLASH supports programming at a high level of expression,
yet it enables its user to understand how the code he writes is actually
executed and to maintain precise control, where it is wanted, over what the
computer is actually doing.  Its high-level facilities include the ability
to define container types and iterators; inheritance of operations and
polymorphism in the object-oriented style; generators that provide most of
the facilities of coroutines but at far less cost; tuple types; and a great
many syntactic niceties that encourage elegance and transparency of
expression.  Although the ideas in SPLASH are not radically new, they are
combined and integrated in a new way.  Major sources of inspiration for
SPLASH are Icon, C++, and SEDL, a Software Engineering Design Language
developed at IBM that is an extension of Ada.  SPLASH is still a paper
language, but the ideas in it are of interest independently of any
particular implementation.  The talk will describe the features of SPLASH
and give some examples of its use. 
=============================================================
 
I still don't have a formal report on it, but I hope to have that by early
summer.  Meanwhile I'll be happy to answer questions about SPLASH and its
status.  There are a lot of things in it that are directly taken from
ICON (for which I give full credit).  Two major differences (which are
related): SPLASH is strongly typed, and it is designed to be compiled rather
than interpreted.  The strong typing also makes operator overloading possible
(re the recent discussions in the ICON group).
 
Paul Abrahams
Abrahams%wayne-mts@um.cc.umich.edu
214 River Road
Deerfield  MA  01342
(413) 774-5500

From R.J.Hare@EDINBURGH.AC.UK  Wed Apr  4 11:19:43 1990
Received: from rvax.ccit.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA26454; Wed, 4 Apr 90 11:19:43 MST
Received: from UKACRL.BITNET by rvax.ccit.arizona.edu; Wed, 4 Apr 90 11:18 MST
Received: from RL.IB by UKACRL.BITNET (Mailer X1.25) with BSMTP id 2221; Tue,
 03 Apr 90 10:04:13 BST
Date: 03 Apr 90  10:04:40 bst
From: R.J.Hare@EDINBURGH.AC.UK
Subject: Prolog documentation
To: icon-group@cs.arizona.edu
Message-Id: <03 Apr 90  10:04:40 bst  340539@EMAS-A>
Via:        UK.AC.ED.EMAS-A;  3 APR 90 10:04:10 BST
X-Envelope-To: icon-group@CS.ARIZONA.EDU
Status: O

Someone put a short document file on this board the other day, which contained
instructions for running the prolog interpreter. I have foolishly lost this.
Could it please be re-posted on the board.
 
Thanks.
 
Roger Hare.

From icon-group-request@arizona.edu  Wed Apr  4 19:00:49 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP
	id AA17274; Wed, 4 Apr 90 19:00:49 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 4 Apr 90 17:48 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA10800; Wed, 4 Apr 90 17:44:52
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 4 Apr 90 17:55 MST
Date: 4 Apr 90 23:23:06 GMT
From: motcid!henley@uunet.uu.NET
Subject: RE: Can Icon be ftp'd from anywhere? Any icon for the Mac?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B4A2C0FF959FC00F03@Arizona.EDU>
Message-Id: <2074@mica6.UUCP>
Organization: Motorola Inc., Cellular Infrastructure Div., Arlington Heights, IL
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <35270@ucbvax.BERKELEY.EDU>
Status: O

usenet@ucbvax.BERKELEY.EDU (USENET News Administration) writes:

>around? For the Mac would be great. Any help much appreciated. Thanks
>From: thom@dewey.soe.berkeley.edu (Thom Gillespie)
>Path: dewey.soe.berkeley.edu!thom

>--Thom

All questions, Yes!
    1) ICON is available from the university of Arizona(version 7.5 and 7.0):

        BBS:  (602) 621-2283

        FTP:  arizona.edu (/icon)
              (128.196.128.118 or 192.12.69.1)

	2) There is a version available for the Mac!

 -------------------------------------------------
|    Aaron Henley (uunet!motcid!henley)           |
|    Motorola Cellular Infrastructure Division    |
 -------------------------------------------------

From icon-group-request@arizona.edu  Wed Apr  4 19:01:22 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP
	id AA17414; Wed, 4 Apr 90 19:01:22 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 4 Apr 90 17:33 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA09534; Wed, 4 Apr 90 17:24:54
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 4 Apr 90 17:37 MST
Date: 5 Apr 90 00:01:01 GMT
From: bullwinkle!ccsam@ucdavis.ucdavis.EDU
Subject: Icon on the IBM RS/6000
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B4A541F67F9FC00ED5@Arizona.EDU>
Message-Id: <7033@aggie.ucdavis.edu>
Organization: Computing Services, UC Davis
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

the subject says it all - has someone ported it yet?

(please respond via mail; and, if you've got a 6000 and want
 to hear what responses i get, please send me a note).

-sam

From icon-group-request@arizona.edu  Wed Apr  4 21:49:29 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP
	id AA26942; Wed, 4 Apr 90 21:49:29 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 4 Apr 90 21:50 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA23834; Wed, 4 Apr 90 21:38:24
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 4 Apr 90 21:51 MST
Date: 4 Apr 90 16:05:44 GMT
From: esquire!yost@nyu.EDU
Subject: RE: Any Icon (programming language) for the Mac?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B481DF6F09FFC01061@Arizona.EDU>
Message-Id: <1913@esquire.UUCP>
Organization: DP&W, New York, NY
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <35270@ucbvax.BERKELEY.EDU>
Status: O

In article <35270@ucbvax.BERKELEY.EDU> thom@dewey.soe.berkeley.edu.UUCP (Thom Gillespie) writes.

ProIcon for the Mac looks from its manual to be
extremely well done.  I haven't used it or seen
it used yet, but it has nicely-accessible online
language reference documentation, power assist for
writing code, some support for Mac windows, and
tracing has been extended to include calls to builtin
functions, and other goodies.

ProIcon was developed by Mark Emmer and Ralph Griswold
and is available mail order for $175 from their company
called Bright Forest at 303-539-3884.

It would be nice if someone who uses ProIcon
would tell us if it really is as good as it looks.

There is also more unix-like batch-oriented version
for MPW available from icon-project@cs.arizona.edu.

 --dave yost
   yost@dpw.com or uunet!esquire!yost
   Please ignore the From or Reply-To fields above, if different.

From shafto@eos.arc.nasa.GOV  Thu Apr  5 08:20:36 1990
Resent-From: shafto@eos.arc.nasa.GOV
Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP
	id AA29876; Thu, 5 Apr 90 08:20:36 MST
Received: from eos.arc.nasa.gov by Arizona.EDU; Thu, 5 Apr 90 08:19 MST
Received: Thu, 5 Apr 90 08:16:57 PST by eos.arc.nasa.gov (5.59/1.2)
Resent-Date: Thu, 5 Apr 90 08:21 MST
Date: Thu, 5 Apr 90 08:16:57 PST
From: Michael Shafto <shafto@eos.arc.nasa.GOV>
Subject: RE: Any Icon (programming language) for the Mac?
Resent-To: icon-group@cs.arizona.edu
To: esquire!yost@nyu.EDU, icon-group@arizona.edu
Cc: shafto@EOS.ARC.NASA.GOV
Resent-Message-Id: <B429D55CA3DFC0132C@Arizona.EDU>
Message-Id: <9004051616.AA23858@eos.arc.nasa.gov>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: esquire!yost@nyu.EDU, icon-group@Arizona.edu
X-Vms-Cc: shafto@EOS.ARC.NASA.GOV
Status: O

Well, I've used ProIcon quite a bit, as well as other symbol-processing
languages on the Mac -- Allegro Common Lisp and MaxSpitbol.  I uncovered
a bug regarding coroutines, which Mark & Ralph patched up right away --
they're a lot more responsive than the MPW folks, in my experience.

I found the ProIcon environment significantly more comfortable than
the ACL environment, even though I had sort of adapted to ACL before
getting ProIcon.  As a general high-level language that gives access
to the Mac qua Mac, I strongly recommend ProIcon.

(Can't comment much on MaxSpitbol due to less experience with it,
though it looks to have many of the same good environmental
features as ProIcon.)

Mike

From goer@sophist.uchicago.EDU  Thu Apr  5 15:51:15 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA00754; Thu, 5 Apr 90 15:51:15 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Thu, 5 Apr 90 14:55 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 5 Apr 90 16:54:03
 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA00373; Thu, 5 Apr 90
 16:50:21 CDT
Resent-Date: Thu, 5 Apr 90 15:03 MST
Date: Thu, 5 Apr 90 16:50:21 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: benchmarks for v8, Xenix
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B3F1A7D705FFC014BD@Arizona.EDU>
Message-Id: <9004052150.AA00373@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


I'm just curious what sorts of benchmarks people are getting
for other machines.  For me, v8 runs just as fast (maybe even
a tad faster, but this may be due to my new compiler) than
v7.

I'm running a 386 box with Xenix 2.3.2, and the 2.3.0 compiler
set (with lng085 sls = MSC 5.1).  However, I actually ended up
compiling Icon with gcc (-traditional).  My results were:

concord.out:concord elapsed time = 17166
deal.out:deal elapsed time = 21133
ipxref.out:ipxref elapsed time = 4233
queens.out: elapsed time = 22383
rsg.out: elapsed time = 24367

The MSC 5.1 compiler compiled okay, but the result was unsatis-
factory (none of the builtin functions could be found by the
interpreter).  Gcc passed all the tests, except for things that
are site-specific, and a few floating point operations (for
which it gave a few digits less precision).

Incidentally, if anyone gets Icon compiled with MSC 5.1 under
Xenix, I'd appreciate hearing about it.  My real reason for
posting, though, isn't to gripe about the Xenix compiler, but
to find out what others are discovering about Icon v8 as far
as the benchmarks go.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From goer@sophist.uchicago.EDU  Thu Apr  5 17:25:06 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA10199; Thu, 5 Apr 90 17:25:06 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Thu, 5 Apr 90 17:25 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 5 Apr 90 19:23:31
 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA00503; Thu, 5 Apr 90
 19:19:50 CDT
Resent-Date: Thu, 5 Apr 90 17:27 MST
Date: Thu, 5 Apr 90 19:19:50 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: coexpressions for Xenix
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B3DD96DC7EBFC01749@Arizona.EDU>
Message-Id: <9004060019.AA00503@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


I just implemented coexpressions for my Xenix system.  It didn't
require rewriting a single line of code.  I just moved the Micro-
port System V file into the Xenix directory, removed the #define
NoCoexp (or whatever it is), and recompiled.  Note that I used gcc.
I'd guess it would work with MSC 5.1 if the bugs there could be
worked out.  Anyway, it passes the tests, and so I'm happy.  I just
thought someone else might like to know that it can (easily) be
done.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From cargo@tardis.cray.com  Fri Apr  6 06:23:21 1990
Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA17024; Fri, 6 Apr 90 06:23:21 MST
Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34)
	id AA14189; Fri, 6 Apr 90 08:23:54 CDT
Received: from zk.cray.com by hall.cray.com
	id AA10754; 3.2/CRI-3.12; Fri, 6 Apr 90 08:23:52 CDT
Received: by zk.cray.com
	id AA17193; 3.2/CRI-3.12; Fri, 6 Apr 90 08:23:50 CDT
Date: Fri, 6 Apr 90 08:23:50 CDT
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9004061323.AA17193@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: RE: Any Icon (programming language) for the Mac?
Status: O

I have used ProIcon for generating PostScript files for printing
on the Mac.  Just this last weekend, I wrote and tested two Icon
programs on MS-DOS and then moved the files to the Mac (via MS-DOS
5.25" floppy to MS-DOS 720K 3.5" floppy to Mac via Apple File Exchange
with a Mac with Superdrive).

The same Icon programs compiled and ran with ProIcon.  For me, this
was extremely useful, since my home system is MS-DOS but my target
environment was a Mac.

I have also used the ProIcon features for reading and writing files
to and from windows, but not so much since maintaining portability
was high on my list for my major programs.

dsc

From ralph  Fri Apr  6 08:26:29 1990
Date: Fri, 6 Apr 90 08:26:29 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004061526.AA24992@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA24992; Fri, 6 Apr 90 08:26:29 MST
To: icon-group
Subject: ProIcon
Status: O

The contact/telephone numbers for ProIcon in recent e-mail were
not correct.

ProIcon is marketed by Catspaw, Inc.  Their telephone number is

	719-539-3884

The telephone number for The Bright Forest Company is

	602-325-3948

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From shafto@eos.arc.nasa.gov  Fri Apr  6 10:32:01 1990
Received: from eos.arc.nasa.gov by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA05802; Fri, 6 Apr 90 10:32:01 MST
Received: Fri, 6 Apr 90 08:28:04 PST by eos.arc.nasa.gov (5.59/1.2)
Date: Fri, 6 Apr 90 08:28:04 PST
From: Michael Shafto <shafto@eos.arc.nasa.gov>
Message-Id: <9004061628.AA12692@eos.arc.nasa.gov>
To: cargo@tardis.cray.com, icon-group@cs.arizona.edu
Subject: RE: Any Icon (programming language) for the Mac?
Cc: shafto@EOS.ARC.NASA.GOV
Status: O

I have also happily ported Icon from MS-DOS to Mac (ProIcon),
which saved me a lot of time.  I've not seen a Lisp dialect
that can do the same trick, though I imagine there must be such
a path via Common Lisp.

Mike

From icon-group-request@arizona.edu  Fri Apr  6 15:26:42 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA29636; Fri, 6 Apr 90 15:26:42 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 6 Apr 90 15:26 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA23944; Fri, 6 Apr 90 15:04:16
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Fri, 6 Apr 90 15:26 MST
Date: 6 Apr 90 21:20:38 GMT
From: pacific.mps.ohio-state.edu!zaphod.mps.ohio-state.edu!usc!elroy.jpl.nasa.gov!suned1!zaft@tut.cis.ohio-state.EDU
Subject: RE: Can Icon be ftp'd from anywhere? Any icon for the Mac?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B3254B01E47FC02098@Arizona.EDU>
Message-Id: <3605@suned1.Navy.MIL>
Organization: NSWSES, Port Hueneme, CA
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <35270@ucbvax.BERKELEY.EDU>, <2074@mica6.UUCP>
Status: O

CORRECTION:  The latest Icon Newsletter says the address for U of
Arizona is changing from arizona.edu to cs.arizona.edu.

Gordon Zaft

-- 
*****************************************************************************
* 	suned1!zaft@elroy.JPL.Nasa.Gov 	zaft@suned1.nswses.navy.mil         *
*	Chairman, Ventura County ACM	Phone: (805) 982-0684               *
*    Any statements / opinions made here are mine, alone, not the Navy's.   *

From icon-group-request@arizona.edu  Mon Apr  9 16:19:18 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA07889; Mon, 9 Apr 90 16:19:18 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Mon, 9 Apr 90 10:37 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA18962; Mon, 9 Apr 90 10:26:32
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Mon, 9 Apr 90 16:03 MST
Date: 9 Apr 90 17:22:44 GMT
From: usc!zaphod.mps.ohio-state.edu!rpi!uwm.edu!csd4.csd.uwm.edu!corre@ucsd.EDU
Subject: Icon on the Mac
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B0C49CFE19FFC02ACA@Arizona.EDU>
Message-Id: <3344@uwm.edu>
Organization: University of Wisconsin-Milwaukee
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I have been using ProIcon (Icon for the Macintosh computer)
continuously for some three months, and am writing to share some
impressions at this point. I feel that one's response to ProIcon will
be largely determined by how one responds to the Macintosh. I am
currently preparing some instructional materials for modern Hebrew
reviving some ideas I tried to implement some twenty years ago using
programmed learning. Many students liked the approach, but I was
bothered by the almost inevitable clumsiness and turgidity of
programmed texts, and did not pursue my ideas too far. I feel that the
computer offers real possibilities of helping students with the
difficult task of learning natural languages, especially those which
use exotic scripts. I decided to try to implement this on the
Macintosh, because my impression is that students prefer this machine,
and when they can choose between it and the other system, they vote
with their feet. The fact is that soaps doubtless have a much greater
following than dramatic masterpieces on public tv, so one need hardly
be surprised that a bright, cheery sometimes gormless approach wins
out on the computer too. I hope I don't seem to be an intellectual
snob in saying this, because I realise that my tastes are often
low-brow, and I could not honestly criticize the individuals (my son
is one) who enjoy the Macintosh. But I don't care for the Macintosh.
Bad enough that I still have to take out the trash in real life, I
don't need to do it symbolically on the screen. These trivialities
annoy me as a Macintosh user, but, more seriously, as a programmer I
feel the lack of support of a consistent and powerful operating system
which will willingly accept my written orders. Using ProIcon is quite
pleasant; it just can't get away from the world in which it lives.
True it adds some functions presaging version 8, and it has a
development system comparable to that of Turbo Pascal, bringing you
back to the Editor when an error is detected. But there is a
trade-off; the ProIcon Editor is an Editor and, unlike the Editor in
Apple Pascal for the II+ for example, does not have an automatic fill
mode that avoids the carriage returns when writing straight text.
Accordingly I find that having more or less completed a program in
ProIcon for which I found the Editor quite satisfactory, I am now
preparing the data files which the program processes on my Zenith
using my favorite editor JOVE (Jonathan's Own Version of Emacs), and
transferring them to the Mac.

	There is one big addition which ProIcon has, and that is the
windows. You do not need absolutely to use these windows if you don't want
to. I transferred a lengthy Icon program which gives a visually equivalent
Jewish and Gregorian calendar for any year from MS-DOS and UNIX to the
Macintosh and only needed to remove the screen controls to make it work
nicely on the Macintosh using only the Interactive window.  But it is a
pity to waste this resource if you are working in the first instance on the
Macintosh, and portability is not important. It does however inject a new
element into one's programming. Suddenly one has to become to some extent a
graphic designer, and this, like programming, is an art, but an entirely
different one. The immense flexibility and power of the window functions of
ProIcon force the programmer to think about all kinds of esthetic issues
which were not really relevant previously. Perhaps someone could do for
ProIcon's windows what Leslie Lamport did for TeX -- take over the visual
design aspect and let the programmer concentrate on logical design. Lamport
points out that "with a visual design system, authors usually produce
aesthetically pleasing, but poorly designed documents." I have an
uncomfortable feeling that I may be doing the same thing with my windows.

	With this admission, I would yet suggest some tentative guidelines
for using these windows. Perhaps others will have further suggestions which
will enable us to build up a body of expertise in this area. First plan the
windows which you will need for the entire program. They can be set up at
the beginning, their size, position and fonts can be determined, and
decisions made as to how they will be connected with disk files, if at all.
They do not have to be visible at this stage, or indeed at any stage (I'll
address this later.) For example, at one stage of my program I have a setup
which looks like this:
 ------------------------------------------------------------------------
|                                   |                                   |
|                                   |                                   |
|                                   |                                   |
|                                   |                                   |
|                                   |                                   |
|        Interactive Window         |         Hebrew Window             |
|                                   |                                   |
|                                   |                                   |
|                                   |                                   |
|                                   |                                   |
|                                   |                                   |
|                                   |                                   |
|                                   |                                   |
 ------------------------------------------------------------------------
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                       Information Window                              |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
 -----------------------------------------------------------------------

The upper left window gives information to, and gets responses from, the
user. It derives its information from a disk file, but the window itself is
not connected to a file. The upper right window is connected to an empty
file which has been opened for writing, so this window is dynamic. Items
appear there (in Hebrew script) which have been prompted by the activity in
the adjacent window. This material can be saved permanently if desired. The
bottom window is connected to a complete, previously prepared file which has
been opened for reading. This is a static window, which the user refers to
as necessary, moving back and forth at will by manipulating the bar, arrows
and thumb on the right side of the window. This might be a help screen, or a
set of relevant information which needs to be handy throughout the exercise.
In addition there is a fourth window which the program never activates. This
is connected to a file logging the user's activity, hour by hour and day by
day, and measuring success.  To this file material is constantly appended,
and is saved at the end of the program. The user can see it at any time
before the program ends by using the pull-down window menu, and clicking on
the Log entry, dismissing it after perusal by clicking on its close box in
its upper left hand corner.  This window just lurks in the background, and
some users might never activate it at all.

	I think this is sufficient to indicate that the permutations of the
ways in which windows can be used are vast. It is probably best to try out
the window functions in some trivial program, just to get a feel for the
manner in which they work. They really do what they are supposed to, but
often it is easier to understand what the functions do by seeing them at
work rather than trying to understand a verbal description. The
documentation does a pretty good job, and is pleasant to look at, but it 
is really hard to describe all these possibilities clearly in words.


--
Alan D. Corre
Department of Hebrew Studies
University of Wisconsin-Milwaukee                     (414) 229-4245
PO Box 413, Milwaukee, WI 53201               corre@csd4.csd.uwm.edu

From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS  Mon Apr  9 19:40:44 1990
Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA22670; Mon, 9 Apr 90 19:40:44 MST
Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0)
	id AA16310; Mon, 9 Apr 90 22:40:38 -0400
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Mon, 9 Apr 90 22:40:25 EDT
Date: Mon, 9 Apr 90 20:16:22 EDT
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
To: icon-group@cs.arizona.edu
Message-Id: <216762@Wayne-MTS>
Subject: SPLASH: A Systems Programming Language for Software Hackers
Status: O

 
In answer to a recent inquiry in this forum, here's the abstract of the talk
that I've been giving on SPLASH.  My apologies if you've received this twice;
I thought I transmitted it once before but I never got a copy back.   Many of
the ideas in SPLASH are derived from Icon.  The main differences are that
SPLASH is strongly typed and that it is designed to be compiled rather than
interpreted.  (That helps in providing operator overloading - a recent topic
of discussion in this forum.)  I'll be happy to answer any questions about
SPLASH either by email or otherwise.
 
==================================================================
 
SPLASH: A Systems Programming Language for Software Hackers
 
SPLASH is a programming language for programmers who delight in
their tools.  SPLASH supports programming at a high level of
expression, yet it enables its user to understand how the code he
writes is really executed and to maintain precise control, where
it is wanted, over what the computer is actually doing.  Its
high-level facilities include the ability to define container
types and iterators; generators that provide most of the
facilities of coroutines but at far less cost; tuple types;
inheritance of operations and polymorphism in the object-oriented
style; and a great many syntactic niceties that encourage
elegance and transparency of expression.  Although the ideas in
SPLASH are not radically new, they are combined and integrated in
a new way.  Major sources of inspiration for SPLASH are Icon,
C++, and SEDL, a Software Engineering Design Language developed
at IBM that is an extension of Ada.  Although SPLASH has not yet
been implemented, the ideas in it are of interest independently
of any particular implementation.  The talk will describe the
features of SPLASH and give some examples of its use. 
 
 
 
Paul Abrahams
214 River Road
Deerfield  MA  01342
(413) 774-5500
Abrahams%wayne-mts@um.cc.umich.edu 

From goer@sophist.uchicago.EDU  Mon Apr  9 23:20:36 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02894; Mon, 9 Apr 90 23:20:36 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Mon, 9 Apr 90 23:15 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Tue, 10 Apr 90
 01:13:45 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA06171; Tue, 10 Apr 90
 01:09:59 CDT
Resent-Date: Mon, 9 Apr 90 23:21 MST
Date: Tue, 10 Apr 90 01:09:59 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: type conversion
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <B0875C6947DFC02FBD@Arizona.EDU>
Message-Id: <9004100609.AA06171@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


I find that I don't make much use of automatic type conversion.
Even though this feature lies at the heart of the Icon implemen-
tation, I wonder how much it lies at the heart of the language's
conception.

Normally, if I think that a type conversion will occur, I make
the conversion explicit with a builtin function like string()
or cset() or integer().  I find that if conversions are occurring
where I don't know about them, they are *usually* indicative of
sloppy programming on my part.

If automatic type conversion disappeared, what sorts of ramifications
would it have?  Would some aspect of the language that I haven't
considered be radically altered?  Or would it permit greater
speed and allow for fuller implementation of things like operator
overloading (which some seem to want)?

What if we had optional static typing?  Would this offer the best
of both worlds?  How would such a feature be implemented and inte-
grated into the rest of the language (if in fact it is desirable
in the first place)?  Would it be hard to do?

I don't claim to be a theoretician.  Any discussion or clarification
would be much appreciated.  Flames as well.  I don't take this sort
of thing too personally.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From icon-group-request@arizona.edu  Wed Apr 11 04:04:09 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA14471; Wed, 11 Apr 90 04:04:09 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 11 Apr 90 04:05 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA19478; Wed, 11 Apr 90
 03:47:54 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 11 Apr 90 04:05 MST
Date: 11 Apr 90 10:22:29 GMT
From: zaphod.mps.ohio-state.edu!usc!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
Subject: RE: type conversion
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <AF968971935FC03C84@Arizona.EDU>
Message-Id: <2035@bruce.OZ>
Organization: Monash Uni. Computer Science, Australia
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <9004100609.AA06171@sophist.uchicago.edu>
Status: O

In article <9004100609.AA06171@sophist.uchicago.edu>, goer@SOPHIST.UCHICAGO.EDU (Richard Goerwitz) writes:
> 
> I find that I don't make much use of automatic type conversion.
> Even though this feature lies at the heart of the Icon implemen-
> tation, I wonder how much it lies at the heart of the language's
> conception.

I call this kind of type conversion "type coercion" and if I can quote from
T.W. Pratt's book (2nd Edition) "Programming Languages, Design and
Implementation", p 57:

	"Coercions are an important design issue in most languages.  Two
	 opposed philosophies exist regarding the extent to which the
	 language should provide coercions between data types."

The two schools of thought Pratt refers to are:
	(i) only do essential coercions like int/real (Modula-2 goes even
	    further and does none!), 
	(ii) if there is a coercion that might make sense then do it (e.g.
	     if a string looks like it could be a numeral then convert it).

> 
> Normally, if I think that a type conversion will occur, I make
> the conversion explicit with a builtin function like string()
> or cset() or integer().  I find that if conversions are occurring
> where I don't know about them, they are *usually* indicative of
> sloppy programming on my part.

Your philosophy is (i) above.

> 
> If automatic type conversion disappeared, what sorts of ramifications
> would it have?  Would some aspect of the language that I haven't
> considered be radically altered?  Or would it permit greater
> speed and allow for fuller implementation of things like operator
> overloading (which some seem to want)?
>

From the point of view of implementation it is probably easier to do
Status: O

no coercions.  There would be little difference in execution speed either way.
It would probably be easy to provide a flag to turn this feature on or off
when using Icon.  Unfortunately this would mean two kinds of source code
proliferating.  In the functional language community a similar controversy
surrounds the use of strict or lazy evaluation of procedure arguments.
Lazy evaluation is more powerful but can enable some very arcane programming
techniques.  The functional language ML chose to use strict evaluation.
Some people liked ML except for this one thing, hence LML was born.
Imagine what it would be like if every language design decision was made
an implementation variable!

> What if we had optional static typing?  Would this offer the best
> of both worlds?  How would such a feature be implemented and inte-
> grated into the rest of the language (if in fact it is desirable
> in the first place)?  Would it be hard to do?
>

Static type checking is not realistically possible for a language such as Icon.
Consider the expression:
	if x=0 then "small" else 1
The type of this expression may be impossible to determine until run time.
In Icon as well as such expressions as these it is possible to check the type
of a value at run time and act according to this information.  I have used
this myself to obtain the behaviour of variant types.  For example:
	if type(x)=="integer" then x+1 else 0
Expression such as this are quite foreign to languages with static type
checking.  Perhaps what you need is some kind of "lint" program like that
which the C programming language has.  For programs which can be statically
typed it could warn of any type clashes and coercions that will occur, for
other programs it could just indicate that static type checking was not
feasible.

From goer@sophist.uchicago.EDU  Wed Apr 11 06:12:07 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA18233; Wed, 11 Apr 90 06:12:07 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Wed, 11 Apr 90 06:11 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 11 Apr 90
 08:10:07 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA07853; Wed, 11 Apr 90
 08:06:19 CDT
Resent-Date: Wed, 11 Apr 90 06:13 MST
Date: Wed, 11 Apr 90 08:06:19 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: type coercion
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <AF84A9EABC9FC030C1@Arizona.EDU>
Message-Id: <9004111306.AA07853@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Consider the following:

   Static type checking is not realistically possible for a language
   such as Icon. Consider the expression:

	if x=0 then "small" else 1

   The type of this expression may be impossible to determine until
   run time.  In Icon as well as such expressions as these it is
   possible to check the type of a value at run time and act according
   to this information.  I have used this myself to obtain the
   behaviour of variant types.  For example:

	if type(x)=="integer" then x+1 else 0 

   Expressions such as this are quite foreign to languages with static
   type checking.  Perhaps what you need is some kind of "lint"
   program like that which the C programming language has.  For
   programs which can be statically typed it could warn of any type
   clashes and coercions that will occur, for other programs it could
   just indicate that static type checking was not feasible.

I don't disagree that expressions such as this are foreign to languages
with static type checking.  What I wonder is whether a thing like optional
static typing might be applied to variables, and not expressions.  In
this scenario,

	if x = 0 then "small" else 1

would be fine.

Just curious.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From ralph  Wed Apr 11 08:35:53 1990
Date: Wed, 11 Apr 90 08:35:53 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004111535.AA26352@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA26352; Wed, 11 Apr 90 08:35:53 MST
To: icon-group
Subject: Version 8 of Icon for MS-DOS
Status: O

Version 8 of Icon for MS-DOS is now available.

There are two packages -- one contains executable binary files and the
other contains source code.

The executable binaries support only the large memory model. Two versions
of the run-time system are provided: one that supports large-integer
arithmetic and a smaller one that does not.

The source code compiles under Microsoft C 5.10 (MS-DOS and OS/2), Lattice C
6.01, and Turbo C 2.0.

Version 8 of Icon for MS-DOS systems can be obtained by anonymous FTP
to cs.arizona.edu. After connecting, cd /icon/v8.  Get READ.ME
there for more information.

If you do not have FTP access or prefer to obtain diskettes and printed
documentation, Version 8 of Icon for MS-DOS can be ordered from:

	Icon Project
	Department of Computer Science
	Gould-Simpson Building
	The University of Arizona
	Tucson, AZ   85721

	602 621-2018 (voice)
	602 621-4246 (FAX)

Specify whether you want executable binaries, source code, or both and the
size of diskettes you prefer (5.25" or 3.5").

The packages are $20 each, payable in US dollars to The University of Arizona
with a check written on a bank in the United States.  Orders also can be
charged to MasterCard or Visa.  The price includes shipping by parcel post
in the United States, Canada, and Mexico. Add $5 per package for air mail
delivery to other countries.

Please direct any questions to me, not to icon-project or icon-group.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph


From M17572@mwvm.mitre.ORG  Thu Apr 12 05:53:54 1990
Resent-From: M17572@mwvm.mitre.ORG
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA18999; Thu, 12 Apr 90 05:53:54 MST
Return-Path: M17572@mwvm.mitre.ORG
Received: from mwunix.mitre.org by Arizona.EDU; Thu, 12 Apr 90 05:22 MST
Received: from mwvm.mitre.org by mwunix.mitre.org (5.61/SMI-2.2) id AA00101;
 Thu, 12 Apr 90 08:19:28 -0400
Received: from MWVM by mwvm.mitre.org (IBM VM SMTP R1.2.1) with BSMTP id 7601;
 Thu, 12 Apr 90 08:20:06 EDT
Resent-Date: Thu, 12 Apr 90 05:36 MST
Date: Thursday, 12 Apr 1990 08:20:05 EST
From: m17572@mwvm.mitre.ORG
Subject: pl/i to c translator in icon
Resent-To: icon-group@cs.arizona.edu
To: icon-group%arizona.edu@mwunix.mitre.ORG
Resent-Message-Id: <AEC0BF5CB25F60009A@Arizona.EDU>
Message-Id: <9004121219.AA00101@mwunix.mitre.org>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group%arizona.edu@mwunix.mitre.ORG
Status: O

I am looking for a translator from pl/i to c written in icon.  It doesn't
have to be perfect.  alternatively, a translator from any similar
language to c would be helpful.  thanks in advance.
*
*  John Artz jartz@mitre.org

From kwalker  Thu Apr 12 08:14:52 1990
Date: Thu, 12 Apr 90 08:14:52 MST
From: "Kenneth Walker" <kwalker>
Message-Id: <9004121514.AA25491@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA25491; Thu, 12 Apr 90 08:14:52 MST
In-Reply-To: <9004111306.AA07853@sophist.uchicago.edu>
To: icon-group
Subject: Re:  type coercion
Status: O

> Date: 11 Apr 90 10:22:29 GMT
> From: zaphod.mps.ohio-state.edu!usc!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU
> 
>Static type checking is not realistically possible for a language such as Icon.
> Consider the expression:
> 	if x=0 then "small" else 1
> The type of this expression may be impossible to determine until run time.
>   ...
> Perhaps what you need is some kind of "lint" program like that
> which the C programming language has.  For programs which can be statically
> typed it could warn of any type clashes and coercions that will occur, for
> other programs it could just indicate that static type checking was not
> feasible.

It is possible to perform type inference on Icon programs. Type
inference assigns a type to each expression, but some types may be
of a form like "string or integer", as in the example. In addition
to assigning types which an expression might actually produce,
it is sometimes necessary for a type inferencing scheme to make
conservative estimates, so the assigned type may include types which
the expression would never take on in any possible execution. A
compiler can use information from type inferencing to eliminate
much of the run-time type checking from a program, improving
execution speed.

I implemented a prototype type inferencing scheme some time ago
and was able to infer unique types for about 90 percent of all
operands where type checking is normally needed. I am preparing
to implement a type inferencing scheme in an experimental
optimizing compiler for Icon. It will be interesting to see what
kind of speedups result from using the type information in the
code generator.

> Date: Wed, 11 Apr 90 08:06:19 CDT
> From: Richard Goerwitz <goer@sophist.uchicago.EDU>
> 
> What I wonder is whether a thing like optional
> static typing might be applied to variables, and not expressions. 

This seems like a nice idea. Adding optional type information to
declarations is quite easily. You could do something like

   local x:integer, y: string | record r

x could then be assigned only integers and y could be assigned only
strings or records of type r. Variables with no type information
would simply be "any type", allowing the current style of typeless
variables to still be used. This scheme would require some run-time
type checking at assignments (and type coercions?), but that is not
a serious problem, though it might take some thought as to how to
implement it efficiently. This scheme effectively moves type
checking from the uses of a variable to the assignments, but 
run-time checking at assignments is only needed when the type
of the value being assigned cannot be statically determined.

I would also like to be able to have a declaration like

   global x: list of integer

This means that any list assigned to x must be restricted to contain
only integers. It would be necessary to have some way of creating
type-restricted structures. Ideally the method would be a simple
extension of the current methods for creating structures.

You probably also want a way of creating named types.

   type foo: list of (integer | foo)
   global x: foo

Type equivalence would be structural. In the following example, x and
y have the same type.

   type str_set: set of string
   local x: str_set
   local y: set of string
   
   x := y

Does anyone know how this compares to other languages with flexible
type systems? Are there pitfalls I haven't anticipated?

  Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 2858  kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker

From ccc!ccc.com!clemc@uunet.UU.NET  Thu Apr 12 08:49:10 1990
Resent-From: ccc!ccc.com!clemc@uunet.UU.NET
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA27469; Thu, 12 Apr 90 08:49:10 MST
Received: from uunet.UU.NET by Arizona.EDU; Thu, 12 Apr 90 08:48 MST
Received: from ccc.UUCP by uunet.uu.net (5.61/1.14) with UUCP id AA06191; Thu,
 12 Apr 90 11:46:45 -0400
Received: from localhost by CCC.COM.ccc.CCC.COM id aa01444; 12 Apr 90 11:00 EDT
Resent-Date: Thu, 12 Apr 90 08:50 MST
Date: Thu, 12 Apr 90 10:59:58 EDT
From: clemc@ccc.COM
Subject: RE: pl/i to c translator in icon
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group@uunet.uu.NET
Resent-Message-Id: <AEA597B3C1FF6002B9@Arizona.EDU>
Message-Id: <9004121546.AA06191@uunet.uu.net>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group@uunet.uu.NET
Status: O

>
>	I am looking for a translator from pl/i to c written in icon.
>	It doesn't have to be perfect.  alternatively, a translator
>	from any similar language to c would be helpful.
>	thanks in advance.
>	*
>	*  John Artz jartz@mitre.org
You might want to look into f2c the FORTRAN 77 to C converter that is
available from NETLIB at AT&T or the Pascal to C converters available
in the UUNET net news archives.

Good Luck,
Clem Cole
------
Clement T. Cole
Cole Computer Consulting		uunet!ccc!clemc uucp
255 North Road #119			clemc@ccc.com Internet
Chelmsford, MA 01824-1402		(508) 256-6967	voice

From wgg@cs.washington.edu  Thu Apr 12 11:24:21 1990
Received: from june.cs.washington.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA08442; Thu, 12 Apr 90 11:24:21 MST
Received: by june.cs.washington.edu (5.61/7.0jh)
	id AA15180; Thu, 12 Apr 90 11:24:22 -0700
Date: Thu, 12 Apr 90 11:24:22 -0700
From: wgg@cs.washington.edu (William Griswold)
Return-Path: <wgg@cs.washington.edu>
Message-Id: <9004121824.AA15180@june.cs.washington.edu>
To: icon-group@cs.arizona.edu, kwalker@cs.arizona.edu
Subject: Re:  type coercion
Status: O

>
>Does anyone know how this compares to other languages with flexible
>type systems? Are there pitfalls I haven't anticipated?
>

The problem that I can anticipate right off is that structural equivalence
when only some variables are typed could be expensive, unless you do some
precomputation and type inference.  For example if you have the 
declaration: 

    local L : list of list of table of (integer | real)

What will you have to do to check an assignment to L coming from an
untyped variable.  What about assignments to any of its subcomponents?
Are pointers to substructures of L a problem? 

Also, suppose you wanted to allow for typed recursive structures (there
is plenty of evidence that you want truly self-referencing structures
in Icon):

    type tree-node : list of (integer | tree-node)

Is this hard to type check?  Seems no harder than the above, but again, 
how does one check modifications to substructures with pointers running
around?  Also, what if it isn't a tree, i.e., you have a loop?  Structures
would probably have to be marked during the traversal of the check. 

Name equivalence would make the checking easier, but it would be far too
restrictive and meaningless in Icon.

I'm not knocking the idea of this flexible typing.  I would *love* to have
it.  But I see some (interesting!) problems when only some of the objects
are typed.  I think the feature could be very useful during development,
because it allows the gradual introduction of types where needed for
testing, and they can be ``turned off'' after development if they cost too
much to check all the time. 


					Bill Griswold

From tenaglia@fps.mcw.edu  Fri Apr 13 05:36:01 1990
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA26263; Fri, 13 Apr 90 05:36:01 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA00396; Thu, 12 Apr 90 21:13:36 EDT
Received: by uwm.edu; id AA04499; Thu, 12 Apr 90 15:09:29 -0500
Message-Id: <9004122009.AA04499@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Thu, 12 Apr 90 14:25:33 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Thu, 12 Apr 90 14:06:10 CDT
Date: Thu, 12 Apr 90 14:06:10 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: RE: pl/i to c translator in icon
X-Vms-Mail-To: UUCP%"m17572@mwvm.mitre.ORG"
Status: O

In response to :

> From:	UUCP%"m17572@mwvm.mitre.ORG" 12-APR-1990 13:24:50.57
> To:	mis!tenaglia
> Subj:	pl/i to c translator in icon
>
> I am looking for a translator from pl/i to c written in icon.  It doesn't
> have to be perfect.  alternatively, a translator from any similar
> language to c would be helpful.  thanks in advance.
> *
> *  John Artz jartz@mitre.org

I'm not sure what the pl/i language is. But if it is something like the
PL/M language offered on Intel RMX systems there may be something. Back
a few years ago at Astronautics Corp. in Milwaukee I worked on a project
to translate PL/M to C. I no longer work there, but they may still have
the system spooled on a tape. It was designed to run under Icon 6 in a
VAX/VMS environment, and was a peculiar combination of DCL script and a
long chain (21 - 27 modules) that would run sequentially and convert
good/average/moderately bad PL/M code into plain vanilla K&R C. Perhaps
if pl/i is similar, the system could be adapted. The contact would be a
Wesley Eckles, System Manager, Engineering Computer Services,
Astronautics Corp., 4115 N. Teutonia Ave, Milwaukee 53209, (414)447-8200 X450.
I can't say if they'd sell it or give it away. Just an idea.

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS  Sat Apr 14 13:01:47 1990
Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA27153; Sat, 14 Apr 90 13:01:47 MST
Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0)
	id AA26489; Sat, 14 Apr 90 16:01:38 -0400
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Sat, 14 Apr 90 16:01:32 EDT
Date: Sat, 14 Apr 90 15:43:44 EDT
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
To: icon-group@cs.arizona.edu
Message-Id: <218276@Wayne-MTS>
Subject: Overloading of operators
Status: O

 
One of the major advantages of strongly typed languages is that you can
overload the operators cleanly.  This advantage may be even more important
than what you gain in protection (perhaps not that much) or efficiency.
In Icon you not only don't need to declare the types of parameters; you can't.
Declaring the types of parameters is essential to defining overloaded operators
unless you want to do the overload resolution within the operator's definition.
 
I haven't found the dynamic type conversion in Icon to be of much use.  Some
of what you get with it is what you'd get in other languages through generics,
such as the ability to write a sort procedure that works for any type of list.
(Forget for the moment that sorting is built into Icon.)  The one example I
know of where dynamic type conversion really helps is in writing print
procedures, since the form of what you print depends on the type of the
argument.
 
To me, an important principle of language design is `To thine own self be
true'.  Overloaded operators are contrary to the spirit of Icon.  If you
really want them, you should either do run-time dispatching or be using a
different language.  I think it would be a mistake to add them to Icon. 
 
Paul Abrahams 
Abrahams%wayne-mts@um.cc.umich.edu

From sbw@naucse.cse.nau.edu  Sat Apr 14 16:00:50 1990
Received: from naucse.cse.nau.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA04295; Sat, 14 Apr 90 16:00:50 MST
Received: by naucse.cse.nau.edu (5.61/1.34)
	id AA26480; Sat, 14 Apr 90 16:00:32 -0700
Message-Id: <9004142300.AA26480@naucse.cse.nau.edu>
Date: Sat, 14 Apr 90 16:00:30 MST
X-Mailer: Mail User's Shell (6.5 4/17/89)
From: sbw@naucse.cse.nau.edu (Steve Wampler)
To: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu, icon-group@cs.arizona.edu
Subject: Re: Overloading of operators
Status: O

Hmmm,  *If* one could always get one's hands on the 'original version'
of an operator, then implementing overloaded operators would be fairly
simple in Icon - though, as you say, you would have to handle the
overloading yourself.  (I would view that as consistent with Icon, since
you have to handle your own typechecking yourself - consider writing
your own version of Icon.)

When I first heard of 'string invocation' - I thought that this would
be the first step in 'overloading' operators (and builtin functions).
I assumed that "+"(1,3) and "write"("hello") would access the
'original' addition and 'write' functions.  I was wrong, of course,
but if I had been correct in my assumption, we would have overloading
now (with adding syntactic support):

	operator +(x,y)
           if type(x) == type(y) == "complex" then
	      return complex_add(x,y)
           return "+"(x,y)
        end

(ignoring mixed-mode for the moment.)

and:

	procedure write(a)	# ignoring varargs for the moment...
	   if type(a) == "table" then
              return write_table(a)
           else return "write"(a)
	end

Alas, it is not so, so it will not be.  As I said, this form, to
me, would have been in the spirit of Icon.

-- 
	Steve Wampler
	{....!arizona!naucse!sbw}
	{sbw@naucse.cse.nau.edu}

From sbw@naucse.cse.nau.edu  Sun Apr 15 06:50:14 1990
Received: from naucse.cse.nau.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13890; Sun, 15 Apr 90 06:50:14 MST
Received: by naucse.cse.nau.edu (5.61/1.34)
	id AA03424; Sun, 15 Apr 90 06:49:46 -0700
Message-Id: <9004151349.AA03424@naucse.cse.nau.edu>
Date: Sun, 15 Apr 90 06:49:44 MST
X-Mailer: Mail User's Shell (6.5 4/17/89)
From: sbw@naucse.cse.nau.edu (Steve Wampler)
To: icon-group@cs.arizona.edu
Subject: Sigh...
Status: O

The sentence in my last posting that read in part:  "consider writing
your own version of Icon" was SUPPOSED to say: "consider writing your
own version of 'image()'".  So much for saturday postings.

-- 
	Steve Wampler
	{....!arizona!naucse!sbw}
	{sbw@naucse.cse.nau.edu}

From icon-group-request@arizona.edu  Tue Apr 17 18:34:24 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13815; Tue, 17 Apr 90 18:34:24 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 17 Apr 90 18:35 MST
Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA02739; Tue, 17 Apr 90
 18:24:50 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 17 Apr 90 18:35 MST
Date: 18 Apr 90 00:58:46 GMT
From: castor!ccs007@ucdavis.ucdavis.EDU
Subject: I/O help
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <AA6607BF813FA0263D@Arizona.EDU>
Message-Id: <7097@aggie.ucdavis.edu>
Organization: University of California, Davis
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O



I'm writing a mini-database and would like to know how to load 
and save a table of tables to disk.  I'm using icon on a VAX 11/785 under
Ultrix-32 V3.1 (Rev. 9). Any help would be greatly appreciated.
Everything I've tried doesn't seem to work.

Jonathan Sims
ccs007@castor.ucdavis.edu

From ralph  Tue Apr 17 18:55:01 1990
Date: Tue, 17 Apr 90 18:55:01 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004180155.AA15065@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA15065; Tue, 17 Apr 90 18:55:01 MST
To: castor!ccs007@ucdavis.ucdavis.EDU
Subject: Re:  I/O help
Cc: icon-group
In-Reply-To: <7097@aggie.ucdavis.edu>
Status: O

There are two procedures in the Icon program library for encoding arbitary Icon
data as strings that can be written to files and then restored.

The Icon program library is available in several ways, including via
FTP.  What you want depends on the version of Icon you're running.
Version 8 is current.

If you need specific advice, let me know.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From sunquest!whm  Tue Apr 17 23:31:37 1990
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA28475; Tue, 17 Apr 90 23:31:37 MST
Received: from grissom by sunquest; Tue, 17 Apr 90 23:30:25 MST
Date: Tue, 17 Apr 90 23:30:22 MST
From: "Bill Mitchell" <sunquest!whm>
Message-Id: <9004180630.AA23253@grissom>
Received: by grissom; Tue, 17 Apr 90 23:30:22 MST
To: arizona!icon-group
Subject: Another kudo for Ralph
Status: O

I just heard this today and thought I'd pass it along for the group.  I quote:

Upon recommendation of [University of Arizona] President Henry Koffler, the
Arizona Board of Regents has named Ralph E. Griswold to the rank of Regents
Professor.  This title, the highest faculty rank at the University, is reserved
for scholars whose exceptional achievements have brought them national and
international distinction, and who have made unique contributions to the
quality of the University through distinguished accomplishments in teaching,
scholarship, and creative work.

From M13852@mwvm.mitre.ORG  Wed Apr 18 07:57:00 1990
Resent-From: M13852@mwvm.mitre.ORG
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA22496; Wed, 18 Apr 90 07:57:00 MST
Return-Path: M13852@mwvm.mitre.ORG
Received: from mwunix.mitre.org by Arizona.EDU; Wed, 18 Apr 90 07:57 MST
Received: from mwvm.mitre.org by mwunix.mitre.org (5.61/SMI-2.2) id AA18834;
 Wed, 18 Apr 90 10:55:28 -0400
Received: from MWVM by mwvm.mitre.org (IBM VM SMTP R1.2.1) with BSMTP id 9527;
 Wed, 18 Apr 90 10:56:06 EDT
Resent-Date: Wed, 18 Apr 90 07:58 MST
Date: Wednesday, 18 Apr 1990 10:56:04 EST
From: m13852@mwvm.mitre.ORG
Subject: ICON GROUP
Resent-To: icon-group@cs.arizona.edu
To: icon-group%arizona.edu@mwunix.mitre.ORG
Resent-Message-Id: <A9F5E9AE033FA02A2B@Arizona.EDU>
Message-Id: <9004181455.AA18834@mwunix.mitre.org>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group%arizona.edu@mwunix.mitre.ORG
Status: O

A couple of us here at MITRE are doing some programming in ICON.  We are
wondering how we can become a part of the ICON Group so that we will be able to
pick up the network broadcast messages.  Thanks in advance.

NJBELL@mwvm.mitre.org
*
*        Noel

From tenaglia@fps.mcw.edu  Thu Apr 19 01:19:41 1990
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02853; Thu, 19 Apr 90 01:19:41 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP 
	id AA08638; Thu, 19 Apr 90 02:25:07 EDT
Received: by uwm.edu; id AA02347; Thu, 19 Apr 90 00:37:11 -0500
Message-Id: <9004190537.AA02347@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Wed, 18 Apr 90 18:10:03 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Wed, 18 Apr 90 16:37:54 CDT
Date: Wed, 18 Apr 90 16:37:54 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: RE: I/O help
X-Vms-Mail-To: UUCP%"castor!ccs007@ucdavis.ucdavis.EDU"
Status: O

In regards to :

> I'm writing a mini-database and would like to know how to load
> and save a table of tables to disk.  I'm using icon on a VAX 11/785 under
> Ultrix-32 V3.1 (Rev. 9). Any help would be greatly appreciated.
> Everything I've tried doesn't seem to work.
>
> Jonathan Sims
> ccs007@castor.ucdavis.edu

I also have written such a database. It's composed of lists of lists. It's
less efficient, but I can have duplicate keys. I store the data in a file
in the format of one record per line, and each field in the record is
delimited with char(255). The database was designed to be flexible so it
wouldn't have to be redone for each individual application. Each database
consists of two parts. The data file and the configuration file. The
configuration file points to the data file and describes it and certain
default characteristics. When run this configuration is loaded, which
tells the database how to load the file, build the screens, etc,... Many
of the settings can be changed on the fly, even the data file. So it is
conceivable to have several data files attached to a given application
model. I've implemented both under VMS and Unix. I can see some advantages
to storing tables of tables, but it sounds rather complicated. I suppose
my database might be thought of as more of a simple tuple editor.

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From goer@sophist.uchicago.EDU  Thu Apr 19 18:39:54 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13875; Thu, 19 Apr 90 18:39:54 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Thu, 19 Apr 90 18:39 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 19 Apr 90
 20:38:52 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA21462; Thu, 19 Apr 90
 20:34:52 CDT
Resent-Date: Thu, 19 Apr 90 18:40 MST
Date: Thu, 19 Apr 90 20:34:52 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: question about dereferenced functions
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A8D304B109DFA03AAD@Arizona.EDU>
Message-Id: <9004200134.AA21462@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Something I was doing the other day got me into
a bit of trouble, and it seems obvious (at least
on one level) why.  The circumstances surrounding
my mistake, however, led me to wonder about the
underlying representation of functions - something
I'd appreciate guidance on from someone more fam-
iliar with the implementation.

If I execute a program containing the line -

	write(name(main))

I get "name" written to the standard output.  However,
if I write,

	func_lst := [main]
	write(name(func_lst[]))

I get "L[1]."  This was what I expected, given the docu-
mentation.  What I did not expect, but in retrospect
seems logical, was that when I execute

	write(name(func_lst[]))

I get the error message, "Run-time error 111, variable
expected."  Clearly, the global identifier main was de-
referenced when it was incorporated into func_lst.  So
func_lst[] is no longer a variable or identifier.  I'm
just curious what functions dereference as in Icon, as
opposed to, say, C (where "main" usually means &main).

Since I have functions and procedures on my mind right
now, I might as well ask another question that has been
interesting me.  If I sort a list of functions and pro-
cedures

	[many, any, myprocedure]

what I get is

	[any, many, myprocedure]

apparently in alphabetical order.  Is this behavior guar-
anteed?  Or is it just a consequence of the implementation,
and something I should not rely on?


    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From icon-group-request@arizona.edu  Fri Apr 20 05:58:02 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA17831; Fri, 20 Apr 90 05:58:02 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 20 Apr 90 05:59 MST
Received: by ucbvax.Berkeley.EDU (5.62/1.41) id AA27838; Fri, 20 Apr 90
 05:45:26 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Fri, 20 Apr 90 05:59 MST
Date: 20 Apr 90 12:42:36 GMT
From: cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!rpi!jefu@tut.cis.ohio-state.EDU
Subject: Solving a simple problem
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A8742E1195BFA03DD9@Arizona.EDU>
Message-Id: <|{W#QJ#@rpi.edu>
Organization: The Museum of Differential Geometry
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


The other day I needed to parse a line of comma separated fields into a list -
thus "one,two,three,four,five" -> ["one" "two" "three" "four" "five" ] . 
I worked out a way to handle this, but it was inelegant.  I tried at first
to use some sort of string generating function to parse things out, but
couldnt get it to work right.  

That is, I wanted to be able to say something like :
 
res := parse(read(file))

where parse would look something like :

procedure parse (line)
    local res
    every line ? (x := something()) do put(res,x) 
end

Where something would return the next field in line each time it is resumed.

Im sure there is a nice way to do this that I'm just missing.

So, my question is - what is the most _elegant_ way to solve this problem - 
preferably using generators of some sort?

-- 

jeff putnam (jefu@pawl.rpi.edu) 

From goer@sophist.uchicago.EDU  Fri Apr 20 08:08:04 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA27282; Fri, 20 Apr 90 08:08:04 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Fri, 20 Apr 90 08:07 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 20 Apr 90
 10:06:05 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA22198; Fri, 20 Apr 90
 10:02:05 CDT
Resent-Date: Fri, 20 Apr 90 08:08 MST
Date: Fri, 20 Apr 90 10:02:05 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: tokenizing
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A86222F0F11FA03ED9@Arizona.EDU>
Message-Id: <9004201502.AA22198@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Re: "one,two,three,four,five" -> ["one" "two" "three" "four" "five" ] . 

The method used:
 
     procedure parse (line)
         local res
         every line ? (x := something()) do put(res,x) 
     end

I don't see anything wrong with this method.  There's one thing to watch
out for, though.  If you say "every s ? move(1)", move will get called
once, and that's it.  What's more, it will reset &pos to the position it
had before it was called.  You almost always want to say "while tab(some
matching procedure) do { something else }."  Also, if you are going to
"put" something into something else, make sure that the something is not
&null (as above).  Remember to initialize the list by saying "res := []"
or "res := list(0)." I'm not sure precisely what you will be using your
tokenized lists for, but anyway here's an alternative way of doing
things (NB: *untested*):

     procedure tokenize(s)
        token_list := list()
        s ? {
          while put(token_list, 1(tab(find(",")),move(1)))
          put(token_list,tab(0))
          }
        return token_list
     end

Another option, using your format above, is:

     procedure parse (line)
         local res
         res := list()
         line ? {
            while x := tab(many(~','))
            do { put(res,x); move(1) }
            }
         return res
     end

If you wish to allow greater flexibility in your input strings,
add characters to ',' above.  Personally, I'd tend to think it
better to permit "hello,word", "hello, world", and an accidental
"hello, world,".  It might also be nice to be able to have spaces
in the tokens themselves (e.g. "hello, how, is, George Washington",
where George Washington is really a single lexical item).

     procedure parse (line)
         local res
         res := list()
         line ? {
            while x := tab(many(~',')) do {
              if (=",", (tab(many(' \t')) | &null)) | pos(0)
              then put(res,x) else stop("Cannot parse ",line,".")
            }
         return res
     end

Again, this is untested.  What it should allow you to do is input
"hello, how, are you doing" and get back ["hello", "how", "are you doing"].

Icon is indeed a very, very good language for doing things like
tokenizing strings.  That's one of the things that got me to thinking
a while back whether Prolog had been implemented in Icon.  It would
be kinda like having one's cake and eating it, too.

From icon-group-request@arizona.edu  Fri Apr 20 19:10:03 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA22426; Fri, 20 Apr 90 19:10:03 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 20 Apr 90 19:03 MST
Received: by ucbvax.Berkeley.EDU (5.62/1.41) id AA17517; Fri, 20 Apr 90
 18:46:23 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Fri, 20 Apr 90 19:10 MST
Date: 21 Apr 90 00:38:33 GMT
From: limbo!taylor@apple.COM
Subject: What's wrong with this Mac Icon program?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A805B831D7BFA04631@Arizona.EDU>
Message-Id: <702@limbo.Intuitive.Com>
Organization: Intuitive Systems, Mountain View, CA: +1 (415) 966-1151
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I'm playing around with the Icon programming language, and am having
some very strange problems trying to get my sample program working.
The Icon I'm working with is ProIcon for the Macintosh, and the
sample program is:

==========================================================

#  Compute simple readability score of given ASCII text file
#  
#  Based on a  program presented in "Icon Programming for Humanists"
#  by Alan Corre', Prentice Hall 1990

global wordlength

procedure main()
   compute_readability(getfile("Choose file for readability score"))
end

procedure compute_readability(filename)
  # main work loop of the program
  local total_words, sentence, sentences
  
  total_words := 0
  wordlength := 0
  sentences := 0
  
  every sentence := get_sentence(filename) do  {
	total_words +:= count_sentence(sentence)
    sentences +:= 1
  }
  
  display(total_words, wordlength, sentences)
  return
end

procedure get_sentence(filename)
  # gets a sentence from the file, even if less than a line or more
  # than a single line of input
  static markers
  local fileid, sentence, line, substring
  
  initial markers := '.!?'
  
  fileid := open(filename) | { write("couldn't open file"); return fail }
  sentence := ""

  while line := read(fileid) do {
    line ? 
	{  
	  while substring := tab(upto(markers)) do {
	  
        # if marker found add it to sentence incl. marker itself
	    sentence ||:= (substring || tab(many(markers)))
	    suspend sentence
		
	    # skip blanks at beginning of next sentence
	    tab(many(' '))
	    sentence := ""
      }
	  
     # if the line is not finished, append rest to sentence
     if not pos(0) then
       sentence ||:= (line[&pos:0] || " ") 
    }
  }
  close(fileid)
end

procedure getword_from_sentence(sentence)
  # produces a list of words from the given sentence
  static chars, punct
  local word
  
  initial { 
     chars := (&lcase ++ &ucase ++ '1234567890\'-')
	 punct := ' .,?";:!'
  }
  
  sentence ? 
  { 
    tab(many(' '))     # skip leading white space
    while word := tab(many(chars)) do {
      tab(many(punct))
	  suspend word
    }
  }
end

procedure count_sentence(sentence)
  # number of words in a sentence
  local total, word

  total := 0
  every word := getword_from_sentence(sentence) do {
    wordlength +:= *word
    total +:= 1
  }

  return total
end

procedure display(words, wordlength, sentences)
  local average_sentence_length, average_word_length
  
  average_sentence_length := real(words) / real(sentences)
  average_word_length := real(wordlength) / real(words)

  write("Total number of words in file: ", words)
  write("Total number of sentences in file: ", sentences)
  write("Total combined word length: ", wordlength)
  write("Average sentence length = ", average_sentence_length)
  write("Average word length = ", average_word_length)
  write()
  write("Readability score = ",  
            average_sentence_length * average_word_length)
end

==========================================================

The problem I'm having with the program is that it doesn't
return the correct results!  That is, if I test it on files
that are sufficiently small that I can count the words in
the file, it gives me expected results.  But if I try with
an 8100 word file (word count from Microsoft Word) then the
Icon program only thinks that there are about 3100 words
therein!

What's worse is that I wrote a quick C program to compute
the same values and it returns much more reasonable 
values: 8500 words for the file (the difference I assume
is based on how Word defines individual word separators).

If someone can help me track down what's wrong with the
above program, I would be ever-so-grateful!  Thanks!

						-- Dave Taylor

Intuitive Systems				Macintosh Editor
Mountain View, California		"Computer Language" Magazine

taylor@limbo.intuitive.com    or   {uunet!}{decwrl,apple}!limbo!taylor

From ralph  Sat Apr 21 06:34:19 1990
Date: Sat, 21 Apr 90 06:34:19 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004211334.AA20571@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA20571; Sat, 21 Apr 90 06:34:19 MST
To: limbo!taylor@apple.COM
Subject: Re:  What's wrong with this Mac Icon program?
Cc: icon-group
Status: O

The most obvious problem is in the loop that generates the words:
+++++++++++++++++++++++++++++++
  sentence ? 
  { 
    tab(many(' '))     # skip leading white space
    while word := tab(many(chars)) do {
      tab(many(punct))
	  suspend word
    }
+++++++++++++++++++++++++++++++
The expression tab(many(chars)) only successed if there is a character in
chars at the current position. While that's probably true the first time
around, it most likely won't be the second time around, so most of
the words in the sentence will not be generated.  The better method
is
+++++++++++++++++++++++++++++++
   sentence {
      while tab(upto(chars)) do {
         word := tab(many(chars))
         suspend word
         }
+++++++++++++++++++++++++++++++
       

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From goer@sophist.uchicago.EDU  Sat Apr 21 07:08:53 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA21796; Sat, 21 Apr 90 07:08:53 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Sat, 21 Apr 90 07:05 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Sat, 21 Apr 90
 09:04:21 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA23541; Sat, 21 Apr 90
 09:00:19 CDT
Resent-Date: Sat, 21 Apr 90 07:10 MST
Date: Sat, 21 Apr 90 09:00:19 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: wordcounts
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A7A1222BEC1FA0445E@Arizona.EDU>
Message-Id: <9004211400.AA23541@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

A recent poster comments how Alan Corre's readability program
does not seem to come up with correct values.  The following is
a series of suggestions as to why this might be occurring.  If
I mention things you already know, I beg your indulgence.  It's
better to be safe (say too much) than sorry (not say enough).

First, please note that the readability score program gives only
a very rough count.  Also be sure (you probably were, but I'll
mention it anyway) that the program is fed ASCII files only.  If
you have anything fancy going on with the eighth bit of your char-
acters, the program will not work correctly.  What it will do un-
der such circumstances (I glean this from a quick perusal of the
source) is get the number of sentences about right, but undercount
the number of words.  Finally, note that the program is sentence-
based, so if your input file has a lot of headers or other text-
blocks not broken up into sentences, you'll get a wrong sentence
count (the word count should not be drastically affected).

In general, if you want to know what is going on inside an Icon
program, "compile" it with the -t option, or else stick a line
"&trace := -1" in your code near the beginning of where you want
to start tracing.  You probably knew this already, but I figured
it wouldn't hurt to mention it just in case.

One apparent bug in the program, incidentally, is that in the pro-
cedure which actually slices out individual words, "punct" is not
simply defined as the inverse of &lcase ++ &ucase ++ &digits.  As
a result, it cannot parse

    (Hi, this is an aside.)  But this is real text (or at least test).

I'd just point out again that the program is meant as a simple illus-
tration, probably meant to work on a fairly restricted range of texts.
Certainly AC could have broadened it to encompass a lot more texts,
but then it would have lost its pedagogical value.

I hope very much that this helps.  Please follow up if it doesn't.


    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer


BTW:  In the posting to which I am responding, not all of the program
was copied.  In particular, getfile() seemed to be missing.  I added
it in a manner different from how the author appearently did it.  I don't
have Corre's original here, so if this seems a hopeless perversion of the
pristine getfile(), think of it as a bit of programmer's licence.  The
program will compile now, and people learning Icon can use it to fol-
low what is going on:

-------------------------------------------------------------------------

#  Compute simple readability score of given ASCII text file
#  
#  Based on a  program presented in "Icon Programming for Humanists"
#  by Alan Corre', Prentice Hall 1990

global wordlength

procedure main(a)
    every in_list := getfile(a) do {
       compute_readability(in_list[1], in_list[2])
    }
end


procedure getfile(a)
    usage := "usage:  readable file1 [file2 [file3 [etc...]]]"
    if *a = 0 then stop(usage)
    while filename := get(a) do {
        if intext := open(filename,"r")
        then suspend [filename, intext]
	else write(&errout,"readable:  Cannot open ",filename,".")
    }
end
        

procedure compute_readability(filename,file)

  # main work loop of the program
  local total_words, sentence, sentences
  
  total_words := 0
  wordlength := 0
  sentences := 0
  
  every sentence := get_sentence(file) do  {
      total_words +:= count_sentence(sentence)
      sentences +:= 1
  }
  
  display(filename,total_words, wordlength, sentences)
  return

end


procedure get_sentence(fileid)
  # gets a sentence from the file, even if less than a line or more
  # than a single line of input
  static markers
  local sentence, line, substring
  
  initial markers := '.!?'
  
  sentence := ""

  while line := read(fileid) do {
    line ? 
	{  
	  while substring := tab(upto(markers)) do {
	  
        # if marker found add it to sentence incl. marker itself
	    sentence ||:= (substring || tab(many(markers)))
	    suspend sentence
		
	    # skip blanks at beginning of next sentence
	    tab(many(' '))
	    sentence := ""
      }
	  
     # if the line is not finished, append rest to sentence
     if not pos(0) then
       sentence ||:= (line[&pos:0] || " ") 
    }
  }
  close(fileid)
end

procedure getword_from_sentence(sentence)
  # produces a list of words from the given sentence
  static chars, punct
  local word
  
  initial { 
     chars := (&lcase ++ &ucase ++ '1234567890\'-')
	 punct := ' .,?";:!'  # here's the apparent bug;
                              # try punct := ~chars at first
  }
  
  sentence ? 
  { 
    tab(many(' '))     # skip leading white space
    while word := tab(many(chars)) do {
      tab(many(punct)) 
	  suspend word
    }
  }
end

procedure count_sentence(sentence)
  # number of words in a sentence
  local total, word

  total := 0
  every word := getword_from_sentence(sentence) do {
    wordlength +:= *word
    total +:= 1
  }

  return total
end

procedure display(filename,words, wordlength, sentences)
  local average_sentence_length, average_word_length
  
  average_sentence_length := real(words) / real(sentences)
  average_word_length := real(wordlength) / real(words)

  write("\nFilename:  ",filename)
  write("Total number of words in file: ", words)
  write("Total number of sentences in file: ", sentences)
  write("Total combined word length: ", wordlength)
  write("Average sentence length = ", average_sentence_length)
  write("Average word length = ", average_word_length)
  write()
  write("Readability score = ",  
            average_sentence_length * average_word_length)
end

# taylor@limbo.intuitive.com    or   {uunet!}{decwrl,apple}!limbo!taylor

From nowlin@iwtqg.att.COM  Sat Apr 21 15:51:50 1990
Resent-From: nowlin@iwtqg.att.COM
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA04389; Sat, 21 Apr 90 15:51:50 MST
Received: from att-in.att.com by Arizona.EDU; Sat, 21 Apr 90 15:49 MST
Resent-Date: Sat, 21 Apr 90 15:53 MST
Date: Sat, 21 Apr 90 16:23 CDT
From: nowlin@iwtqg.att.COM
Subject: RE: What's wrong with ...
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Resent-Message-Id: <A7580FA8BC1FA04236@Arizona.EDU>
Message-Id: <A7588B2BFF5FA036B4@Arizona.EDU>
Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268)
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Status: O

I'm working my way through the Humanist book.  I was intrigued by the
readability index program that Dave Taylor posted since it failed miserably
on the "read.me" file included on the disk that comes with the Humanist
book.  Even after Ralph's recommended fix.  This is no real fault of the
program.  This "read.me" file contains file names that contain embedded
periods and other characters that are normally used for punctuation.

I modified the program to work on this "read.me" file by changing the
algorithm for finding an end of sentence and by adding some characters to
the cset allowed for words.  My modified program still has some problems.
The count of words and sentences is correct for the "read.me" file but the
length of words is now wrong due to the end-of-sentence character being
counted in the length of the last word in a sentence.  I don't know.  Maybe
that's OK.

It's definitely non-trivial to parse text!  The text analysis assumptions
that work fine for plain vanilla text are mostly invalid for technical
documents.  I'd hate to have to parse text in any language besides Icon.

Jerry Nowlin (...!att!iwtqg!nowlin)

From goer@sophist.uchicago.EDU  Sun Apr 22 22:24:48 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA26298; Sun, 22 Apr 90 22:24:48 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Sun, 22 Apr 90 22:18 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 23 Apr 90
 00:17:15 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA25575; Mon, 23 Apr 90
 00:13:12 CDT
Resent-Date: Sun, 22 Apr 90 22:24 MST
Date: Mon, 23 Apr 90 00:13:12 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: regular expressions for icon
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A6582F67FD5FA04BBF@Arizona.EDU>
Message-Id: <9004230513.AA25575@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


I recall several of us discussing some time ago the fact that
Icon does not have first-class data-object patterns like Sno-
bol, and that as a second best we might like to see an egrep()-
like function added.

After having thought about this for a while, I've decided that
adding something like this would be a bit like guilding the
lily.  Besides, even in C regex is a library function, and a nasty
one at that - one which really doesn't belong in Icon's core
language definition.

Most of the times I want to use an egrep-like function are when
I want to contruct recognizers at run-time from user input.
For most purposes, coexpressions suffice.  Where these don't
offer elegant solutions (or speed), I tend to use system("e-
grep...") or open("egrep...","pr").  This isn't portable, and
so I wrote an icon-ish egrep commmand that behaves a lot like
find().

The following is really an alpha test version of that command.
I'd appreciate input from others, especially bug reports.  The
code is not well-commented.  It's a bit redundant as well.  I
made no efforts to compress or optimize it.  I just wanted to
make it work.

One thing that is particularly nice about Icon is that it let
me handle things like position in the current string by using
string scanning.  Hash tables also served as the basic state-
recording structure.  I avoided using coexpressions by saving
operations in lists (e.g. [function, arg, resulting state])
so that, at run time, I could just say if lst[1](lst[2]) then
go to state lst[3].  By doing this, I added just a little
speed and portability (not everyone has coexpressions), and
perhaps a tad greater clarity as well.

Please, please send me comments.

########################################################################
#    
#	Name:	find_re.icn
#	
#	Title:	"Find" Regular Expression
#	
#	Author:	Richard L. Goerwitz
#
#	Date:	April 22, 1990
#
########################################################################
#    
#    DESCRIPTION:  Find_re is similar to the Icon builtin function
#    find(), except that it takes as its first argument a regular
#    expression of the sort used by the Unix egrep command.  For those
#    unfamiliar with the notion of regular expressions, they represent
#    a simple string representation of a finite state transition
#    network which can be converted into an automaton capable of
#    recognizing patterns in strings of characters.  The specific
#    symbols used, and the purposes they are used for, can be gleaned
#    from the Unix man pages for egrep and the regex library
#    functions.  In even more basic terms, regular expressions can be
#    thought of as a very flexible and powerful set of "wildcards."
#
#    DIFFERENCES between egrep and find_re:  Find_re utilizes the same
#    basic language as egrep.  The only major differences are: 1) That
#    find_re is a bit more rigid in its syntax (e.g. find_re will
#    reject vagarities like 'a*?'), and 2) that find_re utilizes the
#    intrinsic Icon data structures and escaping conventions, rather
#    than those of any particular Unix variant.
#
#    BUGS:  No attempt has been made to optimize find_re.  For work
#    that requires a quick response, you'll have to use something like
#    system("egrep...")!  Note, though, that while find_re takes a
#    while to compile a regular expression, find_re at least has enough
#    sense to store the resulting automaton for quick access in subse-
#    quent calls.
#    
########################################################################

global state_table

procedure find_re(re, s, i, j)

    static FSTN_table
    initial FSTN_table := table()

    /re & stop("find_re:  Call me with at least one argument!")

    /i := \&pos | 1
    /s := \&subject | stop("find_re:  No string.")
    /j := *s+1
    if /FSTN_table[re] then {
	tokenized_re := tokenize(re)
	MakeFSTN(tokenized_re) | er(re,2)
	/FSTN_table[re] := copy(state_table)
    }
    s ? {
	tab(x := i to j) &
	apply_FSTN(&null,FSTN_table[re]) &
	(suspend x)
    }

end



procedure apply_FSTN(ini,tbl)

    static s_tbl
    local POS, tmp

    POS := &pos
    /ini := 1 & s_tbl := tbl
    fin := 2
    if ini == 0 then {
	return 0
    }

    if tmp := !s_tbl[ini] &
       tab(tmp[1](tmp[2]))
    then {
	if tmp[3] = fin
	then return 0
	else {
	    return apply_FSTN(tmp[3]) |
		   (&pos := POS, fail)
	}
    }
    else &pos := POS

end
    


procedure tokenize(s)

    token_list := list()
    s ? {

	while chr := move(1) do {
	    if chr == "\\"
	    # it can't be a metacharacter; remove the \ and "put"
	    # the integer value of the next chr into token_list
	    then put(token_list,ord(move(1))) | er(s,2,chr)
	    else if any('*+()|?.$^',chr)
	    then put(token_list,-ord(chr))
	    else {
		case chr of {
		    "["    : {
			every next_one := find("]")
			\next_one ~= &pos | er(s,2,chr)
			put(token_list,-ord(chr))
		    }
                    "]"    : {
			if &pos = (\next_one+1)
			then put(token_list,-ord(chr)) &
			     next_one := &null
			else put(token_list,ord(chr))
		    }
		    default: put(token_list,ord(chr))
		}
	    }
	}
    }

    token_list := UnMetaBrackets(token_list)

    fixed_length_token_list := list(*token_list)
    every i := 1 to *token_list
    do fixed_length_token_list[i] := token_list[i]
    return fixed_length_token_list

end



procedure UnMetaBrackets(l)

    # Since brackets delineate a cset, it doesn't make
    # any sense to have metacharacters inside of them.
    # UnMetaBrackets makes sure there are no metacharac-
    # ters inside of the braces.

    tmplst := list(); i := 0
    Lb := -ord("[")
    Rb := -ord("]")

    while (i +:= 1) <= *l do {
	if l[i] = Lb then {
	    put(tmplst,l[i])
	    until l[i +:= 1] = Rb
	    do put(tmplst,abs(l[i]))
	    put(tmplst,l[i])
	}
	else put(tmplst,l[i])
    }
    return tmplst

end



procedure MakeFSTN(l,INI,FIN)

    # MakeFSTN recursively descends through the tree structure
    # implied by the tokenized string, l, recording in (global)
    # fstn_table a list of operations to be performed, and the
    # initial and final states which apply to them.

    static Lp, Rp, Sl, Lb, Rb, Caret_inside, Dot, Dollar, Caret_outside
    initial {
	Lp := -ord("("); Rp := -ord(")")
	Sl := -ord("|")
	Lb := -ord("["); Rb := -ord("]"); Caret_inside := ord("^")
	Dot := -ord("."); Dollar := -ord("$"); Caret_outside := -ord("^")
    }

    /INI := NextState("new") & state_table := table()
    /FIN := NextState()

    # I haven't bothered to test for empty lists everywhere.
    if *l = 0 then {
	/state_table[INI] := []
	put(state_table[INI],[zSucceed,,FIN])
	return
    }

    # HUNT DOWN THE SLASH (ALTERNATION OPERATOR)
    ini := INI; inter := NextState()
    inter2:= NextState()
    every i := 1 to *l do {
	if l[i] = Sl & tab_bal(l,Lp,Rp) = i then {
	    if i = 1 then er(l,2,char(abs(l[i]))) else {
		MakeFSTN(l[1:i],inter2,FIN)
		MakeFSTN(l[i+1:0],inter,FIN)
		/state_table[ini] := []
		put(state_table[ini],[apply_FSTN,inter2,0])
		put(state_table[ini],[apply_FSTN,inter,0])
		return
	    }
	}
    }

    # HUNT DOWN PARENTHESES
    ini := INI; fin := FIN
    if l[1] = Lp then {
	i := tab_bal(l,Lp,Rp) | er(l,2,"(")
	inter := NextState()
	if any('*+?',char(abs(0 > l[i+1]))) then {
	    case l[i+1] of {
		-ord("*")   : {
		    /state_table[ini] := []
		    put(state_table[ini],[apply_FSTN,inter,0])
		    MakeFSTN(l[2:i],ini,ini)
		    MakeFSTN(l[i+2:0],inter,fin)
		    return
		}
		-ord("+")   : {
		    inter2 := NextState()
		    /state_table[inter2] := []
		    MakeFSTN(l[2:i],ini,inter2)
		    put(state_table[inter2],[apply_FSTN,inter,0])
		    MakeFSTN(l[2:i],inter2,inter2)
		    MakeFSTN(l[i+2:0],inter,fin)
		    return
		}
		-ord("?")   : {
		    /state_table[ini] := []
		    put(state_table[ini],[apply_FSTN,inter,0])
		    MakeFSTN(l[2:i],ini,inter)
		    MakeFSTN(l[i+2:0],inter,fin)
		    return
		}
	    }
	}
	else {
	    MakeFSTN(l[2:i],ini,inter)
	    MakeFSTN(l[i+1:0],inter,fin)
	    return
	}
    }
    else {     # I.E. l[1] NOT = Lp (left parenthesis as -ord("("))
	every i := 1 to *l do {
	    case l[i] of {
		Lp     : {
		    inter := NextState()
		    MakeFSTN(l[1:i],ini,inter)
		    MakeFSTN(l[i:0],inter,fin)
		    return
		}
		Rp     : er(l,2,")")
	    }
	}
    }

    # NOW, HUNT DOWN BRACKETS
    ini := INI; fin := FIN
    if l[1] = Lb then {
	i := tab_bal(l,Lb,Rb) | er(l,2,"[")
	inter := NextState()
	tmp := ""; every tmp ||:= char(l[2 to i-1])
	if Caret_inside = l[2]
	then tmp := ~cset(Expand(tmp[2:0]))
	else tmp :=  cset(Expand(tmp))
	if any('*+?',char(abs(0 > l[i+1]))) then {
	    case l[i+1] of {
		-ord("*")   : {
		    /state_table[ini] := []
		    put(state_table[ini],[apply_FSTN,inter,0])
		    put(state_table[ini],[any,tmp,ini])
		    MakeFSTN(l[i+2:0],inter,fin)
		    return
		}
		-ord("+")   : {
		    inter2 := NextState()
		    /state_table[ini] := []
		    put(state_table[ini],[any,tmp,inter2])
		    /state_table[inter2] := []
		    put(state_table[inter2],[apply_FSTN,inter,0])
		    put(state_table[inter2],[any,tmp,inter2])
		    MakeFSTN(l[i+2:0],inter,fin)
		    return
		}
		-ord("?")   : {
		    /state_table[ini] := []
		    put(state_table[ini],[apply_FSTN,inter,0])
		    put(state_table[ini],[any,tmp,inter])
		    MakeFSTN(l[i+2:0],inter,fin)
		    return
		}
	    }
	}
	else {
	    /state_table[ini] := []
	    put(state_table[ini],[any,tmp,inter])
	    MakeFSTN(l[i+1:0],inter,fin)
	    return
	}
    }
    else {           # I.E. l[1] not = Lb
	every i := 1 to *l do {
	    case l[i] of {
		Lb     : {
		    inter := NextState()
		    MakeFSTN(l[1:i],ini,inter)
		    MakeFSTN(l[i:0],inter,fin)
		    return
		}
		Rb     : er(l,2,"]")
	    }
	}
    }

    # FIND INITIAL SEQUENCES OF POSITIVE INTEGERS, CONCATENATE THEM
    if i := match_positive_ints(l) then {
	inter := NextState()
	tmp := Ints2String(l[1:i])
	/state_table[INI] := []
	put(state_table[INI],[match,tmp,inter])
	MakeFSTN(l[i:0],inter,FIN)
	return
    }

    # OKAY, CLEAN UP ALL THE JUNK THAT'S LEFT
    i := 0
    while (i +:= 1) <= *l do {
	case l[i] of {
	    Dot          : { Op := any;   Arg := &cset }
	    Dollar       : { Op := pos;   Arg := 0     }
	    Caret_outside: { Op := pos;   Arg := 1     }
	    default      : { Op := match; Arg := char(0 < l[i]) }
	} | er(l,2,char(abs(l[i])))
	ini := INI; fin := FIN
	inter := NextState()
	if any('*+?',char(abs(0 > l[i+1]))) then {
	    case l[i+1] of {
		-ord("*")   : {
		    /state_table[ini] := []
		    put(state_table[ini],[apply_FSTN,inter,0])
		    put(state_table[ini],[Op,Arg,ini])
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
		-ord("+")   : {
		    inter2 := NextState()
		    /state_table[ini] := []
		    put(state_table[ini],[Op,Arg,inter2])
		    /state_table[inter2] := []
		    put(state_table[inter2],[apply_FSTN,inter,0])
		    put(state_table[inter2],[Op,Arg,inter2])
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
		-ord("?")   : {
		    /state_table[ini] := []
		    put(state_table[ini],[apply_FSTN,inter,0])
		    put(state_table[ini],[Op,Arg,inter])
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
	    }
	}
	else {
	    /state_table[ini] := []
	    put(state_table[ini],[Op,Arg,inter])
	    MakeFSTN(l[i+1:0],inter,FIN)
	    return
	}
    }

    # WE SHOULD NOW BE DONE INSERTING EVERYTHING INTO state_table
    # IF WE GET TO HERE, WE'VE PARSED INCORRECTLY!
    er(l,4)

end



procedure NextState(new)
    static nextstate
    if \new then nextstate := 0
    return nextstate +:= 1
end



procedure er(x,i,elem)
    writes(&errout,"Error number ",i," parsing ",image(x)," at ")
    if \elem 
    then write(&errout,image(elem),".")
    else write(&errout,"(?).")
    exit(i)
end



procedure zSucceed()
    return .&pos
end



procedure Expand(s)

    s2 := ""
    s ? {
	s2 ||:= ="^"
	s2 ||:= ="-"
	while s2 ||:= tab(find("-")-1) do {
	    if (c1 := move(1), ="-",
		c2 := move(1),
		c1 << c2)
	    then every s2 ||:= char(ord(c1) to ord(c2))
	    else s2 ||:= 1(move(2), not(pos(0))) | er(s,2,"-")
	}
	s2 ||:= tab(0)
    }
    return s2

end



procedure tab_bal(l,i1,i2)
    i := 0
    i1_count := 0; i2_count := 0
    while (i +:= 1) <= *l do {
	case l[i] of {
	    i1  : i1_count +:= 1
	    i2  : i2_count +:= 1
	}
	if i1_count = i2_count
	then suspend i
    }
end


procedure match_positive_ints(l)
    
    # Matches the longest sequence of positive integers in l,
    # beginning at l[1], which neither contains, nor is fol-
    # lowed by a negative integer.  Returns the first position
    # after the match.  Hence, given [55, 55, 55, -42, 55],
    # match_positive_ints will return 3.  [55, -42] will cause
    # it to fail rather than return 1 (NOTE WELL!).

    every i := 1 to *l do {
	if l[i] < 0
	then return (3 < i) - 1
    }

end


procedure Ints2String(l)
    tmp := ""
    every tmp ||:= char(!l)
    return tmp
end

From nowlin@iwtqg.att.COM  Mon Apr 23 07:05:37 1990
Resent-From: nowlin@iwtqg.att.COM
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA12657; Mon, 23 Apr 90 07:05:37 MST
Received: from att-in.att.com by Arizona.EDU; Mon, 23 Apr 90 07:04 MST
Resent-Date: Mon, 23 Apr 90 07:05 MST
Date: Mon, 23 Apr 90 08:45 CDT
From: nowlin@iwtqg.att.COM
Subject: RE: regular expressions for icon
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Resent-Message-Id: <A60F7C62E87FA04C35@Arizona.EDU>
Message-Id: <A60F96D22E5FA03DB4@Arizona.EDU>
Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268)
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Status: O

It just so happens I have a set of tests for regular expression matching
since I had an Icon version of grep to test a few years ago.  I wrote a
main program to test the version of find_re() posted to this group and it
did OK.  Of the 100 tests I have, it only matched two patterns it shouldn't
have and only missed 17 patterns that it should have matched.  I've
included the main program that uses the find_re() procedure to simulate a
grep command here.  I get really bugged by partial postings that people
have to hack a front on before they can try.

procedure main(a)

	# the usage message
	usage := "Usage: RGgrep pattern [ file ... ]"

	# the first program argument must be the pattern
	pattern := get(a) | stop("I at least need a pattern\n",usage)

	# trick the program into using standard input if no files were passed
	if *a = 0 then a := [&null]

	# the rest of the arguments are files to search through
	every f := !a do {

		# if the file isn't null try to open it
		if \f then in := open(f) | stop("I can't open '",f,"'")

		# otherwise use standard input
		else in := &input

		# if there is only one file skip printing the file name
		if *a = 1 then f := ""

		# otherwise tack on a colon
		else f ||:= ":"

		# read all the lines
		every l := !in do {

			# scan the line for the pattern
			l ? {

##### BELOW IS THE CALL TO the find_re() procedure posted earlier #####

				# if the pattern is found print the line
				if find_re(pattern) then write(f,l)
			}
		}

		# close the input file is one was opened
		if in ~=== &input then close(in)
	}
end

I'll post the tests in a separate message since there's an Icon program to
run the tests (naturally) along with the file of tests and they total more
than 100 lines.  The comments in the test program should be enough to use
the program and the included tests.

Jerry Nowlin (...!att!iwtqg!nowlin)

From nowlin@iwtqg.att.COM  Mon Apr 23 07:18:33 1990
Resent-From: nowlin@iwtqg.att.COM
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA14809; Mon, 23 Apr 90 07:18:33 MST
Received: from att-in.att.com by Arizona.EDU; Mon, 23 Apr 90 07:15 MST
Resent-Date: Mon, 23 Apr 90 07:19 MST
Date: Mon, 23 Apr 90 08:58 CDT
From: nowlin@iwtqg.att.COM
Subject: regular expression tests
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Resent-Message-Id: <A60D72A6265FA04848@Arizona.EDU>
Message-Id: <A60E14D3C9FFA04943@Arizona.EDU>
Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268)
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Status: O

Below is a program that will test any grep-like command and a file of the
tests that it uses.  I've run these tests with the standard UNIX grep and
it only fails on the tests that use the '+' character to match at least one
and any subsequent number of characters.  The grep on my system doesn't
like that particular expression.  The Icon grep I have does like it and
that's why these tests are still in this suite.  If anyone finds bugs in
these tests let me know.  Your software is only as good as your tests.

Jerry Nowlin (..!att!iwtqg!nowlin)

-------------------------- test program follows --------------------------

# This program is to test commands that work like the standard UNIX grep.
# That is, they can read from standard input and their first argument is a
# regular expression that is searched for in the input.  The first argument
# to the program is the name of a command to test.  The second argument to
# the program is a file containing tests to be executed.  For example, the
# following is a valid invocation of this program:
#
#	tst grep test.pats
#
# Each line in the file of tests must contain a string followed by a
# regular expression followed by a comment.  I use the comment field to
# indicate whether or not the test is supposed to succeed or fail but
# anything you want can be included in the comment field.  The token
# separating the three fields is a pipe symbol.  This was completely
# arbitrary.  You can change it to anything you want.  The following is a
# valid test line:
#
#	abbbbc|ab*c|YES
#
# This test should succeed.  If it doesn't there's a problem with the
# program being tested.
#
# I use this program to test a version of grep written in Icon.  Since I
# can test the standard versions of grep with this program to I can compare
# the Icon version to the standard.

procedure main(args)

	# this program requires two arguments
	if *args ~= 2 then stop("Usage: tst cmd file")

	# the command to be tested is the first argument
	cmd := get(args)
	write("Testing the ",cmd," command")

	# the file of tests is the second argument
	file := get(args)
	in := open(file) | stop("I can't open '",file,"'")
	
	# each line should contains a test
	every !in ? {
		str := tab(upto('|'))
		move(1)
		pat := tab(upto('|'))
		move(1)
		com := tab(0)
	
		write(com,": searching '",str,"' for pattern '",pat,"'")
	
		# quote the string
		if not find("\"",str) then str := "\"" || str || "\""
		else if not find("'",str) then str := "'" || str || "'"
		else stop("Bad string: ",str)
	
		# quote the pattern
		if not find("\"",pat) then pat := "\"" || pat || "\""
		else if not find("'",pat) then pat := "'" || pat || "'"
		else stop("Bad pattern: ",pat)
	
		# invoke the command
		system("echo " || str || " | " || cmd || " " || pat)
	}
	
	# close the test file
	close(in)
end

-------------------------- test file follows --------------------------

ac|ab|NO
ac|abc|NO
ac|ab*c|YES
ac|ab+c|NO
ac|a.*c|YES
ac|a.+c|NO
abc|ab|YES
abc|abc|YES
abc|ab*c|YES
abc|ab+c|YES
abc|a.*c|YES
abc|a.+c|YES
abbbbc|ab|YES
abbbbc|abc|NO
abbbbc|ab*c|YES
abbbbc|ab+c|YES
abbbbc|a.*c|YES
abbbbc|a.+c|YES
akc|ab|NO
akc|abc|NO
akc|ab*c|NO
akc|ab+c|NO
akc|a.*c|YES
akc|a.+c|YES
akjhgfc|ab|NO
akjhgfc|abc|NO
akjhgfc|ab*c|NO
akjhgfc|ab+c|NO
akjhgfc|a.*c|YES
akjhgfc|a.+c|YES
this is it|^this|YES
this is it|^his|NO
this is it|his|YES
this is it|his$|NO
this is it|it$|YES
match carat|.*^|NO
match (^) carat|.*^|YES
(^) match carat|.^|YES
(^) match carat|(^|YES
match dollar|$.*|NO
match ($) dollar|$.*|YES
($) match dollar|$.|YES
($) match dollar|$)|YES
no stars|^**|YES
no stars|^*+|NO
*#$%&@!_+=:|^**|YES
*#$%&@!_+=:|^*+|YES
no stars|**|YES
no stars|*+|NO
*#$%&@!_+=:|**|YES
*#$%&@!_+=:|*+|YES
no pluses|^+*|YES
no pluses|^++|NO
+#$%&@!_*=:|^+*|YES
+#$%&@!_*=:|^++|YES
no pluses|+*|YES
no pluses|++|NO
+#$%&@!_*=:|+*|YES
+#$%&@!_*=:|++|YES
ABCabcdefDEF|^[a-z]|NO
ABCabcdefDEF|[a-z]$|NO
ABCabcdefDEF|^[A-Z]|YES
ABCabcdefDEF|[A-Z]$|YES
abcABCDEFdef|^[a-z]|YES
abcABCDEFdef|[a-z]$|YES
abcABCDEFdef|^[A-Z]|NO
abcABCDEFdef|[A-Z]$|NO
ABCabcdefDEF|^[acbfed]|NO
ABCabcdefDEF|[acbfed]$|NO
ABCabcdefDEF|^[FA]|YES
ABCabcdefDEF|[FA]$|YES
abcABCDEFdef|^[acbfed]|YES
abcABCDEFdef|[acbfed]$|YES
abcABCDEFdef|^[FEADCB]|NO
abcABCDEFdef|[FEADCB]$|NO
ABCabcdefDEF|^[FE0DCB]|NO
ABCabcdefDEF|[9EADCB]$|NO
abcABCDEFdef|^[9cbfed]|NO
abcABCDEFdef|[acb0ed]$|NO
ABCabcdefDEF|[a-cd-f]D|YES
ABCabcdefDEF|C[fa]|YES
abcABCDEFdef|c[^a-z]|YES
abcABCDEFdef|[^0-9]A|YES
this is a more complicated test| is .*test$|YES
this is a more complicated test| is .*test|YES
this is a more complicated test| is *test$|NO
this is a more complicated test.| is .*test$|NO
this is a more complicated test|is.*test|YES
this istest may be weird|is.*test|YES
this may be a more complicated test| is .*test$|NO
this may be a more complicated test|is .*test$|YES
this may be a more complicated test| is .*test|NO
this may be a more complicated test| is .*test$|NO
this may be a more complicated test.|is .*test$|NO
this may be a more complicated test|is.*test|YES
test ranges 5198402 ablkseimnfaKJLDLD|[-D]|YES
test ranges 5198402 ablkseimnfaKJLDLD|[-Z]|NO
test ranges 5198402 ablkseimnfaKJLDLD|[A-Z]|YES
test ranges 5198402 ablkseimnfaKJLDLD|[A-]|NO
test ranges 5198402 ablkseimnfaKJLDLD|[a-]|YES

From goer@sophist.uchicago.EDU  Mon Apr 23 13:50:42 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA28441; Mon, 23 Apr 90 13:50:42 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Mon, 23 Apr 90 10:44 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 23 Apr 90
 11:49:40 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA28106; Mon, 23 Apr 90
 11:45:10 CDT
Resent-Date: Mon, 23 Apr 90 13:48 MST
Date: Mon, 23 Apr 90 11:45:10 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: find_re and egrep
Resent-To: icon-group@cs.arizona.edu
To: nowlin@iwtqg.att.COM
Cc: icon-group@arizona.edu
Resent-Message-Id: <A5D72FD7177FA05340@Arizona.EDU>
Message-Id: <9004231645.AA28106@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: nowlin@iwtqg.att.COM
X-Vms-Cc: icon-group@Arizona.edu
Status: O


JN expressed some annoyance that I did not provide find_re() in a form
that would make it immediately testable.  I am sorry that I annoyed anyone.
My intent was to help!  My reason for posting this way was that find_re is
NOT just an egrep without "procedure main."  If people wanna put wrappers
around it, and then run it through egrep-like tests they can do this.  The
comments prepended to it, however, state very clearly that it will not pass
all tests geared for the Unix egrep system command.  In particular (as Jerry
Nowlin's tests confirm), find_re will reject constructs like '.*?'.  If
I ever chance to write '.*?', it always means one of two things:  1) I am
being very sloppy, or 2) I am not thinking about what I am doing, and am
in fact making an error.  Note also that find_re utilizes Icon's escaping
conventions, and does not attempt to accommodate itself to any particular
Unix variant.  Again, I am sorry if the way I posted find_re annoyed any-
one else.  My aim was not to make testing difficult.  In fact, I have al-
ready put it through a large battery of egrep tests.  Differences that ex-
ist between egrep and find_re are there because I want them there (or else
because egrep is not consistent from operating system [version] to operating
system [version]).  What I had hoped is that people might test it within
Icon as a "find" variant with added functionality (but a lot slower).  One
thing that might be done with it, in fact, is to place a test right at the
outset that checks for input strings without metacharacters and then calls
find() if none are found.  Another thing to do might be to add a fifth argu-
ment, which if it is nonnull, frees up all the space allocated for stored
automata.  I have no idea whether this would be worth it (probably not).

Naturally, I'll work on speeding it up.

If people do in fact want to test find_re as an egrep program, I don't
object.  I just want to be sure everyone realizes that in the marginal kinds
of cases that standard tests tend to work on, find_re will show certain sys-
tematic differences from egrep.  In actual usage, these differences will
only show up once in a blue moon, and should always consist in an error mes-
sage flagging a pattern like '.*+' or '$)' (the last of which most egrep
commands will flag as an error as well).  Like all egrep commands I have
access to, find_re will not construe $ and ^ as literals in contexts like
'.*^' or '$?', even though it might make better sense to do so.  Because
these sequences lie in one of those gray areas, maybe I should consider
flagging them as errors?

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From icon-group-request@arizona.edu  Tue Apr 24 11:07:41 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA06257; Tue, 24 Apr 90 11:07:41 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 24 Apr 90 11:08 MST
Received: by ucbvax.Berkeley.EDU (5.62/1.41) id AA23588; Tue, 24 Apr 90
 11:03:16 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 24 Apr 90 11:09 MST
Date: 24 Apr 90 18:02:36 GMT
From: sdd.hp.com!zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!corre@ucsd.EDU
Subject: Word Counts
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A5243E38369FA05A45@Arizona.EDU>
Message-Id: <3594@uwm.edu>
Organization: University of Wisconsin-Milwaukee
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I think the relevant points concerning the program in my book have
been made by others with the skill that one is accustomed to seeing on
this newsgroup, but I should like to add some meditations on this
theme.
Leslie Lamport writes: "TeX ... has trouble deciding which periods end
sentences." The statement is not strictly accurate. The author of TeX
apparently decided that while it was worth while to teach TeX to
hyphenate virtually every word in the English language (with the
possible exception of "gnomonly"), it was not worth while to allow for
the etc.'s and viz.'s and I + II.'s. The task of making determinations
about the period is left in human hands. Actually it would be quite
possible to write an Icon preprocessor to take care of that gap, but
one likewise has to decide if it is worth it. The fact is that our
notation both of natural and mathematical language is saddled with
centuries of accrued accidents. Consider:
Capitalization: Semitic languages manage just fine without it. So did
computers for a while. FORTRAN is not written like that because it is
an acronym, but because the limited character set forced it. what a
chance was missed to dump our two-tier system, the main function of
which is to make third graders miserable! but QWERTY reared its head,
and we had to squeeze computers into our foolishness. if
Capitalization had any Meaning we might be capitalizing our Nouns like
german and norwegian.
Functions: The fact that "plus" is written in between its arguments,
"square" is written after and superscript, and "square root" is written
before obscures the essential similarity of these items which
establish relationships. Yet we really can't get used to Polish type
notation.
Intonation: A tremendously significant force which has a wretched
representation in punctuation and underlining. The possibilities of
recording saying a simple "yes" fearfully, enthusiastically,
grudgingly have not even been explored. If we must have complicated
representation systems, why leave this out?
Spelling: enuff said.
We need not think that these anomalies do not take their toll. When
Joshua was pursuing some kings and wanted to know their names, a
simple shepherd boy was able to write them down for him. The new
alphabetic system was devoid of frills, could be written on a piece of
broken pot, and learned in short order. Our clumsy system leaves us
with vast numbers of intelligent illiterates trying to master a
fundamentally rong sistem. G.B. Shaw left his entire estate to
bankroll a reform of English spelling, but it didn't help. The toll in
human deprivation is quite substantial. Of course, we'd have to find
something for the teachers to do if the representation of English
could be learned in five easy lessons... I suspect that the ancient
Akkadian scribes who had to master an incredibly complicated writing
system liked it that way, because it made them indispensible. Their
monopoly was smashed by the inventors of the alphabet, but we have
managed to reestablish the old, turf-bound order to a certain degree.
Computers bring us head to head with all the inconsistencies that we
cope with daily. Instead of fitting their simple logic, we massage or
bludgeon them into accepting our outworn habits.
Now on the pedagogical issue. My definition of "word" is then
simplistic, if indeed we could ever agree on what a word is. I define
it as a string of alphabetic characters, or something marked off by
markers such as space and period. If you consider the following
sentence which is a bit Elizabethan but nonetheless valid
	'Tis the boys'.
(= modern English, "It belongs to the boys") you will see the
difficult of teaching the machine that this is not a pair of single
quotes but two apostrophes. I do hint on page forty that the
apostrophe is troublesome, but deem it better to let the reader find
his or her own problems that belong in the realm of the way we
represent things in general rather than in Icon or the hardware. Let
me give another example. On p. 36 there is a little program which
simply puts on the screen the entire ascii character set. It had
worked fine when I originally tried it with version 5 of Icon, but
when I tried it on version 7 it failed. Control-Z would not allow
itself to be written to the screen. Since I could not solve this
problem, I applied to Ralph who ascertained that the problem is a
feature (or bug, depending how you look at it) of one of the C
compilers, and this had been differently implemented in v5. I opted to
include in the program an instruction to omit Control-Z if the program
failed, but did not explain the details to the reader. I figured that
a student at that point would really not want to be bothered by the
vagaries of C compilers and would probably prefer to remain in
blissful ignorance (as, in general, I do myself on such matters.) This
is the kind of paternalistic decision which teachers (like parents)
are continually called upon to make. Reference manuals should be
exhaustive, and exhaustiveness should have priority over clarity if a
choice has to be made. But books meant to teach have to be clear, and
this means leaving a great deal out. When I started to learn Hebrew, I
used a grammar (by the Scotsman Davidson) which started with a vast
amount of theoretical knowledge of Hebrew's complex vocalization
system. It was a chicken and egg situation; you needed the theory to
understand the rest of the book, but you couldn't understand the
theory until you had read the rest of the book. That may be the real
reason that I have small classes, so maybe I shouldn't complain.
Seriously though, it is difficult to determine how much detail a
student can handle---and one can't write a book to fit the needs of
every individual. As the judge said in the Ulysses case, you just have
to consider the reader whose degree of sensuality is average. Read
sense for sensuality in this case.
What is all of this doing in an Icon discussion? It is relevant I
believe. The development of algorithmic thinking is a valuable asset
which has a distinct humanistic value. Not to say that there is no
room for sentiment, opinion, taste. But one has to know the
difference, and the great thing about programming is that ideas can be
tested by a reliable arbiter. As a child I was told that learning
Latin would help me "think logically", and for seven long years I had
Caesar, Virgil and Lucretius shoved down my throat. I disagreed with
my teachers, although I rarely expressed it because it could result in
a sore bottom. The logic of Latin grammar seemed to me a myth to which
I was forced to give lip service. (I was delighted when I found later
that the Classical Arabic verb "to be" takes an accusative---which my
Latin teachers declared to be a sin against logic.) Computer languages
really are logical, and I believe they really do make a difference in
the way one approaches day to day problems. So programming has a
humanistic value in its own right. And from this point of view, I
believe that Icon is the best language to study. It gets across the
fundamental point of algorithmic thinking without burdening one with
endless struggles with data types, significant though they may be. It
gives fair treatment to text and to math. It doesn't pretend to be
"English-like" on the one hand, or use impenetrable abbreviations on
the other. And if I never write a compiler in it, I won't be heart
broken.
--
Alan D. Corre
Department of Hebrew Studies
University of Wisconsin-Milwaukee                     (414) 229-4245
PO Box 413, Milwaukee, WI 53201               corre@csd4.csd.uwm.edu

From icon-group-request@arizona.edu  Tue Apr 24 14:52:48 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA00443; Tue, 24 Apr 90 14:52:48 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 24 Apr 90 14:52 MST
Received: by ucbvax.Berkeley.EDU (5.62/1.41) id AA07280; Tue, 24 Apr 90
 14:47:34 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 24 Apr 90 14:53 MST
Date: 20 Apr 90 06:26:42 GMT
From: helios.ee.lbl.gov!hellgate.utah.edu!uplherc!wicat!sarek!gsarff@ucsd.EDU
Subject: How to obtain IDOL?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A504D7ECDE5FA05A46@Arizona.EDU>
Message-Id: <00464@sarek.UUCP>
Organization: Programmers in Exile
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

As in the subject line, how can I obtain a copy of IDOL?  Preferably non-ftp
since I don't have access myself.  Does the Icon project have a mail server
or would anyone there be able to mail it, or possibly better, is it online on
any other system that does have a mail-archive-server?  

Thanks.

From cjeffery  Tue Apr 24 15:15:33 1990
Resent-From: "Clinton Jeffery" <cjeffery>
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02404; Tue, 24 Apr 90 15:15:33 MST
Received: from megaron.cs.arizona.edu by Arizona.EDU; Tue, 24 Apr 90 15:14 MST
Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15)
 via SMTP id AA02162; Tue, 24 Apr 90 15:13:01 MST
Received: by caslon; Tue, 24 Apr 90 15:13:01 mst
Resent-Date: Tue, 24 Apr 90 15:17 MST
Date: Tue, 24 Apr 90 15:13:01 mst
From: Clinton Jeffery <cjeffery@cs.arizona.edu>
Subject: How to obtain IDOL?
Resent-To: icon-group@cs.arizona.edu
To: helios.ee.lbl.gov!hellgate.utah.edu!uplherc!wicat!sarek!gsarff@ucsd.EDU
Cc: icon-group@arizona.edu
Resent-Message-Id: <A5019BB4F15FA05D2C@Arizona.EDU>
Message-Id: <9004242213.AA14596@caslon>
In-Reply-To: 
 helios.ee.lbl.gov!hellgate.utah.edu!uplherc!wicat!sarek!gsarff@ucsd.EDU's
 message of 20 Apr 90 06:26:42 GMT <00464@sarek.UUCP>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: 
 helios.ee.lbl.gov!hellgate.utah.edu!uplherc!wicat!sarek!gsarff@ucsd.EDU
X-Vms-Cc: icon-group@Arizona.edu
Status: O

You asked if Idol is available electronically to people without ftp access.
cs.arizona.edu does not have a mail server that I know of.

I have automated the process of e-mailing out copies of Idol in the
form of UNIX shell-archive files for people who can send me a working
e-mail address.  I guess that makes me a mail server.

Idol is also distributed with various systems' Version 8 of the
Icon Program Library, which can be ordered from the Icon Project.
--
| Clint Jeffery, U. of Arizona Dept. of Computer Science
| cjeffery@cs.arizona.edu -or- {noao allegra}!arizona!cjeffery
--

From icon-group-request@arizona.edu  Wed Apr 25 11:37:48 1990
Resent-From: icon-group-request@arizona.edu
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA16446; Wed, 25 Apr 90 11:37:48 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 25 Apr 90 11:38 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA17908; Wed, 25 Apr 90
 11:30:47 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 25 Apr 90 11:39 MST
Date: 25 Apr 90 18:30:08 GMT
From: usc!cs.utexas.edu!jnino@ucsd.EDU
Subject: what is IDOL
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A456DFF45F5FA064DA@Arizona.EDU>
Message-Id: <1254@gorath.cs.utexas.edu>
Organization: U. Texas CS Dept., Austin, Texas
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I have very recently become interested in Icon, and I'm new to this news
group. I read a message of someone inquiring about IDOL. Could anybody 
drop me a hint as to what that is...just wondering.

Thank you.

Jaime

From goer@sophist.uchicago.EDU  Wed Apr 25 12:24:02 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA21179; Wed, 25 Apr 90 12:24:02 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Wed, 25 Apr 90 12:25 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 25 Apr 90
 14:23:09 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA02170; Wed, 25 Apr 90
 14:18:59 CDT
Resent-Date: Wed, 25 Apr 90 12:25 MST
Date: Wed, 25 Apr 90 14:18:59 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: great set of wildcards - improved
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A4506B3E31FFA06363@Arizona.EDU>
Message-Id: <9004251918.AA02170@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

(It has come to my attention that many reading this newsgroup do not know
about egrep.  In nontechnical terms, egrep is a great little pattern-
finding program that uses a powerful wildcard system.  These wildcards
are far, far superior to, anything you might find in a wordprocessor, or
say, in MS-DOS's ? and * [these symbols are used, but they mean something
different to egrep].  Icon's pattern-matching facilities are, in fact,
much more powerful than egrep's.  However, you can't generally access
them at run-time as elegantly as with, say, Snobol4.  I.e. you can't
easily tell Icon what to scan for while the program is running.  Unless
you want to write a compiler yourself, use a lot of coexpressions, or
use a variant translator like the one Ken Walker constructed, you will
be up a creek - UNLESS you have find_re() [which essentially is a lit-
tle compiler].  I hope this helps everyone understand the following
posting.)

Okay, okay, I've been chided by several people now privately about not
just biting the bullet and making find_re fully egrep-compatible.  Here's
a new version WITH a wrapper (courtesy of Jerry Nowlin).  It fails only
one of the tests included with the gnu egrep distribution, and this be-
cause of differences in escaping conventions that I don't think would be
wise to change.

In essence, find_re() is now compatible both with find() (as to its syn-
tax), and with egrep (as to its input language and exit codes).  I hope
that it will provide people with distinctly Iconish access to the ori-
ginally non-Iconish egrep-style pattern-matching language.

Note that the program posted here runs from two to twenty-five times as
fast as the previous version.  Jerry Nowlin reminded that he had earlier
posted a grep-like utility.  I found it in my archives, ran it, found it
to be MUCH faster than my command.  Though JN's grep did not implement
things like () and |, find_re hardly seemed worth its added functionality
if it was so slow.  I used some of the ideas in JN's grep, added a few
easy optimizations, ran it through all the tests again, and presto!

Please tell me if any bugs are found.  Note that the syntax is like that
of *e*grep - not grep.  Note also that I am not in any way directly con-
nected with computer science as a field.  I've never read the egrep
source code, and I've never written a compiler.  This is just a utility
I wrote because I myself needed it.  It would *certainly* be possible to
write it faster.  I invite (in fact, let me challenge) the real computer
science people around here to do it *right* (= faster, better).  No
cheating, either (i.e. adding C calls to the Icon source, as one can do
with version 8).  It's gotta have full egrep functionality as well, and
pass all the main tests one might subject an egrepi-like command to!

--------------------------------RGegrep.icn----------------------------------
# wrapper by Jerry Nowlin

procedure main(a)

	# the usage message
	usage := "Usage: RGegrep pattern [ file ... ]"

	# the first program argument must be the pattern
	# Perhaps:  *a = 0 & stop("more args!"); while pattern := get(a) do {
	pattern := get(a) | stop("I at least need a pattern\n",usage)

	# trick the program into using standard input if no files were passed
	if *a = 0 then a := [&null]

	# the rest of the arguments are files to search through
	every f := !a do {

		# if the file isn't null try to open it
		if \f then in := open(f) | stop("I can't open '",f,"'")

		# otherwise use standard input
		else in := &input

		# if there is only one file skip printing the file name
		if *a = 1 then f := ""

		# otherwise tack on a colon
		else f ||:= ":"

		# read all the lines
		every l := !in do {

			# scan the line for the pattern
			l ? {

##### BELOW IS THE CALL TO the find_re() procedure listed below #####

				# if the pattern is found print the line
				if find_re(pattern)
                                then exit_status := 0 & write(f,l)
			}
		}

		# close the input file is one was opened
		if in ~=== &input then close(in)
	}
   
    # If set in a while loop, put another brace around the above -
    #   }

	exit(\exit_status | 1)

end

########################################################################
#    
#	Name:	find_re.icn
#	
#	Title:	"Find" Regular Expression
#	
#	Author:	Richard L. Goerwitz
#
#	Date:	April 25, 1990 (version 0.9)
#
########################################################################
#    
#    DESCRIPTION:  Find_re is similar to the Icon builtin function
#    find(), except that it takes as its first argument a regular
#    expression of the sort used by the Unix egrep command.  For those
#    unfamiliar with the notion of regular expressions, they represent
#    a simple string representation of a finite state transition
#    network which can be converted into an automaton capable of
#    recognizing patterns in strings of characters.  The specific
#    symbols used, and the purposes they are used for, can be gleaned
#    from the Unix man pages for egrep and the regex library
#    functions.  In nontechnical terms, regular expressions are a
#    great set of wildcards.
#
#    DIFFERENCES between find and find_re:  Find_re is backwards com-
#    patible with find().  Aside from permitting regular expressions,
#    the only difference between find_re() and find() is that find_re
#    sets the global variable __endpoint to the first position after
#    any given match occurs.  Use this variable with great care, pre-
#    ferably assigning its value to some other variable immediately
#    after the match (e.g. find_re("hello[. ?!]*",s) & tmp := __end-
#    point).  Otherwise, you will certainly run into trouble!
#
#    DIFFERENCES between egrep and find_re:  Find_re utilizes the same
#    basic language as egrep.  The only big diff. is that find_re uses
#    intrinsic Icon data structures and escaping conventions rather
#    than those of any particular Unix variant.  Be careful!  If you
#    put find_re('\(hello\)',s) into your source file, find_re will
#    treat it just like find_re('(hello)',s).  If, however, you enter
#    '\(hello\)' at run-time (due to find_re(!&input,s)), what Icon
#    receives will depend on your operating system (most likely, a
#    trace will show "\\(hello\\)").
#
#    BUGS:  Little attempt has been made to optimize find_re.  For work
#    that requires a quick response, you'll have to use something like
#    system("egrep...")!
#    
########################################################################


global state_table, biggest_nonmeta_str, __endpoint
record o_a_s(op,arg,state)

procedure find_re(re, s, i, j)

    static FSTN_table, STRING_table
    initial {
	FSTN_table := table()
	STRING_table := table()
    }

    /re & stop("find_re:  Call me with at least one argument!")

    /i := \&pos | 1
    /s := \&subject | stop("find_re:  No string.")
    /j := *s+1
    if /FSTN_table[re] then {
	if \STRING_table[re] then {
	    every p := find(STRING_table[re],s,i,j)
	    do { __endpoint := p + *re; suspend p }
	    fail
	}
	else {
	    tokenized_re := tokenize(re)
	    if 0 > !tokenized_re then {
		MakeFSTN(tokenized_re) | er(re,2)
		/FSTN_table[re] := copy(state_table)
	    }
	    else {
		tmp := ""; every tmp ||:= char(!tokenized_re)
		insert(STRING_table,re,tmp)
		every p := find(STRING_table[re],s,i,j)
		do { __endpoint := p + *re; suspend p }
		fail
	    }
	}
    }


    s ? {
	tab(x := i to j) &
	(find(biggest_nonmeta_str) | fail) \ 1 &
	apply_FSTN(&null,FSTN_table[re]) &
	(__endpoint := .&pos - 1, suspend x)
    }

end



procedure apply_FSTN(ini,tbl)

    static s_tbl
    local POS, tmp, fin

    /ini := 1 & s_tbl := tbl
    if ini = 0 then {
	return .&pos
    }
    POS := &pos
    fin := 0

    if tmp := !s_tbl[ini] &
       tab(tmp.op(tmp.arg))
    then {
	if tmp.state = fin
	then return .&pos
	else {
	    return apply_FSTN(tmp.state) |
		   (&pos := POS, fail)
	}
    }
    else &pos := POS

end
    


procedure tokenize(s)

    local chr, tmp

    token_list := list()
    s ? {
	tab(many('*+?|'))
	while chr := move(1) do {
	    if chr == "\\"
	    # it can't be a metacharacter; remove the \ and "put"
	    # the integer value of the next chr into token_list
	    then put(token_list,ord(move(1))) | er(s,2,chr)
	    else if any('*+()|?.$^',chr)
	    then {
		# Yuck!  Egrep compatibility stuff.
		case chr of {
		    "*"    : {
			tab(many('*+?'))
			put(token_list,-ord("*"))
		    }
		    "+"    : {
			tmp := tab(many('*?+')) | &null
			if upto('*?',\tmp)
			then put(token_list,-ord("*"))
			else put(token_list,-ord("+"))
		    }
		    "?"    : {
			tmp := tab(many('*?+')) | &null
			if upto('*+',\tmp)
			then put(token_list,-ord("*"))
			else put(token_list,-ord("?"))
		    }
		    "("    : {
			tab(many('*+?'))
			put(token_list,-ord("("))
		    }
		    default: put(token_list,-ord(chr))
		}
	    }
	    else {
		case chr of {
		    # More egrep compatibility stuff.
		    "["    : {
			every next_one := find("]")
			\next_one ~= &pos | er(s,2,chr)
			put(token_list,-ord(chr))
		    }
                    "]"    : {
			if &pos = (\next_one+1)
			then put(token_list,-ord(chr)) &
			     next_one := &null
			else put(token_list,ord(chr))
		    }
		    default: put(token_list,ord(chr))
		}
	    }
	}
    }

    token_list := UnMetaBrackets(token_list)

    fixed_length_token_list := list(*token_list)
    every i := 1 to *token_list
    do fixed_length_token_list[i] := token_list[i]
    return fixed_length_token_list

end



procedure UnMetaBrackets(l)

    # Since brackets delineate a cset, it doesn't make
    # any sense to have metacharacters inside of them.
    # UnMetaBrackets makes sure there are no metacharac-
    # ters inside of the braces.

    local tmplst, i, Lb, Rb

    tmplst := list(); i := 0
    Lb := -ord("[")
    Rb := -ord("]")

    while (i +:= 1) <= *l do {
	if l[i] = Lb then {
	    put(tmplst,l[i])
	    until l[i +:= 1] = Rb
	    do put(tmplst,abs(l[i]))
	    put(tmplst,l[i])
	}
	else put(tmplst,l[i])
    }
    return tmplst

end



procedure MakeFSTN(l,INI,FIN)

    # MakeFSTN recursively descends through the tree structure
    # implied by the tokenized string, l, recording in (global)
    # fstn_table a list of operations to be performed, and the
    # initial and final states which apply to them.

    static Lp, Rp, Sl, Lb, Rb, Caret_inside, Dot, Dollar, Caret_outside
    local i, inter, inter2, tmp
    initial {
	Lp := -ord("("); Rp := -ord(")")
	Sl := -ord("|")
	Lb := -ord("["); Rb := -ord("]"); Caret_inside := ord("^")
	Dot := -ord("."); Dollar := -ord("$"); Caret_outside := -ord("^")
	biggest_nonmeta_str := ""
    }

    /INI := 1 & state_table := table() & NextState("new")
    /FIN := 0

    # I haven't bothered to test for empty lists everywhere.
    if *l = 0 then {
	/state_table[INI] := []
	put(state_table[INI],o_a_s(zSucceed,&null,FIN))
	return
    }

    # HUNT DOWN THE SLASH (ALTERNATION OPERATOR)
    every i := 1 to *l do {
	if l[i] = Sl & tab_bal(l,Lp,Rp) = i then {
	    if i = 1 then er(l,2,char(abs(l[i]))) else {
		inter := NextState()
		inter2:= NextState()
		MakeFSTN(l[1:i],inter2,FIN)
		MakeFSTN(l[i+1:0],inter,FIN)
		/state_table[INI] := []
		put(state_table[INI],o_a_s(apply_FSTN,inter2,0))
		put(state_table[INI],o_a_s(apply_FSTN,inter,0))
		return
	    }
	}
    }

    # HUNT DOWN PARENTHESES
    if l[1] = Lp then {
	i := tab_bal(l,Lp,Rp) | er(l,2,"(")
	inter := NextState()
	if any('*+?',char(abs(0 > l[i+1]))) then {
	    case l[i+1] of {
		-ord("*")   : {
		    /state_table[INI] := []
		    put(state_table[INI],o_a_s(apply_FSTN,inter,0))
		    MakeFSTN(l[2:i],INI,INI)
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
		-ord("+")   : {
		    inter2 := NextState()
		    /state_table[inter2] := []
		    MakeFSTN(l[2:i],INI,inter2)
		    put(state_table[inter2],o_a_s(apply_FSTN,inter,0))
		    MakeFSTN(l[2:i],inter2,inter2)
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
		-ord("?")   : {
		    /state_table[INI] := []
		    put(state_table[INI],o_a_s(apply_FSTN,inter,0))
		    MakeFSTN(l[2:i],INI,inter)
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
	    }
	}
	else {
	    MakeFSTN(l[2:i],INI,inter)
	    MakeFSTN(l[i+1:0],inter,FIN)
	    return
	}
    }
    else {     # I.E. l[1] NOT = Lp (left parenthesis as -ord("("))
	every i := 1 to *l do {
	    case l[i] of {
		Lp     : {
		    inter := NextState()
		    MakeFSTN(l[1:i],INI,inter)
		    MakeFSTN(l[i:0],inter,FIN)
		    return
		}
		Rp     : er(l,2,")")
	    }
	}
    }

    # NOW, HUNT DOWN BRACKETS
    if l[1] = Lb then {
	i := tab_bal(l,Lb,Rb) | er(l,2,"[")
	inter := NextState()
	tmp := ""; every tmp ||:= char(l[2 to i-1])
	if Caret_inside = l[2]
	then tmp := ~cset(Expand(tmp[2:0]))
	else tmp :=  cset(Expand(tmp))
	if any('*+?',char(abs(0 > l[i+1]))) then {
	    case l[i+1] of {
		-ord("*")   : {
		    /state_table[INI] := []
		    put(state_table[INI],o_a_s(apply_FSTN,inter,0))
		    put(state_table[INI],o_a_s(any,tmp,INI))
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
		-ord("+")   : {
		    inter2 := NextState()
		    /state_table[INI] := []
		    put(state_table[INI],o_a_s(any,tmp,inter2))
		    /state_table[inter2] := []
		    put(state_table[inter2],o_a_s(apply_FSTN,inter,0))
		    put(state_table[inter2],o_a_s(any,tmp,inter2))
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
		-ord("?")   : {
		    /state_table[INI] := []
		    put(state_table[INI],o_a_s(apply_FSTN,inter,0))
		    put(state_table[INI],o_a_s(any,tmp,inter))
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
	    }
	}
	else {
	    /state_table[INI] := []
	    put(state_table[INI],o_a_s(any,tmp,inter))
	    MakeFSTN(l[i+1:0],inter,FIN)
	    return
	}
    }
    else {           # I.E. l[1] not = Lb
	every i := 1 to *l do {
	    case l[i] of {
		Lb     : {
		    inter := NextState()
		    MakeFSTN(l[1:i],INI,inter)
		    MakeFSTN(l[i:0],inter,FIN)
		    return
		}
		Rb     : er(l,2,"]")
	    }
	}
    }

    # FIND INITIAL SEQUENCES OF POSITIVE INTEGERS, CONCATENATE THEM
    if i := match_positive_ints(l) then {
	inter := NextState()
	tmp := Ints2String(l[1:i])
	if *tmp > *biggest_nonmeta_str
	then biggest_nonmeta_str := tmp
	/state_table[INI] := []
	put(state_table[INI],o_a_s(match,tmp,inter))
	MakeFSTN(l[i:0],inter,FIN)
	return
    }

    # OKAY, CLEAN UP ALL THE JUNK THAT'S LEFT
    i := 0
    while (i +:= 1) <= *l do {
	case l[i] of {
	    Dot          : { Op := any;   Arg := &cset }
	    Dollar       : { Op := pos;   Arg := 0     }
	    Caret_outside: { Op := pos;   Arg := 1     }
	    default      : { Op := match; Arg := char(0 < l[i]) }
	} | er(l,2,char(abs(l[i])))
	inter := NextState()
	if any('*+?',char(abs(0 > l[i+1]))) then {
	    case l[i+1] of {
		-ord("*")   : {
		    /state_table[INI] := []
		    put(state_table[INI],o_a_s(apply_FSTN,inter,0))
		    put(state_table[INI],o_a_s(Op,Arg,INI))
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
		-ord("+")   : {
		    inter2 := NextState()
		    /state_table[INI] := []
		    put(state_table[INI],o_a_s(Op,Arg,inter2))
		    /state_table[inter2] := []
		    put(state_table[inter2],o_a_s(apply_FSTN,inter,0))
		    put(state_table[inter2],o_a_s(Op,Arg,inter2))
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
		-ord("?")   : {
		    /state_table[INI] := []
		    put(state_table[INI],o_a_s(apply_FSTN,inter,0))
		    put(state_table[INI],o_a_s(Op,Arg,inter))
		    MakeFSTN(l[i+2:0],inter,FIN)
		    return
		}
	    }
	}
	else {
	    /state_table[INI] := []
	    put(state_table[INI],o_a_s(Op,Arg,inter))
	    MakeFSTN(l[i+1:0],inter,FIN)
	    return
	}
    }

    # WE SHOULD NOW BE DONE INSERTING EVERYTHING INTO state_table
    # IF WE GET TO HERE, WE'VE PARSED INCORRECTLY!
    er(l,4)

end



procedure NextState(new)
    static nextstate
    if \new then nextstate := 1
    else nextstate +:= 1
    return nextstate
end



procedure er(x,i,elem)
    writes(&errout,"Error number ",i," parsing ",image(x)," at ")
    if \elem 
    then write(&errout,image(elem),".")
    else write(&errout,"(?).")
    exit(i)
end



procedure zSucceed()
    return .&pos
end



procedure Expand(s)

    s2 := ""
    s ? {
	s2 ||:= ="^"
	s2 ||:= ="-"
	while s2 ||:= tab(find("-")-1) do {
	    if (c1 := move(1), ="-",
		c2 := move(1),
		c1 << c2)
	    then every s2 ||:= char(ord(c1) to ord(c2))
	    else s2 ||:= 1(move(2), not(pos(0))) | er(s,2,"-")
	}
	s2 ||:= tab(0)
    }
    return s2

end



procedure tab_bal(l,i1,i2)
    i := 0
    i1_count := 0; i2_count := 0
    while (i +:= 1) <= *l do {
	case l[i] of {
	    i1  : i1_count +:= 1
	    i2  : i2_count +:= 1
	}
	if i1_count = i2_count
	then suspend i
    }
end


procedure match_positive_ints(l)
    
    # Matches the longest sequence of positive integers in l,
    # beginning at l[1], which neither contains, nor is fol-
    # lowed by a negative integer.  Returns the first position
    # after the match.  Hence, given [55, 55, 55, -42, 55],
    # match_positive_ints will return 3.  [55, -42] will cause
    # it to fail rather than return 1 (NOTE WELL!).

    every i := 1 to *l do {
	if l[i] < 0
	then return (3 < i) - 1
    }

end


procedure Ints2String(l)
    tmp := ""
    every tmp ||:= char(!l)
    return tmp
end


procedure StripChar(s,s2)
    if find(s2,s) then {
	tmp := ""
	s ? {
	    while tmp ||:= tab(find("s2"))
	    do tab(many(cset(s2)))
	    tmp ||:= tab(0)
	}
    }
    return \tmp | s
end

From @RELAY.CS.NET:Adalbert.Kerber@uni-bayreuth.dbp.de  Thu Apr 26 02:46:52 1990
Received: from relay.cs.net by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA19701; Thu, 26 Apr 90 02:46:52 MST
Received: from relay2.cs.net by RELAY.CS.NET id aa13696; 26 Apr 90 5:46 EDT
Received: from zix.gmd.dbp.de by RELAY.CS.NET id ae06551; 26 Apr 90 5:38 EDT
Received: from zix.gmd.dbp.de by .zix.gmd.dbp.de id a001775; 26 Apr 90 8:41 MET
Date: 26 Apr 90 07:31 GMT
From: Adalbert.Kerber%uni-bayreuth.dbp.de@RELAY.CS.NET
To: ICON-GROUP@cs.arizona.edu
Message-Id: <94138062400991/13537 X400>
Status: O

please stop subscription for btm203@dbthrz5.bitnetplease stop subscription for btm203@dbthrz5.bitnet

From goer@sophist.uchicago.EDU  Fri Apr 27 08:05:57 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA16824; Fri, 27 Apr 90 08:05:57 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Fri, 27 Apr 90 08:07 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 27 Apr 90
 08:44:41 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA04266; Fri, 27 Apr 90
 08:40:31 CDT
Resent-Date: Fri, 27 Apr 90 08:07 MST
Date: Fri, 27 Apr 90 08:40:31 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: coexpressions, questions
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A2E22055697FA06BB8@Arizona.EDU>
Message-Id: <9004271340.AA04266@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I've always wondered why coexpressions were introduced.  Was it because
people wanted coroutines?  Or did the idea of delaying evaluation prompt
its inclusion (a holdover from SNOBOL's *)?  Final question:  How is it
that the designers came to recognize that coroutines and delayed evalu-
ation (or really, controlled evaluation) could be united under the same
syntactic rubric as coroutines?


    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From goer@sophist.uchicago.EDU  Fri Apr 27 08:06:08 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA16848; Fri, 27 Apr 90 08:06:08 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Fri, 27 Apr 90 08:06 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 27 Apr 90
 09:13:06 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA04306; Fri, 27 Apr 90
 09:08:52 CDT
Resent-Date: Fri, 27 Apr 90 08:07 MST
Date: Fri, 27 Apr 90 09:08:52 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: grammars, questions
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A2E22ABA8B7FA07336@Arizona.EDU>
Message-Id: <9004271408.AA04306@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

My last questions were centered on coexpressions.  These ones concern
representation of various types of grammars in Icon.  They seemed to
belong in separate postings.

I've long wondered what sorts of grammars Icon is capable of representing,
using string scanning.  It appears that Icon has no trouble representing
context free languages, with the restriction that there can be no left-
recursion in the grammar.  For representing natural languages, this can
be a serious drawback.  Lots of natural language constructs cannot be rep-
resented without left recursion (e.g. the possessive 's construct).

Prolog has a thing called indexed grammars, which on the infamous "Chomsky
hierarchy" seem to fall between context free phrase structure grammars and
context sensitive phrase structure grammars.  These grammars are like con-
text free grammars except that while recognizing sequences the various nodes
are given labels which can be correlated and compared with labels for other
nodes.  The result is that you can have the NP node labeled as plural, and
the VP node as well, and tell the grammar that if the two do not match the
sentence must be regected.  As far as I can see, this could be done in Icon
simply by having matching procedures return values.  I dunno.  Has anyone
looked into this?

It seems to me that Icon has the distinct advantage of allowing the user to
skip the tokenizing stage in many cases.  You can just parse the string di-
rectly.  I like this.  But what do we do in cases where we must deal with
a backslash.  Most solutions I've seen are pretty ugly.  What I did in my
find_re procedure posted a few days ago was to convert input strings to lists,
and then convert metacharacters to negative integers, leaving nonmetas as
positive integers.  This worked well, since Icon has ord() and char().  Has
anyone developed an elegant solution to the \ problem using string scanning?

This posting has wandered a bit, so let me summarize.  I am curious, first
of all, about what sorts of grammars can easily be represented using Icon's
special string-processing facilities.  Secondly, I'm curious whether anyone
has done research into indexed grammars in Icon.  Thirdly, I'd like to know
whether there exist elegant solutions to the \ problem.

I hope that these questions are relevant, interesting, etc., and not just
a waste of bandwidth.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From ralph  Fri Apr 27 08:26:26 1990
Resent-From: "Ralph Griswold" <ralph>
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA19579; Fri, 27 Apr 90 08:26:26 MST
Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Fri, 27 Apr 90 08:27 MST
Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA19537; Fri, 27 Apr 90
 08:25:52 MST
Resent-Date: Fri, 27 Apr 90 08:27 MST
Date: Fri, 27 Apr 90 08:25:52 MST
From: Ralph Griswold <ralph@cs.arizona.edu>
Subject: RE:  coexpressions, questions
Resent-To: icon-group@cs.arizona.edu
To: goer@sophist.uchicago.EDU, icon-group@arizona.edu
Resent-Message-Id: <A2DF439D1E3FA078DD@Arizona.EDU>
Message-Id: <9004271525.AA19537@megaron.cs.arizona.edu>
In-Reply-To: <9004271340.AA04266@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: goer@sophist.uchicago.EDU, icon-group@Arizona.edu
Status: RO

The motivation for adding co-expressions to Icon was to be able to
control when and where the results of a generator are produced.

Without co-expressions, the results of a generator are produced in
a last-in-first-out fashion at the lexical site in program at
which the generator apprears, as demanded by the enclosing expression.
Parallel production of the results of two generators, for example,
is impossible without co-expressions.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From icon-group-request@arizona.edu  Fri Apr 27 10:57:05 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA05153; Fri, 27 Apr 90 10:57:05 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 27 Apr 90 10:58 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA15563; Thu, 26 Apr 90
 12:41:59 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Fri, 27 Apr 90 10:58 MST
Date: 26 Apr 90 18:50:05 GMT
From: nic!hri!sparc9!rolandi@bbn.COM
Subject: ftp new sparc station icon sources
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A2CA3D27445FA07E57@Arizona.EDU>
Message-Id: <1990Apr26.185005.19973@hri.com>
Organization: Horizon Research
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: RO

What does one need to do in order to obtain the new SPARC 
station Icon source code?  



              ***************************************
              *          Walter G. Rolandi          *
              *        Horizon Research, Inc.       *
              *           1432 Main Street          *
              *       Waltham, MA  02154  USA       *
              *            (617) 466 8339           *
              *                                     *
              *           rolandi@hri.com           *
              ***************************************

From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS  Fri Apr 27 11:07:25 1990
Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA06381; Fri, 27 Apr 90 11:07:25 MST
Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0)
	id AA05405; Fri, 27 Apr 90 14:07:15 -0400
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Fri, 27 Apr 90 14:06:43 EDT
Date: Fri, 27 Apr 90 12:59:36 EDT
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
To: icon-group@cs.arizona.edu
Message-Id: <221535@Wayne-MTS>
Subject: Why co-expressions?
Status: RO

Ralph's explanation of the motivation for co-expressions is right on.
But Ralph--why do they have to be symmetric?  An asymmetric version,
which is what I have in SPLASH, is conceptually simpler (rather like
Unix piping, but with branching) and seems to provide all the functionality
I've ever needed.
 
Paul Abrahams
abrahams%wayne-mts@um.cc.umich.edu

From ralph  Fri Apr 27 11:14:42 1990
Date: Fri, 27 Apr 90 11:14:42 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004271814.AA06892@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA06892; Fri, 27 Apr 90 11:14:42 MST
To: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
Subject: Re:  Why co-expressions?
Cc: icon-group
In-Reply-To: <221535@Wayne-MTS>
Status: O

   Ralph's explanation of the motivation for co-expressions is right on.
   But Ralph--why do they have to be symmetric?  An asymmetric version,
   which is what I have in SPLASH, is conceptually simpler (rather like
   Unix piping, but with branching) and seems to provide all the functionality
   I've ever needed.
    
   Paul Abrahams
   abrahams%wayne-mts@um.cc.umich.edu

This is a long-standing question.  In fact, it's been posed as a challenge --
produce a program that really needs the full coroutine capabilities of
co-expressions.

Perhaps Steve Wampler, who designed and implemented co-expressions, will
respond.

It's worth noting that symmetry usually is viewed as an aesthetic virtue.

I guess my personal view is that I can ignore the coroutine aspects of
co-expressions.  Except when I have to document them or teach about them.


   

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From kwalker  Fri Apr 27 11:19:06 1990
Date: Fri, 27 Apr 90 11:19:06 MST
From: "Kenneth Walker" <kwalker>
Message-Id: <9004271819.AA07299@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA07299; Fri, 27 Apr 90 11:19:06 MST
In-Reply-To: <9004271408.AA04306@sophist.uchicago.edu>
To: icon-group
Subject: Re:  grammars, questions
Status: O

> Date: Fri, 27 Apr 90 09:08:52 CDT
> From: Richard Goerwitz <goer@sophist.uchicago.EDU>
> 
> It appears that Icon has no trouble representing
> context free languages, with the restriction that there can be no left-
> recursion in the grammar.

Certain special cases of left-recursion can be converted into looping.
Unfortunately, this bounds backtracking so to you have to know that
backtracking won't find other solutions.

For example, the production

s ::=  t  |  s "+" t

can be parsed with a procedure something like

procedure s()
   x := []
   push(x, t()) | fail
   while ="+" do
      push(x, t()) | stop("syntax error")
   return x
end

(I haven't tested this code; in any event it needs to be a little more
sophisticated.) This pattern of left recursion comes up a lot in
programming languages. Is it common in natural languages?

  Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
  +1 602 621-4324  kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker

From cargo@tardis.cray.com  Fri Apr 27 12:06:16 1990
Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA10501; Fri, 27 Apr 90 12:06:16 MST
Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34)
	id AA18506; Fri, 27 Apr 90 14:06:25 CDT
Received: from zk.cray.com by hall.cray.com
	id AA15158; 4.1/CRI-3.12; Fri, 27 Apr 90 14:06:24 CDT
Received: by zk.cray.com
	id AA04961; 3.2/CRI-3.12; Fri, 27 Apr 90 14:06:45 CDT
Date: Fri, 27 Apr 90 14:06:45 CDT
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9004271906.AA04961@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: parsers
Status: O

I am interested in using Icon to write parsers for what I have heard
called "braced languages."  "The braced languages are deterministic
and context-free langauges that explicity identify and mark the
beginning and end of each piece of information comprising the data
object." [from The automatic generation of software for data exchange
in the graphics domain, Sandra A. Mamrak, et al., The Ohio State
University]

The languages in general are SGML documents (with the document type
description defining the particular structure of the data objects).

I haven't had time to do much aside from research some of the work
that other people have done.

dsc

From wunder@hpsdel.sde.hp.com  Fri Apr 27 12:53:27 1990
Received: from hp-sde.sde.hp.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13751; Fri, 27 Apr 90 12:53:27 MST
Received: from orac.sde.hp.com by hp-sde.sde.hp.com with SMTP
	(16.2A/15.5+IOS 3.13) id AA22723; Fri, 27 Apr 90 12:53:01 -0700
Received: by hpsdel.sde.hp.com
        (15.7/SES42.42) id AA25100; Fri, 27 Apr 90 12:52:42 pdt
Date: Fri, 27 Apr 90 12:52:42 pdt
From: Walter Underwood <wunder@hpsdel.sde.hp.com>
Message-Id: <9004271952.AA25100@hpsdel.sde.hp.com>
To: cargo@tardis.cray.com
Cc: icon-group@cs.arizona.edu
In-Reply-To: David S. Cargo's message of Fri, 27 Apr 90 14:06:45 CDT <9004271906.AA04961@zk.cray.com>
Subject: parsers
Status: O

   The languages in general are SGML documents (with the document type
   description defining the particular structure of the data objects).

Check out this paper:

  The implementation of the Amsterdam SGML parser
  J Warmer & S Van Egmond
  Electronic Publishing, vol 2 no 2, page 65
  July 1989

They talk about why SGML is not LL(1), and about implementing a parser
with the Amsterdam Compiler Kit.

wunder


From sbw@naucse.cse.nau.edu  Fri Apr 27 12:55:05 1990
Received: from naucse.cse.nau.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA13884; Fri, 27 Apr 90 12:55:05 MST
Received: by naucse.cse.nau.edu (5.61/1.34)
	id AA05834; Fri, 27 Apr 90 12:54:21 -0700
Message-Id: <9004271954.AA05834@naucse.cse.nau.edu>
Date: Fri, 27 Apr 90 12:54:12 MST
X-Mailer: Mail User's Shell (6.5 4/17/89)
From: sbw@naucse.cse.nau.edu (Steve Wampler)
To: "Ralph Griswold" <ralph@cs.arizona.edu>,
        um.cc.umich.edu!Wayne-MTS!Paul_Abrahams@cs.arizona.edu
Subject: Re:  Why co-expressions?
Cc: icon-group@cs.arizona.edu
Status: O

On Apr 27 at 11:18, "Ralph Griswold" writes:
} 
}    Ralph's explanation of the motivation for co-expressions is right on.
}    But Ralph--why do they have to be symmetric?  An asymmetric version,
}    which is what I have in SPLASH, is conceptually simpler (rather like
}    Unix piping, but with branching) and seems to provide all the functionality
}    I've ever needed.
}     
}    Paul Abrahams
}    abrahams%wayne-mts@um.cc.umich.edu
} 
} This is a long-standing question.  In fact, it's been posed as a challenge --
} produce a program that really needs the full coroutine capabilities of
} co-expressions.
} 
} Perhaps Steve Wampler, who designed and implemented co-expressions, will
} respond.

Well, let's see, just what do I want my memory of back then to be...

Oh yes.

One thing to keep in mind is that, as a PhD student, I was interested in
'research' topics, not just implementation.  The nice thing about a
symmetric view of co-expressions is that they are full of interesting
potentials - for example, since they effectively provide a heap-based
calling structure (instead of the conventional stack-based model), they
provide all sorts of fun graph-based programming strategies.  And, of
course, lend themselves reasonably well to exploring certain multi-
process programming strategies (could do better at this one, though).

I think I can claim (heck, I can claim anything - wonder if I'm right?)
that asymmetric co-expressions provide lazy evaluation of a tree-based
calling structure - which is interesting, but not as general (from a
research point of view, remember).

And, I know it seems odd, but I find the symmetric view very straight-forward,
there isn't much special-casing going on.  I think this is reflected in
the original implementation, which was flat-out trivial on a PDP (to activate
a co-expression, you simply changed the sp register to point into the stack
for the co-expression, saved the pc and reset it to the saved pc for
the co-expression.  Only about 4 instructions.  Returning was, well,
symmetric.)  I'd bet that an implementation of the asymmetric model
would be no easier, and possibly more complex (since you may need to
worry about preventing cycles).  Since all that changes is the CPU state,
this seems naturally (to my warped mind) as a simple multi-processor model.

Of course, the original model was flawed if one really wanted coroutines,
but there are other things (such as the heap-based call graph mentioned
above) that are nice *from a research point of view*.

Sigh, I wish someone would throw me a couple of graduate students and
some time to really explore these things.

Paul, I'm anxious to learn more about SPLASH!  I'd like to play with
asymmetric co-expressions as well as some of the other features you've
tempted us with!

-- 
	Steve Wampler
	{....!arizona!naucse!sbw}
	{sbw@naucse.cse.nau.edu}

From goer@sophist.uchicago.EDU  Fri Apr 27 12:57:36 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA14211; Fri, 27 Apr 90 12:57:36 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Fri, 27 Apr 90 12:58 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 27 Apr 90
 14:57:23 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA04801; Fri, 27 Apr 90
 14:53:10 CDT
Resent-Date: Fri, 27 Apr 90 12:59 MST
Date: Fri, 27 Apr 90 14:53:10 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: left recursion in natural languages
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A2B963D86B3FA07326@Arizona.EDU>
Message-Id: <9004271953.AA04801@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Ken Walker kindly responds to my query about parsing strategies (or at
least one of the many queries):

	For example, the production
	
	s ::=  t  |  s "+" t

(For those who want to get in this game, quick read Griswold & Griswold,
chapter 15.)
	
	can be parsed with a procedure something like
	
	procedure s()
	   x := []
	   push(x, t()) | fail
	   while ="+" do
	      push(x, t()) | stop("syntax error")
	   return x
	end
	
	(I haven't tested this code; in any event it needs to be a little more
	sophisticated.) This pattern of left recursion comes up a lot in
	programming languages. Is it common in natural languages?

Yes, it's fairly common.  One case in point is the 's construction.
You can define a noun phrase as a combination of articles, simple
nouns, adjectives, relative clauses, etc.  This works fairly well
using a simple phrase structure grammar.  However, as soon as you try
to include 's, you have to define a noun phrase as a noun phrase fol-
lowed by an 's.  E.g. -

	the queen of England
	the queen of England's throne

The phrase "the queen of England" is a noun phrase, and so is "the queen
of England's throne."  You can see immediately how there's left recursion
here.  Any time you get a postpositions (which is what 's really is -
it's not a "case" or an affix of any kind) you'll have this problem of
left recursion.  Sumerian, which is one language I've studied, is all
postpositions - no prepositions at all.  In fact, you can often do a
mirror-image lexical calque of a phrase and have it come out as acceptable
English.

The problem with natural language parsing strategies that don't permit
elegant left-recursion is that they don't mirror the apparent ability
for people to handle both sorts of recursion in any given languages.
Normally a language will use one or the other predominantly, but often
there is mixing (as in English).

One way out is to permit X levels of left recursion, with X representing
the number of levels beyond which people get confused, and don't really
talk that way.  The problem with this is that people might perhaps really
be using an internal grammar that permits infinite recursion, but that
they just can't fully realize the grammar.  This seems a bit silly to
me.

I have to admit that my main interest is in the sounds - the phonemes,
or systmatic pronunciation units - of ancient Semitic languages.
Someone who is more into the suntax of natural languages, please jump
in here --->

From @s.ms.uky.edu:mtbb95@ms.uky.edu  Fri Apr 27 14:29:16 1990
Received: from e.ms.uky.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA21281; Fri, 27 Apr 90 14:29:16 MST
Received: from s.ms.uky.edu by g.ms.uky.edu id aa24729; 27 Apr 90 16:48 EDT
From: Bob Maras <mtbb95@ms.uky.edu>
Date: Fri, 27 Apr 90 16:47:52 EDT
X-Mailer: Mail User's Shell (6.4 2/14/89)
To: icon-group@cs.arizona.edu, mtbb95@ms.uky.edu
Subject: Removal
Message-Id:  <9004271647.aa22801@s.s.ms.uky.edu>
Status: O

Please remove my name from your Icon mailing list of users.  I appreciate the fine effort you are making and wish each of you the very best.  I have enjoyed
your information very much.

Robert Maras

-- 
                          _      _
                         ( ) __ ( )
                          | O  O |         B O B   M A R A S
                          /  __  \      /
                         (   \/   )  __/
                          \ \__/ / 
                           \____/ 
                           |_/\_|     H A P P Y    C O M P U T I N G    !!!
        

From icon-group-request@arizona.edu  Fri Apr 27 17:54:28 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA06635; Fri, 27 Apr 90 17:54:28 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 27 Apr 90 17:54 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA27061; Fri, 27 Apr 90
 17:39:16 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Fri, 27 Apr 90 17:54 MST
Date: 27 Apr 90 18:22:00 GMT
From: swrinde!zaphod.mps.ohio-state.edu!uwm.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!daniel@ucsd.EDU
Subject: Icon v8 port for Convex ?
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A2900CB3721FA084E1@Arizona.EDU>
Message-Id: <6900001@ux1.cso.uiuc.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


I desire to bring up Icon version 8 on a Convex 220.  Has anyone done
the port yet?  I see that version 7 was ported.

-- Daniel Pommert

email.internet:	pommert@uiuc.edu
email.bitnet:	daniel@uiucvmd

phone:	(217) 333-8629

post:	DCL Rm, 150
	1304 W. Springfield
	Urbana, IL  61801-2987

where:	40  6 47 N Latitude
	88 13 36 W Longitude

From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS  Fri Apr 27 19:20:34 1990
Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA11705; Fri, 27 Apr 90 19:20:34 MST
Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0)
	id AA27095; Fri, 27 Apr 90 22:20:30 -0400
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Fri, 27 Apr 90 22:20:12 EDT
Date: Fri, 27 Apr 90 22:14:38 EDT
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
To: icon-group@cs.arizona.edu
Message-Id: <221656@Wayne-MTS>
Subject: Deletions from the Icon mailing list
Status: O

Would all you people who (unwisely) want to get off the Icon mailing list
please send notice of that to icon-PROJECT, not to icon-GROUP.
(If I'm leading the flocks astray, dear icon-group-person, please let
us know.) - Paul Abrahams

From ralph  Fri Apr 27 19:43:06 1990
Date: Fri, 27 Apr 90 19:43:06 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004280243.AA12512@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA12512; Fri, 27 Apr 90 19:43:06 MST
To: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu, icon-group@cs.arizona.edu
Subject: Re:  Deletions from the Icon mailing list
Status: O

The correct e-mail address to use to get on/off the icon-group
mailing list is icon-group-request.  The name is common protocol
used for all such groups.

The problem is that folks can't be expected to know/remember that.

A good alternative is icon-project; we'll handle icon-group changes,
and, at least, mail to icon-project doesn't get resent to hundreds of
addresses all over the world.

In general, if you have something that's not suitable for broadcasting,
send it to icon-project.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

p.s.  Thanks, Paul.

From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS  Sun Apr 29 13:01:50 1990
Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA14826; Sun, 29 Apr 90 13:01:50 MST
Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0)
	id AA13598; Sun, 29 Apr 90 16:01:45 -0400
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Sun, 29 Apr 90 16:01:27 EDT
Date: Sun, 29 Apr 90 15:45:40 EDT
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
To: icon-group@cs.arizona.edu
Message-Id: <221847@Wayne-MTS>
Subject: Co-expressions, symmetric and otherwise
Status: O

 
Rich Goerwitz asked me what the difference was between symmetric and
asymmetric co-expressions.  The answer, in terms of Icon, is simple:
if you disallow the expressions @&source, @&main, and the binary form
e1@e2 (see p. 138 of the Icon book), then you have asymmetric co-expressions.
In other words, you can create a co-expression e, which can then pass
back values at various points of call with suspend (or return--just once).
These values can be picked up by evaluating @e.  So the co-expression
sends values (via suspend), and various callers can retrieve them
(via @).  But there's no way for a caller to pass a value to e once it's been
created except, of course, through globals or other such devices.
 
In my experience the asymmetric form is extremely useful, and it's all that
I've ever needed.  To tell the truth, I never fully fathomed the program on
p. 139.  (How many of you out there have really understood it?)  My suspicion
is that if the facilities of Sec. 13.4.1 were dropped from Icon, no useful
programs would be broken.  To answer Steve Wampler's point about restrictions,
the only restriction needed to limit Icon to asymmetric coexpressions would
be to eliminate those facilities--so it would be pretty simple if the Icon
project wanted to do it.
 
Historically, I think that the symmetric form of coroutine arose because there
was no natural way to make coroutines asymmetric.  The suspend notation
provides such a way.
 
In terms of implementation, the hard part is not so much passing control back
and forth but the storage allocation.  A coexpression requires either heap
allocation or a stack of its own, and if you choose the stack, you either have
to limit its size or make it relocatable from one place to another.  Hence the
references in the literature to cactus stacks, which are what you need to
implement coroutines.  With asymmetric coroutines, there are some interesting
optimization possibilities if the optimizer can discover that a coexpression
only uses a bounded amount of storage.  (This is how it works in SPLASH.)
I don't know whether symmetric co-expressions make this much harder to do.
Coroutines were originally designed for Fortran-like (or assembly) languages
in environments without recursion, so the storage allocation problem was
essentially trivial.
 
Paul Abrahams
abrahams%wayne-mts@um.cc.umich.edu

From goer@sophist.uchicago.EDU  Sun Apr 29 15:45:51 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA23817; Sun, 29 Apr 90 15:45:51 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Sun, 29 Apr 90 15:46 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Sun, 29 Apr 90
 17:45:37 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA02634; Sun, 29 Apr 90
 17:41:26 CDT
Resent-Date: Sun, 29 Apr 90 15:47 MST
Date: Sun, 29 Apr 90 17:41:26 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: hell
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A10F8C79D93FA08D2B@Arizona.EDU>
Message-Id: <9004292241.AA02634@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


    In my experience the asymmetric form is extremely useful, and it's
    all that I've ever needed.  To tell the truth, I never fully
    fathomed the program on p. 139.  (How many of you out there have
    really understood it?)  My suspicion is that if the facilities of
    Sec. 13.4.1 were dropped from Icon, no useful programs would be
    broken.  To answer Steve Wampler's point about restrictions, the
    only restriction needed to limit Icon to asymmetric coexpressions
    would be to eliminate those facilities--so it would be pretty
    simple if the Icon project wanted to do it.

I guess my experience was this:  I learned Icon on my own, and I figured
than anything in "the book" was there because it was important for a fun-
damental understanding of the language.  As a result, I fooled with the
example of symmetric coexpressions until I understood what they were used
for.  I'd say that for about two years at least half the programs I wrote
used these symmetric coexpressions.  I've written some very extensive con-
version programs using methods like those outlined on the infamous p.
139.

I recall being a bit surprised several years ago when David Gudeman ex-
pressed a dislike for coroutines.  Jerry Nowlin also used to jump in and
redo most programs that were posted using coexpressions (still less co-
routines) so that these were not necessary, mumbling something about
speed :-).  My problem is that, although I like using them for some things,
they usually obfuscate my code unless I am careful.

I guess I've gotten away from coroutines, but I'd note that, at least here
on my home machine, they do not cause much of a speed decrease over other
methods.

I would be very, very interested in seeing a short, clean example of a
situation where symmetric coroutines provide at least the most elegant,
if not the only possible, way of doing something.


Another subject (yes, this should have been placed in another posting,
but I post here often enough as it is):  What are Backus-Naur Forms?
I was reading about SNOBOL the other day - a language I am only very
superficially familiar with.  The author of the article I was perusing
stated that BNFs could be translated directly into SNOBOL4 patterns.
First of all, like so many others who use Icon, I am not a computer
scientist.  I do linguistics and text processing mainly.  I have no
idea what BNFs are.  I'd like to know what they are.  Secondly, I'd
enjoy knowing whether the same translation as is possible for SNOBOL
is possible for Icon.

If someone knows the answers to these questions, could he or she
perhaps chime in?

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From ralph  Sun Apr 29 16:08:52 1990
Date: Sun, 29 Apr 90 16:08:52 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004292308.AA25231@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA25231; Sun, 29 Apr 90 16:08:52 MST
To: goer@sophist.uchicago.EDU
Subject: Re:  hell
Cc: icon-group
In-Reply-To: <9004292241.AA02634@sophist.uchicago.edu>
Status: O

Co-expressions are space-intensive, but not time-intensive.  In fact,
activating a co-expression is a bit faster than calling a procedure.

The biggest problem is that co-expressions are not supported in all versions
of Icon, so code that uses them is not portable.

Symmetry/non-symmetry aside, co-expressions as they are in Icon are not
going to be changed.  I can't really see what all the fuss is about;
the symmetry costs relatively little code, it's be done, and you can
ignore the symmetric uses if you don't need them.

As to BNF:  It's a notational system for writing context-free production
grammars.  It was first used in the design of Algol 60 (or possibly
Algol 58).  It's nothing special, but it can be found in many older
programming language texts, used for describing syntax.  There are examples
of BNF grammars in the Icon langauge book, although they may not be
labeled as such.

There's a fairly simple mapping from BNF (and other CF production grammar
systems) to patterns in SNOBOL4.  There's a similar mapping for Icon,
provided matching procedures are used for the nonterminal symbols.  The
result is a recursive-descent parser with backtracking.  Aside from
efficiency considerations, the main problem is that left-recursion in
productions translates into left recusrion in matching.  SNOBOL4 avoids
this by using a length-shortening heuristic.

The mapping is described in the Icon language book.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From cjeffery  Sun Apr 29 16:19:29 1990
Resent-From: "Clinton Jeffery" <cjeffery>
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA25760; Sun, 29 Apr 90 16:19:29 MST
Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Sun, 29 Apr 90 16:18 MST
Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15)
 via SMTP id AA25635; Sun, 29 Apr 90 16:17:09 MST
Received: by caslon; Sun, 29 Apr 90 16:17:09 mst
Resent-Date: Sun, 29 Apr 90 16:20 MST
Date: Sun, 29 Apr 90 16:17:09 mst
From: Clinton Jeffery <cjeffery@cs.arizona.edu>
Subject: hell
Resent-To: icon-group@cs.arizona.edu
To: goer@sophist.uchicago.EDU
Cc: icon-group@arizona.edu
Resent-Message-Id: <A10AE048433FA08FC7@Arizona.EDU>
Message-Id: <9004292317.AA02545@caslon>
In-Reply-To: Richard Goerwitz's message of Sun, 29 Apr 90 17:41:26 CDT
 <9004292241.AA02634@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: goer@sophist.uchicago.EDU
X-Vms-Cc: icon-group@Arizona.edu
Status: O

Backus Naur Forms (or Backus Normal Forms, and including
"Extended Backus Naur Forms) are one notation for expressing
"Phrase Structure Grammars"--I'm sure the main reason we call
them BNF is to avoid giving linguists the credit where credit
is due.  Well, I am just kidding here folks; computer scientists
use it for historical reasons, but BNF's are essentially PSG's.
Most BNF grammars would translate easily into Icon, as has
been discussed in the past two weeks.

I am going to stay out of the co-routine squabble for the moment.
I think that symmetric co-expressions can defend themselves.
--
| Clint Jeffery, U. of Arizona Dept. of Computer Science
| cjeffery@cs.arizona.edu -or- {noao allegra}!arizona!cjeffery
--

From gudeman  Sun Apr 29 21:36:32 1990
Resent-From: "David Gudeman" <gudeman>
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA12514; Sun, 29 Apr 90 21:36:32 MST
Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Sun, 29 Apr 90 21:37 MST
Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA12501; Sun, 29 Apr 90
 21:36:06 MST
Resent-Date: Sun, 29 Apr 90 21:38 MST
Date: Sun, 29 Apr 90 21:36:06 MST
From: David Gudeman <gudeman@cs.arizona.edu>
Subject: hell
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A0DE8DB29D9FA09240@Arizona.EDU>
Message-Id: <9004300436.AA12501@megaron.cs.arizona.edu>
In-Reply-To: Richard Goerwitz's message of Sun, 29 Apr 90 17:41:26 CDT
 <9004292241.AA02634@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

   From: Richard Goerwitz <goer@sophist.uchicago.EDU>

   I recall being a bit surprised several years ago when David Gudeman ex-
   pressed a dislike for coroutines.

Well..., more accurately I _prefer_ to use recursive procedures.  On
the other hand, there are certainly some problems that are more
conveniently solved with coroutines.  My preference for recursive
procedures is largely based on the fact that _I_ find them easier to
understand than I do coroutines.  I can rationalize this rather
selfish attitude by noting that this is probably typical: it is likely
that most people find procedures easier to understand.  So if your
code is going to be read by others, it is probably best not to go
looking for places to use coroutines.

People generally learn to use recursive procedures long before they
ever hear of coexpressions, and (like me) they are too lazy to spend a
lot of time with a new construction when most problems can be
adequately solved without it.  In some ways this is similar to the
resistance people have against learning a new programming language.

We are probably poorer for our specialization, and given your unusual
experience in learning Icon, you surely have some unique perspectives
to contribute.  In particular, it is interesting to see how coroutines
are used by someone who never developed a strong prejudice in favor of
procedures.

From @mirsa.inria.fr:ol@cerisi.cerisi.Fr  Sun Apr 29 23:05:03 1990
Received: from mirsa.inria.fr by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA17108; Sun, 29 Apr 90 23:05:03 MST
Received: from cerisi.cerisi.fr by mirsa.inria.fr with SMTP
	(5.59++/IDA-1.2.8) id AA08681; Mon, 30 Apr 90 08:05:11 +0200
Message-Id: <9004300605.AA08681@mirsa.inria.fr>
Date: Mon, 30 Apr 90 08:03:44 -0100
Posted-Date: Mon, 30 Apr 90 08:03:44 -0100
From: Lecarme Olivier <ol@cerisi.cerisi.Fr>
To: icon-group@cs.arizona.edu
In-Reply-To: "Ralph Griswold"'s message of Sun, 29 Apr 90 16:08:52 MST <9004292308.AA25231@megaron.cs.arizona.edu>
Subject:  hell
Status: O

A point of history: BNF was first used in the description of Algol 60
(Algol 58, names IAL (for International Algorithmic Language) when it
was first designed, used a notation similar to that used for Fortran).
It's called Backus Normal Form because first used in a draft document by
John Backus. It is also known as Backus-Naur Form, because its final
form, used in the "Report about the Algorithmic Language Algol", was
designed by Peter Naur.

Original BNF had flaws: for example, terminal symbols were not specially
quoted, contrarily to non-terminal symbols; there was no delimiter
between rules; meta-symbols of the notation could not be used in the
language being described. The most popular notation presently is
probably EBNF (Extended Backus-Naur Form), designed by Niklaus Wirth,
which corrects these flaws and adds some more meta-operators for
expressing frequent cases (optional parts, repetitions, and so on).


			    Olivier Lecarme

From goer@sophist.uchicago.EDU  Mon Apr 30 06:59:28 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA06121; Mon, 30 Apr 90 06:59:28 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Mon, 30 Apr 90 07:00 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 30 Apr 90
 08:59:54 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03346; Mon, 30 Apr 90
 08:55:43 CDT
Resent-Date: Mon, 30 Apr 90 07:01 MST
Date: Mon, 30 Apr 90 08:55:43 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: BNFs
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A08FE708BD1FA091AF@Arizona.EDU>
Message-Id: <9004301355.AA03346@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

It really is a shame that there has to be this sort of terminological
fragmentation.  I see now that Backus Naur Forms (and EBNFs) are really
just a notational device for expressing context free grammars.  Good-
ness.  What a fuss :-)!  As one poster pointed out, linguists have been
using their own practical notation for context free grammars for some
time now.  Prolog is one language that implements this notation in the
form of its definite clause grammars (which in reality are slightly more
powerful than context free grammars).  I suppose it's all a matter of
which tradition you "grew up" in.

In a previous posting, I showed how left recursion occurs in English.
There are also lots of context-dependent rules, such as verb-noun agree-
ment.  You can't use a context free phrase structure grammar, for in-
stance, to recognize a sentence that goes -

           The woman loves me.
           The woman I married loves me.
           The woman I married, who won't love me if my son and I forget
             Mother's Day, loves me.

There are two problems here:  1) We have to get the verb phrase - something
that must be defined separately from the noun phrase - to know about what
is going on in the noun phrase, namely that it is singular (hence "loves" and
not "love."  The other problem (2) is what I mentioned before, namely left re-
cursion in the grammar.  Another thing to consider is the arbitrary complexity
of the noun phrase, and the theoretically unlimited distance that separates
the main noun of that phrase (above = "woman") from the verb phrase ("loves
me").

Phrase structure grammars, BNFs, regular expressions, definite clause grammars
without indexing - whatever you happen to call them (they are all context free)
- are, to the linguist, not of most central interest.

As I hinted at above, the problem of the verb knowing what the noun before it
is doing number-wise can be solved.  In Prolog you do it using indexed gram-
mars.  In Icon, the solution looks pretty straightforward, even elegant.
Just have your matching procedures return a value.  This has the effect of
allowing nodes to have labels.  I'd tend to want to create a list, and then
have each node, if it wishes, simply return a value, which is then "put" into
a list and returned (or put into another list by the calling node, or whatever).
Anyway, all you would have to do is make sure that neither the noun phrase
nor the verb phrase conflict as to number (in this case, I'd just check to
be sure that, if the one is marked "singular," the other is as well).

Again, I'd really like to see some comments by someone who has studied the
syntax of natural languages in more detail than I.  It just seemed useful
to provide a little input from linguistics.  It will be useful in the long
run, I think, if I keep reminding computer scientists that their work has
implications in closely related fields.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From @RELAY.CS.NET,@dg-rtp.rtp.dg.com:langley@DG-RTP.DG.COM  Mon Apr 30 08:05:02 1990
Received: from relay.cs.net by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA11625; Mon, 30 Apr 90 08:05:02 MST
Received: from dg-rtp.rtp.dg.com by RELAY.CS.NET id aa23732; 30 Apr 90 11:01 EDT
Received: from bigbird.rtp.dg.com by dg-rtp.dg.com (4.20/4.7)
	id AA17366; Mon, 30 Apr 90 10:59:37 edt via SMTP
Received: by bigbird.rtp.dg.com (4.20/rtp-s01)
	id AA06810; Mon, 30 Apr 90 11:00:37 edt
Date: Mon, 30 Apr 90 11:00:37 edt
From: Mark L Langley <langley@DG-RTP.DG.COM>
Message-Id: <9004301500.AA06810@bigbird.rtp.dg.com>
Return-Receipt-To: langley@dg-rtp.dg.com
To: icon-group@cs.arizona.edu
Subject: linguism
Status: O

First, I just wanted to say that I have really enjoyed the linguistic-
related postings.

This group is quite a melting pot of Computer Scientists, Researchers,
and thinkers-in-general, centered around the unlikely vehicle of Icon.
(Now back to the show....)

Richard Goerwitz remarked
> 
> It really is a shame that there has to be this sort of terminological
> fragmentation.  I see now that Backus Naur Forms (and EBNFs) are really
> just a notational device for expressing context free grammars.  Good-
> ness.  What a fuss :-)!  As one poster pointed out, linguists have been
> using their own practical notation for context free grammars for some
> time now.  Prolog is one language that implements this notation in the
> form of its definite clause grammars (which in reality are slightly more
> powerful than context free grammars).  I suppose it's all a matter of
> which tradition you "grew up" in.

Can you say a little about the linguistic convention for describing
cfg-s?  What are it's advantages? Does it submit more readily to being
manipulated by a program?  I once wrote Icon (what else?) programs to 
manipulate a cfg, by putting it through its paces/transformations between 
normal forms, factoring left recursion, et al.

I would be interested in seeing an alternate representation than BNF
that might offer conceptual improvements.

> 
>            The woman loves me.
>            The woman I married loves me.
>            The woman I married, who won't love me if my son and I forget
>              Mother's Day, loves me.
> 
> There are two problems here:  1) We have to get the verb phrase - something
> that must be defined separately from the noun phrase - to know about what
> is going on in the noun phrase, namely that it is singular (hence "loves" and
> not "love."  The other problem (2) is what I mentioned before, namely left re-
> cursion in the grammar.  Another thing to consider is the arbitrary complexity
> of the noun phrase, and the theoretically unlimited distance that separates
> the main noun of that phrase (above = "woman") from the verb phrase ("loves
> me").
> 
> Phrase structure grammars, BNFs, regular expressions, definite clause grammars
> without indexing - whatever you happen to call them (they are all context free)
> - are, to the linguist, not of most central interest.
> 
> As I hinted at above, the problem of the verb knowing what the noun before it
> is doing number-wise can be solved.  In Prolog you do it using indexed gram-
> mars.  In Icon, the solution looks pretty straightforward, even elegant.
> Just have your matching procedures return a value.  This has the effect of
> allowing nodes to have labels.  I'd tend to want to create a list, and then
> have each node, if it wishes, simply return a value, which is then "put" into
> a list and returned (or put into another list by the calling node, or whatever).
> Anyway, all you would have to do is make sure that neither the noun phrase
> nor the verb phrase conflict as to number (in this case, I'd just check to
> be sure that, if the one is marked "singular," the other is as well).

This looks like what we compiler wonks do when we check non-syntactic
things (like whether something is declared or not...) during the translation
process.  Is there a better (more fully encompassing) formalism here?  That
is, syntax-with-ad-hoc-checking is highly impure...

> 
> Again, I'd really like to see some comments by someone who has studied the
> syntax of natural languages in more detail than I.  It just seemed useful
> to provide a little input from linguistics.  It will be useful in the long
> run, I think, if I keep reminding computer scientists that their work has
> implications in closely related fields.

So would I...

Has anyone successfully parsed
	"The policeman raised his hand and stopped the car"
		(Courtesy of R. Schank.)
	
>     -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
>     goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer
> 
> 

Mark
langley@dg-rtp.dg.com

From sboisen@BBN.COM  Mon Apr 30 09:17:00 1990
Message-Id: <9004301617.AA16804@megaron.cs.arizona.edu>
Received: from RIGEL.BBN.COM by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA16804; Mon, 30 Apr 90 09:17:00 MST
To: langley@dg-rtp.dg.com
Cc: icon-group@cs.arizona.edu
In-Reply-To: Mark L Langley's message of Mon, 30 Apr 90 11:00:37 edt <9004301500.AA06810@bigbird.rtp.dg.com>
Subject: linguism
From: Sean Boisen <sboisen@BBN.COM>
Sender: sboisen@BBN.COM
Reply-To: sboisen@BBN.COM
Date: Mon, 30 Apr 90 12:15:00 EDT
Status: O

> Can you say a little about the linguistic convention for describing
> cfg-s?  What are it's advantages? Does it submit more readily to being
> manipulated by a program?  I once wrote Icon (what else?) programs to 
> manipulate a cfg, by putting it through its paces/transformations between 
> normal forms, factoring left recursion, et al.
> 
> I would be interested in seeing an alternate representation than BNF
> that might offer conceptual improvements.
> 

As someone working in Natural Language Processing (NLP), i'll jump
into the fray:

There really is no single linguistic convention for representing CFGs,
and it's not at all clear that CFGs are sufficiently powerful for
representing NL (one classic case for this argument is a construction
like "John, Bill, and Fred love Mary, Sally, and Julie,
respectively"). Most current lingustic theories use grammars whose
power is somewhere between context-free and context-sensitive,
inclusive. In addition to strictly parsing, there are also the
problems of building semantic representations (since it usually
doesn't do much good to simply represent the structure of a sentence:
you want to know what it *means*). 

Note to Richard Goerwitz: left-recursion is only a problem if you are
parsing top-down, and that's not a foregone conclusion. In fact, many
very good NLP systems use bottom-up parsing for independent reasons. 

> This looks like what we compiler wonks do when we check non-syntactic
> things (like whether something is declared or not...) during the translation
> process.  Is there a better (more fully encompassing) formalism here?  That
> is, syntax-with-ad-hoc-checking is highly impure...
> 

One popular approach these days is unification-based formalisms, where
the agreement checking is at least not quite so ad hoc. DCGs under
Prolog are a well-known instance, although one can also do unification
in many other languages (we use a unification-based formalism in
Lisp). 

> Has anyone successfully parsed
> 	"The policeman raised his hand and stopped the car"
> 		(Courtesy of R. Schank.)

This isn't really a parsing problem (this sentence is pretty clearly a
conjoined verb phrase with a single subject noun phrase), but a
problem of semantics: the raised hand "primes" one to think that he
stopped it by contacting the car, but the pragmatics of this don't
work. We don't work on any traffic domains :-) but this doesn't seem
all that problematic to the generally-naive semantics of most of
today's NLP systems. 

Hope this is helpful.

........................................
Sean Boisen -- sboisen@bbn.com
BBN Systems and Technologies Corporation, Cambridge MA
Disclaimer: these opinions void where prohibited by lawyers.

From goer@sophist.uchicago.EDU  Mon Apr 30 09:32:04 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA17729; Mon, 30 Apr 90 09:32:04 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Mon, 30 Apr 90 09:33 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 30 Apr 90
 11:32:14 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03522; Mon, 30 Apr 90
 11:28:01 CDT
Resent-Date: Mon, 30 Apr 90 09:33 MST
Date: Mon, 30 Apr 90 11:28:01 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: encompassing formalism (stealing from Prolog)
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A07A95FC09FFA0900F@Arizona.EDU>
Message-Id: <9004301628.AA03522@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

	Can you say a little about the linguistic convention for describing
	cfg-s?  What are it's advantages? Does it submit more readily to being
	manipulated by a program?

You know, I should throw in here that linguists are pretty bad about notation
as well.  On of the tremendous annoyances a researcher in a specific language
field runs into is the proliferation of formalisms to represent grammars of
various kinds.  The most prominent methods are two.  The first is a notation
for context-sensitive grammars.  For instance, in Akkadian, you have a rule
which deletes short vowels in open syllables if preceded by another syllable
with a short vowel and followed by another syllable (below, V: is a long vowel,
while V is short):

     V -> 0 / CVC_CV

I.e. a short vowel, V, goes to nut'n (zero) in the context consonant, vowel,
consonant, ___, consonant, vowel (where the blank is our vowel).  Linguists
try to keep one symbol on the right, thus preventing the notation from lean-
ing towards an unrestricted rewrite system.

The other formalism is more process-oriented (hence it is ironic that Prolog
is the main language which implements it).  You say -

      S -> NP VP

and what not (i.e. "a sentence may be broken down into a noun phrase and a
verb phrase," or, depending on your model, "a sentence consists of" [more
like a definition).  Anyway, it's pretty much like the Backus-Naur formal-
ism, though it is less well-defined, and tends to get modified ad hoc by
everyone who uses it.

Particularly interesting for us here is the way Prolog implements this for-
malism.  Assuming the Prolog you use implements definite clause grammar no-
tation, you can say,

      s --> np(Number), vp(Number).
      np(Number) --> det, n(Number).
      vp(Number) --> v, np.
      det --> [the].
      n(singular) --> [woman]
      n(plural) --> [husbands]
      v(singular) --> [love]

etc.  You get the idea.  I don't see any reason that this couldn't be im-
plemented EASILY in Icon.  And Icon has some neat advantages over Prolog,
such as very good string handling.  There needs to be some research on just
how far these indexed grammars can represent natural languages.  They are
kind of an intemediate creature on the hierarchy of grammar types - some-
thing that has not really been studied a great deal.

Recently a formalism called PATR has been evolved.  This formalism has been
implemented in Prolog and Lisp.  PATR is an extension, so far as I can see,
of this definite clause notation we seen in Prolog.  I have toyed with doing
it in Icon.  The question is whether to create a PATR -> Icon translator
that outputs code that must be translated and linked, or if it should be
permit the user to "consult" a database at run-time.

In my find_re posting (which, incidentally had yet a few small bugs, since
after all it *was* version 0.9; it is very usable, though, and I hope that
people test it) I adopted the run-time approach.  Basically I just did what
regex does - it compiles a string representation of a finite state transi-
tion network into an automaton, stored in a small table (and which can be
eval()'d any time).  I dunno what would be best with PATR.  Both facilities
would be nice.  Icon is a very good language for natural language proces-
sing, and I would like very much to see it gain greater popularity in a
field now dominated by (lots (of (and (extremely annoying) (unintuitive))
parentheses)).





  I once wrote Icon (what else? programs to 
	manipulate a cfg, by putting it through its paces/transformations between 
	normal forms, factoring left recursion, et al.
	
	I would be interested in seeing an alternate representation...
	
	> 
	>            The woman loves me.
	>            The woman I married loves me.
	>            The woman I married, who won't love me if my son and I forget
	>              Mother's Day, loves me.
	> 
	> There are two problems here:  1) We have to get the verb phrase - something
	> that must be defined separately from the noun phrase - to know about what
	> is going on in the noun phrase, namely that it is singular (hence "loves" and
	> not "love."  The other problem (2) is what I mentioned before, namely left re-
	> cursion in the grammar.  Another thing to consider is the arbitrary complexity
	> of the noun phrase, and the theoretically unlimited distance that separates
	> the main noun of that phrase (above = "woman") from the verb phrase ("loves
	> me").
	> 
	> Phrase structure grammars, BNFs, regular expressions, definite clause grammars
	> without indexing - whatever you happen to call them (they are all context free)
	> - are, to the linguist, not of most central interest.
	> 
	> As I hinted at above, the problem of the verb knowing what the noun before it
	> is doing number-wise can be solved.  In Prolog you do it using indexed gram-
	> mars.  In Icon, the solution looks pretty straightforward, even elegant.
	> Just have your matching procedures return a value.  This has the effect of
	> allowing nodes to have labels.  I'd tend to want to create a list, and then
	> have each node, if it wishes, simply return a value, which is then "put" into
	> a list and returned (or put into another list by the calling node, or whatever).
	> Anyway, all you would have to do is make sure that neither the noun phrase
	> nor the verb phrase conflict as to number (in this case, I'd just check to
	> be sure that, if the one is marked "singular," the other is as well).
	
	This looks like what we compiler wonks do when we check non-syntactic
	things (like whether something is declared or not...) during the translation
	process.  Is there a better (more fully encompassing) formalism here?  That
	is, syntax-with-ad-hoc-checking is highly impure...
	
	> 
	> Again, I'd really like to see some comments by someone who has studied the
	> syntax of natural languages in more detail than I.  It just seemed useful
	> to provide a little input from linguistics.  It will be useful in the long
	> run, I think, if I keep reminding computer scientists that their work has
	> implications in closely related fields.
	
	So would I...
	
	Has anyone successfully parsed
		"The policeman raised his hand and stopped the car"
			(Courtesy of R. Schank.)
		
	>     -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
	>     goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer
	> 
	> 
	
	Mark
	

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From ralph  Mon Apr 30 09:48:30 1990
Date: Mon, 30 Apr 90 09:48:30 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9004301648.AA18532@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA18532; Mon, 30 Apr 90 09:48:30 MST
To: icon-group
Subject: Version 8 of Icon for personal computers
Status: O

Version 8 of Icon for personal computers is now available.

In addition to Version 8 for MS-DOS, announced earlier, there are
now implementations for the Amiga, the Atari ST, and the Macintosh (under MPW).

There are two packages for each computer -- one contains executable binary
files and the other contains source code. 1MB of RAM is about the minimum for
successful use.

Version 8 of Icon for these computers can be obtained by anonymous FTP
to cs.arizona.edu. After connecting, cd /icon/v8. Get READ.ME there for more
information.

If you do not have FTP access or prefer to obtain diskettes and printed
documentation, Version 8 of Icon for for the computers listed above can be
ordered from:

	Icon Project
	Department of Computer Science
	Gould-Simpson Building
	The University of Arizona
	Tucson, AZ   85721

	602 621-2018 (voice)
	602 621-4246 (FAX)

Specify whether you want executable binaries, source code, or both.

The packages are $15 each, payable in US dollars to The University of Arizona
with a check written on a bank in the United States.  Orders also can be
charged to MasterCard or Visa.  The price includes shipping by parcel post
in the United States, Canada, and Mexico. Add $5 per package for air mail
delivery to other countries.

Please direct any questions to me, not to icon-project or icon-group.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph


From reid@ctc.contel.COM  Mon Apr 30 10:47:52 1990
Resent-From: reid@ctc.contel.COM
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA23971; Mon, 30 Apr 90 10:47:52 MST
Received: from ctc.contel.com (turing.ctc.contel.com) by Arizona.EDU; Mon, 30
 Apr 90 10:49 MST
Received: from demo360.ctc.contel.com by ctc.contel.com (4.0/SMI-4.0) id
 AA04565; Mon, 30 Apr 90 13:47:19 EDT
Resent-Date: Mon, 30 Apr 90 10:49 MST
Date: Mon, 30 Apr 90 13:47:19 EDT
From: reid@ctc.contel.COM
Subject: RE:  grammars, questions
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A06FFFB52ADFA095D1@Arizona.EDU>
Message-Id: <9004301747.AA04565@ctc.contel.com>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

> > It appears that Icon has no trouble representing
> > context free languages, with the restriction that there can be no left-
> > recursion in the grammar.
> 
> Certain special cases of left-recursion can be converted into looping.
> Unfortunately, this bounds backtracking so to you have to know that
> backtracking won't find other solutions.
> 
> For example, the production
> 
> s ::=  t  |  s "+" t
> 
> can be parsed with a procedure something like
> 
> procedure s()
>    x := []
>    push(x, t()) | fail
>    while ="+" do
>       push(x, t()) | stop("syntax error")
>    return x
> end
> 

Look at converting your LL1-style BNF to extended BNF (EBNF).  Three nice
things happen:

1) your grammar is shorter, much more readable and no left recursion 
   is needed

2) the implementing procedure for a nonterminal is real straight 
   forward and

3) adding attributes and semantic actions is much easier.

Tom.

Thomas F. Reid, Ph. D.                   (703)818-4505 (work)
Contel Technology Center                 (703)742-8720 (home)
15000 Conference Center Drive            Net: reid@ctc.contel.com
P.O. Box 10814  
Chantilly, Va.  22021-3808

From goer@sophist.uchicago.EDU  Mon Apr 30 12:28:33 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA02825; Mon, 30 Apr 90 12:28:33 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Mon, 30 Apr 90 12:29 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 30 Apr 90
 14:28:40 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03997; Mon, 30 Apr 90
 14:24:28 CDT
Resent-Date: Mon, 30 Apr 90 12:30 MST
Date: Mon, 30 Apr 90 14:24:28 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: clarification requested
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <A061ED019A3FA087A6@Arizona.EDU>
Message-Id: <9004301924.AA03997@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


# It has been asked of me that I clarify what I mean by implementing
# indexed grammars like Prolog's in Icon.  What follows here is a
# very simple example.  Note how Prolog has the advantage of either
# producing or recognizing.  Icon, however, has the advantage of
# providing an immediate interface with the outside world (no need
# for tokenizing or for giving input like :- s([the husband...],[])).
# Please don't anyone gripe about how incomplete the grammars are,
# or about the fact that the Prolog and Icon code are not exactly
# equivalent!  Maybe someone will kindly offer us a DCG -> Icon con-
# verter.
#
# Naturally, the Prolog is much shorter.  This is the sort of thing
# it was designed to do.
#
#   s --> np(Number), vp(Number).
#   np(Number) --> det, n(Number).
#   vp(Number) --> v(Number), np(_).
# 
# % I get so tired of "man and wife"; let's try "woman and husband" :-)
#
#   n(singular) --> [woman].
#   n(plural) --> [husbands].
#   n(plural) --> [women].
#   n(singular) --> [husband].
#   v(singular) --> [loves].
#   v(singular) --> [hates].
#   v(plural) --> [hate].
#   v(plural) --> [love].
#   det --> [the].

procedure main()
    while input_line := trim(map(!&input),',.?!')
    do write(input_line ? S())
end

procedure S()
    NP() == VP() &
    pos(0) &
    (return "yes")
    return "no"
end

procedure NP()
    DET() &
    tag := N() &
    (suspend tag)
end

procedure VP()
    tag := V() &
    NP() | &null &
    (suspend tag)
end

procedure DET()
    ="the" &
    =" " | &null &
    suspend
end

procedure N()
    suspend Nsing() | Nplur()
end

procedure Nsing()
    wordlst := ["husband","woman"]
    =!wordlst &
    =" " | &null &
    (suspend "singular")
end

procedure Nplur()
    wordlst := ["husbands","women"]
    =!wordlst &
    =" " | &null &
    (suspend "plural")
end

procedure V()
    suspend Vsing() | Vplur()
end

procedure Vsing()
    wordlst := ["loves","hates"]
    =!wordlst &
    =" " | &null &
    (suspend "singular")
end

procedure Vplur()
    wordlst := ["love","hate"]
    =!wordlst &
    =" " | &null &
    (suspend "plural")
end

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From reid@ctc.contel.com  Mon Apr 30 14:16:46 1990
Received: from turing.ctc.contel.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA11520; Mon, 30 Apr 90 14:16:46 MST
Received: from demo360.ctc.contel.com by ctc.contel.com (4.0/SMI-4.0)
	id AA05049; Mon, 30 Apr 90 17:16:55 EDT
Date: Mon, 30 Apr 90 17:16:55 EDT
From: reid@ctc.contel.com (Tom Reid  x4505)
Message-Id: <9004302116.AA05049@ctc.contel.com>
To: goer@sophist.uchicago.edu, icon-group@cs.arizona.edu
Subject: RE:  grammars, questions
Cc: reid@ctc.contel.com
Status: O

> From goer@sophist.uchicago.edu Mon Apr 30 14:58:46 1990
> From: Richard Goerwitz <goer@sophist.uchicago.edu>
> To: reid@ctc.contel.com
> Subject: RE:  grammars, questions
> 
> What are LL1-style BNFs?  It's the LL1 that I don't understand.
> You don't need to post this to the group, unless you want to
> approach this as an "in case not everyone knows what we're talk-
> ing about, here's some background" type posting.
> 
>     -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
>     goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer
> 

Richard (and others):

Sorry about that.  LL1 grammars are a subset of context-free grammars (cfgs)
that arose in (automatic) parser construction.  There are two kinds of
classical parsers: bottom-up and top-down.  Bottom-up spawned the LR, LALR,
etc. cfg-subset grammars as the largest languages recognized by 
particular kinds of pushdown automata (PDA) which recognized in linear time 
and space.  

In order to design a top-down predictive, recursive descent parser, you 
needed to restrict cfgs to what was termed ll1 grammars.  The two basic 
restrictions were that no production could have left recursion and that you 
could not have common prefixes.  The reason for both is in the following 
example.

Assume that 
	A ::= q1
	A ::= q2
	   ...
	A ::= qn
are A's productions in a grammar G.  In order for G to be LL1 (and thus have
a non ambiguous, non backtracking recursive descent parser), the FIRST
sets of q1, q2, ..., qn must be disjoint (i.e., in order to to have
backtracking, the PDA must be able to uniquely choose which A-production
to apply by looking at just the next token).  Unless the language is
trivial, the FIRST set of the left recursive production A := A .. 
would contain all the other FIRST symbols.

Oh yes, the FIRST set for a production is the set of all terminal symbols
which can begin a string derived from that symbol.


From @RELAY.CS.NET,@dg-rtp.rtp.dg.com:langley@DG-RTP.DG.COM  Tue May  1 05:51:44 1990
Received: from relay.cs.net by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA09247; Tue, 1 May 90 05:51:44 MST
Received: from dg-rtp.rtp.dg.com by RELAY.CS.NET id aa16747; 1 May 90 8:51 EDT
Received: from bigbird.rtp.dg.com by dg-rtp.dg.com (4.20/4.7)
	id AA28301; Tue, 1 May 90 08:49:18 edt via SMTP
Received: by bigbird.rtp.dg.com (4.20/rtp-s01)
	id AA19283; Tue, 1 May 90 08:50:20 edt
Date: Tue, 1 May 90 08:50:20 edt
From: Mark L Langley <langley@DG-RTP.DG.COM>
Message-Id: <9005011250.AA19283@bigbird.rtp.dg.com>
Return-Receipt-To: langley@dg-rtp.dg.com
To: icon-group@cs.arizona.edu
Subject: Challenge! & more on grammars, questions
Status: O

<the <Challenge> is at the end, skip ahead if you like...>

Richard Goerwitz asks
> > 
> > What are LL1-style BNFs?  
> 
> Richard (and others):
To which Tom Reid replied
> 
> Sorry about that.  LL1 grammars are a subset of context-free grammars (cfgs)
> that arose in (automatic) parser construction.  There are two kinds of
> classical parsers: bottom-up and top-down.  Bottom-up spawned the LR, LALR,
> etc. cfg-subset grammars as the largest languages recognized by 
> particular kinds of pushdown automata (PDA) which recognized in linear time 
> and space.  
> 
If I may add a little,...

LL(1) refers to LEFT-scan-of-input (i.e. reading left to right), producing
a LEFTmost derivation, using at most ONE token of lookahead.  Thus LR(1)
means producing a rightmost derivation.  LALR(1) means Look-ahead LR(1)
which is a technique for reducing LR parsing which are linear (though huge)
to something a lot smaller.  (The Icon parser is written in YACC, which is
an LALR parser generator.)  LALR(1) is theoretically less powerful than LR(1)
but I have never found a grammar I couldn't rewrite.

LL parsing is the same thing as recursive descent parsing.  It is generally 
thought to be more intuitive -- you can think about an LL parser as always
making forward progress by consuming one token per state.  LR parsing detects
errors as soon as is possible.  (i.e. the fewest number of tokens that
can't be something legitimate are flagged, whereas LL parsers may kick
around and not report an error right away.)

While LL parsers can't handle Left-recursion, Alternatively LR parsers 
don't like right-recursion.  It tends to overflow the internal pushdown
stack.  For example, matching parenthesis using right-recursion in LR
is bad.  Therefore in LR and LALR you should rewrite your rules to be 
left recursive.  This is usually a mechanical process, but not always.
(Consider what happens if you are expecting some action to take place
at the same time a production is matched.)

<Challenge>

There is a well-known theorem (I couldn't find it) that states that any
LL(k) grammar can be rewritten as an LL(1) grammar.  This is easy to see
because you can keep "left-factoring" productions.

Can you rewrite an arbitrary LR(k) grammar as an LR(1) grammar?

I have yet to find an LR(k) grammar that I couldn't rewrite, but I haven't
successfully proven the theorem either...  But I'm not a bright 
theoretician...

Anybody?
Mark

From @mirsa.inria.fr:ol@cerisi.cerisi.Fr  Tue May  1 11:04:23 1990
Received: from mirsa.inria.fr by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA28571; Tue, 1 May 90 11:04:23 MST
Received: from cerisi.cerisi.fr by mirsa.inria.fr with SMTP
	(5.59++/IDA-1.2.8) id AA04105; Tue, 1 May 90 20:05:37 +0200
Message-Id: <9005011805.AA04105@mirsa.inria.fr>
Date: Tue, 1 May 90 20:03:23 -0100
Posted-Date: Tue, 1 May 90 20:03:23 -0100
From: Lecarme Olivier <ol@cerisi.cerisi.Fr>
To: langley@DG-RTP.DG.COM
Cc: icon-group@cs.arizona.edu
In-Reply-To: Mark L Langley's message of Tue, 1 May 90 08:50:20 edt <9005011250.AA19283@bigbird.rtp.dg.com>
Subject: Challenge! & more on grammars, questions
Status: O

The theorem that for every LR(k) grammar with k>1 there exists an
equivalent LR(1) grammar is only stated by Waite & Goos (Compiler
construction, Springer-Verlag 1984), but it is demonstrated by Aho &
Ullman (The theory of parsing, translation, and compiling, Prentice-Hall
1972). Unfortunately, as explained by Waite & Goos, "the transformation
underlying the proof of this theorem is unsuitable for practical
purposes".


			    Olivier Lecarme

From icon-group-request@arizona.edu  Tue May  1 13:57:01 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA12403; Tue, 1 May 90 13:57:01 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 1 May 90 13:53 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA17432; Tue, 1 May 90 13:15:57
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 1 May 90 13:57 MST
Date: 1 May 90 20:00:17 GMT
From: tank!sophist!goer@handies.ucar.EDU
Subject: RE: linguism
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <9F8C8DA101FFA097D5@Arizona.EDU>
Message-Id: <9062@tank.uchicago.edu>
Organization: University of Chicago
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <9004301500.AA06810@bigbird.rtp.dg.com>,
 <9004301617.AA16804@megaron.cs.arizona.edu>
Status: O

In article <9004301617.AA16804@megaron.cs.arizona.edu> sboisen@BBN.COM writes:
>> Can you say a little about the linguistic convention for describing
>...it's not at all clear that CFGs are sufficiently powerful for
>representing NL (one classic case for this argument is a construction
>like "John, Bill, and Fred love Mary, Sally, and Julie,
>respectively").

I'd think that the real problem here is that the noun phrases must be
recursively defined (as someone pointed out, it's not problem if we
are using a bottom-up parser).

I think what the argument here is is that the two noun phrases, "John,
Bill, and Fred" and "Mary, Sally, and Julie" must be of equal length.
I dunno.  We should probably extend this statement even further.  We
should probably be saying that the two noun phrases have to have equal
length AND that each member in the first set should plausibly be able
to "love" the corresponding member in the second set.  Both of these
criteria, the length of the noun phrases, and the applicability of
the action "love" to each of the respective members are really se-
mantic considerations.

Let me explain this another way.  From the standpoint of the grammar,
there is absolutely nothing wrong with saying, "John, Bill, and Fred
love Mary, Sally, and Julie."  You can, if you want, tack on an ad-
verb, "J, B, and F love M, S, and J very much."  Whether that adverb
is "respectively" or "very much" is not important to the grammar.
The consideration that the length of the noun phrases must "make
sense" (i.e. be of the same length, and have members that can love
and be loved) is extraneous to the basic grammar.  Perhaps we should
be integrating syntax and semantics from the start.  Still, looking at
this sentence in the terms we in this group are currently discussing
parsing problems, the word "respectively" cannot be said to impose
any extraordinary new organization on a sentence.  It is just an
adverb which the speaker may or may not add, depending on what his/her
meaning is, and whether the word makes sense in the context of what
is being said.

Perhaps irrelevant side note:  How often do you really hear people
use the term, respectively, in the context you mentioned?  Just cu-
rious.  To me it is primarily an affectation of the educated, and
I rarely hear even them using it in contexts where more than two
things are being respectively-ed.  This is due to the fact that
people don't naturally think about how many members are in the
noun phrases they are using, and it's pretty easy to forget, and,
say, put four nouns in the first set, and five in the second.  The
very fact that we have to strain at this construction tells me that
it is not really a fundamental part of the grammar in the same sense
as is the fact that most sentences consist of a noun and a verb
phrase.

Point:  Many such examples where natural languages are said to re-
quire exotic parsing mechanisms in fact may not.  What they re-
quire is a way of integrating semantics more closely into syntax.
We also have to keep our eyes peeled for cases where marginal or
literary usage is thrust into the core of the grammar.  In most
such cases there is indeed a important process at work.  How-
ever, this process rarely belongs in the basic structural me-
chanisms.  In the case of "respectively," I believe the correct
interpretation resides in the interactions of syntax and seman-
tics.

I'd appreciate argument on this point, especially if it is ac-
companied by Icon code!

   -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
   goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From icon-group-request@arizona.edu  Tue May  1 18:02:30 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA29079; Tue, 1 May 90 18:02:30 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 1 May 90 18:02 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA04828; Tue, 1 May 90 17:54:00
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 1 May 90 18:04 MST
Date: 1 May 90 15:12:40 GMT
From: ntvax!leff@tut.cis.ohio-state.EDU
Subject: Reversible Assignment Problem
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <9F6A1EE99FFFA09F0F@Arizona.EDU>
Message-Id: <1990May1.151240.11020@dept.csci.unt.edu>
Organization: University of North Texas
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


Reversible assignments to global variables inside procedures are
not being reversed. 

The program below prints out three.  It should print out one.  Obviously,
the reversible assignment to z is not being reversed when test does not 
lead to an eventual success.

Why does the reversing of the assignment not take place, and what would
make it do so?

The Icon Programming Language, Chapter 11, section 11.8.2 did not shed
any light on these issues.

global z
procedure test(i)
z<-z+1
if i~=3 then fail
if i=3 then return 1
end

procedure main()
z:=0
every i:=(1 to 10) do if test(i) then write("test succeeded ",i," ",z)
end
 

From goer@sophist.uchicago.EDU  Tue May  1 19:28:53 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA04110; Tue, 1 May 90 19:28:53 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Tue, 1 May 90 19:28 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Tue, 1 May 90 21:27:23
 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA06119; Tue, 1 May 90
 21:23:09 CDT
Resent-Date: Tue, 1 May 90 19:28 MST
Date: Tue, 1 May 90 21:23:09 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: reversible assignment
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <9F5E450635BFA0A633@Arizona.EDU>
Message-Id: <9005020223.AA06119@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O


# Reversible assignments to global variables inside procedures are
# not being reversed.
#
# The program below prints out three.  It should print out one.  Obviously,
# the reversible assignment to z is not being reversed when test does not 
# lead to an eventual success.
#
# Why does the reversing of the assignment not take place, and what would
# make it do so?
#
# The Icon Programming Language, Chapter 11, section 11.8.2 did not shed
# any light on these issues.


global z

procedure test(i)

  z<-z+1
  if i~=3 then fail
  if i=3 then return 1

end



procedure main()

  z:=0        # not needed!!!!!
  every i:= 1 to 10
  do if test(i)
     then write("test succeeded ",i," ",z)

end



Okay, it appears that you are misunderstanding the purpose of reversible
assignment, and, because of this, are not quite getting the idea of how
and when to use it (or how it behaves when you do).  Don't worry.  You
aren't the only one who has had trouble with this....

Basically, Icon is set up for control backtracking.  Hence if you say

	i := 0
	i +:= |1 & i = 5

the i = 5 comparison will fail at first.  So the expression will back-
track to the i +:= |1 expression to see if it can produce another re-
sult.  The assignment itself doesn't produce another result; however
the expression |1 can (basically | before something makes Icon do it
repeatedly).  Hence it produces another 1, and that 1 is then added
(+:=) to i to make i one bigger.  This assignment operation succeeds,
and so control passes to the i = 5 comparison.  This fails, and so the
process of backtracking and incrementing is repeated once again.

Note that throughout this process, the i, when it is assigned a value,
keeps that value, even if the expression i +:= |1 is being resumed.
That is, if you add 1 to i, then go on to test whether i is equal to
5, and then resume to increment i again - if you do this, the i will
not get reset to whatever value it had before you started.  It will
keep the last assigned value.  This is what makes it get bigger every
time i +:= 1 is resumed, rather then going back to the original value
it had.  Eventually it will reach the value of 5, and the expression
as a whole will succeed.

Sometimes this feature isn't wanted.  Sometimes you don't want the
i to keep the value you gave it if it is resumed.  Let me offer a
silly example:

	str := "string of nonsensical strings"
	59 < (position <- find("string",str))
        if \position
	then write("the word \"string\" occurs after position 13")
	else write("the word \"string\" occurs before position 14")

Essentially, position will be assigned the value of the find ex-
pression, and then its value will be compared with 59.  If it is
less than or equal to 59 (which it will be every time the com-
parison takes place), then the expression (position <- find(...))
will get resumed.  When it is resumed, the former assignment of
position will get undone.  Then it will be assigned a new value,
and the comparison will be made again.  On the next resumption, find
will fail.  There are only two places where "string" occurs in str.
Because we included the reversible assignment operator, position
will be returned to its former value (namely &null), and control
will move to the next line.

I know that this example is silly, but I wanted to illustrate the
point without having to get into string scanning (the place where
reversible assignment seems most handy).  You'll eventually get to
the chapter in Griswold & Griswold on complex string processing.
The Arb() procedure is a nice little example of where you really
have to have reversible assignment.  Put in general terms, re-
versible assignment makes backtracking undo assignments.  Normally
backtracking doesn't do this.

Now, to your sample program.  I don't know exactly what you would
be using this for, but it doesn't matter.  If I say

	a <- 1

a will be assigned the value of 1.  Nothing will change this be-
cause the expression a <- 1 will not be resumed.  I have heard
the term "bounded" applied to this situation.  Whatever you call
it, it means it's done and that's it.  Even if the procedure in
which it occurs is resumed, you won't see any change.  It is only
if you set it in a context where the expression itself will be
resumed will you see any effects.  You might write, for example,

	a <- 1 & open("inputfile","r")

If the open() function fails to open "inputfile," then the ex-
pression a <- 1 will be resumed.  Since there are no generators
there, it will not produce another result, and a will be returned
to the value it had before the assignment was made.

I hope that this long-winded discursus helps you.  Basically,
I'd stay away from reversible assignment until you have gotten
past generators, and into string scanning far enough to understand
say, the Arb() program.  Your basic misconception is that if
a procedure is called in which reversible assignment occurs
the assignment will be undone.  This isn't the case.  It is 
only if the expression in which it occurs causes control to
backtrack through the assignment that it will be undone.

It's nice to see questions like this on this newsgroup.  The
surveys tell us that most Icon users call themselves beginners
or, perhaps, intermediate-ers.  I sometimes wonder whether the
fact that discussion here is dominated by people who have been
doing Icon for some time intimidates these people, or whether
they feel they are wasting bandwidth.  It's not a waste at all!
Don't be intimidated!

-Richard

From wgg@cs.washington.EDU  Tue May  1 21:01:22 1990
Resent-From: wgg@cs.washington.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA09716; Tue, 1 May 90 21:01:22 MST
Return-Path: wgg@cs.washington.EDU
Received: from june.cs.washington.edu by Arizona.EDU; Tue, 1 May 90 21:01 MST
Received: by june.cs.washington.edu (5.61/7.0jh) id AA04237; Tue, 1 May 90
 20:59:24 -0700
Resent-Date: Tue, 1 May 90 21:02 MST
Date: Tue, 1 May 90 20:59:24 -0700
From: wgg@cs.washington.EDU
Subject: RE:  reversible assignment
Resent-To: icon-group@cs.arizona.edu
To: goer@sophist.uchicago.EDU, icon-group@arizona.edu
Resent-Message-Id: <9F513282723FA0A08F@Arizona.EDU>
Message-Id: <9005020359.AA04237@june.cs.washington.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: goer@sophist.uchicago.EDU, icon-group@Arizona.edu
Status: O

Richard Goerwitz's answer is correct:  data backtracking is possible only
if the control backtracks to the expression involved. 

In the expressions

	z <- 1
	z = 10

each has an implicit semi-colon at the end:

	z <- 1;
	z = 10;

The first semicolon delimits the assignment from the comparison, and prevents
backtracking back into the assignment from the comparison if it fails.  
Thus the assignment expression is ``bounded'' by the semicolon, and cannot
be resumed once it has yielded a result. 

On the other hand, in the expression

	(z <- i) & (z = 10)

The assignment itself is not bounded, and it is possible to backtrack from the
comparison into the assignment, if the comparison fails. 

In most cases the semantics of traditional-appearing control structures in
Icon is to bound an expression so that it produces only one result.  This
prevents ``surprises'', and also avoids the overhead of often unneeded
backtracking.  Hence the control structures if-then and while-do bound their
control expressions (but not their bodies!).   Of course, it is easy to
phrase ``backtracking'' versions of these control structures:

	basic				backtracking
-------------------------------------------------------
if X then Y else Z			(X & Y) | Z

while X do Y				every X do Y

X;Y					X & Y

return X				suspend X

One could easily argue that I've chosen the wrong analogues.  (Suppose 
that the analogue for while-do resumes X only if Y fails, otherwise X
just starts over.  Consider, too, its behavior when Y contains a break.)

					Bill Griswold

From ralph  Wed May  2 09:54:57 1990
Date: Wed, 2 May 90 09:54:57 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9005021654.AA23250@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA23250; Wed, 2 May 90 09:54:57 MST
To: icon-group
Subject: icon-group mail
Status: O

There's been a lot of interesting mail to icon-group recently.  We're
encouraged to see this activity and hope it will keep up.

Please bear in mind when you're sending e-mail to icon-group that long
messages sometimes cause problems.  The main difficulty is when one
person responds to icon-group mail and includes most or all of the
text of the message toward which the response is directed.  This sometimes
makes such responses very bulky.  While this is no problem for most
persons on icon-group, it is for some.  Persons who get icon-group mail
via a modem connections may have difficulty receiving long messages and
it's also expensive for them. Long messages also may be refused by
electronic gateways.  This, for example, can prevent icon-group mail from
getting to electronic news distribution.

Please take a little extra time when composing lengthy messages to be sure
you only include relevant information.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From gudeman  Wed May  2 12:44:25 1990
Resent-From: "David Gudeman" <gudeman>
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA05435; Wed, 2 May 90 12:44:25 MST
Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Wed, 2 May 90 12:40 MST
Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA05223; Wed, 2 May 90
 12:38:12 MST
Resent-Date: Wed, 2 May 90 12:41 MST
Date: Wed, 2 May 90 12:38:12 MST
From: David Gudeman <gudeman@cs.arizona.edu>
Subject: backtracking rules (was: Reversible Assignment Problem)
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <9ECDF6D2891FA09E8B@Arizona.EDU>
Message-Id: <9005021938.AA05223@megaron.cs.arizona.edu>
In-Reply-To: ntvax!leff@tut.cis.ohio-state.EDU's message of 1 May 90 15:12:40
 GMT <1990May1.151240.11020@dept.csci.unt.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

This problem with reversible assignment is part of a larger problem
that a lot of people seem to have with Icon.  That is the problem of
understanding where backtracking "goes".  In particular, the
reversible assignment problem seems to be caused by the following
misunderstanding: 

  global x
  procedure foo()
    x <- 1
    suspend x
  end

  procedure main()
    every foo()
  end

where people think that when foo gets resumed, that resumes every
expression in foo that suspended in the past.  But once you pass the
semi-colon (an implicit one in this case) the expression before the
semicolon is no longer suspended, it is finished.  Here is a good test
for what expressions in a procedure can be resumed after a suspend:
imagine what would happen if you replaced

  suspend EXPRESSION

with 

  every write(image(EXPRESSION))

basically, the sequence that would be written is the sequence that a
calling procedure would see.  Also, any backtracking that would be
done between written values gets done between real suspensions.  If
you wrote

  procedure foo()
    x <- 1
    every write(image(x))
  end

from above, would the assignment ever get reversed?

From utah-cs!boulder!ncar.UCAR.EDU!oddjob!sophist.uchicago.edu.richard!zenu!  Thu May  3 09:55:47 1990
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA07804; Thu, 3 May 90 09:55:47 MST
Received: from boulder.UUCP by cs.utah.edu (5.61/utah-2.10-cs)
	id AA13526; Thu, 3 May 90 10:54:27 -0600
Received: by boulder.Colorado.EDU (cu-hub.890824)
Received: by ncar.ucar.EDU (5.61/ NCAR Central Post Office 04/10/90)
	id AA07950; Thu, 3 May 90 10:53:42 MDT
Received: from tank.uchicago.edu by oddjob.uchicago.edu Thu, 3 May 90 11:16:11 CDT
Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 3 May 90 11:17:09 CDT
Return-Path: <richard@zenu.uucp>
Received:  by sophist.uchicago.edu (3.2/UofC3.0)
	id AA08195; Thu, 3 May 90 11:12:52 CDT
Received: by sophist.uchicago.edu (smail2.5)
	id AA00229; 3 May 90 10:31:14 CDT (Thu)
Subject: lifetime of variables
To: icon-group@arizona.edu
Date: Thu, 3 May 90 10:31:13 CDT
X-Mailer: ELM [version 2.2 PL0]
Message-Id: <9005031031.AA00229@sophist.uchicago.edu>
From: utah-cs!boulder!sophist.uchicago.edu!richard (Richard L. Goerwitz III)
Status: O


Why is it that a procedure like

  procedure return_table()
    tbl := table()
    return tbl
  end

works.  I guess I never really thought about it before (I don't
mentally transfer Icon into equivalent constructions in other
languages).  If I had no familiarity with Icon, I'd probably way
"make tbl static or global, 'cause it'll disappear when return_
table() returns, and all you'll be left with is a pointer aiming
into the great void."

From icon-group-request@arizona.edu  Thu May  3 13:34:18 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron (5.59-1.7/15) via SMTP
	id AA22791; Thu, 3 May 90 13:34:18 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 3 May 90 13:34 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA04254; Thu, 3 May 90 13:18:55
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 3 May 90 13:34 MST
Date: 3 May 90 20:18:39 GMT
From: swrinde!cs.utexas.edu!jnino@ucsd.EDU
Subject: Differences between version 5 and version 7
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <9DFD6CB34E7FA05491@Arizona.EDU>
Message-Id: <1280@gorath.cs.utexas.edu>
Organization: U. Texas CS Dept., Austin, Texas
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I am getting started with this language. I got a book written by R. Griswold
and published in 1983 tittled "The Icon programming Language". In the preface
it is indicated that Version 5 is to be used in the book. I'd like to know
how different is Version 5 from version 7 or even version 8. Is it advisable
to go ahead and get an intro to Icon using this book?

Thanks

Jaime Nino

From goer@sophist.uchicago.EDU  Thu May  3 16:53:20 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron (5.59-1.7/15) via SMTP
	id AA05262; Thu, 3 May 90 16:53:20 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Thu, 3 May 90 16:54 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 3 May 90 18:53:45
 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA08853; Thu, 3 May 90
 18:49:25 CDT
Resent-Date: Thu, 3 May 90 16:54 MST
Date: Thu, 3 May 90 18:49:25 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: go ahead with Griswold & Griswold
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <9DE172D9C61FA0B255@Arizona.EDU>
Message-Id: <9005032349.AA08853@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

Go ahead and use Griswold & Griswold.  Icon is mostly backwards
compatible with its former incarnations.  You'll notice that co-
routines came and went and came back again from version 5 to 7.
String scanning no longer operates with simple global variables
&pos and &source.  These variables still exist.  However, their
scope is a bit different.  No need to worry about the specifics.
There are a few nice features, like faster and cleaner options
3 and 4 for sort.  We now have math functions that used to be
part of the library (sin, etc.).

In general, don't worry about the differences.  If something seems
awry - which is unlikely - post.  You certainly won't be the only
one whose had questions :-).  The version of Icon you are using
will certainly have documentation to go with it.  When you feel
comfortable enough with the language, just browse through them.
They are well-written and pretty concise.  You quickly get caught
up on the additions that have been made to the language since ver-
sion 5.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From nevin1@ihlpb.att.com  Thu May  3 19:41:49 1990
Message-Id: <9005040241.AA13224@megaron>
Received: from att-in.att.com by megaron (5.59-1.7/15) via SMTP
	id AA13224; Thu, 3 May 90 19:41:49 MST
From: nevin1@ihlpb.att.com
Date: Thu, 3 May 90 19:41 CDT
Original-From: ihlpb!nevin1 (Nevin J Liber +1 708 979 4751)
To: att!cs.arizona.edu!icon-group
Subject: Re: lifetime of variables
Status: O

>Why is it that a procedure like
>
>  procedure return_table()
>    tbl := table()
>    return tbl
>  end
>
>works. [...]
>If I had no familiarity with Icon, I'd probably way
>"make tbl static or global, 'cause it'll disappear when return_
>table() returns, and all you'll be left with is a pointer aiming
>into the great void."

[Side note:  the above is a good explanation of a very common C
programming error.]

The the table sticks around because is it is stored in that
area of memory commonly referred to as the "heap".  (This is the same
type of memory that C's malloc() function returns pointers into.)

[Note:  there are other ways of implementing call-return mechanisms
(eg: copy the object before returning), but they have other problems
associated with it.]

One purpose of a heap is to have objects survive procedure calls and
returns.  Like static variables, it has limited visibility.  However,
it differs from statics in that each call to a function like your
return_table() returns a DIFFERENT table each time.  (I don't mean to
say that if tbl were declared static that the return_table() would
return the same table each time; its behavior would not change.  What I mean
is that in the framework of a language like C, if you return a pointer to
a static you will always get the same address, while if you return a
pointer to something malloc()ed you will get a different address.)

The other purpose to having a heap is to create objects of arbitrary
size or of sizes unknown at compile time.


I hoped I haven't rambled too long.  It's been a long day. :-)

	NEVIN ":-)" LIBER  nevin1@ihlpb.ATT.COM  (708) 831-FLYS

From icon-group-request@arizona.edu  Thu May  3 19:48:11 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron (5.59-1.7/15) via SMTP
	id AA13499; Thu, 3 May 90 19:48:11 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 3 May 90 19:49 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA27182; Thu, 3 May 90 19:35:29
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Thu, 3 May 90 19:49 MST
Date: 4 May 90 01:45:12 GMT
From: uupsi!sunic!sics.se!sics!soder@rice.EDU
Subject: Icon on Sun386i
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <9DC90397F2DFA022A0@Arizona.EDU>
Message-Id: <SODER.90May4034512@basf.nmpcad.se>
Organization: nmp
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
Status: O

I have run Icon 7.5 successfully on a Sun386i for a long
time. But without co-expressions. I recently fetched Icon
8.0. There is a configuration called "sun386i" in the
distribution. I copied "rswitch.c" from the "i386_sysv"
configuration and ended up with the following "define.h":

#define SysTime <sys/time.h>
#define GetHost
#define MaxHdr 5000

#define UNIX 1

No other modifications to the "sun386i" configuration.  My
knowledge of assembler and co-expression implementation is
identical to zero, but this installation passed the
co-expression tests and seems to work fine.

Well, there is one Sun386i-specific glitch. (It was there in
7.5 too.)  Icon programs generate a random (?) return code,
except "stop" that reliably returns '1'.  Not even "exit(0)"
helps.  I've noticed that C programs that just flow out of
"main" also return "random" codes.  Can this be the problem
in "iconx"? I can't easily find out. Anyway, a C program
must end by executing "exit" to be portable.
--
----------------------------------------------------
Hakan Soderstrom             Phone: +46 (8) 752 1138
NMP-CAD                      Fax:   +46 (8) 750 8056
P.O. Box 1193                E-mail: soder@nmpcad.se
S-164 22 Kista, Sweden

From @RELAY.CS.NET,@dg-rtp.rtp.dg.com:langley@DG-RTP.DG.COM  Fri May  4 06:55:47 1990
Received: from relay.cs.net by megaron (5.59-1.7/15) via SMTP
	id AA16623; Fri, 4 May 90 06:55:47 MST
Received: from dg-rtp.rtp.dg.com by RELAY.CS.NET id aa18447; 4 May 90 9:55 EDT
Received: from bigbird.rtp.dg.com by dg-rtp.dg.com (4.20/4.7)
	id AA03537; Fri, 4 May 90 09:53:38 edt via SMTP
Received: by bigbird.rtp.dg.com (4.20/rtp-s01)
	id AA17493; Fri, 4 May 90 09:54:50 edt
Date: Fri, 4 May 90 09:54:50 edt
From: Mark L Langley <langley@DG-RTP.DG.COM>
Message-Id: <9005041354.AA17493@bigbird.rtp.dg.com>
Return-Receipt-To: langley@dg-rtp.dg.com
To: icon-group@cs.arizona.edu
Subject: Re: lifetime of variables
Status: O

Richard Goerwitz III asks
> 
> Why is it that a procedure like
> 
>   procedure return_table()
>     tbl := table()
>     return tbl
>   end
> 
> works.  I guess I never really thought about it before (I don't
> mentally transfer Icon into equivalent constructions in other
> languages).  If I had no familiarity with Icon, I'd probably way
> "make tbl static or global, 'cause it'll disappear when return_
> table() returns, and all you'll be left with is a pointer aiming
> into the great void."
> 

Ah, this is one of the great things about Icon -- Memory management
is done for you.  Dynamic storage allocation is the trick.  Imagine
two ways of using your office, playroom, or kitchen counter.

Static Storage Allocation: 
	Take something out, put it back, Take it out, put it back...
Dynamic Storage Allocation:
	Take things out, put them back when you need the space.

To make a long story short, the Icon garbage collector is responsible 
for collecting things that you no longer need.  The interpreter doles 
out memory as needed.  When it runs out, it finds all the objects that 
could still be referenced and moves them together.  This writes over all 
the objects that cannot be reached anymore, leaving space at the end.  
Between this and saying "Mother may I have some more?" to the operating 
system, it usually avoids running out of memory.

Mark

From utah-cs!boulder!ncar.UCAR.EDU!oddjob!sophist.uchicago.edu.goer!zenu!  Fri May  4 13:48:45 1990
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA24398; Fri, 4 May 90 13:48:45 MST
Received: from boulder.UUCP by cs.utah.edu (5.61/utah-2.10-cs)
	id AA06666; Fri, 4 May 90 14:48:24 -0600
Received: by boulder.Colorado.EDU (cu-hub.890824)
Received: by ncar.ucar.EDU (5.61/ NCAR Central Post Office 04/10/90)
	id AA06967; Fri, 4 May 90 14:47:06 MDT
Received: from tank.uchicago.edu by oddjob.uchicago.edu Fri, 4 May 90 15:17:42 CDT
Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 4 May 90 15:18:02 CDT
Return-Path: <goer@zenu.uucp>
Received:  by sophist.uchicago.edu (3.2/UofC3.0)
	id AA10051; Fri, 4 May 90 15:13:36 CDT
Received: by sophist.uchicago.edu (smail2.5)
	id AA00157; 4 May 90 11:18:15 PDT (Fri)
Subject: frequently-asked questions
To: icon-group@arizona.edu
Date: Fri, 4 May 90 11:18:14 PDT
X-Mailer: ELM [version 2.2 PL0]
Message-Id: <9005041118.AA00157@sophist.uchicago.edu>
From: utah-cs!boulder!sophist.uchicago.edu!goer (Richard qua goer.)
Status: O



I wonder if we might do well to accumulate a "frequently-asked
questions" list, to make things easier for people starting to learn
Icon (presumably a large portion of the Icon-group's membership).
I'll just post an entry, and if anyone wants to add to it, I'll simply
append their additions to my list.  I find that people starting to
learn Icon tend to make similar mistakes, and I end up answering the
same questions over and over.  Not that answering them is difficult or
tedious.  I just hate to see people have to find out, after hours of
debugging, that they have run into a problem that might easily have
been avoided through the use of such a list.  Take five minutes out,
and add to the list!


Problem:  Why do I get unexpected results when I initialize a table
    like this:  tbl := table([])?  What I want is to make all the keys
    in tbl have empty lists as their initial values.

Answer:  Tables, sets, and lists in Icon are handled differently than,
    say, strings, csets, and integers.  When you "dereference" a
    variable whose value is a string, cset, or integer, you get a
    string, cset or integer (nothing complicated here).  In other
    words, if you say

	i := 1
	j := i

    j will end up with a value of 1.  When the i is dereferenced, it
    produces the integer 1, and *that* is what gets assigned to j.
    With structures like lists, however, dereferencing them produces a
    "pointer" to the structure in question.  It does not produce a
    copy of the structure (for that, you have to use copy()).  This is
    why, if you say

	l1 := ["hello"]
	l2 := ["hello"]
	if l1 === l2
	then write("the same")
	else write("different")

    you will see "different" written to the screen.  In effect, you
    have created two lists which, although they bear a structural
    similarity, reside in different places in memory, and therefore
    are *different lists*.

    What is the point here?  The point is that, if you say

	tbl := table([])

    you are actually setting up tbl so that each time you insert a new
    key, it will automatically be assigned the value produced by [].
    If you had said "tbl := table(1)" this would be fine.  "1"
    produces the integer 1.  Remember, however, that [] creates a
    specific structure (an empty list) and produces a pointer to that
    list.  What you'll end up with, therefore, is a table with keys
    whose values are all pointers to the one list structure!  What
    this does to your program is make it so that if you make any
    insertions into any key's value (e.g.  tbl[key1] |||:= ["hello"]
    or insert(tbl[key1],"hello")), you will find, suddenly, that *all*
    of the keys' values have been modified.

    To make the long story short, you have to initialize the table
    using &null,

	tbl := table()    # the same as tbl := table(&null)

    and then, each time you add a key, do this:

	/tbl[key] := []   # or /tbl[key] := list()

    The above expression first checks to see whether key has been
    inserted into tbl yet, and if not, makes its value the empty list
    (the forward slash tests for the null value, and so if key is
    already present in the table, and has been assigned a value,
    tbl[key] := [] will not take place).  You can then go about
    inserting things into this list as expected.

From ralph  Sat May  5 07:19:59 1990
Date: Sat, 5 May 90 07:19:59 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9005051419.AA14654@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA14654; Sat, 5 May 90 07:19:59 MST
To: icon-group
Subject: Second Edition of the Icon programming language book
Status: O

The second edition of The Icon Programming Language is now available.

The second edition describes Version 8 of Icon (the first edition, published in
1983, describes Version 5.9).

In addition to describing Version 8, the second edition is completely revised.
Important concepts such as generators and string scanning are presented first,
allowing subsequent material to be presented in ways more natural to Icon.

New material includes a chapter on the details of running Icon programs,
more (and harder) exercises, several large sample programs, and an
expanded "mini-reference" to Icon's functions and operations.

Here's the publication information:

	The Icon Programming Language, second edition. Ralph E.  Griswold and
	Madge T. Griswold, Prentice Hall, 1990.  367 pages. $29.95.
	ISBN 0-13-447889-4.

The book can be ordered from any full-service bookstore or from the Icon
Project.  The Icon Project pays postage in the United States, Canada, and
Mexico.  There is a $13 charge for shipping to other countries, which is
by air mail.  

Orders placed with the Icon Project must be in US dollars to The University of
Arizona with a check written on a bank in the United States.  Orders also can be
charged to MasterCard or Visa.

	Icon Project
	Department of Computer Science
	Gould-Simpson Building
	The University of Arizona
	Tucson, AZ   85721

	602 621-2018 (voice)
	602 621-4246 (FAX)

Please address any questions to me, not icon-project or icon-group.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From ralph  Sat May  5 08:37:23 1990
Date: Sat, 5 May 90 07:19:59 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9005051419.AA14654@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA14654; Sat, 5 May 90 07:19:59 MST
To: icon-group
Subject: Second Edition of the Icon programming language book
Status: RO

The second edition of The Icon Programming Language is now available.

The second edition describes Version 8 of Icon (the first edition, published in
1983, describes Version 5.9).

In addition to describing Version 8, the second edition is completely revised.
Important concepts such as generators and string scanning are presented first,
allowing subsequent material to be presented in ways more natural to Icon.

New material includes a chapter on the details of running Icon programs,
more (and harder) exercises, several large sample programs, and an
expanded "mini-reference" to Icon's functions and operations.

Here's the publication information:

	The Icon Programming Language, second edition. Ralph E.  Griswold and
	Madge T. Griswold, Prentice Hall, 1990.  367 pages. $29.95.
	ISBN 0-13-447889-4.

The book can be ordered from any full-service bookstore or from the Icon
Project.  The Icon Project pays postage in the United States, Canada, and
Mexico.  There is a $13 charge for shipping to other countries, which is
by air mail.  

Orders placed with the Icon Project must be in US dollars to The University of
Arizona with a check written on a bank in the United States.  Orders also can be
charged to MasterCard or Visa.

	Icon Project
	Department of Computer Science
	Gould-Simpson Building
	The University of Arizona
	Tucson, AZ   85721

	602 621-2018 (voice)
	602 621-4246 (FAX)

Please address any questions to me, not icon-project or icon-group.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From icon-group-request@arizona.edu  Wed May  9 01:49:45 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA01904; Wed, 9 May 90 01:49:45 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 9 May 90 01:43 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA24236; Wed, 9 May 90 01:09:52
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 9 May 90 01:50 MST
Date: 9 May 90 08:08:06 GMT
From: zaphod.mps.ohio-state.edu!sdd.hp.com!uakari.primate.wisc.edu!samsung!munnari.oz.au!mudla!ok@tut.cis.ohio-state.EDU
Subject: RE: encompassing formalism (stealing from Prolog)
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <99A8C815087FA04FA2@Arizona.EDU>
Message-Id: <3948@munnari.oz.au>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <9004301628.AA03522@sophist.uchicago.edu>

In article <9004301628.AA03522@sophist.uchicago.edu>, goer@SOPHIST.UCHICAGO.EDU (Richard Goerwitz) writes:
> The other formalism is more process-oriented (hence it is ironic that
> Prolog is the main language which implements it).  You say -
>       S -> NP VP

*process*-oriented?  It's just a declarative constraint on a node in a
 =======             constituent structure:  a node labelled S may 
dominate two daughters, one labelled NP and one labelled VP, and the
one labelled NP must precede the one labelled VP.  Rules like this can
be used as easily for generation as for parsing.

> Particularly interesting for us here is the way Prolog implements this
> for malism.  Assuming the Prolog you use implements definite clause
> grammar notation, you can say,

Since there is a *freely* distributable DCG->Prolog rule-at-a-time translator
which has been broadcast over the net, I think it's safe to say that any
Prolog system which _hasn't_ got DCG rules isn't trying.  The really
exciting thing about grammar rules in Prolog is that (if you avoid cuts
and non-logical features of Prolog) you have a non-directional declarative
formalism which can be processed in a variety of ways:  one and the same
set of rules may be loaded directly as Prolog code, or parsed with a left-
corner parser, or given to a chart parser, or (and this _happens_) used
for generation.  In each case one "compiles" rather easily from rules to
Prolog.

> I don't see any reason that this couldn't be implemented EASILY in Icon.
> And Icon has some neat advantages over Prolog,
> such as very good string handling.

I think Icon is a _wonderful_ language.  But it isn't supposed to be a
declarative language.  Most of the time when I write grammar rules in
Prolog I am using them to _generate_ lists.  Some of the rest of the
time I don't know whether the code I write will generate or parse, and
have no reason to care.  In the case of PATR-II, people are writing
large grammars where one and the same grammar is used for both parsing
and generation, basically by switching control strategies in a kind of
interpreter.

Yes, Icon has very good string handling.  Anyone with substantial
string-handling problems would be crazy not to use Icon if they had
the chance.  But what has that to do with parsing?  I think that the
most important lesson I ever learned about SNOBOL was when I enthused
about it to an anthropologist, who said "it can parse sequences of
characters?  Great!  Can it parse sequences of words?  No?  Then it's
no use to me!"  That is one of (several) respects in which Icon
improves dramatically on SNOBOL:  you _can_ parse a sequence of words
in Icon using the same basic mechanisms that you use for string
scanning.  I imagine that someone writing a parser for English (or
Akkadian!) in Icon would represent a sentence as a list of (pointers
to) dictionary entries, where a dictionary entry might be a record
or quite possibly a set of "senses".

> [still talking about grammar rules in Prolog]
> There needs to be some research on just
> how far these indexed grammars can represent natural languages.

Prolog grammar rules have the full power of Turing machines,
because the additional arguments may be arbitrarily complex.
(So may the attribute/value matrices used in several current formalisms.)

> Recently a formalism called PATR has been developed.

PATR is based on the idea of "complex categories".
The label on a node of the constituent structure is taken to be,
not a simple name as in BNF, but an attribute/value matrix in which
the traditional category label itself, if there is such a thing at
all, is merely one of the attributes.  For example, instead of the
simple categories S(entence), V(erb)P(hrase), V(erb), it is common
to talk about [cat=v,bar=2], [cat=v,bar=1], [cat=v,bar=0] in order
to capture certain regularities.  For example, there is something
called the Head Feature Convention in GPSG, which basically says
that in a meaningful rule X0 -> X1 ... Xn there is a distinguished
daughter Xi called the "head" of the phrase and X0 and Xi have
certain features in common (such as 'cat' but not 'bar').

Information is passed around in PATR by a method similar to unification.
Icon can certainly implement this, but so can Pascal...  The point is
that it isn't directional.  In one use of a rule, an attribute may be
in effect copied from the parent to its head daughter; in another use
of the same rule in the same parse, the same attribute may be in
effect copied from the daughter to the parent.

The fact that PATR-II has been implemented in Lisp as well as Prolog
shows that backtracking built into the the implementation language is
not necessary.  Icon may well make a good base for such parsers and
generators, but don't expect it to have any advantage over Lisp
(other than size, and of course price...).

From cargo@tardis.cray.com  Wed May  9 08:23:13 1990
Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA19476; Wed, 9 May 90 08:23:13 MST
Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34)
	id AA22200; Wed, 9 May 90 08:54:35 CDT
Received: from zk.cray.com by hall.cray.com
	id AA20018; 4.1/CRI-3.12; Wed, 9 May 90 08:54:33 CDT
Received: by zk.cray.com
	id AA06993; 3.2/CRI-3.12; Wed, 9 May 90 08:54:42 CDT
Date: Wed, 9 May 90 08:54:42 CDT
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9005091354.AA06993@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: Icon 8.0 MS-DOS performance

A user of some Icon programs for MS-DOS written in Icon 7.0 was
asking me if V8.0 had any performance differences over V7.0.

Anybody know?

dsc

From ralph  Wed May  9 08:38:08 1990
Date: Wed, 9 May 90 08:38:08 MST
From: "Ralph Griswold" <ralph>
Message-Id: <9005091538.AA20409@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA20409; Wed, 9 May 90 08:38:08 MST
To: cargo@tardis.cray.com
Subject: Re:  Icon 8.0 MS-DOS performance
Cc: icon-group
In-Reply-To: <9005091354.AA06993@zk.cray.com>

Version 8 is faster especially with large sets and tables.  It also
has somewhat smaller structures for lists, tables, sets, and records.

Anyone using 7.0 under MS-DOS should upgrade to 8.0 for two reasons:
8.0 fixes several bugs and if you need help from the Icon Project, you'll
have to be running 8.0.

  Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721
  +1 602 621 6609   ralph@cs.arizona.edu  uunet!arizona!ralph

From icon-group-request@arizona.edu  Wed May  9 18:01:30 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA07060; Wed, 9 May 90 18:01:30 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 9 May 90 18:01 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA21791; Wed, 9 May 90 17:51:41
 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 9 May 90 18:03 MST
Date: 8 May 90 17:24:21 GMT
From: hpfcso!hpldola!schreck@hplabs.hp.COM
Subject: RE: Reversible Assignment Problem
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <9920E9BE77FFA0E30F@Arizona.EDU>
Message-Id: <1130001@hpldola.HP.COM>
Organization: HP Elec. Design Div. -ColoSpgs
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <1990May1.151240.11020@dept.csci.unt.edu>

/ hpldola:comp.lang.icon / leff@dept.csci.unt.edu (Dr. Laurence L. Leff) /  9:12 am  May  1, 1990 /

> Reversible assignments to global variables inside procedures are
> not being reversed. 

> The program below prints out three.  It should print out one.  Obviously,
> the reversible assignment to z is not being reversed when test does not 
> lead to an eventual success.

> Why does the reversing of the assignment not take place, and what would
> make it do so?

> The Icon Programming Language, Chapter 11, section 11.8.2 did not shed
> any light on these issues.

> global z
> procedure test(i)
> z<-z+1
> if i~=3 then fail
> if i=3 then return 1
> end
> 
> procedure main()
> z:=0
> every i:=(1 to 10) do if test(i) then write("test succeeded ",i," ",z)
> end

There is no reason for reversing the assignment, because the choice point
created by "<-" is never resumed.  The expression z <- z+1 succeeds.  To get
the effect you're looking for, you could substitute the following for the body
of test:

    return (z <- z+1, i = 3, 1)

When the i = 3 expression fails, the assignment statement will be resumed
and z will be restored to its original value.  Backtracking is initiated,
in this case, by a failure.

From icon-group-request@arizona.edu  Fri May 11 07:50:47 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA19955; Fri, 11 May 90 07:50:47 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 11 May 90 07:51 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA25339; Fri, 11 May 90
 07:43:49 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Fri, 11 May 90 07:52 MST
Date: 11 May 90 14:43:35 GMT
From: usc!zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!corre@ucsd.EDU
Subject: Boolean
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <97E3E4092D7FA0EDC9@Arizona.EDU>
Message-Id: <3919@uwm.edu>
Organization: University of Wisconsin-Milwaukee
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu


	Iconists don't seem to use the word "boolean" very much. As an
  emigrant from Pascal-land I guess I notice this. I suppose the
  reason is partly than boolean concepts are written into the fabric
  of the language by the succeed/fail mechanism and partly that Icon
  allows constructs that belie strict structuring but are sensible and
  useful. I have in mind such things as break and stop() which render
  unnecessary the typical WHILE.....AND NOT FINISHED of Pascal.
  Occasionally though, I find it desirable to establish a global
  variable which toggles between having a value and having the null
  value. For example, I have a program which enables printed output in
  mixed Hebrew and English, the input file being in normal English and
  transcribed Hebrew. An arbitrary symbol (I used tilde) tells the
  program that a change has taken place, and this can be recorded by
                          roman := 1         or
                          roman := &null
  In this way the program always "knows" what mode it is in, as it can
  always check \roman or /roman and maybe change it while it is about
  it:
                          if (\roman := &null)
  Pascal has a rather neat
                         ROMAN := TRUE
                           ...
                         ROMAN := NOT ROMAN
  which toggles a boolean variable. I have represented this in Icon by
                         roman := 1
                         ....
                         roman :=: other
  (where other was previously undefined.)

	Maybe some of you have evolved better ways of handling such
  issues.
--
Alan D. Corre
Department of Hebrew Studies
University of Wisconsin-Milwaukee                     (414) 229-4245
PO Box 413, Milwaukee, WI 53201               corre@csd4.csd.uwm.edu

From utah-cs!boulder!ncar.UCAR.EDU!oddjob!sophist.uchicago.edu.goer!zenu!  Sun May 13 20:49:14 1990
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA03057; Sun, 13 May 90 20:49:14 MST
Received: from boulder.UUCP by cs.utah.edu (5.61/utah-2.11-cs)
	id AA02629; Sun, 13 May 90 21:48:54 -0600
Received: by boulder.Colorado.EDU (cu-hub.890824)
Received: by ncar.ucar.EDU (5.61/ NCAR Central Post Office 04/10/90)
	id AA10385; Sun, 13 May 90 21:48:28 MDT
Received: from tank.uchicago.edu by oddjob.uchicago.edu Sun, 13 May 90 22:26:29 CDT
Received: from sophist.uchicago.edu by tank.uchicago.edu Sun, 13 May 90 22:27:35 CDT
Return-Path: <goer@zenu.uucp>
Received:  by sophist.uchicago.edu (3.2/UofC3.0)
	id AA21993; Sun, 13 May 90 22:23:05 CDT
Received: by sophist.uchicago.edu (smail2.5)
	id AA00361; 13 May 90 20:06:09 PDT (Sun)
Subject: determinism - how?
To: icon-group@arizona.edu
Date: Sun, 13 May 90 20:06:09 PDT
X-Mailer: ELM [version 2.2 PL0]
Message-Id: <9005132006.AA00361@sophist.uchicago.edu>
From: utah-cs!boulder!sophist.uchicago.edu!goer (Richard Goerwitz qua goer.)


This question is not strictly related to Icon, but since many of those
reading this group are interested in parsing strategies (whether for
natural or "artificial" languages) I felt it reasonable to seek some
guidance here.  Let me add that I'd enjoy seeing Icon code as part of
any response that might appear.

Let's say we have a regular expression like

	a*aab

(I use plain ol' regular expressions because a previous discussion has
shown me that people utilize different notational conventions,
depending on whether their training is primarily computational or
linguistic).  I'd figure that the above regular expression would
translate into a transition network having an initial state (call it
zero), with two arcs leading from it, the one labeled "a" (leading
back to itself) and the other labeled "aa" (leading to state 1).  From
state one would be another arc leading to the final state (state 2).
This arc would be labeled "b."

Problem: The resulting transition network will not convert into a
deterministic finite state automaton.  In more concrete terms, if you
were to turn a*aab loose on a string beginning with "aa," you wouldn't
know that the arc labeled "aa" lead up a "false path" until the
automaton reached the next state (1), and attempted to cross over to
state 2 (via "b").

Normally, when I am confronted with this sort of situation, I just
laugh and use a pushdown automaton of some sort.  Clearly, though, it
is possible to make this into a deterministic automaton.  All you
gotta do is turn a*aab into aaa*b.  I'd just rearrange everything I
run into in this manner were it not for the fact that things get
considerably nastier when you get involved in things like
(a*|b)(aa|b|c).

Is there some conversion method I am overlooking?

NB:  I'm coming at this from the standpoint of a student of the
humanities, and so if I am given references to computational journals,
chances are that I'll have more difficulty using them than a bit of
sample Icon (or, for that matter, C, Prolog, or even Lisp) code.  I
admit that I prefer to read Icon code, though (hence my posting to
this group).  Beggars can't be choosers, though, I guess, so I will
gladly accept any suggestions, references, or even flames that come my
way.


   -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
   goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From tenaglia@fps.mcw.edu  Mon May 14 10:23:53 1990
From: tenaglia@fps.mcw.edu
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA11961; Mon, 14 May 90 10:23:53 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.06) with UUCP 
	id AA21961; Mon, 14 May 90 12:16:59 EDT
Received: by uwm.edu; id AA02910; Mon, 14 May 90 10:58:24 -0500
Date: Mon, 14 May 90 10:58:24 -0500
Message-Id: <9005141558.AA02910@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Mon, 14 May 90 10:58:58 CDT
Apparently-To: rutgers!uwvax!cs.arizona.edu!icon-group

> 	Iconists don't seem to use the word "boolean" very much. As an
>   emigrant from Pascal-land I guess I notice this. I suppose the
>   reason is partly than boolean concepts are written into the fabric
>   of the language by the succeed/fail mechanism and partly that Icon
>   allows constructs that belie strict structuring but are sensible and
>   useful.

I have found 2 useful approaches to it. The quick and dirty method I use
for throw away one shot deals works like this.

  global toggle
  ...
  toggle := 1
  ...
  if change(state) then toggle := -toggle
  (toggle = 1) | map(item,&lcase,&ucase)

This fragment is handy for things like converting source to all upper case
except for quoted strings.

To accomplish the same thing in a more permanent program, one might use
named variables for readibility.

  global true,false,condition
  ...
  true  := 1
  false := -1
  ...
  condition := true
  ...
  if chage(state) then condition := -condition
  (condition = true)  | map(item,&lcase,&ucase)
  ... or ...
  case condition of
    {
    true : condition := false
    false: condition := true
  default: stop("Logic has ceased to function!")
    }

After looking at these, it becomes obvious that they are the same thing.
If one wanted to get very tricky with bit masks and 'exclusive or', that
might be way to handle large amounts of booleans. I haven't gotten latest
Icon book yet, so I don't know if the bit operations include a bitest()
procedure which helps process binary bit data. Here's how it might work.

procedure bitest(bitpat,boolnum)
  local i,count
  count := 0
  (*bitpat <= *boolnum) | (bitpat  := right(bitpat,*boolnum,"0"))
  (*bitpat >= *boolnum) | (boolnum := right(boolnum,*bitpat,"0"))
  every i := 1 to *bitpat do
    if bitpat[i] == boolnum[i] then count +:= 1
  if count = 0 then return "none"  
  if count = *bitpat then return "full"
  return "some"
  end

Returns the degree of bitmatch. Whether 'full', 'none', or 'some'. Or else
maybe it could return a list containing the position numbers of matches? Or
maybe 0 - 0 matches might not be included? Any other nifty variations?
  
Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From nowlin@iwtqg.att.COM  Mon May 14 12:59:15 1990
Resent-From: nowlin@iwtqg.att.COM
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA24666; Mon, 14 May 90 12:59:15 MST
Received: from att-in.att.com by Arizona.EDU; Mon, 14 May 90 12:55 MST
Resent-Date: Mon, 14 May 90 12:57 MST
Date: Mon, 14 May 90 13:57 CDT
From: nowlin@iwtqg.att.COM
Subject: RE: boolean
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Resent-Message-Id: <955DBB548ABFA108D4@Arizona.EDU>
Message-Id: <955E203F329FA1034C@Arizona.EDU>
Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268)
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu

> > 	Iconists don't seem to use the word "boolean" very much. As an
> >   emigrant from Pascal-land I guess I notice this. I suppose the
> >   reason is partly than boolean concepts are written into the fabric
> >   of the language by the succeed/fail mechanism and partly that Icon
> >   allows constructs that belie strict structuring but are sensible and
> >   useful.
> 
> I have found 2 useful approaches to it. The quick and dirty method I use
> for throw away one shot deals works like this.
> 
>   global toggle
>   ...
>   toggle := 1
>   ...
>   if change(state) then toggle := -toggle
>   (toggle = 1) | map(item,&lcase,&ucase)
> 
> This fragment is handy for things like converting source to all upper case
> except for quoted strings.

Icon doesn't need booleans.  Unset versus set or &null versus some value
handle the problem.  I know I saw someone say this before but what the hay.
A COMPLETE and SIMPLISTIC example for converting everything not enclosed in
double quotes to upper case follows:

procedure main()
	chgcase := 1
	tmp := &null

	while inline := read() do {
		outline := ""
		inline ? {
			while part := tab(upto('"')) do {
				if \chgcase then
					outline ||:= map(part,&lcase,&ucase)
				else	outline ||:= part

				outline ||:= move(1)

				chgcase :=: temp
			}
			if \chgcase then
				outline ||:= map(tab(0),&lcase,&ucase)
			else	outline ||:= tab(0)
		}
		write(outline)
	}
end

This example is to illustrate set and unset used as boolean and is not a
complete solution.  Notice that this program fails when used to print
itself.  There are other design problems too.  Fix it?

> If one wanted to get very tricky with bit masks and 'exclusive or', that
> might be way to handle large amounts of booleans. 
> 
> Returns the degree of bitmatch. Whether 'full', 'none', or 'some'. Or else
> maybe it could return a list containing the position numbers of matches? Or
> maybe 0 - 0 matches might not be included? Any other nifty variations?

Not like any boolean I ever saw.  I thought boolean implied on or off?

Jerry Nowlin (...!att!iwtqg!nowlin)

From gudeman  Mon May 14 14:08:20 1990
Date: Mon, 14 May 90 14:08:20 MST
From: "David Gudeman" <gudeman>
Message-Id: <9005142108.AA01957@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA01957; Mon, 14 May 90 14:08:20 MST
To: icon-group@cs.arizona.edu
In-Reply-To: nowlin@iwtqg.att.COM's message of Mon, 14 May 90 13:57 CDT <955E203F329FA1034C@Arizona.EDU>
Subject: boolean

How about using csets for booleans?  After all, they _do_ form a
boolean algebra.  Just subsitute

  &cset for TRUE
  '' for FALSE
  ++ for AND
  ** for OR
  ~ for NOT

Of course, this isn't very efficient...

More seriously, Pascal boolean values and operations represent an
inadequate attempt to force predicates to be functions.  This is
because Pascal does not support true predicates.  Icon doesn't support
true predicates either, but Icon's succeed/fail is closer to the pure
concept of valid/invalid than are Pascal's booleans.  There _are_ some
applications for having true/false as values, but these applications
are fairly rare, and the paradigm is easily simulated by other types
(as has been pointed out before).

From cargo@tardis.cray.com  Mon May 14 14:41:08 1990
Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA04417; Mon, 14 May 90 14:41:08 MST
Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34)
	id AA25102; Mon, 14 May 90 16:40:50 CDT
Received: from zk.cray.com by hall.cray.com
	id AA16667; 4.1/CRI-3.12; Mon, 14 May 90 16:40:47 CDT
Received: by zk.cray.com
	id AA11027; 3.2/CRI-3.12; Mon, 14 May 90 16:41:01 CDT
Date: Mon, 14 May 90 16:41:01 CDT
From: cargo@tardis.cray.com (David S. Cargo)
Message-Id: <9005142141.AA11027@zk.cray.com>
To: icon-group@cs.arizona.edu
Subject: boolean

I have found that I use one of two methods of dealing with "boolean"
operations in Icon.  One is the aforementioned use of null values
in a variable.  (I had a long time trying to memorize what operation
the / and \ operators performed.  I knew that they were for testing
for null and nonnull values, but I could never remember which did
what.  I finally developed this mnemonic device.  / slopes Up and
tests for Undefined; \ slopes Down and tests for Defined.  I
recognize that defined and undefined are not the exact Icon concepts,
but at least I can remember the operations now.)

The other way I deal with boolean operations is to use null records:

record true()
record false()
...
flag := true()
...
if type(flag) == "true"
then ...

I have also seen someone learning to program in Icon use

global false, true
...
false := "false"
true := "true"
...
etc.

dsc

From icon-group-request@arizona.edu  Tue May 15 10:01:48 1990
Resent-From: icon-group-request@arizona.edu
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA11683; Tue, 15 May 90 10:01:48 MST
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 15 May 90 09:58 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA06788; Tue, 15 May 90
 09:54:45 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Tue, 15 May 90 10:02 MST
Date: 15 May 90 16:54:40 GMT
From: usc!zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!corre@ucsd.EDU
Subject: RE: boolean
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <94AD16B4707FA1085A@Arizona.EDU>
Message-Id: <3979@uwm.edu>
Organization: University of Wisconsin-Milwaukee
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu
References: <9005142141.AA11027@zk.cray.com>

In article <9005142141.AA11027@zk.cray.com> cargo@TARDIS.CRAY.COM (David S. Cargo) writes:
>I have found that I use one of two methods of dealing with "boolean"
>operations in Icon.  One is the aforementioned use of null values
>in a variable.  (I had a long time trying to memorize what operation
>the / and \ operators performed.  I knew that they were for testing
>for null and nonnull values, but I could never remember which did
>what.  

I had the same problem. I decided that the "natural" state of a variable
is null and the "natural" slash succeeds therefor. (Who ever saw a
backslash before they saw a computer?)
--
Alan D. Corre
Department of Hebrew Studies
University of Wisconsin-Milwaukee                     (414) 229-4245
PO Box 413, Milwaukee, WI 53201               corre@csd4.csd.uwm.edu

From gmt  Tue May 15 10:18:06 1990
Date: Tue, 15 May 90 10:18:06 MST
From: "Gregg Townsend" <gmt>
Message-Id: <9005151718.AA13161@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA13161; Tue, 15 May 90 10:18:06 MST
To: icon-group
Subject: mnemonics for / and \

I remember the meanings of / and \ by the slant of the first consonant in:

	/	iZnull
	\	Notnull

I read that first in this group, but I don't know who to credit.

    Gregg Townsend / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
    +1 602 621 4325     gmt@cs.arizona.edu     110 57 16 W / 32 13 45 N / +758m

From goer@sophist.uchicago.EDU  Wed May 16 10:30:28 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA10601; Wed, 16 May 90 10:30:28 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Wed, 16 May 90 10:28 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 16 May 90
 12:27:00 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA25307; Wed, 16 May 90
 12:22:22 CDT
Resent-Date: Wed, 16 May 90 10:28 MST
Date: Wed, 16 May 90 12:22:22 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: problem:  Use records, tables, or lists?
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <93E0453CE8FFA0871E@Arizona.EDU>
Message-Id: <9005161722.AA25307@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu

I have to make a very basic decision about how I'm going to imple-
ment a lexicon in some language processing I'm doing.  It's one of
those cases where many solutions present themselves, and the prob-
lem is which solution will really prove fastest and most flexible in the
long run.  It'll take me a moment to explain, so bear with me....

Let's say that I have a program which creates a little database.
Lets say that, at run time, I'm reading in a key, and maybe between
five and ten fields of information for that key:

    language:  part_of_speech, noun
               number, singular
               gender,
               person, 3
               etc.

Now I don't want to use a record, because the record is fixed in
terms of the number of fields it has.  Though access is very fast,
and I'd LIKE to use a record, I can't rely on knowing just what
and how many fields will be filled at run-time.  I also might need
to augment the record at run-time.  The natural idea of creating a
table with the keys being lexical items (e.g. "language"), and the
values being a record, won't work.

How about we keep the table, but instead of using a record as each
key's value, use instead another table?  I dunno.  What if the lexi-
con is a couple of thousand words long?  Will thousands of tables
with just five or ten elements work out?  Maybe someone familiar
with the internals of Icon heap allocation and memory management
will offer a guess as to whether this will ultimately prove a pro-
ductive method.  Like records, tables at least offer easy access via
fields or keys.  Unlike records, though, they are easily manipulated
at run-time.  This is their big advantage.

Another possibility is to use a list to store the various fields'
values (["part_of_speech.verb","number.singular"], or the like).  It
would be fairly easy to extract the values (untested!!!):

    procedure Get_Value(key)
        return !List ? (tab(find("."))==key,move(1),tab(0))
    end

The lists would be fairly small, but changing the values of fields
would become non-trivial, and perhaps a bit slow.  So would simply
accessing them.  The above procedure is going to take some time every
time it's called.  From previous experience with Icon, I'd say
it'd be less than a twentieth the speed of a simple record access.

Anyway, my question is which of these various solutions might in
the long run prove best.  The record solution isn't workable.  The
tables and lists are fine.  I don't know which will prove better in
terms of memory/speed.  Nor do I know whether there are other solutions I
might use (other than, say, using a set rather than a list, so that
insertions are not duplicating already existing material - but then
how much overhead is there for sets, over and above what I would
expect for a list?).

Any suggestions would be welcome.  Please feel free to write me or
post.  I'm not *only* interested in high-power comments about the
nature of the underlying implementation.  I'm sure that there are
lots of things I haven't thought of.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From cjeffery  Wed May 16 12:24:47 1990
Resent-From: "Clinton Jeffery" <cjeffery>
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA19678; Wed, 16 May 90 12:24:47 MST
Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Wed, 16 May 90 12:26 MST
Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15)
 via SMTP id AA19652; Wed, 16 May 90 12:24:08 MST
Received: by caslon; Wed, 16 May 90 12:24:07 mst
Resent-Date: Wed, 16 May 90 12:26 MST
Date: Wed, 16 May 90 12:24:07 mst
From: Clinton Jeffery <cjeffery@cs.arizona.edu>
Subject: problem:  Use records, tables, or lists?
Resent-To: icon-group@cs.arizona.edu
To: goer@sophist.uchicago.EDU
Cc: icon-group@arizona.edu
Resent-Message-Id: <93CFC6EF2E5FA11CD4@Arizona.EDU>
Message-Id: <9005161924.AA08082@caslon>
In-Reply-To: Richard Goerwitz's message of Wed, 16 May 90 12:22:22 CDT
 <9005161722.AA25307@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: goer@sophist.uchicago.EDU
X-Vms-Cc: icon-group@Arizona.edu

Richard Goerwitz asks: what is the best Icon representation for lexicon
entries which consist of a variable number of name-attribute pairs?  Records
are Icon's fastest group data type.  Unfortunately, records do not work for
everything.  A table of tables would provide acceptable speed but it would
be Really Really space-consuming.  Use it only if your lexical entries have
a LOT of fields.  A table of lists would use way less space, but run slower.

If your lexicon entries have a largely common set of field names,
I would suggest a hybrid approach.  Declare a record like

    record lexentry( part_of_speech, number, gender, person , other )

And allocate a list for the "other" field only for those entries which
have exotic fields.  Records use way less space than lists, so
you might declare ALL of the common fields, and use the "other" field
only for really weird words.

You can hide the hybrid approach with a few procedures similar to the one
you suggested, or better yet write an Idol class, and share it with me!
Here's a start:

procedure Get_Value(rec,key)
  return case key of {
  "part_of_speech": rec.part_of_speech
  "number":         rec.number
  "gender":         rec.gender
  "person":         rec.person
  default:  (!(\(rec.other)) ? (tab(find("."))==key,move(1),tab(0)))
  }
end

This is still a linear search through a bunch of strings, but when you
know you are accessing one of the builtin fields, you can just access
the field directly by name, e.g. rec.part_of_speech

Hope this helps,
Clint

From nowlin@iwtqg.att.COM  Wed May 16 13:17:17 1990
Resent-From: nowlin@iwtqg.att.COM
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA24036; Wed, 16 May 90 13:17:17 MST
Received: from att-in.att.com by Arizona.EDU; Wed, 16 May 90 13:12 MST
Resent-Date: Wed, 16 May 90 13:15 MST
Date: Wed, 16 May 90 14:17 CDT
From: nowlin@iwtqg.att.COM
Subject: RE: problem: Use records, tables, or lists?
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Resent-Message-Id: <93C8E778833FA11D54@Arizona.EDU>
Message-Id: <93C9678E1FDFA11951@Arizona.EDU>
Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268)
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu

> Let's say that I have a program which creates a little database.
> Lets say that, at run time, I'm reading in a key, and maybe between
> five and ten fields of information for that key:
> 
>     language:  part_of_speech, noun
>                number, singular
>                gender,
>                person, 3
>                etc.
> 
> Now I don't want to use a record, because the record is fixed in
> terms of the number of fields it has.  Though access is very fast,
> and I'd LIKE to use a record, I can't ...
> ...
> Anyway, my question is which of these various solutions might in
> the long run prove best.  The record solution isn't workable.  The
> tables and lists are fine.  I don't know which will prove better in
> terms of memory/speed.  ...

I think the first rule of Icon programming should be:

	1) Don't worry about efficiency until you get it to work.

One of the beauties of Icon is that you can get this kind of application
going in a flash.  If you find efficiency in speed or memory a problem then
the real decision should be if you want to leave it in Icon so it's easy to
maintain (or if Icon is your only choice) or if you want to convert it to
something like C and really make it efficient.

This application sounds like a real good use for objects with inheritance. 
A base class of words could have attributes like length and frequency of
use.  The sub-class noun could have attributes like plural or singular and
the sub-class verb could have attributes like tense.  You get the idea.

I've done some pseudo class stuff like this with records of records in
Icon.  You can included function as fields in records that use other fields
in the record as arguments.

	record word(word,length,freq,details)
	record noun(number,gender,...)
	record verb(tense,object,...)

With this data layout a noun or verb record could be assigned to the
details member of a word record and another word.noun record could be
assigned to the object member of a verb record?  This example assumes I
know something about English grammar which could suffer the fate of most
assumptions.

Maybe the Idol language (which I've heard about but haven't looked into
yet...I'm busy) actually has this kind of feature built in.  Does it?

Jerry

From tenaglia@fps.mcw.edu  Wed May 16 14:09:46 1990
Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA29152; Wed, 16 May 90 14:09:46 MST
Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.06) with UUCP 
	id AA09621; Wed, 16 May 90 16:45:02 EDT
Received: by uwm.edu; id AA08551; Wed, 16 May 90 15:38:13 -0500
Message-Id: <9005162038.AA08551@uwm.edu>
Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail);
          Wed, 16 May 90 14:59:07 CDT
Received: by mis.mcw.edu (DECUS UUCP w/Smail);
          Wed, 16 May 90 14:41:06 CDT
Date: Wed, 16 May 90 14:41:06 CDT
From: Chris Tenaglia - 257-8765 <tenaglia@mis.mcw.edu>
To: icon-group@cs.arizona.edu
Subject: Lexicon Database


Concerning the generation of a lexicon, and what are the best structures,...

I think I have played with similar concepts in a different application a long
time ago. I think a table is nice. The word in question is the entry. To the
assigned value a long delimted string with a fixed rule tree for each part of
language.

For example :   vocab := table()
                vocab["car"] := "noun,singular,neuter,A motorized vehicle"
                vocab["cows"]:= "noun,plural,female,Female Cattle"
                vocab["red"] := "adjective,Color"
                vocab["a"]   := "article,singular,Indefinite article for one"
                vocab["any"] := "article,plural,Indefinate article for many"
                vocab["paint"]:="verb,transitive,Apply a colored fluid"
                vocab["jump"]:= "verb,intransitive,Hop over"

Depending on the eventual application, this may or may not work. One tranvesty
generator I wrote, used separate lists loaded from files for each part of the
language. It was pretty random and useless. The structure above, is more
flexible, and easy to parse.

Chris Tenaglia (System Manager)
Medical College of Wisconsin
8701 W. Watertown Plank Rd.
Milwaukee, WI 53226
(414)257-8765
tenaglia@mis.mcw.edu


From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS  Wed May 16 19:28:36 1990
Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA19823; Wed, 16 May 90 19:28:36 MST
Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0)
	id AA18730; Wed, 16 May 90 22:28:12 -0400
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Wed, 16 May 90 22:27:03 EDT
Date: Wed, 16 May 90 18:36:08 EDT
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
To: icon-group@cs.arizona.edu
Message-Id: <225766@Wayne-MTS>
Subject: Booleans

 
Booleans and success/failure play complementary roles in a programming
language.  You can think of a boolean as capturing and preserving the result
of success/failure.  It's too bad that Icon doesn't have natural boolean
operators.  Of course there are umpteen ways to fake them, but none of those
ways are quite as nice as the standard boolean operators (where I include
short-circuit "and" and "or" among the standard boolean operators, as they are
in C).  But I'd much rather have a language like Icon that has success/failure
but lacks true booleans than a language like Pascal that lacks success/failure
but has true booleans.
 
My approach in SPLASH is to provide both.  The ? operator converts
success/failure to true/false, while the "is" operator converts true/false to
success/failure.  Here's an example of a SPLASH generator that illustrates
these operators.  The generator merges the output of a sequence of other
generators.
 
merge: generic(t) process(stream(*): generator(t)) yield t is
	declare
		found: boolean
	in do { % get one element from each stream
			found := false
			for i in stream'range do
				found |:= ?yield *stream(i) % yield is like suspend
			}
		until is ~found
end merge
 
Paul Abrahams

From markc%essex.ac.uk@NSFnet-Relay.AC.UK  Thu May 17 00:26:16 1990
Received: from nsfnet-relay.ac.uk by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA04423; Thu, 17 May 90 00:26:16 MST
Received: from sun.nsfnet-relay.ac.uk by vax.NSFnet-Relay.AC.UK 
           via Janet with NIFTP  id aa01072; 17 May 90 8:03 BST
Received: from ese by servax0.sx.ac.uk   SMTP/TCP  id aa15922;
          17 May 90 8:20 WET DST
From: Clark Mark <markc%essex.ac.uk@NSFnet-Relay.AC.UK>
Date: Thu, 17 May 90 08:20:24 +0100
Message-Id: <794.9005170720@ese.essex.ac.uk>
To: icon-group@cs.arizona.edu
Subject: Withdrawl

Please remove my name from your icon-group, thanks.

From nowlin@iwtqg.att.COM  Thu May 17 05:50:28 1990
Resent-From: nowlin@iwtqg.att.COM
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA18791; Thu, 17 May 90 05:50:28 MST
Received: from att-in.att.com by Arizona.EDU; Thu, 17 May 90 05:49 MST
Resent-Date: Thu, 17 May 90 05:51 MST
Date: Thu, 17 May 90 07:37 CDT
From: nowlin@iwtqg.att.COM
Subject: RE: problem: Use records, tables, or lists?
Resent-To: icon-group@cs.arizona.edu
To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu
Resent-Message-Id: <933DBD555FFFA12404@Arizona.EDU>
Message-Id: <933E0EF7CFFFA1195A@Arizona.EDU>
Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268)
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu

Regarding records and their use in holding features such as
number, gender, word-class:

I've looked over the solutions offered, and have found them
tempting.  Whatever I end up doing, I'll post the code.  It'll
end up a left corner bottom-up parser (that way I can avoid
the problem of left recursion which so often comes up in
natural languages).  I've decided, at least provisionally,
against the "record" solution.  Let me explain why.  The rea-
son might not be obvious to people working primarily with
regular languages or with programming languages that can
be handled with a deterministic pushdown automaton.

If I use records, I'll have to do things like decide what the
most often used categories will be, and then enforce a con-
stant spelling across all files (i.e. no sing, s, singular -
just one of them).  What's terrible about this is that, even
if I do remember all the naming conventions, I'll be sad-
dled with lots of extraneous record fields.  Arabic and Ugar-
itic, say, will use a dual category.  This will be superflu-
ous for English, German, Hebrew, etc. (but not, say, clas-
sical Greek).  Likewise, gender will be important for French,
German, Latin, Arabic, less so for Dutch, and hardly at all
for English.  What I'm attempting to illustrate is that, if
the system is to achieve some theoretical elegance (and a
nice, clean look, too), it's not going to be desirable to
spend any effort trying to predict what categories will be
most often used.  Even if we were talking about a single-
language system, many categories would not come into play
for a given range of constructions.

It is true that, at some point, especially when dealing with
a specific set of problems within a specific language, I
*might* find it sensible to introduce records.  However, as
Jerry Nowlin pointed out, at this stage it is important to
make it work rather than start worrying too much about
speed.

This doesn't mean that speed is not a consideration.  It is
important that I not lose sight of it.  Memory requirements
are also important (which is why I can't just use tables and
forget it).  My tendency right now is to use lists or sets.
But I dunno.  I do know that records will take me way out of
line with what the system itself needs in order to operate
cleanly and elegantly.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From esquire!info8!yost@cmcl2.NYU.EDU  Thu May 17 08:02:51 1990
Received: from NYU.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA25200; Thu, 17 May 90 08:02:51 MST
Received: by cmcl2.NYU.EDU (5.61/1.34)
	id AA07024; Thu, 17 May 90 11:03:40 -0400
Received: from info8 by ESQUIRE.DPW. id aa21254; 17 May 90 10:57 EDT
Received: from localhost by info8. (4.0/SMI-4.0)
	id AA02151; Thu, 17 May 90 10:59:56 EDT
Message-Id: <9005171459.AA02151@info8.>
From: yost@DPW.COM (Dave Yost)
Reply-To: yost@DPW.COM (Dave Yost)
To: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
Cc: icon-group@cs.arizona.edu, yost@cmcl2.NYU.EDU
Subject: Re: Booleans 
In-Reply-To: Your message of Wed, 16 May 90 18:36:08 EDT.
             <225766@Wayne-MTS> 
Phone: +1 212-266-0796 (Voice Direct Line)
Fax: +1 212-266-0790
Organization: Davis Polk & Wardwell
	      1 Chase Manhattan Plaza
	      New York, NY  10005
Date: Thu, 17 May 90 10:59:53 -0400
Sender: yost@info8.NYU.EDU

> Booleans and success/failure play complementary roles in a programming
> language.  You can think of a boolean as capturing and preserving the result
> of success/failure.  It's too bad that Icon doesn't have natural boolean
> operators.  Of course there are umpteen ways to fake them, but none of those
> ways are quite as nice as the standard boolean operators (where I include
> short-circuit "and" and "or" among the standard boolean operators, as they are
> in C).  But I'd much rather have a language like Icon that has success/failure
> but lacks true booleans than a language like Pascal that lacks success/failure
> but has true booleans.

I agree!  Is there even a convention in Icon on how to store true/false
state?  I've used both (\x) and (x = 1) and (x ~= 0) as true/false indicators.
The last two (last one preferred)  I have found better because I get the
extra benefit of a runtime error if I try to test x before it is set --
which can also be *not* what you want sometimes.  Mostly I would rather
have a syntax that says what I mean than to use a nonstandardized fake.
Someone reading the code (x ~= 0) has to look around to see if x can take
on more than two values.

 --dave yost
   yost@dpw.com or uunet!esquire!yost
   Please ignore the From or Reply-To fields above, if different.

From utah-cs!cs.utexas.edu!yale!LRW.COM!lrw!leichter  Thu May 17 08:39:49 1990
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA28306; Thu, 17 May 90 08:39:49 MST
Received: from cs.utexas.edu by cs.utah.edu (5.61/utah-2.11-cs)
	id AA17130; Thu, 17 May 90 09:39:21 -0600
Posted-Date: Thu, 17 May 90 11:08:45 EDT
Received: from ames.arc.nasa.gov by cs.utexas.edu (5.61/1.62)
	id AA16008; Thu, 17 May 90 10:35:48 -0500
Received: from harvard.harvard.edu by ames.arc.nasa.gov (5.61/1.2); Thu, 17 May 90 08:35:30 -0700
Received: by harvard.harvard.edu (5.54/a0.25)
	(for cs.utexas.edu!utah-cs!arizona!CS.Arizona.EDU!icon-group) id AA28867; Thu, 17 May 90 11:35:11 EDT
Received: from lrw.UUCP by BULLDOG.CS.YALE.EDU via UUCP; Thu, 17 May 90 11:08:10 EDT
Message-Id: <9005171508.AA01693@BULLDOG.CS.YALE.EDU>
Received: by lrw.UUCP (DECUS UUCP w/Smail);
          Thu, 17 May 90 11:08:45 EDT
Date: Thu, 17 May 90 11:08:45 EDT
From: <utah-cs!cs.utexas.edu!yale!LRW.COM!leichter>
To: CS.Arizona.EDU!icon-group@cs.utexas.edu
Subject: RE: problem: Use records, tables, or lists?
X-Vms-Mail-To: IN::"icon-group@CS.Arizona.EDU"

There's actually a standard technique to deal with this kind of problem.
Stated more generally:  You have a set of pairs of names and (attribute,value)
pairs; for example:

	{("green",{("part","adjective"),("plural","none")}),
		("dog",{("part","noun")})}

You are viewing this as a two-level hierarchy:  First you map from the name
("green") to the set of pairs {(part,adjective),(plural,none)}; then within
that set of pairs, you map an attribute ("part") to a value ("adjective").
The problem with this approach, as you've noted, is that the lower level of
the hierarchy is difficult to implement efficiently:  It consists of many
small collections, and you are forced to pay the overhead of a data structure
per collection.

So, the trick is to amortize the cost by collapsing the hierarchy.  Logically,
this involves changing the collection above to the following:

	{(("green","part"),"adjective"),(("green","plural"),"none"),
		(("dog","part"),"noun")}

That is, where you previously had two functions of one argument,
word-to-attribute-list and attribute-to-value, you now have a single function,
word-and-attribute-to-value.

In Icon terms, this is very simple:  Make a single table whose keys consist
of name-attribute pairs, which could be records or simply strings (e.g.,
"green:part"), and whose values are the values you want associated.

The advantage of this approach is that you pay the cost of table maintenance
only once.  As long as large tables are efficiently implemented, this method
will work very well.

What you lose if you use this approach is the ability to pick up all the
attribute-value pairs associated with a single word:  There is no efficient
way to extract from a table everything whose key is of the from
"green:<something>".  Depending on your application, this may not be an issue
at all, or it may be one that you can work around.  For example, you can
maintain a separate table in which the keys are words and the values are
lists of attributes.  This will work well unless you need to look up all
the attributes very often, or you change the attribute lists associated with
a given word frequently.

There are, of course, many possible optimizations.  For example, if there is
a list of common attributes which almost all words have values for, it is
probably better to store those in a separate record, and only store the extras
in the table.  Optimizing the common case often gives you most of the perfor-
mance advantage with very little of the extra cost.
							-- Jerry

From gmt  Thu May 17 10:07:02 1990
Date: Thu, 17 May 90 10:07:02 MST
From: "Gregg Townsend" <gmt>
Message-Id: <9005171707.AA04752@megaron.cs.arizona.edu>
Received: by megaron.cs.arizona.edu (5.59-1.7/15)
	id AA04752; Thu, 17 May 90 10:07:02 MST
To: icon-group
Subject: memory requirements of tables

Small tables in Icon v8 use *much* less space than in version 7.  This doesn't
necessarily make them the best approach (small records are still cheaper), but
don't make any decisions based on the old version's behavior.

    Gregg Townsend / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
    +1 602 621 4325     gmt@cs.arizona.edu     110 57 16 W / 32 13 45 N / +758m

From goer@sophist.uchicago.EDU  Fri May 18 13:49:44 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA00357; Fri, 18 May 90 13:49:44 MST
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Fri, 18 May 90 13:48 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 18 May 90
 15:21:45 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA28592; Fri, 18 May 90
 15:14:27 CDT
Resent-Date: Fri, 18 May 90 13:48 MST
Date: Fri, 18 May 90 15:14:27 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: deterministic automata
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <923202F42DFFA12DCC@Arizona.EDU>
Message-Id: <9005182014.AA28592@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu

Thanks to everyone who responded regarding converting nondeterministic
finite state automata to deterministic ones.

I read and thought about the responses for a couple of days.  The dfsa
would be too big, I gathered - at least in some instances.  And rarely
would all of it get used.  So I just built a program that makes the
nfsa, and then builds the dfsa as it goes, removing epsilon moves, and
collapsing states together on the fly.

It is true that this system works out conceptually very cleanly.  I
just found that it is slow, slow, slow.

The reason is that, in order for it to work, I could not play games that
I use with fsa's and pushdown automata, namely utilize Icon's existing
functions like find, any, match, etc.  I had to break each step down
into a node and arcs labeled with a single character.  Only in this man-
ner is it easy to find out, after removing epsilon moves, and multiplying
out the states, whether two or more arcs with the same label diverge from
a single state(-set).  I also have not found a good, clean way of using
Icon to make references to state-sets clean.  It's not a matter of making
table which stores simple integer-labeled nodes.  We're not talking about
states that can be labeled with simple integers anymore, but rather sets
of integers (corresponding to states in the nfsa).

Anyway, although converting an nfsa to a dfsa proved pretty easy, I have
yet to make it really efficient within Icon.

Having said this, let me ask if anyone else has played with dfsa's in
Icon.  Has anyone found a clean way of referring to (and checking for
the previous existence of) sets of states?  I don't even have a good way
of, say, collecting final state-sets together and storing them.  I have 
to convert them to some other data type (usually a string), and then
store them in this form.  Lotsa space.  I also have no way of easily
getting back to using any() for what corresponds to regular expressions
such as [a-z].  I'm sure I could somehow collect the arcs, check to
see whether they point to the same state-sets (with the attendant prob-
lems noted above regarding uniqueness of sets), and then make them all
point to a string equivalent to the set which they all lead to, and
then enter that string equivalent in a table....

My mind's beginning to wander as I try to fathom the overhead.  Is this
really a job I should reasonably only be doing in C, or am I just mis-
sing some shortcuts?

I have, by the way, made the nfsa work all by itself pretty nicely and
efficiently in Icon (small, too - just a couple of table entries).  It's
the dfsa that's killing me.

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer

From nevin1@ihlpb.att.com  Fri May 18 16:18:47 1990
Message-Id: <9005182318.AA10442@megaron.cs.arizona.edu>
Received: from att-in.att.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP
	id AA10442; Fri, 18 May 90 16:18:47 MST
From: nevin1@ihlpb.att.com
Date: Fri, 18 May 90 17:53 CDT
Original-From: ihlpb!nevin1 (Nevin J Liber +1 708 979 4751)
To: att!cs.arizona.edu!icon-group
Cc: Richard Goerwitz <att!sophist.uchicago.EDU!goer>
Subject: Re: deterministic automata

Richard Goerwitz <goer@sophist.uchicago.EDU> writes:
>We're not talking about
>states that can be labeled with simple integers anymore, but rather sets
>of integers (corresponding to states in the nfsa).

But you CAN still use integers (especially since Icon now has
arbitrarily large integers); all you need to do is encode the states.

You could use a binary encoding scheme, where each position corresponds
to a state in the NFA (non-deterministic finite automata).

Suppose that your DFA (deterministic finite automata) state is [0,2,3].
Converting this to a binary string, you get (from most significant bit
to least significant bit) "1101" (bits 3, 2, and 0 are "on").
This is easily converted to 13, an integer (in Icon, try converting
with integer("2r" || "1101"), or the "uglier" 2^3 + 2^2 +2^0), or
you can keep it in binary string form if that is more convenient.
[Note:  for the string form, it may or may not prove useful to pad it
out with 0's on the left so all the strings have the same length, which
would be equal to the number of states in the NFA.]

Once in this form, the problem you had with the inefficiency of having
different sets of the exact same elements goes away, and it is still
relatively easy to check and see if a given NFA state is part of a
certain DFA state (in binary string form, for example, use
dfaState[-1 - nfaState]).

>Anyway, although converting an nfsa to a dfsa proved pretty easy, I have
>yet to make it really efficient within Icon.

But how often are you going to do the conversion?  Unless you are
building these state machines on the fly, this part of the project
probably isn't worth making more efficient.

	NEVIN ":-)" LIBER  nevin1@ihlpb.ATT.COM  (708) 831-FLYS

From icon-group-request@arizona.edu  Wed May 23 12:41:08 1990
Resent-From: icon-group-request@arizona.edu
Received: from Arizona.EDU (Maggie.Telcom.Arizona.EDU) by megaron (5.61/15) via SMTP
	id AA19800; Wed, 23 May 90 12:41:08 -0700
Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 23 May 90 12:40 MST
Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA08014; Wed, 23 May 90
 12:32:25 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews for
 icon-group@arizona.edu (icon-group@arizona.edu) (contact
 usenet@ucbvax.Berkeley.EDU if you have questions)
Resent-Date: Wed, 23 May 90 12:41 MST
Date: 23 May 90 18:27:14 GMT
From: usc!snorkelwacker!ai-lab!idsardi@ucsd.EDU
Subject: graphics
Sender: icon-group-request@arizona.edu
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <8E4D882E88FF2012D2@Arizona.EDU>
Message-Id: <8683@rice-chex.ai.mit.edu>
Organization: MIT Artificial Intelligence Laboratory
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu

I'm looking to draw some trees and graphs, especially
on the Mac.  I have ProIcon and was wondering whether
there's a hack that would allow quickdraw calls to
be made on ProIcon windows.  Barring that does anyone
have character-oriented graphics, stuff that would
approximate line drawing.

Thanks,

Bill Idsardi

From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS  Thu May 24 07:34:19 1990
Received: from umich.edu by megaron (5.61/15) via SMTP
	id AA10074; Thu, 24 May 90 07:34:19 -0700
Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0)
	id AA03230; Thu, 24 May 90 10:34:18 -0400
Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Thu, 24 May 90 10:33:00 EDT
Date: Wed, 23 May 90 22:45:42 EDT
From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu
To: icon-group@cs.arizona.edu
Message-Id: <227438@Wayne-MTS>
Subject: Coexpressions revisited

 
In some earlier messages I advanced the view that there really wasn't much use
in Icon for coexpressions other than those that returned values via `suspend'.
In other words, coexpressions should be limited to the generalization of
generators that would enable them to be called from different places or from
the same place at different times.  Such coexpressions seemed capable of doing
all the "interesting" examples, such as interleaving the output of two
generators. 
 
Since then I've thought of another example that shows why the more general form
of coexpressions might be needed after all.  This example is not theoretical
at all.
 
Suppose we want to write an error-listing routine `errlist' that receives
error messages, each consisting of a page number and a string.  We want the
output of this routine to indicate all the errors on a page, listing the page
number *just once*.  Easy enough with an ordinary procedure, it seems.  But
now let's impose one restriction: `errlist' cannot use any static or
global variables to save its state from one call to another.  This restriction
is one often encountered in writing recursive procedures.  Now `errlist' has a
problem: how to know if an error message has the same page number as its
predecessor. 
 
Here's how `errlist' might be written using coexpressions(I haven't tested it):
 
record errmessage(page, text)
 
procedure errmessage()
   local prevpage, err
   repeat {
      while (err := @&source).page = \prevpage do 
         write(repl(" ", 10), err.text)
      write(right(err.page, 8), "  ", text)
      prevpage = err.page
   }
end
 
If `errlist' produced values instead of receiving them, we could make it into
a single coexpression that suspended the values in turn.  But there's no
`accept' expression corresponding to `suspend', so  `errlist' needs to use @
to pick up the message pairs, and the various callers need to use @ to
transmit those pairs.  There is a kludgy way around this: `errlist' suspends a
record object, and the caller fills in the object.  When `errlist' gets
control back, it processes whatever is in the object.  But this kludge is not
at all satisfying; not only is it awkward to use, but it requires special
handling to initiate and terminate `errlist'.
 
A more intuitive explanation of what's going on here is that connecting
coexpressions via `suspend' provides a form of input piping, rather like Unix
but generalized to allow several inputs to one filter.  Input piping usually
suffices, but in cases such as 'errlist', output piping is needed as well.
 
Paul Abrahams
abrahams%wayne-mts@um.cc.umich.edu

From pax@ihcup.att.com  Thu May 24 15:48:42 1990
Date: Thu, 24 May 90 15:48:42 -0700
From: pax@ihcup.att.com
Message-Id: <9005242248.AA14248@megaron.cs.arizona.edu>
Received: from att.UUCP by megaron.cs.arizona.edu (5.61/15) via UUCP
	id AA14248; Thu, 24 May 90 15:48:42 -0700
To: icon-group@arizona.att.com
Subject: Icon cross ref

I seem to remember some time ago that someone posted to this group
the Icon source for an Icon Cross Refrence program.  At the time I
did not save the source but would now like to have such a program.

I need to cross reference Iocn V8 procedures and global variables.

I would appreciate it very much if the provider(s) of cross reference
program(s) would send me the source to:

	att!ihcup!pax

Thanks

Joe T. Hall
AT&T Bell Laboratories
200 Park Plaza, Room IHP 2B-524
Naperville, Illinois 60566-7050
USA
att!ihcup!pax
tel: +1 708 713-7285
fax: +1 708 713-7480
tlx:157294384(JTHALL)

From buchs@Mayo.edu  Tue May 29 12:57:36 1990
Received: from fermat.Mayo.edu by megaron (5.61/15) via SMTP
	id AA04505; Tue, 29 May 90 12:57:36 -0700
Received: from FALCON.DECnet MAIL11D_V3 by fermat.Mayo.edu (5.57/Ultrix2.4-C)
	id AA16575; Tue, 29 May 90 14:48:07 CDT
Date: Tue, 29 May 90 14:48:06 CDT
Message-Id: <9005291948.AA16575@fermat.Mayo.edu>
From: buchs@Mayo.edu
To: :"icon-group@cs.arizona.edu"@FERMAT
Cc: BUCHS@fermat.Mayo.edu
Subject: beginner help


I have just started with Icon.  It looks like I really need to
get "The Icon Programming Language" to get very far.  What have
others done?  I have found a bit of info in some of the TR
documents and I could probably read through some of the library
programs to get ideas.

I am trying to parse a file with lines of backslash delimited
fields, with no trailing delimiter:

  field1\field2\field3

I thought I was on to an elegant way with the string scanning
operator:

  procedure main()
    while line := read() do {
      line ? while write(tab(find("\\")))
        do move(1)
    }
  end

But I cannot get the last field.  Any ideas?
-------------------------------------------------------------
Kevin Buchs          Internet: buchs@mayo.edu
Mayo Foundation              Is this my life or is it just an
Rochester, MN 55905          incredible, high-speed, simulation?
(507) 284-0009                         -S. R. Cleaves
-------------------------------------------------------------


From goer@sophist.uchicago.EDU  Tue May 29 16:37:23 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Arizona.EDU (Maggie.Telcom.Arizona.EDU) by megaron (5.61/15) via SMTP
	id AA21065; Tue, 29 May 90 16:37:23 -0700
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Tue, 29 May 90 16:27 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Tue, 29 May 90
 18:26:43 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA13953; Tue, 29 May 90
 18:22:07 CDT
Resent-Date: Tue, 29 May 90 16:29 MST
Date: Tue, 29 May 90 18:22:07 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <8976B8E3B55F20A047@Arizona.EDU>
Message-Id: <9005292322.AA13953@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu

  I am trying to parse a file with lines of backslash delimited
  fields, with no trailing delimiter:

    field1\field2\field3

  I thought I was on to an elegant way with the string scanning
  operator:

    procedure main()
      while line := read() do {
        line ? while write(tab(find("\\")))
          do move(1)
      }
    end

It looks to me as though, for a beginner, you have gotten pretty
far into string scanning.  Nice work.

The problem is easy to see - once, that is, you "get it."  Find()
will tell you the position in line (the subject of the string-
scanning operation) at which a backslash occurs.  Tab() will then
move you there.  This works fine for a while.  But what happens
when you've come to the last \, have move()'d past it?  You have
no backslashes left on the line to find(), and so naturally find()
fails, the loop fails, the scanning espression fails, and you
are sent back for another input string.

Try doing something like this:

while line := read() do {
  line ? {
    while write(tab(find("\\")|0))
    do move(1) | break
  }
}

The expression a|b yields all results in a, then those for b.  In
the context above, the a side is all that will normally get evalu-
ated.  In other words, when you say, tab(find(x)), find will only
look for one result.  Icon only starts backtracking and doing all
those fancy result-sequences in contexts like

    every i := tab(find(x))
    do ...

Right here we are looking for only one result.  What I've done
above is to show you how to exploit Icon's desire to find just
one result.  find() in tab(find(x)) will return an integer as
long as there is a backslash ahead of it in the current subject.
When it fails, that's it.  However, if we say

    tab(find(x)|0)

when find() fails, Icon has another expression it can try to
see if it produces another result, namely the 0.  Tab(0) means,
"go to the end of the line."  You'll note above that I put in
the expression "move(1) | break" as the do-clause associated
with tab(find("\\")|0).  All this does is make sure that if
you can't move one character (i.e. you've hit line's end), the
loop will fail.  "Move(1) | fail" means, at least in this con-
text, "try to move(1), and if you can't, try to do whatever is
on the other side of the slash (namely break out of the loop)."

This should do what you want, though I confess that I haven't
tested the code.

You might want to use the fields of your backslashed strings
in various ways, but I'd guess you best bet is to place them
in a list:

tmp_list := []
line ? {
  while put(tmp_lst,tab(find("\\")|0))
  do move(1) | break
  }
now do something with tmp_lst, like save it in a bigger list,
or permutate it, or just print it...

Get it?  Am I being too pedantic, or is this okay?

Nice job picking up string scanning so fast.


From nevin1@ihlpb.att.com  Tue May 29 17:17:59 1990
Message-Id: <9005300017.AA24779@megaron>
Received: from att-in.att.com by megaron (5.61/15) via SMTP
	id AA24779; Tue, 29 May 90 17:17:59 -0700
From: nevin1@ihlpb.att.com
Date: Tue, 29 May 90 19:16 CDT
Original-From: ihlpb!nevin1 (Nevin J Liber +1 708 979 4751)
To: att!cs.arizona.edu!icon-group
Cc: att!Mayo.edu!buchs
Subject: Re: beginner help

>I am trying to parse a file with lines of backslash delimited
>fields, with no trailing delimiter:

>  field1\field2\field3

>I thought I was on to an elegant way with the string scanning
>operator:
>
>  procedure main()
>    while line := read() do {
>      line ? while write(tab(find("\\")))
>        do move(1)
>    }
>  end
>
>But I cannot get the last field.  Any ideas?

The reason that you don't get the last field is because your expression
fails just before that point.  What happens is find() doesn't see a
backslash so it fails, and since failure is inherited, tab() fails,
write() fails, the inner while clause fails, the string scanning fails,
the do part of the outer while loop is done, and control is passed back
to the outer while clause (to read in another line).  You need to
specify what happens when find() can't find a backslash.


Here is how I would have coded it (late at night after work :-)):

	procedure main()

	local	line

	while line := read() do
		line ? while write(tab(upto('\\') | 0)) & move(1)

	end

Note:  you could use find("\\") in place of the upto('\\'), but I
prefer using upto() for three reasons:

1.  It emphasizes that you are looking for a delimiter of length 1.
2.  It allows you to look for more than one delimiter at the same time.
3.  If I recall correctly, it is the idiom found in the Icon documentation.

Anyway, here is an explanation of what goes on at the last field:  the
upto('\\') fails, so by alternation (the | operator) a write(tab(0)) is
performed, printing out the last field.  Next move(1) fails (since the
current position is at the end of the string, it cannot move over one
position to the right), the inner while clause fails, the string
scanning fails, the do part of the outer while loop is done, and
control is passed back to the outer while loop (to read in another
line).

	NEVIN ":-)" LIBER  ..!gargoyle!igloo!nevin  (708) 831-FLYS

From CELEX@HNYMPI52.BITNET  Wed May 30 03:19:39 1990
Received: from rvax.ccit.arizona.edu by megaron (5.61/15) via SMTP
	id AA22964; Wed, 30 May 90 03:19:39 -0700
Received: from HNYMPI52.BITNET by rvax.ccit.arizona.edu; Wed, 30 May 90 03:18
 MST
Date: Wed, 30 May 90 11:14 N
From: CELEX@HNYMPI52.BITNET
Subject: splitting up lines.
To: icon-group@cs.arizona.edu
Message-Id: <891BFBCD971F602E5A@rvax.ccit.arizona.edu>
X-Original-To:  icon-group@cs.arizona.edu, CELEX
X-Envelope-To: icon-group@cs.arizona.edu

I found the recent discussion on dividing a line into its fields very
interesting. A lot of our files are organized in this way, so we have to
use these techniques pretty often.
For finding the n-th field of a line we use the following procedure:
 
procedure field(line,n)
 
if n = 1 then return line[1:find("\\",line)]
every x := 1 + find("\\",line) \ (n - 1)
return line[x:find("\\",line,x)]
 
end
 
It does the job, but I wonder if this can be done quicker or more
elegantly. Any comments?
 
Marcel Bingley
 
CELEX
University of Nijmegen
Nijmegen - The Netherlands

From buchs@Mayo.edu  Wed May 30 08:14:53 1990
Received: from fermat.Mayo.edu by megaron (5.61/15) via SMTP
	id AA03402; Wed, 30 May 90 08:14:53 -0700
Received: from FALCON.DECnet MAIL11D_V3 by fermat.Mayo.edu (5.57/Ultrix2.4-C)
	id AA18237; Wed, 30 May 90 10:09:25 CDT
Date: Wed, 30 May 90 10:09:24 CDT
Message-Id: <9005301509.AA18237@fermat.Mayo.edu>
From: buchs@Mayo.edu
To: :"icon-group@cs.arizona.edu"@FERMAT
Cc: BUCHS@fermat.Mayo.edu
Subject: Re: beginner help

Thanks to everyone who helped me over my first hurdle.
-------------------------------------------------------------
Kevin Buchs          Internet: buchs@mayo.edu
Mayo Foundation              Is this my life or is it just an
Rochester, MN 55905          incredible, high-speed, simulation?
(507) 284-0009                         -S. R. Cleaves
-------------------------------------------------------------


From goer@sophist.uchicago.EDU  Wed May 30 12:23:29 1990
Resent-From: goer@sophist.uchicago.EDU
Received: from Arizona.EDU (Maggie.Telcom.Arizona.EDU) by megaron (5.61/15) via SMTP
	id AA21030; Wed, 30 May 90 12:23:29 -0700
Return-Path: goer@sophist.uchicago.EDU
Received: from tank.uchicago.edu by Arizona.EDU; Wed, 30 May 90 12:09 MST
Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 30 May 90
 14:08:04 CDT
Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA15300; Wed, 30 May 90
 14:03:28 CDT
Resent-Date: Wed, 30 May 90 12:15 MST
Date: Wed, 30 May 90 14:03:28 CDT
From: Richard Goerwitz <goer@sophist.uchicago.EDU>
Subject: splitting up lines
Resent-To: icon-group@cs.arizona.edu
To: icon-group@arizona.edu
Resent-Message-Id: <88D10B1CC47F20B3CB@Arizona.EDU>
Message-Id: <9005301903.AA15300@sophist.uchicago.edu>
X-Envelope-To: icon-group@CS.Arizona.EDU
X-Vms-To: icon-group@Arizona.edu

    I found the recent discussion on dividing a line into its fields very
    interesting. A lot of our files are organized in this way, so we have to
    use these techniques pretty often.
    For finding the n-th field of a line we use the following procedure:

    procedure field(line,n)

    if n = 1 then return line[1:find("\\",line)]
    every x := 1 + find("\\",line) \ (n - 1)
    return line[x:find("\\",line,x)]

    end

Icon is pretty good about letting us dispense with what I call
disparagingly the "ij stuff" (this isn't a joke about Dutch, by
the way :-), but rather a reference to the usual variable names
used in explicit subscripting operations).

procedure find_field(line,sep,n)

    x := 0
    line ? {
        until (x +:= 1) = n
        do tab(find(sep)+*sep) | fail
        target := tab(find(sep)|0)
     }

    return target

end

Note that sep defines the field separator.  It has to be a string.  It
would be pretty easy to have it be itself a pattern.  A few weeks ago
I posted a program called find_re that works like find above, except
that it takes an egrep-style regular expression.  I have a new version
around if anyone wants it.  I see no reason to keep on posting code,
when the old code works (it has some minor bugs) - at least until I
can prod people into trying it out and letting me know if it works as
it should.  I can only be just so imaginative on my own :-).

Is this more elegant that what you posted?  There is no disputing
matters of taste.  Take your pick.  Probably I'd use a completely
different approach, like make your field-finder a matching procedure
used in string scanning expressions.  What sort of context would you
use this in?

    -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
    goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer