From tenaglia@fps.mcw.edu Wed Jan 3 08:16:12 1990 Received: from RUTGERS.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA16236; Wed, 3 Jan 90 08:16:12 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA02559; Wed, 3 Jan 90 10:15:21 EST Received: by uwm.edu; id AA02111; Wed, 3 Jan 90 09:06:14 -0600 Message-Id: <9001031506.AA02111@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Wed, 3 Jan 90 08:31:28 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Wed, 3 Jan 90 08:08:07 CDT Date: Wed, 3 Jan 90 08:08:07 CDT From: Chris Tenaglia - 257-8765 To: icon-group@arizona.edu Subject: Handy icon procedure make hex dumps of strings Status: O I have another handy but small procedure. It's called dump(str). It's chiefly a debugging tool. It has to linked with radcon however. dump(str) converts a string of unknown bytes into a list of hexidecimal formatted ascii. The string "Hello" becomes list ["48","65","6C","6C","6F"]. This can be output nicely with the expression : every writes(!dump(str)," ") ------------------------------------------------------------------------- ################################################################## # # # THIS PROCEDURE CONVERTS A MYSTERY STRING TO A HEX DUMP LIST # # # ################################################################## procedure dump(Str) # REQUIRES LINK RADCON ! Buffer := [] every put(Buffer,right(map(radcon(ord(!Str),10,16),&lcase,&ucase),2,"0")) return Buffer end --------------------------------------------------------------------------- Perhaps it could have been done better as a generator or coexpression. Any ideas for improvements? Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From gmt Wed Jan 3 10:29:59 1990 Date: Wed, 3 Jan 90 10:29:59 MST From: "Gregg Townsend" Message-Id: <9001031729.AA22739@megaron.arizona.edu> Received: by megaron.arizona.edu (5.59-1.7/15) id AA22739; Wed, 3 Jan 90 10:29:59 MST To: icon-group Subject: The Icon project has moved Status: O The Icon project's home machine, formerly "arizona.edu", has changed its Internet domain name to "cs.arizona.edu". The uucp sitename of "arizona" has not changed. FTP files from: cs.arizona.edu (128.196.128.118 or 192.12.69.1) Send questions to: icon-project@cs.arizona.edu uunet!arizona!icon-project Mailing list contributions: icon-group@cs.arizona.edu uunet!arizona!icon-group Changes of address: icon-group-request@cs.arizona.edu uunet!arizona!icon-group-request From tenaglia@fps.mcw.edu Fri Jan 5 09:16:56 1990 Received: from RUTGERS.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA13014; Fri, 5 Jan 90 09:16:56 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA06126; Fri, 5 Jan 90 11:16:00 EST Received: by uwm.edu; id AA27525; Fri, 5 Jan 90 09:32:05 -0600 Message-Id: <9001051532.AA27525@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Fri, 5 Jan 90 08:50:10 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Fri, 5 Jan 90 08:07:21 CDT Date: Fri, 5 Jan 90 08:07:21 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: Procedures for packed decimal conversions Status: O Dear Icon-Group, Here are two other interesting procedures. pack() and unpack() deal with translating numbers to and from a packed decimal format. Packed format is used to de/compress decimal numbers. The packed numbers use just over 1/2 the space of ascii formatted numbers. I use the format for experiments with encryption. Sometimes databases may use them. unpack() requires radcon() from the icon program library. Perhaps someone would care to improve on my code either in performance or elegance? pack(num,width) packs a number into a packed decimal string. num is the number and width is the size of the target packed string. unpack(val,width) unpacks a packed decimal number into an integer. width is the width of the returned integer. ################################################################## # # # THIS PROCEDURE PACKS AN INTEGER IN TO PACKED DECIMAL STRING. # # # ################################################################## procedure pack(num,width) # 5p local int,sign,prep,packed,i,word,calc (int := integer(num)) | fail if int < 0 then sign := "=" else sign := "<" prep := int || sign ; packed := "" if (*prep % 2) ~= 0 then prep := "0" || prep every i := 1 to *prep by 2 do { word := prep[i+:2] if word[-1] == ("=" | "<") then { calc := word[1]*16 + ord(word[2])-48 packed ||:= char(calc) next } calc := word[1]*16 + word[2] packed ||:= char(calc) } /width := *packed return right(packed,width,"\0") end ################################################################## # # # THIS PROCEDURE UNPACKS A VALUE INTO AN INTEGER. # # # ################################################################## procedure unpack(val,width) # 6p REQUIRES LINK RADCON ! local tmp,number,tens,ones,sign tmp := "" ; sign := 1 every number := ord(!val) do { hex := map(radcon(number,10,16),&lcase,&ucase) tmp ||:= hex } if tmp[-1] == ("B" | "D") then sign := -1 tmp[-1] := "" ; tmp *:= sign ; /width := *tmp return right(tmp,width) end Have Fun ! Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From SHAFIE@UCBEH.SAN.UC.EDU Thu Jan 11 13:29:03 1990 Received: from ucbeh.san.uc.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA07629; Thu, 11 Jan 90 13:29:03 MST Date: Thu, 11 Jan 90 14:13 EST From: Amin Shafie - Univ of Cincinnati Comp Ctr Subject: SIGUCCS CALL for PARTICIPATION To: 386USERS@TWG.COM, 9370-L%HEARN.BITNET@MITVMA.MIT.EDU, AAI@ST-LOUIS-EMH2.ARMY.MIL, ADA-SW@WSMR-SIMTEL20.ARMY.MIL, ADVISE-L%CANADA01.BITNET@CUNYVM.CUNY.EDU, ADVSYS@EDDIE.MIT.EDU, AG-EXP-L%NDSUVM1.BITNET@CUNYVM.CUNY.EDU, AI-ED@SUMEX-AIM.STANFORD.EDU, AIDSNEWS%RUTVM1.BITNET@CUNYVM.CUNY.EDU, AIList@AI.AI.MIT.EDU, AIX-L%BUACCA.BITNET@MITVMA.MIT.EDU, ALLIN1-L@CCVM.SUNYSB.EDU, AMETHYST-USERS@WSMR-SIMTEL20.ARMY.MIL, AMIGA-RELAY@UDEL.EDU, ANDREW-DEMOS@ANDREW.CMU.EDU, ANTHRO-L%UBVM.BITNET@CUNYVM.CUNY.EDU, apollo@UMIX.CC.UMICH.EDU, ARMS-D@XX.LCS.MIT.EDU, ARPANET-BBOARDS@MC.LCS.MIT.EDU, ASM370%UCF1VM.BITNET@CUNYVM.CUNY.EDU, AVIATION@MC.LCS.MIT.EDU, AVIATION-THEORY@MC.LCS.MIT.EDU, bicycles@BBN.COM, BIG-LAN@SUVM.ACS.SYR.EDU, BIG-LAN@SUVM.BITNET, BIOTECH%UMDC.BITNET@CUNYVM.CUNY.EDU, BIOTECH@UMDC.UMD.EDU, BITNEWS%BITNIC.BITNET@CUNYVM.CUNY.EDU, BMDP-L%MCGILL1.BITNET@CORNELLC.CCS.CORNELL.EDU, bug-1100@SUMEX-AIM.STANFORD.EDU, CA@THINK.COM, CADinterest^.es@XEROX.COM, CAN-INET@MC.LCS.MIT.EDU, cisco@SPOT.COLORADO.EDU Message-Id: X-Envelope-To: Icon-Group@ARIZONA.EDU X-Vms-To: @LISTS.DIS X-Vms-Cc: SHAFIE Status: O <-------------------------------------------------------------------- < < SIGUCCS User Services Conference XVIII < Call For Participation < < New Centerings in Computing Services < < September 30 through October 3, 1990 < < Westin Hotel < Cincinnati, Ohio < < <<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << << << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << << <>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> << << < Received: from rvax.ccit.arizona.edu by megaron (5.59-1.7/15) via SMTP id AA20569; Fri, 12 Jan 90 13:47:46 MST Received: from ASUACAD.BITNET by rvax.ccit.arizona.edu; Fri, 12 Jan 90 13:43 MST Received: by ASUACAD (Mailer R2.05) id 9455; Fri, 12 Jan 90 13:37:55 MST Date: Fri, 12 Jan 90 13:37:11 MST From: mannem ravinder reddy Subject: unsubscribe To: icon-group Status: O unsubscribe icon-group From PRONK@HROEUR5.BITNET Thu Jan 18 16:01:25 1990 Message-Id: <9001182301.AA18292@megaron.arizona.edu> Received: from rvax.ccit.arizona.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA18292; Thu, 18 Jan 90 16:01:25 MST Received: from HROEUR5.BITNET by rvax.ccit.arizona.edu; Thu, 18 Jan 90 15:44 MST Date: Thu, 18 Jan 90 13:06 N From: PRONK@HROEUR5.BITNET Subject: unsubscribe To: icon-group@cs.arizona.edu X-Original-To: icon-group@cs.arizona.edu, PRONK Status: O unsubscribe From icon-group-request@arizona.edu Fri Jan 19 06:50:55 1990 Received: from ucbvax.Berkeley.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA01143; Fri, 19 Jan 90 06:50:55 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA29135; Fri, 19 Jan 90 05:40:50 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Date: 18 Jan 90 22:00:00 GMT From: amdahl!ntmtv!hildum@apple.com (Eric Hildum) Organization: Northern Telecom (Mountain View, CA) Subject: Installing Icon 7.5 on Sun 4 Message-Id: <679@ntmtv.UUCP> References: <678@ntmtv.UUCP> Sender: icon-group-request@arizona.edu To: icon-group@arizona.edu Status: O I have installed Icon 7.5 on a Sun 4 workstation, and run into some problems. The operating system is SunOS 4.0.3c, the installation was done by Bill Mitchell on November 22, 1988. After the installation, I ran the full test suite, and the gc2 and checking tests apparently did not pass. In addition, this port does not support overflow checking or co-expressions. Are these known problems, and is there a more recent port to the Sun 4 which supports overflow checking and co-expressions? Thanks, Eric replies to: ntmtv!hildum@ames.com hildum@iris.ucdavis.edu From icon-group-request@arizona.edu Fri Jan 19 06:51:00 1990 Received: from ucbvax.Berkeley.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA01158; Fri, 19 Jan 90 06:51:00 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA29153; Fri, 19 Jan 90 05:41:00 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Date: 18 Jan 90 21:53:30 GMT From: amdahl!ntmtv!hildum@apple.com (Eric Hildum) Organization: Northern Telecom (Mountain View, CA) Subject: Installing Icon on Sun3 with SunOS 4.0.3 Message-Id: <678@ntmtv.UUCP> Sender: icon-group-request@arizona.edu To: icon-group@arizona.edu Status: O I have just installed Icon 7.5 on a Sun 3/60 with SunOS 4.0.3 and have a couple of issues. First, I have to change the -m68020 switch to -sun3 to get correct compiliation. This seems to work just fine; is this change going to be made to the installation available on arizona.edu? The new default for the cc compiler is to use software floating point, rather than the switch option. Would it be reasonable to change the default sun3 installation to include the -fswitch option? The Header file now requires almost 12000 bytes. Is this reasonable? Other than these issues, everything went well, and all the tests passed. From cargo@tardis.cray.com Fri Jan 19 07:39:42 1990 Received: from uc.msc.umn.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA02802; Fri, 19 Jan 90 07:39:42 MST Received: from hall.cray.com by uc.msc.umn.edu (5.59/1.14) id AA19081; Fri, 19 Jan 90 08:37:55 CST Received: from zk.cray.com by hall.cray.com id AA04437; 3.2/CRI-3.12; Fri, 19 Jan 90 08:39:39 CST Received: by zk.cray.com id AA00765; 3.2/CRI-3.12; Fri, 19 Jan 90 08:39:35 CST Date: Fri, 19 Jan 90 08:39:35 CST From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9001191439.AA00765@zk.cray.com> To: icon-group@cs.arizona.edu Subject: concat with blank and a question Status: O I recently purchased a product called HyperPAD for the PC. It is intended to be a HyperCard-like product for the PC. What I found interesting was something in its list of operators. To concatenate two strings there is the concatenation operator &. However, there is an operator to concatenate two strings with a space in between them, the && operator. I realized that this one little feature was a a nice convenience. I know I have often used something like a || " " || b because I needed to put a space between two strings I was combining. Maybe from an overall language viewpoint this isn't a significant improvement, but I thought it was an interesting addition to a language with string processing. The question I have is: What is the best way to elminate spaces (or, to generalize, members of a particular cset) from a string while still preserving the order of the remaining characters? I'm going to be performing some comparisons where certain characters are very likely to not be significant. I'm looking for an efficient way of doing preprocessing to remove the insignificant characters. o o \_____/ /-o-o-\ _______ DDDD SSSS CCCC / ^ \ /\\\\\\\\ D D S C \ \___/ / /\ ___ \ D D SSS C \_ _/ /\ /\\\\ \ D D S C \ \__/\ /\ @_/ / DDDDavid SSSS. CCCCargo \____\____\______/ CARGO@TARDIS.CRAY.COM From goer@sophist.uchicago.edu Fri Jan 19 09:17:22 1990 Received: from tank.uchicago.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA09664; Fri, 19 Jan 90 09:17:22 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 19 Jan 90 10:17:26 CST Return-Path: Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA22197; Fri, 19 Jan 90 10:13:19 CST Date: Fri, 19 Jan 90 10:13:19 CST From: Richard Goerwitz Message-Id: <9001191613.AA22197@sophist.uchicago.edu> To: icon-group@arizona.edu Subject: strip? Status: O A snail asked: The question I have is: What is the best way to elminate spaces (or, to generalize, members of a particular cset) from a string while still preserving the order of the remaining characters? I wonder if you are referring to a stripping routine? procedure Strip(s,c) s2 := "" s ? { while s2 ||:= tab(upto(c)) do tab(many(c)) s2 ||:= tab(0) } return s2 end This will work with strings, and I suppose that type conversion will make it work with csets, too. For operations specifically having to do with csets, you can of course say c1 --:= c2 where c1 is the cset you are trying to strip down, and c2 is the cset containing the characters to be removed from it. The trouble here, though, is that, unlike strings, csets are not an ordered sequence of characters (you did say something about "original or- der," didn't you?). I guess I'm confused. If the original order is important, use Strip(s,c), and feed it strings. Does this help? -Richard L. Goerwitz goer@sophist.uchicago.edu goer%sophist@uchicago.bitnet rutgers!oddjob!gide!sophist!goer From tenaglia@fps.mcw.edu Fri Jan 19 10:20:36 1990 Received: from RUTGERS.EDU by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA13085; Fri, 19 Jan 90 10:20:36 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA25922; Fri, 19 Jan 90 12:18:26 EST Received: by uwm.edu; id AA26075; Fri, 19 Jan 90 11:17:22 -0600 Message-Id: <9001191717.AA26075@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Fri, 19 Jan 90 11:15:03 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Fri, 19 Jan 90 11:08:11 CDT Date: Fri, 19 Jan 90 11:08:11 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: RE: concat with blank and a question X-Vms-Mail-To: UUCP%"cargo@tardis.cray.com" Status: O In reply to the concat,... If you are running ICON under unix, and are adventurous, you can build in your own concat with the 'Personal Interpretor' or 'Variant Translator'. I've built a personal code library of icon software chips which I just include using the editor when I use them. Having an operator $ for example may accomplish : both := first $ second (rather then both := first || " " || second) But it still not as flexible as a procedure : procedure cat(s1,s2,s3) /s3 := " " # where an optional string may or may not appear. return s1 || s3 || s2 end --------------------------------------------------- Concerning Character Elimination in a String This is best done with the string scanning feature of ICON. I'll present a little procedure. Others may approach the problem differently. procedure strip(str,chrs) local text,chars /chrs := ' ' chars := &cset -- chrs text := "" str ? while tab(upto(chars)) do text ||:= tab(many(chars)) return text end Yours truly, Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From cargo@tardis.cray.com Fri Jan 19 16:03:37 1990 Received: from uc.msc.umn.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA09993; Fri, 19 Jan 90 16:03:37 MST Received: from hall.cray.com by uc.msc.umn.edu (5.59/1.14) id AA01870; Fri, 19 Jan 90 17:01:37 CST Received: from zk.cray.com by hall.cray.com id AA16178; 3.2/CRI-3.12; Fri, 19 Jan 90 17:03:22 CST Received: by zk.cray.com id AA01299; 3.2/CRI-3.12; Fri, 19 Jan 90 17:03:16 CST Date: Fri, 19 Jan 90 17:03:16 CST From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9001192303.AA01299@zk.cray.com> To: icon-group@cs.arizona.edu Subject: deleting characters Status: O After sending out my mail about deleting or stripping characters out of a string I received a couple of responses from kindly Iconists. I later was showing off Icon to a friend when I notices a "delete" function in strutil.icn. Upon closer examination I found that delete also did what I wanted. The next step for me was to write a little benchmarking program so I could see how the different methods compared speedwise. It turns out that the IPL delete routine was the slowest, although the fastest was only 15 percent faster. Here is the test program: procedure main() test_string := " str ? while tab(upto(chars)) do text ||:= tab(many(chars))" remove := &cset -- &lcase time1 := &time limit := 1000 every 1 to limit do result1 := delete(test_string, remove) time2 := &time every 1 to limit do result2 := strip1(test_string, remove) time3 := &time every 1 to limit do result3 := strip2(test_string, remove) time4 := &time write(time1) write(time2-time1, " ", result1) write(time3-time2, " ", result2) write(time4-time3, " ", result3) return end # from IPL strutil.icn # delete characters # procedure delete(s,c) local i while i := upto(c,s) do s[i:many(c,s,i)] := "" return s end #From: Richard Goerwitz procedure strip1(s,c) s2 := "" s ? { while s2 ||:= tab(upto(c)) do tab(many(c)) s2 ||:= tab(0) } return s2 end #From: Chris Tenaglia - 257-8765 procedure strip2(str,chrs) local text,chars # /chrs := ' ' # (I commmented this out because the others don't do such checks.) chars := &cset -- chrs text := "" str ? while tab(upto(chars)) do text ||:= tab(many(chars)) return text end And the output (first column is time, second column is the result string which is always supposed to be the same). Initial time is 0, other times are in milliseconds. 0 21033 strwhiletabuptocharsdotexttabmanychars 20350 strwhiletabuptocharsdotexttabmanychars 17717 strwhiletabuptocharsdotexttabmanychars All three are reasonable, but the differences in approach are educational. o o \_____/ /-o-o-\ _______ DDDD SSSS CCCC / ^ \ /\\\\\\\\ D D S C \ \___/ / /\ ___ \ D D SSS C \_ _/ /\ /\\\\ \ D D S C \ \__/\ /\ @_/ / DDDDavid SSSS. CCCCargo \____\____\______/ CARGO@TARDIS.CRAY.COM From flee@shire.cs.psu.edu Fri Jan 19 17:07:13 1990 Received: from shire.cs.psu.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA15092; Fri, 19 Jan 90 17:07:13 MST Received: from localhost by shire.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA02947; Fri, 19 Jan 90 19:07:42 -0500 Message-Id: <9001200007.AA02947@shire.cs.psu.edu> To: cargo@tardis.cray.com (David S. Cargo) Cc: icon-group@cs.arizona.edu Subject: Re: deleting characters In-Reply-To: Your message of Fri, 19 Jan 90 17:03:16 CST. <9001192303.AA01299@zk.cray.com> Date: Fri, 19 Jan 90 19:07:40 EST From: Felix Lee Status: O > It turns out that the IPL delete routine was the slowest, although > the fastest was only 15 percent faster. On pathological cases, the delete routine can be much slower. Try removing spaces from repl("a ", 1000). The delete routine is quadratic wrt the length of the source string, while the strip routines are quadratic wrt the result. This is due to the terrible amount of copying involved in manipulating Icon strings: delete has to copy 3997 + 3994 + ... + 1000 characters, while the other procedures need only copy 1 + 2 + ... + 1000. You can get linear performance if you do it in C. -- Felix Lee flee@shire.cs.psu.edu *!psuvax1!flee From gmt Fri Jan 19 17:48:05 1990 Date: Fri, 19 Jan 90 17:48:05 MST From: "Gregg Townsend" Message-Id: <9001200048.AA16501@megaron.arizona.edu> Received: by megaron.arizona.edu (5.59-1.7/15) id AA16501; Fri, 19 Jan 90 17:48:05 MST In-Reply-To: <9001200007.AA02947@shire.cs.psu.edu> To: icon-group Subject: Re: deleting characters Cc: cargo@tardis.cray.com, flee@shire.cs.psu.edu Status: O Felix Lee (flee@shire.cs.psu.edu) writes: On pathological cases, the delete routine can be much slower. True. ...You can get linear performance if you do it in C. You get it in Icon, too. Both Goerwitz's and Tenaglia's procedures are linear. Building strings by successive concatenation is sufficiently common that it was worth optimizing the implementation. If no other concurrent activity disrupts things, only the new characters (those added at the end) are copied. Gregg Townsend / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721 +1 602 621 4325 gmt@cs.arizona.edu 110 57 16 W / 32 13 45 N / +758m From wgg@cs.washington.edu Fri Jan 19 18:13:02 1990 Received: from june.cs.washington.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA17870; Fri, 19 Jan 90 18:13:02 MST Received: by june.cs.washington.edu (5.61/7.0jh) id AA16921; Fri, 19 Jan 90 17:11:14 -0800 Date: Fri, 19 Jan 90 17:11:14 -0800 From: wgg@cs.washington.edu (William Griswold) Return-Path: Message-Id: <9001200111.AA16921@june.cs.washington.edu> To: cargo@tardis.cray.com, flee@shire.cs.psu.edu Subject: Re: deleting characters Cc: icon-group@cs.arizona.edu Status: O > The delete routine is quadratic wrt the length of the source string, > while the strip routines are quadratic wrt the result. I may be wrong, but I believe you will find that in modern implementations of Icon that the strip routines are linear in time. Icon is smart enough to know that a string is located at the end of the string memory region (in this case the value of the variable holding the accumulating result string), and can just add to the end of it to concatenate. Any other *modification* of a string requires copying--substring creation does not require copying, since it is implemented as a pointer and an index. > You can get linear performance if you do it in C. Many common operations in Icon require *more* time to perform in C--using available abstractions--such as computing the length of a string. Also note that string concatentation in C in the standard way (using strcat) takes linear time. It also requires knowing the destination string is long enough to hold the longer result. Thus making strip as ``fast'' as Icon's requires a little effort. Bill Griswold From flee@shire.cs.psu.edu Fri Jan 19 19:50:23 1990 Received: from shire.cs.psu.edu by megaron.arizona.edu (5.59-1.7/15) via SMTP id AA21917; Fri, 19 Jan 90 19:50:23 MST Received: from localhost by shire.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA04238; Fri, 19 Jan 90 21:51:28 -0500 Message-Id: <9001200251.AA04238@shire.cs.psu.edu> To: "Gregg Townsend" Cc: icon-group@cs.arizona.edu Subject: Re: deleting characters In-Reply-To: Your message of Fri, 19 Jan 90 17:48:05 MST. <9001200048.AA16501@megaron.arizona.edu> Date: Fri, 19 Jan 90 21:51:27 EST From: Felix Lee Status: O > If no other concurrent activity disrupts things, only the new characters > (those added at the end) are copied. Ah, I forgot about that optimization. -- Felix From icon-group-request@arizona.edu Sun Jan 21 20:52:21 1990 Received: from ucbvax.Berkeley.EDU by megaron (5.59-1.7/15) via SMTP id AA12560; Sun, 21 Jan 90 20:52:21 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA06366; Sun, 21 Jan 90 19:47:56 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Date: 22 Jan 90 03:24:46 GMT From: aramis.rutgers.edu!paul.rutgers.edu!jac@rutgers.edu (Jonathan A. Chandross) Organization: Rutgers Univ., New Brunswick, N.J. Subject: Re: deleting characters Message-Id: References: <9001200111.AA16921@june.cs.washington.edu> Sender: icon-group-request@arizona.edu To: icon-group@arizona.edu Status: O wgg@CS.WASHINGTON.EDU (William Griswold) > Many common operations in Icon require *more* time to perform in C--using > available abstractions--such as computing the length of a string. Also > note that string concatentation in C in the standard way (using strcat) > takes linear time. It also requires knowing the destination string is > long enough to hold the longer result. Thus making strip as ``fast'' as > Icon's requires a little effort. I don't know if your statement is totally fair. There is nothing to prevent one from using BCPL style strings (i.e. also store a length with the string) in a C program. In fact, this is done. The MESA language (XEROX) generates C code which is then compiled normally. Strings in MESA are stored with a length, and are word aligned. This allows strcpy, strcmp, et al to work on word quantities, producing much faster string routines. I see no reason (aside from inertia) for why this has not been done to C. (Well, one would have to write routines to convert from the library function's notion of a character string to the new one with a length.) A while back I needed to derive the name from a file pointer. Since stdio does not support this I had to write a piece of code like: struct N_FILE { char *name; FILE *file; }; and the associated front-end routines for stdio. This was not hard to do, and did not take all that much time. Of course, one could say that the pattern matching, associative table features, etc. that make Icon so popular could also be added to C using the argument I give above. I won't (and can't) defend such a statement. My point is that condemning C for a shortcoming in the library routines is not really fair. Especially when that problem could be fixed in a few days hacking. Jonathan A. Chandross Internet: jac@paul.rutgers.edu UUCP: rutgers!paul.rutgers.edu!jac From goer@sophist.uchicago.edu Sun Jan 21 22:18:57 1990 Received: from tank.uchicago.edu by megaron (5.59-1.7/15) via SMTP id AA16449; Sun, 21 Jan 90 22:18:57 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Sun, 21 Jan 90 23:19:04 CST Return-Path: Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA24968; Sun, 21 Jan 90 23:14:57 CST Date: Sun, 21 Jan 90 23:14:57 CST From: Richard Goerwitz Message-Id: <9001220514.AA24968@sophist.uchicago.edu> To: icon-group@arizona.edu Subject: condemning Status: O Recent point: > My point is that condemning C for a shortcoming in the library > routines is not really fair. Especially when the problem could > be fixed in a few days hacking. I never got the impression that anyone was condemning C. Aren't you overreacting a bit? Whether or not the poster was correct in this instance, it does seem that making C behave like Icon often does result in very poor performance. Granted, you can often go down some by-way, and come up with a new Iconish library routine that will outperform Icon itself. But this is the very sort of in- convenience that Icon was intended to help us avoid. It's a trade- off. Name your poison. -Richard From wgg@cs.washington.edu Mon Jan 22 01:57:26 1990 Received: from june.cs.washington.edu by megaron (5.59-1.7/15) via SMTP id AA00782; Mon, 22 Jan 90 01:57:26 MST Received: by june.cs.washington.edu (5.61/7.0jh) id AA15833; Sun, 21 Jan 90 23:59:00 -0800 Date: Sun, 21 Jan 90 23:59:00 -0800 From: wgg@cs.washington.edu (William Griswold) Return-Path: Message-Id: <9001220759.AA15833@june.cs.washington.edu> To: @rutgers.edu:paul.rutgers.edu!jac@aramis.rutgers.edu Subject: Re: deleting characters Cc: icon-group@cs.arizona.edu Status: O >Date: 22 Jan 90 03:24:46 GMT >From: aramis.rutgers.edu!paul.rutgers.edu!jac@rutgers.edu (Jonathan A. Chandross) >Organization: Rutgers Univ., New Brunswick, N.J. >Subject: Re: deleting characters >To: icon-group@arizona.edu > > >wgg@CS.WASHINGTON.EDU (William Griswold) >> Many common operations in Icon require *more* time to perform in C--using >> available abstractions--such as computing the length of a string. Also >> note that string concatentation in C in the standard way (using strcat) >> takes linear time. It also requires knowing the destination string is >> long enough to hold the longer result. Thus making strip as ``fast'' as >> Icon's requires a little effort. > >I don't know if your statement is totally fair. There is nothing to >prevent one from using BCPL style strings (i.e. also store a length >with the string) in a C program. > >In fact, this is done. The MESA language (XEROX) generates C code >which is then compiled normally. Strings in MESA are stored with >a length, and are word aligned. This allows strcpy, strcmp, et al >to work on word quantities, producing much faster string routines. >I see no reason (aside from inertia) for why this has not been done >to C. (Well, one would have to write routines to convert from the >library function's notion of a character string to the new one with >a length.) > ... > >Of course, one could say that the pattern matching, associative >table features, etc. that make Icon so popular could also be added >to C using the argument I give above. I won't (and can't) defend >such a statement. My point is that condemning C for a shortcoming >in the library routines is not really fair. Especially when that >problem could be fixed in a few days hacking. > > >Jonathan A. Chandross >Internet: jac@paul.rutgers.edu >UUCP: rutgers!paul.rutgers.edu!jac > Looks like I've got my foot stuck in the Turing Tar Pit. I'm aware that I can do (almost) anything I want in any programming language at (close to) the performance the theoreticians tell me. As indicated at the end of your message, it is not possibility, but reality that counts. The reality is that C translates a string literal into a fixed-sized null-terminated character array. For a programmer to reimplement strings has to work pretty hard (The folks at Xerox and Arizona are good examples) particularly if that includes dynamically sized string management. Only with the encapsulation provided by C++ can we get close to what you claim, which, with care, can even handle C string literals correctly. I will readily confess that there are many problems that I would rather code in C than Icon--each is suited to a special set of tasks. Bill Griswold From tenaglia@fps.mcw.edu Tue Jan 23 15:54:03 1990 Received: from RUTGERS.EDU by megaron (5.59-1.7/15) via SMTP id AA19233; Tue, 23 Jan 90 15:54:03 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA28358; Tue, 23 Jan 90 17:52:43 EST Received: by uwm.edu; id AA03553; Tue, 23 Jan 90 16:28:55 -0600 Message-Id: <9001232228.AA03553@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Tue, 23 Jan 90 16:12:14 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Tue, 23 Jan 90 13:56:10 CDT Date: Tue, 23 Jan 90 13:56:10 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: Correction to packed decimal converter Status: O Several weeks ago I posted procedures for converting integers to and from packed format. After perfecting an application using them, a flaw became apparent in the procedure unpack(). Below is the corrected version. ################################################################## # # # THIS PROCEDURE UNPACKS A VALUE INTO AN INTEGER. # # # ################################################################## procedure unpack(val,width) # REQUIRES LINK RADCON ! local tmp,number,tens,ones,sign tmp := "" sign := 1 every number := ord(!val) do tmp ||:= right(map(radcon(number,10,16),&lcase,&ucase),2,"0") #this line changed if tmp[-1] == ("B" | "D") then sign := -1 tmp[-1] := "" tmp *:= sign /width := *tmp return right(tmp,width) end Yours truly, Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From cargo@tardis.cray.com Wed Jan 24 14:31:32 1990 Received: from uc.msc.umn.edu by megaron (5.59-1.7/15) via SMTP id AA18152; Wed, 24 Jan 90 14:31:32 MST Received: from hall.cray.com by uc.msc.umn.edu (5.59/1.14) id AA23656; Wed, 24 Jan 90 15:29:38 CST Received: from zk.cray.com by hall.cray.com id AA12949; 3.2/CRI-3.12; Wed, 24 Jan 90 15:31:27 CST Received: by zk.cray.com id AA05119; 3.2/CRI-3.12; Wed, 24 Jan 90 15:31:21 CST Date: Wed, 24 Jan 90 15:31:21 CST From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9001242131.AA05119@zk.cray.com> To: icon-group@cs.arizona.edu Subject: questions about records Status: O I happened to be looking at an application for rsg.icn from the IPL, when I happened to be looking at the beginning of the program and saw: record nonterm(name) record charset(chars) record query(name) I observed that two records had the same field name ("name"). This prompted a couple of questions that I couldn't find answers to in any of the Icon programming language documentation I looked at (including the book). What are the restrictions on reusing field names from record declarations? For example, you can clearly use the same field name in two different record declarations. You can also use the same field name in two different ordinal locations in two different record declarations. The field names can also be the same as names of local variables (surprise!). That was not what I would have expected. What seems most confusing is from page 222 of the Icon book: record-declaration: record identifier ( field-list ) where the field-list is subscripted with "opt" (meaning optional). The syntax says you can have a declaration like record weird() and I tried that in a test program and it translated without complaint. But what can you use it for? A very puzzled snail, o o \_____/ /-o-o-\ _______ DDDD SSSS CCCC / ^ \ /\\\\\\\\ D D S C \ \___/ / /\ ___ \ D D SSS C \_ _/ /\ /\\\\ \ D D S C \ \__/\ /\ @_/ / DDDDavid SSSS. CCCCargo \____\____\______/ cargo@tardis.cray.com From ralph Wed Jan 24 15:13:41 1990 Date: Wed, 24 Jan 90 15:13:41 MST From: "Ralph Griswold" Message-Id: <9001242213.AA21470@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA21470; Wed, 24 Jan 90 15:13:41 MST To: cargo@tardis.cray.com, icon-group@cs.arizona.edu Subject: Re: questions about records In-Reply-To: <9001242131.AA05119@zk.cray.com> Status: O Yes, you can have the same field name in different records, and the positions need not be the same, as in record foo(a,b,c) record baz(c,b,a) Icon will handle this properly. Also, as you've observed, the "name spaces" for identifiers and field names are disjoint, so you can, for example, have a local identifier named b and do something like b := foo() . . . b.b := 1 (Not recommended, of course.) And a record need not have any fields, as in record nil() Useful if you need objects of an identifiable type, but the objects have no attributes. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From @um.cc.umich.EDU:Paul_Abrahams@Wayne-MTS Wed Jan 24 17:17:07 1990 Resent-From: @um.cc.umich.EDU:Paul_Abrahams@Wayne-MTS Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP id AA01355; Wed, 24 Jan 90 17:17:07 MST Received: from megaron (megaron.cs.arizona.edu) by Arizona.EDU; Wed, 24 Jan 90 02:22 MST Received: from sharkey.cc.umich.edu by megaron (5.59-1.7/15) via SMTP id AA04923; Wed, 24 Jan 90 02:25:00 MST Received: from ummts.cc.umich.edu by sharkey.cc.umich.edu (5.61/1123-1.0) id AA14554; Wed, 24 Jan 90 04:21:59 -0500 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Wed, 24 Jan 90 04:23:14 EST Resent-Date: Wed, 24 Jan 90 02:23 MST Date: Tue, 23 Jan 90 23:55:38 EST From: Paul_Abrahams%Wayne-MTS@um.cc.umich.EDU Subject: Strings in C Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <195450@Wayne-MTS> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@arizona.edu Status: O This forum isn't about C, but anyway--- The problem with C is not just that it's a high-level machine language-- it's a high-level machine language for the PDP11. But even given that, the decision to null-terminate strings was a dreadful mistake (see my article on the subject in SIGPLAN Notices, Oct 88 I think). Paul Abrahams From icon-group-request@arizona.edu Thu Jan 25 11:08:40 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP id AA27922; Thu, 25 Jan 90 11:08:40 MST Received: from megaron (megaron.cs.arizona.edu) by Arizona.EDU; Thu, 25 Jan 90 11:03 MST Received: from ucbvax.Berkeley.EDU by megaron (5.59-1.7/15) via SMTP id AA27812; Thu, 25 Jan 90 11:06:29 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA05008; Thu, 25 Jan 90 09:55:56 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 25 Jan 90 11:04 MST Date: 25 Jan 90 06:01:11 GMT From: tellab5!wheaton!johnh@uunet.uu.NET Subject: installing icon 5.7 on ultrix 3.0 Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Organization: Wheaton College, Wheaton Il Resent-Message-Id: Message-Id: <1786@wheaton.UUCP> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@arizona.edu Status: O This may have been discussed before but: I have (I think) installed icon 5.7 on ultrix 3.0. I could not do it without using the BSD4.2 sources. There were three problems. 1) Apparently in BSD4.2 a C program allocates a fixed number of slots for files inside of the users area (below &end). The Icon compiler/translator uses that fact along with setbuf calls to have complete control of memory allocation above &end. Apparently in ultrix 3.0 space for files are allocated at runtime above the &end symbol. Since the Icon compiler/intepreter assumes that the limit of memory starts at &end the memory initialization routines walk over the file slots. The result is that the distributed (unsupported) binarys will not run on ultrix 3.0. (In our case compiles and intrepets hits eof immeadiatly). Changes to the sources from BSD4.2 were minmal (3 files - fgrep for brk and &end replace code using sbrk) 2) The manual of execve which include a description of "intepreter" files where the first line of a file start with #! intrepeter I could not get to work on ultrix 3.0. I remember a problem with pascal pi object files not working with some ultrix releases. The pi object files on ultrix 3.0 do not use this facility. There is a reasonable discussion of the issue in doc/install and the solution is to not use the -directex flag when running icon-setup. 3) There was seemed to be a small problem in the Makefile for icont. It refered to a object mon.o. There is no mon.c but there was a mon.o file in one of the libraries included in the link step. I removed the mon.o entry in the Makefile. johnh... -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= UUCP: (obdient spl1)!wheaton!johnh telephone: (312) 260-3871 (office) Mail: John Hayward Math/Computer Science Dept. Wheaton College Wheaton Il 60187 Act justly, love mercy and walk humbly with your God. Micah 6:8b From cargo@tardis.cray.com Mon Jan 29 14:39:51 1990 Received: from timbuk.cray.com by megaron (5.59-1.7/15) via SMTP id AA26026; Mon, 29 Jan 90 14:39:51 MST Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34) id AA05890; Mon, 29 Jan 90 15:39:40 CST Received: from zk.cray.com by hall.cray.com id AA01356; 3.2/CRI-3.12; Mon, 29 Jan 90 15:39:37 CST Received: by zk.cray.com id AA00300; 3.2/CRI-3.12; Mon, 29 Jan 90 15:39:34 CST Date: Mon, 29 Jan 90 15:39:34 CST From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9001292139.AA00300@zk.cray.com> To: icon-group@cs.arizona.edu Subject: comparing csets Status: O I got my hardcopy Icon news in the mail and saw my name mentioned. I thought I'd furnish an update on the way I eventually solved my cset comparison problem. I eventually started using the following procedure: procedure overlap(c1, c2) return '' ~=== c1 ** c2 end The main feature that I exploit with this procedure is that when I have tracing turned on, I can see the result of the comparison. Approaches like 0 ~= *(c1 ** c2) are probably more efficient (though I haven't checked), but not as informative when I'm testing. o o \_____/ /-o-o-\ _______ DDDD SSSS CCCC / ^ \ /\\\\\\\\ D D S C \ \___/ / /\ ___ \ D D SSS C \_ _/ /\ /\\\\ \ D D S C \ \__/\ /\ @_/ / DDDDavid SSSS. CCCCargo \____\____\______/ cargo@tardis.cray.com From cargo@tardis.cray.com Thu Feb 1 13:52:05 1990 Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA17936; Thu, 1 Feb 90 13:52:05 MST Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34) id AA12869; Thu, 1 Feb 90 14:52:00 CST Received: from zk.cray.com by hall.cray.com id AA10468; 3.2/CRI-3.12; Thu, 1 Feb 90 14:51:57 CST Received: by zk.cray.com id AA03270; 3.2/CRI-3.12; Thu, 1 Feb 90 14:52:14 CST Date: Thu, 1 Feb 90 14:52:14 CST From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9002012052.AA03270@zk.cray.com> To: icon-group@cs.arizona.edu Subject: table initialization Status: O I was looking at implementing some Icon code to initialize font width tables. Naturally I thought about using tables to do this. I then realized that while most other structures can be initialized with constants, there doesn't seem to be a convenient way to do this with tables. Or is there something I'm missing. What I will probably wind up doing is either using lists indexed by character values (using ord(s)) or using two constant lists to initialize a table, char_val and char_width are the two lists: every i := 1 to n do width[char_val[i]] := char_width[i] Eventually I'll probably want to know which is faster to use, a character width table or and array indexed using ord(s)) string_width := 0 # for either case every string_width +:= char_width[ord(!s)] # for a list every string_width +:= char_width[!s] # for a table It would seem to be a trade between ease of access and creation of temporaries. Has anybody tried anything like this already and found an answer to which is better speed-wise? From wgg@cs.washington.edu Thu Feb 1 15:27:59 1990 Received: from june.cs.washington.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA27918; Thu, 1 Feb 90 15:27:59 MST Received: by june.cs.washington.edu (5.61/7.0jh) id AA27239; Thu, 1 Feb 90 14:26:50 -0800 Date: Thu, 1 Feb 90 14:26:50 -0800 From: wgg@cs.washington.edu (William Griswold) Return-Path: Message-Id: <9002012226.AA27239@june.cs.washington.edu> To: cargo@tardis.cray.com, icon-group@cs.arizona.edu Subject: Re: table initialization Status: O >From: cargo@tardis.cray.com (David S. Cargo) >To: icon-group@cs.arizona.edu >Subject: table initialization > >I was looking at implementing some Icon code to initialize font width >tables. Naturally I thought about using tables to do this. I then >realized that while most other structures can be initialized with >constants, there doesn't seem to be a convenient way to do this with >tables. Or is there something I'm missing. > >What I will probably wind up doing is either using lists indexed by >character values (using ord(s)) or using two constant lists to >initialize a table, char_val and char_width are the two lists: > >every i := 1 to n do width[char_val[i]] := char_width[i] > >Eventually I'll probably want to know which is faster to use, a >character width table or and array indexed using ord(s)) > >string_width := 0 # for either case >every string_width +:= char_width[ord(!s)] # for a list >every string_width +:= char_width[!s] # for a table > >It would seem to be a trade between ease of access and creation >of temporaries. > >Has anybody tried anything like this already and found an answer >to which is better speed-wise? > You might want to think about storing your character set and font widths in an external file, so that if the values (i.e., font) change, you won't have to change your program. Then the code for tables vs. lists is not so different: Your input: a 5 b 5 ... m 8 ... z 6 the processing code: # e.g., for tables table := width() while line := read(font-file) do line ? width[move(1)] := integer((tab(many(' \t')) & tab(0))) As for performance, there are several things you can try. It is likely that the ord(s) function will be faster, since the hashing and chaining used to implement tables will be avoided. If you want to hide which type you are using, use a procedure. Procedure call is pretty cheap, so you don't have to worry much about the cost: # for tables procedure width(char) static width-table initial width-table := table() return width-table[char] end # for lists procedure width(char) static width-list initial width-list := list(256) return width-list[ord(char)] end Note that since I didn't dereference the return variable, they can be assigned to by the input function: while line := read(font-file) do line ? width(move(1)) := integer((tab(many(' \t')) & tab(0))) Thus even the font width reader doesn't need to know the representation you are using. One thing that just occurred to me is that with the table representation (or some hybrid enscapsulated in a procedure) you could look-up word widths as well as character widths, probably at little extra effort (say, if you stored the widths of the 2000 most commonly used words) and some performance benefit. You could even use a history scheme, in which you remember the widths of words already computed in the current document. It seems like overkill, but it gives you some idea of the flexibility of Icon. Bill Griswold From cargo@tardis.cray.com Thu Feb 1 15:47:09 1990 Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA29364; Thu, 1 Feb 90 15:47:09 MST Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34) id AA14604; Thu, 1 Feb 90 16:47:02 CST Received: from zk.cray.com by hall.cray.com id AA12124; 3.2/CRI-3.12; Thu, 1 Feb 90 16:46:59 CST Received: by zk.cray.com id AA03421; 3.2/CRI-3.12; Thu, 1 Feb 90 16:47:18 CST Date: Thu, 1 Feb 90 16:47:18 CST From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9002012247.AA03421@zk.cray.com> To: icon-group@cs.arizona.edu Subject: Re: table initialization Status: O "You might want to think about storing your character set and font widths in an external file, so that if the values (i.e., font) change, you won't have to change your program." As a matter of fact, I will start by reading the Adobe Font Metrics file to get the initial values. I could either use them directly, or have the program write initialization code to be used by another program. It's a matter of early or late binding in effect. If you think that ord(s) is fast relative to hashing, I'll probably go that way. david snail From @mirsa.inria.fr:ol@cerisi.cerisi.Fr Mon Feb 5 13:28:51 1990 Received: from mirsa.inria.fr by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02205; Mon, 5 Feb 90 13:28:51 MST Received: from cerisi.cerisi.fr by mirsa.inria.fr with SMTP (5.59++/IDA-1.2.8) id AA23439; Mon, 5 Feb 90 21:26:06 +0100 Message-Id: <9002052026.AA23439@mirsa.inria.fr> Date: Mon, 5 Feb 90 21:26:48 -0100 Posted-Date: Mon, 5 Feb 90 21:26:48 -0100 From: Lecarme Olivier To: icon-group@cs.arizona.edu Subject: Icon on RISC machines Status: O Our laboratory and computing center are just buying brand new RISC machines made by DEC (or more precisely, sold by DEC). On these machines, Unix or something like this is supposed to work, but most programming languages have been forgotten: after all, C is enough for everything, or is it not enough after all? Thus, I'm missing Pascal, Modula-2... and Icon! Being naturally optimistic, I tried to pretend to the Icon installation that this machine is in fact a Vax with Ultrix. Something went wrong during "make Icon", in program rlocal.c of src/iconx. I could try to figure what happened, but it's somewhat late for working, and I preferred to ask the Icon community whether anybody has already made the job. Maybe I made a mistake when copying the whole Icon distribution, or maybe something more serious is happening. Can anybody help me? Olivier Lecarme From cargo@tardis.cray.com Wed Feb 7 08:58:08 1990 Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA11797; Wed, 7 Feb 90 08:58:08 MST Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34) id AA09246; Wed, 7 Feb 90 09:57:51 CST Received: from zk.cray.com by hall.cray.com id AA29474; 3.2/CRI-3.12; Wed, 7 Feb 90 09:57:48 CST Received: by zk.cray.com id AA01397; 3.2/CRI-3.12; Wed, 7 Feb 90 09:57:47 CST Date: Wed, 7 Feb 90 09:57:47 CST From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9002071557.AA01397@zk.cray.com> To: icon-group@cs.arizona.edu Subject: generation of procedures Status: O I was looking at the code for RSG (part of the Icon Program Library), noticing how one part of the code takes advantage of the order of a list of procedure variables to successively try evaluating a line of input until one of the procedures succeeds. My first reaction was that this reminded me of searching an object hierarchy looking for a handler for a particular type of message. If a particular object can't handle the message, it fails and lets the object next higher in the hierarchy have a crack at it. It sort of reminded me of message passing, but with a distinctively Icon flavor to it. My real question is that I can't figure out how to decide what the symantics of the operation really are. Normally I would expect to say the !plist comes as element generation, which should succeed when the first element is generated. But then it is followed by a parameter list and surrounded by parentheses. This seems to combine to make it an alternation expression equivelent to: (define | generate | grammar | source | comment | prompter | error)(line) plist := [define,generate,grammar,source,comment,prompter,error] : : while in := pop(ifile) do { # process all files repeat { writes(\prompt) line := read(in) | break while line[-1] == "\\" do line := line[1:-1] || read(in) | break (!plist)(line) # the line above is the interesting one! } close(in) } I can't seem to find anything in the Icon book that spells out what is really happening here. It looked at first like !plist wasn't in a context that required generation of all the list elements, but clearly that is not the case. The confused snail, o o \_____/ /-o-o-\ _______ DDDD SSSS CCCC / ^ \ /\\\\\\\\ D D S C \ \___/ / /\ ___ \ D D SSS C \_ _/ /\ /\\\\ \ D D S C \ \__/\ /\ @_/ / DDDDavid SSSS. CCCCargo \____\____\______/ cargo@tardis.cray.com From ralph Wed Feb 7 09:11:17 1990 Date: Wed, 7 Feb 90 09:11:17 MST From: "Ralph Griswold" Message-Id: <9002071611.AA13262@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA13262; Wed, 7 Feb 90 09:11:17 MST To: cargo@tardis.cray.com, icon-group@cs.arizona.edu Subject: Re: generation of procedures In-Reply-To: <9002071557.AA01397@zk.cray.com> Status: O The procedures are generated by !plist as you surmised. The first one is then applied to the argument list, resulting in a procedure call. If that call fails, !plist is resumed to produce another procedure. Think of a procedure call as e0(e1, e2, ..., en) The order of evaluation is e0, e1, e2, ..., en. If all succeed, the value of e0 is applied to the values of e1, e2, ..., en. If the resulting procedure call fails, en, ..., e2, e1, e0 are resumed in that order (assuming they suspended). If any produces a new result, evaluation starts to the right again. In the case you cite, only e0 is a generator, so failure of the procedure call causes e0 to produce another procedure, which is then applied to the arguments. If any of the procedure calls fails, the process stops, since there is nothing to drive further generation. The effect is to apply the procedures in plist until one succeeds. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From wgg@cs.washington.edu Wed Feb 7 09:50:43 1990 Received: from june.cs.washington.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA17139; Wed, 7 Feb 90 09:50:43 MST Received: by june.cs.washington.edu (5.61/7.0jh) id AA00868; Wed, 7 Feb 90 08:49:37 -0800 Date: Wed, 7 Feb 90 08:49:37 -0800 From: wgg@cs.washington.edu (William Griswold) Return-Path: Message-Id: <9002071649.AA00868@june.cs.washington.edu> To: cargo@tardis.cray.com, icon-group@cs.arizona.edu Subject: Re: generation of procedures Status: O >Date: Wed, 7 Feb 90 09:57:47 CST >From: cargo@tardis.cray.com (David S. Cargo) >To: icon-group@cs.arizona.edu >Subject: generation of procedures >Errors-To: icon-group-errors@cs.arizona.edu > >I was looking at the code for RSG (part of the Icon Program Library), >noticing how one part of the code takes advantage of the order of a >list of procedure variables to successively try evaluating a line of >input until one of the procedures succeeds. My first reaction was >that this reminded me of searching an object hierarchy looking for a >handler for a particular type of message. If a particular object >can't handle the message, it fails and lets the object next higher >in the hierarchy have a crack at it. It sort of reminded me of >message passing, but with a distinctively Icon flavor to it. > I like your analogy to searching for a handler in an object hierarchy-- it looks a lot like delegation. I'll think about this one some more. Perhaps one could use PDCOs or a Variant Translator to make such a scheme syntactically papable. >My real question is that I can't figure out how to decide what the >symantics of the operation really are. Normally I would expect to >say the !plist comes as element generation, which should succeed >when the first element is generated. But then it is followed by >a parameter list and surrounded by parentheses. This seems to >combine to make it an alternation expression equivelent to: > > (define | generate | grammar | source | comment | prompter | error)(line) > > > plist := [define,generate,grammar,source,comment,prompter,error] > : > : > while in := pop(ifile) do { # process all files > repeat { > writes(\prompt) > line := read(in) | break > while line[-1] == "\\" do line := line[1:-1] || read(in) | break > (!plist)(line) ># the line above is the interesting one! > } > close(in) > } > > >I can't seem to find anything in the Icon book that spells out what is really >happening here. It looked at first like !plist wasn't in a context that required >generation of all the list elements, but clearly that is not the case. > Element generation in normal expression context evaluates the expression in an attempt to produce a single result. So this code will try list elements until one invocation succeeds. Thus error(line) gets executed only if none of the other alternatives succeed. It is trying to parse the line, using each of the parsing procedures as a possible syntactic alternative. As is usual with parsing, you want only one result, and in this case we take the first one that comes, assuming that it is preferred or unique. Bill Griswold From icon-group-request@arizona.edu Thu Feb 8 09:53:28 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13028; Thu, 8 Feb 90 09:53:28 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 8 Feb 90 09:53 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA06835; Thu, 8 Feb 90 08:47:22 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 8 Feb 90 09:55 MST Date: 8 Feb 90 16:43:27 GMT From: tank!iitmax!chien@handies.ucar.EDU Subject: icon source wanted Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <3348@iitmax.IIT.EDU> Organization: Illinois Institute of Technology, Chicago X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Is icon source for UN*X machines available for ftp? Thanks for the info. Greg Chien Manager, Design Processes Laboratory Institute of Design Illinois Institute of Technology Internet: chien@iitmax.iit.edu From icon-group-request@arizona.edu Mon Feb 12 23:42:25 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA14475; Mon, 12 Feb 90 23:42:25 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Mon, 12 Feb 90 23:40 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA12563; Mon, 12 Feb 90 22:23:00 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Mon, 12 Feb 90 23:42 MST Date: 12 Feb 90 14:42:17 GMT From: van-bc!ubc-cs!alberta!myrias!dragos!wally@ucbvax.Berkeley.EDU Subject: compiler, compiler, where art thou? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1990Feb12.144217.17097@dragos.uucp> Organization: Orbital Mind Control Lasers, Inc. X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O So, the question of the day is: Where does one find a compiler for our wonderful Icon that runs on the Atari ST? We`ve found icont, and iconx, but is there to be found an iconc? -- O o Wallace Harshaw ( ) somewhere around here... "] [" From goer@sophist.uchicago.EDU Tue Feb 13 01:31:14 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA18977; Tue, 13 Feb 90 01:31:14 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Tue, 13 Feb 90 01:30 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Tue, 13 Feb 90 02:28:40 CST Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA29016; Tue, 13 Feb 90 02:24:10 CST Resent-Date: Tue, 13 Feb 90 01:32 MST Date: Tue, 13 Feb 90 02:24:10 CST From: Richard Goerwitz Subject: compiler, compiler Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9002130824.AA29016@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O We`ve found icont, and iconx, but is there to be found an iconc? Just about everyone would like to see a compiler, but it's a whole dif- ferent ball game than an interpreter. The most recent Icon newsletter mentioned that research in this are was going on. With all the work the icon-project members put in right now, though, it might not be fitting for us to press them on this subject. The "Icon book," as everyone affectionately calls it, speaks of a com- piler (version 5 is it?). Compilers haven't been implemented since then, though you can save an image of the executing program under Unix, chmod it so that it is an executable file, and then pretend it is a compiled program. You're working on an Atari, though, so you're out of luck. Sorry. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From icon-group-request@arizona.edu Tue Feb 13 02:54:27 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA21988; Tue, 13 Feb 90 02:54:27 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 13 Feb 90 02:55 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA25510; Tue, 13 Feb 90 01:46:31 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 13 Feb 90 02:55 MST Date: 13 Feb 90 04:32:21 GMT From: ssbell!mcmi!unocss!dent@uunet.uu.NET Subject: Anyone using Idol..? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1996@unocss..unl.edu> Organization: U. of Nebraska at Omaha X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Is anyone out there using Idol for anything? I've got it running in tandem with Icon 7.5 on a UNIX (DYNIX really) machine here, and I think it's very interesting, but I have no background in "object orientedness" at all, so I was curious to see if there were some slightly more complex examples of Idol use around. Thanks for any pointers you might want to give.. :-) -/ Dave Caplinger /--------------------------------------------------------- Microcomputer Specialist, Campus Computing, Univ. of Nebraska at Omaha dent@zeus.unl.edu ...!uunet!unocss!dent DENT@UNOMA1 From ralph Tue Feb 13 06:27:50 1990 Date: Tue, 13 Feb 90 06:27:50 MST From: "Ralph Griswold" Message-Id: <9002131327.AA00802@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA00802; Tue, 13 Feb 90 06:27:50 MST To: van-bc!ubc-cs!alberta!myrias!dragos!wally@ucbvax.Berkeley.EDU Subject: Re: compiler, compiler, where art thou? Cc: icon-group In-Reply-To: <1990Feb12.144217.17097@dragos.uucp> Status: O The so-called Icon compiler, iconc, has not been supported for any version of Icon for many years, and there never hsa been one for the Atari ST. The term compiler in this context is somewhat misleading. The iconc you refer to compiled Icon mostly into subroutine calls and was only 5-10% faster than the interpreter. Granted there are other advantages to iconc, like being able to link C functions. However, iconc was not portable and we did not have the resources to maintain it as a separate program. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From icon-group-request@arizona.edu Wed Feb 14 22:53:35 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA26458; Wed, 14 Feb 90 22:53:35 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 14 Feb 90 22:48 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA00866; Wed, 14 Feb 90 21:36:58 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 14 Feb 90 22:54 MST Date: 14 Feb 90 21:06:57 GMT From: esquire!yost@nyu.EDU Subject: a reads() bug Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1785@esquire.UUCP> Organization: DP&W, New York, NY X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O #!/bin/sh # Demonstrate Icon reads() bug on Sun4 # Reading more characters than available on a pipe can cause trouble # Don't know if it is a system bug or an Icon bug # May also be a problem if reading from a device. # Works OK on the Pyramid # Icon version 7.5 # Moral of the story, reads(,4096) at a time, max # 2/14/90 Dave Yost, DP&W pipebufsize=4096 just_big_enough_for_trouble1=`expr $pipebufsize + 1` just_big_enough_for_trouble2=`expr $pipebufsize + 1` tmp=xxx bigfile=/etc/termcap dd ibs=$just_big_enough_for_trouble1 count=1 < $bigfile > $tmp cat << END > pipebug.icn procedure main (args) while writes(reads(&input, $just_big_enough_for_trouble2)) return end END icont pipebug.icn echo ">> Both of these commands should succeed, but for the bug " echo ">> ./pipebug < $tmp | cmp - $tmp" ./pipebug < $tmp | cmp - $tmp echo ">> cat $tmp | ./pipebug | cmp - $tmp" cat $tmp | ./pipebug | cmp - $tmp rm -f pipebug.icn $tmp ality of the underlying UNIX system call. --dave yost yost@dpw.com or uunet!esquire!yost From kelvin@astro.cs.iastate.EDU Thu Feb 15 08:31:44 1990 Resent-From: kelvin@astro.cs.iastate.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA20024; Thu, 15 Feb 90 08:31:44 MST Received: from megaron.cs.arizona.edu by Arizona.EDU; Thu, 15 Feb 90 08:31 MST Received: from atanasoff.cs.iastate.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA19794; Thu, 15 Feb 90 08:28:47 MST Received: from astro.cs.iastate.edu by atanasoff (99.99) id AA10782; Thu, 15 Feb 90 09:26:21 -0600 Received: by astro.cs.iastate.edu (3.24) id AA28956; Thu, 15 Feb 90 09:27:11 CST Resent-Date: Thu, 15 Feb 90 08:33 MST Date: Thu, 15 Feb 90 09:27:11 CST From: kelvin@astro.cs.iastate.EDU Subject: reads() considered weak Resent-To: icon-group@cs.arizona.edu To: esquire!yost@nyu.EDU Cc: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9002151527.AA28956@astro.cs.iastate.edu> In-Reply-To: esquire!yost@nyu.EDU's message of 14 Feb 90 22:00:14 GMT <1786@esquire.UUCP> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: esquire!yost@nyu.EDU X-Vms-Cc: icon-group@Arizona.edu Status: O you might want to look at: A Stream Data Type that Supports Goal-Directed Pattern Matching on Unbounded Sequences of Values, in Computer Languages, Vol. 15, No. 1 (jan 1990) by Kelvin Nilsen this describes one proposed solution to the sort of problems you mentioned. unfortunately, i haven't yet gathered enough external funding and/or academic rank to spend much time on development and distribution of my real-time derivative of Icon, Conicon. Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA 50011 (515) 294-2259 kelvin@atanasoff.cs.iastate.edu uunet!atanasoff!kelvin From goer@sophist.uchicago.EDU Thu Feb 15 09:47:27 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA27427; Thu, 15 Feb 90 09:47:27 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Thu, 15 Feb 90 09:45 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 15 Feb 90 10:44:22 CST Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA01848; Thu, 15 Feb 90 10:39:46 CST Resent-Date: Thu, 15 Feb 90 09:49 MST Date: Thu, 15 Feb 90 10:39:46 CST From: Richard Goerwitz Subject: Conicon - What?!! Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9002151639.AA01848@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O > unfortunately, i haven't yet gathered enough external funding and/or > academic rank to spend much time on development and distribution of > my real-time derivative of Icon, Conicon. ^^^^^ Did you really think you could toss this one off without being asked for more information? :-) What is Conicon? -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From kelvin@astro.cs.iastate.EDU Thu Feb 15 13:15:02 1990 Resent-From: kelvin@astro.cs.iastate.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA14373; Thu, 15 Feb 90 13:15:02 MST Received: from megaron.cs.arizona.edu by Arizona.EDU; Thu, 15 Feb 90 13:09 MST Received: from atanasoff.cs.iastate.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13788; Thu, 15 Feb 90 13:06:45 MST Received: from astro.cs.iastate.edu by atanasoff (99.99) id AA15938; Thu, 15 Feb 90 14:04:15 -0600 Received: by astro.cs.iastate.edu (3.24) id AA29215; Thu, 15 Feb 90 14:05:04 CST Resent-Date: Thu, 15 Feb 90 13:14 MST Date: Thu, 15 Feb 90 14:05:04 CST From: kelvin@astro.cs.iastate.EDU Subject: Conicon - What?!! Resent-To: icon-group@cs.arizona.edu To: goer@sophist.uchicago.EDU Cc: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9002152005.AA29215@astro.cs.iastate.edu> In-Reply-To: Richard Goerwitz's message of Thu, 15 Feb 90 10:39:46 CST <9002151639.AA01848@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: goer@sophist.uchicago.EDU X-Vms-Cc: icon-group@Arizona.edu Status: O Conicon is a contraction for concurrent Icon. Conicon is designed to provide the high-level power of Icon to real-time programmers. The implementation of Conicon differs somewhat from that of Icon. In particular, we use a special real-time garbage collection algorithm designed in part by me, and a different virtual machine encoding which allows real-time response to interrupts (certain machine instructions in Icon's virtual machine represent potentially unbounded amounts of computation. Since it is not possible to switch contexts in the middle of executing a particular instruction, the worst-case time required to execute a virtual machine instruction represents a lower bound on the time required to respond to a high-priority interrupt.) Also, Conicon provides several new (and different) programming paradigms: 1) The stream data type represents an unbounded sequence of values. Generally, you can treat this like a pipe from a concurrent process, or as an I/O connection to the outside world (to A/D converters, keyboards, terminals, modems, etc...). In Conicon, string scanning is replaced with stream scanning. The integration is, I think, fairly clean and natural. Streams are described more thoroughly in the paper mentioned in my earlier mail: A Stream Data Type that Supports Goal-Directed Pattern Matching on Unbounded Sequences of Values - Kelvin Nilsen Computer Languages, Vol. 15, No. 1, Jan. 90. I can provide reprints to anyone who is interested in this. 2) Conicon supports concurrent processes. These processes are spawned in one of two ways. First, Icon's create operator serves in Conicon to create a concurrent process instead of creating a coexpression. A stream which represents the sequence of values generated by the spawned expression is automatically created when the process is spawned. Second, Conicon introduces yet another operator: binary !, which is interpreted as "concurrent alternation." For example, every write(1 to 3 ! 5 to 7) might output the sequence: 5, 6, 1, 2, 3, 7 There are a variety of useful programming techniques that can be based on the concurrent alternation operator. These techniques, and other aspects of concurrency in Conicon are discussed more thoroughly in a paper submitted to Software -- Practice & Experience. We have not yet heard back from the referees. If anyone would like to see a draft, please send me mail... From root@fergvax.unl.edu Fri Feb 16 07:22:53 1990 Received: from fergvax.unl.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA10598; Fri, 16 Feb 90 07:22:53 MST Received: by fergvax.unl.edu (5.57/Ultrix2.4-C) id AA09632; Fri, 16 Feb 90 08:20:06 CST Date: Fri, 16 Feb 90 08:20:06 CST From: root@fergvax.unl.edu (System PRIVILEGED Account) Message-Id: <9002161420.AA09632@fergvax.unl.edu> To: icon-group@cs.arizona.edu Subject: unocss.unl.edu Cc: mkb@fergvax.unl.edu Status: O Dear Icon-Group List Manager: The network that host unocss is currently on is in the process of being changed over from a class C network to a class B network. No mail can get through to it. Messages that are currently being routed through fergvax.unl.edu are unable to get through and are spooling up here. I would appreciate it if you would remove them from the mailing list until they get their network problems straightened out. When they do, you will need to send all icon-group mail to unocss directly. You will need to change the mailing address of payne%unocss.unl.edu@fergvax.unl.edu to be payne@unocss.unomaha.edu I do not know what the IP# will be for unocss. But if you are running named, you should not need it. Thank you. Sincerely, FERGVAX System Manager P.S. For your information: Mail Queue (6 requests) --QID-- --Size-- -----Q-Time----- ------------Sender/Recipient------------ AA27250 2562 Thu Feb 15 14:42 (Deferred: Connection timed out during user open with unocss.) AA24126 481 Thu Feb 15 11:26 (Deferred: Connection timed out during user open with unocss.) AA23032 620 Thu Feb 15 10:31 (Deferred: Connection timed out during user open with unocss.) AA19377 969 Thu Feb 15 00:34 (Deferred: Connection timed out during user open with unocss.) AA19373 1054 Thu Feb 15 00:33 (Deferred: Connection timed out during user open with unocss.) AA17854 652 Tue Feb 13 10:12 (Deferred: Connection timed out during user open with unocss.) From goer@sophist.uchicago.EDU Thu Feb 22 20:06:32 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA18569; Thu, 22 Feb 90 20:06:32 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Thu, 22 Feb 90 20:08 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 22 Feb 90 21:06:43 CST Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03453; Thu, 22 Feb 90 21:02:03 CST Resent-Date: Thu, 22 Feb 90 20:08 MST Date: Thu, 22 Feb 90 21:02:03 CST From: Richard Goerwitz Subject: BSD -> SYSV filename mapper (reformats entire tar archive) Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9002230302.AA03453@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Recently I had occasion to install a number of BSD archives on my home (SysV) machine, and I got fed up with having to rename all the directories, and altering all the source to recognize the new and shorter names. It seemed better to create a filter that would take a tar archive and map everything all at once. I started writing the program in C, but I soon realized that doing the job in C would take a couple of evenings. The Icon program took just part of one evening. It's not meant to be pretty, but since it is probably something others would find useful, I'm posting it. It works here at my site. Naturally, I don't guarantee that it will work anywhere else. ---------------------------------------------------------------------- global filenametbl, chunkset, short_chunkset # see procedure mappiece(s) record hblock(name,junk,size,mtime,chksum, linkflag,linkname,therest) # see readtarhdr(s) procedure main(a) usage := "usage: maptarfile inputfile # output goes to stdout" 0 < *a < 2 | stop("Bad arg count.\n",usage) intext := open(a[1],"r") | stop("maptarfile: can't open ",a[1]) # run through all the headers in the input file, filling # (global) filenametbl with the names of overlong files; # make_table_of_filenames fails if there are no such files make_table_of_filenames(intext) | stop("maptarfile: no overlong path names to map") # now that a table of overlong filenames exists, go back # through the text, remapping all occurrences of these names # to new, 14-char values; also, reset header checksums, and # reformat text into correctly padded 512-byte blocks seek(intext,1) output_mapped_headers_and_texts(intext) | stop("maptarfile: error reformatting text") close(intext) write_report() exit(0) end procedure make_table_of_filenames(intext) # global chunkset (set of overlong filenames) local header # read headers for overlong filenames; for now # ignore everything else while header := readtarhdr(reads(intext,512)) do { tab_nxt_hdr(intext,trim_str(header.size)) fixpath(trim_str(header.name)) } *chunkset = 0 & fail return &null end procedure output_mapped_headers_and_texts(intext) # remember that filenametbl, chunkset, and short_chunkset # (which are used by various procedures below) are GLOBAL local header, newtext, full_block # read in headers, one at a time while header := readtarhdr(reads(intext,512)) do { # replace overlong filenames with shorter ones, according to # the conversions specified in the global hash table filenametbl header.name := left(map_filenams(header.name),100,"\x00") header.linkname := left(map_filenams(header.linkname),100,"\x00") # use header.size field to read in and map the subsequent text newtext := trim( map_filenams(tab_nxt_hdr(intext,trim_str(header.size))),'\x00' ) # now, find the length of newtext, and insert it into the size field header.size := right(exbase10(*newtext,8) || " ",12," ") # calculate the checksum of the newly retouched header header.chksum := right(exbase10(get_checksum(header),8)||"\x00 ",8," ") # finally, join all the header fields into a new block and write it out full_block := ""; every full_block ||:= !header writes(left(full_block,512,"\x00")) # now we're ready to write out the text, padding the final block # out to an even 512 bytes if necessary; the next header must start # right at the beginning of a 512 byte block newtext ? { while writes(move(512)) if not pos(0) then writes(left(tab(0),512,"\x00")) | fail } } writes(repl("\x00",512)) return &null end procedure trim_str(s) # knock out spaces, nulls return s ? { (tab(many(' ')) | &null) & trim(tab(find("\x00")|0)) } \ 1 end procedure tab_nxt_hdr(f,size_str) hs := integer("8r" || size_str) next_header_offset := (hs / 512) * 512 hs % 512 ~= 0 & next_header_offset +:= 512 if 0 = next_header_offset then return "" return reads(f,next_header_offset) | stop("maptarfile: error reading in ", string(next_header_offset)," bytes.") end procedure fixpath(s) # fixpath is a misnomer of sorts, since it is used on # the first pass only, and merely examines each filename # in a path, using the procedure mappiece to record any # overlong ones in the global table filenametbl and in # the global sets chunkset and short_chunkset; no fixing # is actually done here s2 := "" s ? { while piece := tab(find("/")+1) do s2 ||:= mappiece(piece) s2 ||:= mappiece(tab(0)) } return s2 end procedure mappiece(s) # global filenametbl, chunkset short_chunkset initial { filenametbl := table() chunkset := set() short_chunkset := set() } chunk := trim(s,'/') if *chunk > 14 then { i := 0 repeat { # if the file has already been looked at, continue if \filenametbl[chunk] then next # else find a new unique 14-character name for it lchunk := chunk[1:12] || right(string(i+:=1),3,"0") if lchunk == !filenametbl then next else break } # record filename in various global sets and tables filenametbl[chunk] := lchunk insert(chunkset,chunk) insert(short_chunkset,chunk[1:16]) } else lchunk := chunk lchunk ||:= (s[-1] == "/") return lchunk end procedure readtarhdr(s) this_block := hblock() s ? { this_block.name := move(100) # <- to be looked at later this_block.junk := move(8+8+8) # skip the permissions, uid, etc. this_block.size := move(12) # <- to be looked at later this_block.mtime := move(12) this_block.chksum := move(8) # <- to be looked at later this_block.linkflag := move(1) this_block.linkname := move(100) # <- to be looked at later this_block.therest := tab(0) } integer(this_block.size) | fail return this_block end procedure map_filenams(s) # chunkset is global, and contains all the overlong filenames # found in the first pass through the input file; here the aim # is to map the filenames to the shortened variants as stored # in filenametbl (which happens to be GLOBAL) local s2 s2 := "" s ? { until pos(0) do { # first narrow the possibilities, then try to map; # short_chunkset, chunkset & filenametbl are global if member(short_chunkset,&subject[&pos:&pos+15]) then s2 ||:= filenametbl[=!chunkset] else s2 ||:= move(1) } } return s2 end # Author: Ralph E. Griswold # Date: June 10, 1988 # exbase10(i,j) convert base-10 integer i to base j # The maximum base allowed is 36. procedure exbase10(i,j) static digits local s, d, sign initial digits := &digits || &lcase if i = 0 then return 0 if i < 0 then { sign := "-" i := -i } else sign := "" s := "" while i > 0 do { d := i % j if d > 9 then d := digits[d + 1] s := d || s i /:= j } return sign || s end procedure get_checksum(r) sum := 0 r.chksum := " " every field := !r do every sum +:= ord(!field) return sum end procedure write_report() # this procedure writes out a list of filenames which were # remapped (because they exceeded the SysV 14-char limit) local outtext, stbl, i (outtext := open(fname := "mapping.report","w")) | open(fname := "/tmp/mapping.report","w") | stop("maptarfile: Can't find a place to put mapping.report!") stbl := sort(filenametbl,3) every i := 1 to *stbl -1 by 2 do { write(outtext,left(stbl[i],35," ")," ",stbl[i+1]) } write(&errout,"maptarfile: ",fname," contains the list of changes") close(outtext) return &null end From CELEX@HNYMPI52.BITNET Sat Feb 24 19:16:52 1990 Resent-From: CELEX@HNYMPI52.BITNET Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA00779; Sat, 24 Feb 90 19:16:52 MST Received: from rvax.ccit.arizona.edu by Arizona.EDU; Sat, 24 Feb 90 19:16 MST Received: from HNYMPI52.BITNET by rvax.ccit.arizona.edu; Sat, 24 Feb 90 19:12 MST Resent-Date: Sat, 24 Feb 90 19:18 MST Date: Sat, 24 Feb 90 20:03 N From: CELEX@HNYMPI52.BITNET Subject: strip Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: X-Original-To: icon-group@arizona.edu, CELEX X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O We use for removing unwanted characters from a string the following method: (&unwanchar is the set of the unwanted characters) while string[upto(&unwanchar,string)+:1] :== "" Hope this helps, Marcel Bingley C C -- C E L E X -- C C C C University of Nijmegen C CCCCCC Wundtlaan 1 C CCCCCCCCCCCCC 6525 XD NIJMEGEN C C C CCCCCCCCCCCCCCCC The Netherlands CCCCCCCCCC CC C CCCCCCCC CCCCCCCC Tel: (+31) (0)80 - 512117 CCCCCCCC - 515797 CCCCCCCC CCCCCCCC EARN/BITNET: celex@hnympi52 CCCCCCCC Internet: celexmail@celex.surfnet.nl CCCCCCCC SURFNET: celex::celexmail CCCCCCCC JANET: celex%hnympi52@earn-relay CCCCCCCCC PSI: 020418802007380::celexmail CCCCCCCCCCC From goer@sophist.uchicago.EDU Wed Feb 28 02:40:39 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA01623; Wed, 28 Feb 90 02:40:39 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Wed, 28 Feb 90 02:38 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 28 Feb 90 03:37:08 CST Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA10175; Wed, 28 Feb 90 03:32:21 CST Resent-Date: Wed, 28 Feb 90 02:40 MST Date: Wed, 28 Feb 90 03:32:21 CST From: Richard Goerwitz Subject: in situ filename truncator for tar files Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9002280932.AA10175@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Having received a number of requests for this thing from a number of places, it seemed prudent to clean it up, comment it, fix the bugs, and repost. I had no idea that anyone would really _use_ it. #------------------------------------------------------------------- # # MAPTARFILE # # Map 15+ char. filenames in a tar archive to 14 chars. # Handles both the header blocks and the source itself. # Obviates the need for renaming files and directories # by hand, and for altering source and docs to refer to # the new file and directory names. # # Richard L. Goerwitz, III # # Last modified 2/27/90 # #------------------------------------------------------------------- global filenametbl, chunkset, short_chunkset # see procedure mappiece(s) record hblock(name,junk,size,mtime,chksum, linkflag,linkname,therest) # see readtarhdr(s) procedure main(a) usage := "usage: maptarfile inputfile # output goes to stdout" 0 < *a < 2 | stop("Bad arg count.\n",usage) intext := open(a[1],"r") | stop("maptarfile: can't open ",a[1]) # Run through all the headers in the input file, filling # (global) filenametbl with the names of overlong files; # make_table_of_filenames fails if there are no such files. make_table_of_filenames(intext) | stop("maptarfile: no overlong path names to map") # Now that a table of overlong filenames exists, go back # through the text, remapping all occurrences of these names # to new, 14-char values; also, reset header checksums, and # reformat text into correctly padded 512-byte blocks. Ter- # minate output with 512 nulls. seek(intext,1) every writes(output_mapped_headers_and_texts(intext)) close(intext) write_report() # Record mapped file and dir names for future ref. exit(0) end procedure make_table_of_filenames(intext) local header # chunkset is global # search headers for overlong filenames; for now # ignore everything else while header := readtarhdr(reads(intext,512)) do { # tab upto the next header block tab_nxt_hdr(intext,trim_str(header.size),1) # record overlong filenames in several global tables, sets fixpath(trim_str(header.name)) } *chunkset = 0 & fail return &null end procedure output_mapped_headers_and_texts(intext) # Remember that filenametbl, chunkset, and short_chunkset # (which are used by various procedures below) are global. local header, newtext, full_block, block, lastblock # Read in headers, one at a time. while header := readtarhdr(reads(intext,512)) do { # Replace overlong filenames with shorter ones, according to # the conversions specified in the global hash table filenametbl # (which were generated by fixpath() on the first pass). header.name := left(map_filenams(header.name),100,"\x00") header.linkname := left(map_filenams(header.linkname),100,"\x00") # Use header.size field to determine the size of the subsequent text. # Read in the text as one string. Map overlong filenames found in it # to shorter names as specified in the global hash table filenamtbl. newtext := map_filenams(tab_nxt_hdr(intext,trim_str(header.size))) # Now, find the length of newtext, and insert it into the size field. header.size := right(exbase10(*newtext,8) || " ",12," ") # Calculate the checksum of the newly retouched header. header.chksum := right(exbase10(get_checksum(header),8)||"\x00 ",8," ") # Finally, join all the header fields into a new block and write it out full_block := ""; every full_block ||:= !header suspend left(full_block,512,"\x00") # Now we're ready to write out the text, padding the final block # out to an even 512 bytes if necessary; the next header must start # right at the beginning of a 512-byte block. newtext ? { while block := move(512) do suspend block pos(0) & next lastblock := left(tab(0),512,"\x00") suspend lastblock } } # Write out a final null-filled block. Some tar programs will write # out 1024 nulls at the end. Dunno why. return repl("\x00",512) end procedure trim_str(s) # Knock out spaces, nulls from those crazy tar header # block fields (some of which end in a space and a null, # some just a space, and some just a null [anyone know # why?]). return s ? { (tab(many(' ')) | &null) & trim(tab(find("\x00")|0)) } \ 1 end procedure tab_nxt_hdr(f,size_str,firstpass) # Tab upto the next header block. Return the bypassed text # as a string (this value is not always used). local hs, next_header_offset hs := integer("8r" || size_str) next_header_offset := (hs / 512) * 512 hs % 512 ~= 0 & next_header_offset +:= 512 if 0 = next_header_offset then return "" else { # if this is pass no. 1 don't bother returning a value; we're # just collecting long filenames; if \firstpass then { seek(f,where(f)+next_header_offset) return } else { return reads(f,next_header_offset)[1:hs+1] | stop("maptarfile: error reading in ", string(next_header_offset)," bytes.") } } end procedure fixpath(s) # Fixpath is a misnomer of sorts, since it is used on # the first pass only, and merely examines each filename # in a path, using the procedure mappiece to record any # overlong ones in the global table filenametbl and in # the global sets chunkset and short_chunkset; no fixing # is actually done here. s2 := "" s ? { while piece := tab(find("/")+1) do s2 ||:= mappiece(piece) s2 ||:= mappiece(tab(0)) } return s2 end procedure mappiece(s) # Check s (the name of a file or dir as recorded in the tar header # being examined) to see if it is over 14 chars long. If so, # generate a unique 14-char version of the name, and store # both values in the global hashtable filenametbl. Also store # the original (overlong) file name in chunkset. Store the # first fifteen chars of the original file name in short_chunkset. # Sorry about all of the tables and sets. It actually makes for # a reasonably efficient program. Doing away with both sets, # while possible, causes a tenfold drop in execution speed! # global filenametbl, chunkset, short_chunkset local j, ending initial { filenametbl := table() chunkset := set() short_chunkset := set() } chunk := trim(s,'/') if *chunk > 14 then { i := 0 repeat { # if the file has already been looked at, continue if \filenametbl[chunk] then next # else find a new unique 14-character name for it # preserve important suffixes like ".Z," ".c," etc. if chunk ? (tab(find(".")), ending := move(1) || tab(any(&ascii)), pos(0)) then lchunk := chunk[1:11] || right(string(i+:=1),2,"0") || ending else lchunk := chunk[1:12] || right(string(i+:=1),3,"0") if lchunk == !filenametbl then next else break } # record filename in various global sets and tables filenametbl[chunk] := lchunk insert(chunkset,chunk) insert(short_chunkset,chunk[1:16]) } else lchunk := chunk lchunk ||:= (s[-1] == "/") return lchunk end procedure readtarhdr(s) # Read the silly tar header into a record. Note that, as was # complained about above, some of the fields end in a null, some # in a space, and some in a space and a null. The procedure # trim_str() may (and in fact often _is_) used to remove this # extra garbage. this_block := hblock() s ? { this_block.name := move(100) # <- to be looked at later this_block.junk := move(8+8+8) # skip the permissions, uid, etc. this_block.size := move(12) # <- to be looked at later this_block.mtime := move(12) this_block.chksum := move(8) # <- to be looked at later this_block.linkflag := move(1) this_block.linkname := move(100) # <- to be looked at later this_block.therest := tab(0) } integer(this_block.size) | fail # If it's not an integer, we've hit # the final (null-filled) block. return this_block end procedure map_filenams(s) # Chunkset is global, and contains all the overlong filenames # found in the first pass through the input file; here the aim # is to map these filenames to the shortened variants as stored # in filenametbl (GLOBAL). local s2 s2 := "" s ? { until pos(0) do { # first narrow the possibilities, using short_chunkset if member(short_chunkset,&subject[&pos:&pos+15]) # then try to map from a long to a shorter 14-char filename then s2 ||:= (filenametbl[=!chunkset] | move(1)) else s2 ||:= move(1) } } return s2 end # From the IPL. Thanks, Ralph - # Author: Ralph E. Griswold # Date: June 10, 1988 # exbase10(i,j) convert base-10 integer i to base j # The maximum base allowed is 36. procedure exbase10(i,j) static digits local s, d, sign initial digits := &digits || &lcase if i = 0 then return 0 if i < 0 then { sign := "-" i := -i } else sign := "" s := "" while i > 0 do { d := i % j if d > 9 then d := digits[d + 1] s := d || s i /:= j } return sign || s end # end IPL material procedure get_checksum(r) # Calculates the new value of the checksum field for the # current header block. Note that the specification say # that, when calculating this value, the chksum field must # be blank-filled. sum := 0 r.chksum := " " every field := !r do every sum +:= ord(!field) return sum end procedure write_report() # This procedure writes out a list of filenames which were # remapped (because they exceeded the SysV 14-char limit), # and then notifies the user of the existence of this file. local outtext, stbl, i (outtext := open(fname := "mapping.report","w")) | open(fname := "/tmp/mapping.report","w") | stop("maptarfile: Can't find a place to put mapping.report!") stbl := sort(filenametbl,3) every i := 1 to *stbl -1 by 2 do { write(outtext,left(stbl[i],35," ")," ",stbl[i+1]) } write(&errout,"maptarfile: ",fname," contains the list of changes") close(outtext) return &null end From tenaglia@fps.mcw.edu Wed Feb 28 08:20:02 1990 Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA12588; Wed, 28 Feb 90 08:20:02 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA19229; Wed, 28 Feb 90 10:17:25 EST Received: by uwm.edu; id AA23161; Wed, 28 Feb 90 09:08:35 -0600 Message-Id: <9002281508.AA23161@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Wed, 28 Feb 90 08:29:52 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Wed, 28 Feb 90 08:29:37 CDT Date: Wed, 28 Feb 90 08:29:37 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: And now something useless Status: O Icon is a great language for recreational programming as well. I recently read a Scientific American where someone described a program that takes a known text and scrambles it in a most bizaare fashion. The output is not unlike a Max Headroom monolog. Also they interfaced it with a rhyming engine to generate bizaare poetry. Well I thought it would be fun to try doing it in icon. Below is the scrambler. I guess I was lazy not to do the rhyming engine. It chooses subsequent words based on the likelyhood of them occuring after the current word. This icon program accomplishes that. I find it rather amusing what it does to my own documentation. Perhaps someone has a more clever method, or perhaps someone would want to post a rhyming engine? ##################### 80 lines follow ############################ # # # Poet.Icn 02/28/90 BY TENAGLIA # # # # THIS PROGRAM TAKES A DOCUMENT AND RE-OUTPUTS IT IN A CLEVERLY # # SCRAMBLED FASHION. IT USES THE NEXT TWO MOST LIKELY WORDS TO # # TO FOLLOW. USAGE : ICONX POET INPUT_FILE [OUTPUT_FILE] # # IF NO OUTPUT FILE IS SPECIFIED, THE GIBBERISH IS SENT TO TTY # # THE CONCEPT WAS FOUND IN A RECENT SCIENTIFIC AMERICAN AND ICON # # SEEMED TO OFFER THE BEST IMPLEMENTATION. # # # ################################################################## global vocab,index procedure main(param) source := param[1] | input("_Source:") target := param[2] | "tt:" (in := open(source)) | stop("Can't open ",source) (out := open(target,"w")) | stop("Can't open ",target) vocab:= [] index:= table([]) write("Loading vocabulary") while line := read(in) do { vocab |||:= parse(line,' ') writes(".") } close(in) write("\nindexing...\n") every i := 1 to *vocab-2 do index[vocab[i]] |||:= [i] index[vocab[-2]] |||:= [-2] # wrap end to front in order to index[vocab[-1]] |||:= [-1] # prevent stuck loop if last word chosen n := -1 ; &random := map(&clock,":","0") ; line := "" write("\n") every 1 to *vocab/2 do { (n > 1) | (n := ?(*vocab-2)) word := vocab[n] follows := vocab[(?(index[word]))+1] n := (?(index[follows])) + 1 if (*line + *word + *follows + 2) > 80 then { write(out,line) line := "" } line ||:= word || " " || follows || " " } write(out,line,".") close(out) end ################################################################## # # # THIS PROCEDURE PULLS ALL THE ELEMENTS (TOKENS) OUT OF A LINE # # BUFFER AND RETURNS THEM IN A LIST. A VARIABLE NAMED 'CHARS' # # CAN BE STATICALLY DEFINED HERE OR GLOBAL. IT IS A CSET THAT # # CONTAINS THE VALID CHARACTERS THAT CAN COMPOSE THE ELEMENTS # # ONE WISHES TO EXTRACT. # # # ################################################################## procedure parse(line,delims) static chars chars := &cset -- delims tokens := [] line ? while tab(upto(chars)) do put(tokens,tab(many(chars))) return tokens end ################################################################## # # # THIS PROCEDURE IS TERRIBLY HANDY IN PROMPTING AND GETTING # # AN INPUT STRING # # # ################################################################## procedure input(prompt) writes(prompt) return read() end Yours truly, Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From icon-group-request@arizona.edu Wed Feb 28 18:13:45 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA27832; Wed, 28 Feb 90 18:13:45 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 28 Feb 90 18:13 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AB00291; Tue, 27 Feb 90 20:19:27 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 28 Feb 90 18:14 MST Date: 27 Feb 90 18:20:53 GMT From: esquire!yost@nyu.EDU Subject: Icon instead of shell scripts or C code Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1806@esquire.UUCP> Organization: DP&W, New York, NY X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O There world is divided into two kinds of programs: 1. Simple data manipulation batch programs (e.g. grep) 2. All other programs. Unfortunately, Icon doesn't reach much beyond Class 1. It would be great to be able to use Icon for all programs. There are many reasons write programs on unix in Icon instead of the shell or C, and I'm sure we all know what those are. Unfortunately, those reasons don't stand a chance against the reason why you can't use Icon: The vast majority of unix system calls are not supported. As a result, Icon has these deficiencies, among others: Lack of sophistication in the running of subprocesses: Keyboard interrupt while a system() command is ignored No way to run a unix command and capture its output in a variable You can't run a program in the background and get its process (group) id for a later kill No fork, exec, wait, etc. No trapping of signals, and therefore no cleanup on forced exit, no timeouts Has anyone implemented more of the unix system calls? Would you please tell us about it? Icon is so nice. It's a shame it can't be used for more things. --dave yost yost@dpw.com or uunet!esquire!yost Please ignore the From or Reply-To fields above, if different. P.S. Here is a routine that adds a little to Icon's capability to replace shell scripts: # Run a command with the contents of an Icon string variable as input # Note: If the string is not newline-terminated, it will appear to the # command as if it were. There are workarounds for this procedure tosystem (inputstring, command) return system ("<<'**END**' " || command || "\n" || inputstring || if inputstring[-1] ~== "\n" then "\n" else "" || "**END**\n") end From jeffc@osf.ORG Mon Mar 5 09:14:07 1990 Resent-From: jeffc@osf.ORG Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA25244; Mon, 5 Mar 90 09:14:07 MST Received: from osf.osf.org by Arizona.EDU; Mon, 5 Mar 90 09:06 MST Received: from soba.osf.org by osf.osf.org (5.61/OSF 0.9) id AA16654; Mon, 5 Mar 90 11:02:30 -0500 Resent-Date: Mon, 5 Mar 90 09:11 MST Date: Mon, 05 Mar 90 11:02:29 -0500 From: Jeff Carter Subject: Porting Icon 7.5 to a new and Unique UNIX machine Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9533.636652949@soba> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I recently began porting (a reasonably simple task) Icon to a DECstation 3100 (MIPS RISC chipset, runs Ultrix. aka PMAX). I have run in to a couple of things that hopefully someone else out there has a solution for, or suggestions how to debug: (1) There are numurous calls to fopen() with "rb" and "wb" modes that are _not_ surrounded by OS-specific #ifdef's. This leads me to believe (unfortunately) that maybe the particular version I have hasn't been run on a wide variety of UNIX machines. Any comments? Are other versions of UNIX more forgiving than ULTRIX 3.0? (2) Floating-point conversions. I get numerous failures of the "eval" and "fncs" tests that seem to stem from a problem with the conversion of real numbers to their string representations. For example, from the "eval" test: 3c3 < 2.0 === +2.0 ----> 9.018482111602407e-O4 --- > 2.0 === +2.0 ----> 2.0 And from the "fncs" test: 3c3 < copy(1.0) ----> 9.017964046223754e-O4 --- > copy(1.0) ----> 1.0 There are numerous other examples, but almost all of the reported errors are similar to these. (3) Memory allocation. Early the startup, the executable calls fopen() in order to get the code file. This, unfortunately, causes fopen() to call malloc(), which immediately fails because initalloc() hasn't been called. And initalloc doesn't get called until after the header is read from the code file. This forces me to use the static allocation versions of the memory management routines. The first application that I want to use this for wants to use a _lot_ of string space. I keep getting "out of space in string region" errors, and having to restart with larger and larger values. This is a royal pain. Has anyone looked at making the code region be allocated out of the static memory region, or some other technique that would let me initialize the memory allocation routine earlier? Is there a particular reason why this _won't_ work (so I don't waste my time on it, only to dicover the fatal flaw.) jeff carter jeffc@osf.org From ralph Mon Mar 5 09:29:18 1990 Resent-From: "Ralph Griswold" Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA26537; Mon, 5 Mar 90 09:29:18 MST Received: from megaron.cs.arizona.edu by Arizona.EDU; Mon, 5 Mar 90 09:26 MST Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA26252; Mon, 5 Mar 90 09:26:10 MST Resent-Date: Mon, 5 Mar 90 09:27 MST Date: Mon, 5 Mar 90 09:26:10 MST From: Ralph Griswold Subject: RE: Porting Icon 7.5 to a new and Unique UNIX machine Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu, jeffc@osf.ORG Resent-Message-Id: Message-Id: <9003051626.AA26252@megaron.cs.arizona.edu> In-Reply-To: <9533.636652949@soba> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu, jeffc@osf.ORG Status: O Version 8 of Icon, to be out shortly, will support the DECstation and several other newer workstations, including the Sun SPARCstation and the NeXT machine. (All of the problems noted in earlier mail are corrected in Version 8.) Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From tenaglia@fps.mcw.edu Wed Mar 14 16:18:58 1990 Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA26299; Wed, 14 Mar 90 16:18:58 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA18368; Wed, 14 Mar 90 18:17:58 EST Received: by uwm.edu; id AA22806; Wed, 14 Mar 90 16:49:13 -0600 Message-Id: <9003142249.AA22806@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Wed, 14 Mar 90 16:02:34 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Wed, 14 Mar 90 15:05:15 CDT Date: Wed, 14 Mar 90 15:05:15 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: Handy Icon Procedure for Reports Status: O Dear Icon Group : I am including a handy procedure that can reformat strings. It's fairly intuitive as far as usage is concerned. My implementaion is pretty plain. Perhaps someone has a more elegant expression that makes use of string scanning or co-expressions? Enjoy! ######################################################################## # # # THIS PROCEDURE IS A HANDY STRING REMAPPER/FORMATTER AND IT'S # # VERY HANDY FOR REPORT GENERATION. USAGE PATCH(VARIABLE,MASK) # # WHERE VARIABLE IS A STRING AND MASK IS A STRING. MASK CONTAINS # # A SEQUENCE THAT TRANSFORMS VARIABLE. THE # CHARACTER MEANS TO # # COPY THE CHARACTER AT THAT POSITION. THE $ CHARACTER MEANS TO # # DELETE THE CURRENT CHARACTER AT THAT POSITION. ANY OTHER BYTES # # GET INSERTED INTO THE VARIABLE AT THEIR RESPECTIVE POSITIONS. # # EXAMPLES : patch("12/03/89","##$##$##") returns 120389 # # patch("120389","##/##/19##") returns 12/03/1989 # # patch("12/03/1989","##$") returns 12 # # # ######################################################################## procedure patch(var,mask) text := "" i := 0 every mark := !mask do { case mark of { "#" : { text ||:= var[(i+:=1)] next } "$" : { i +:= 1 next } default : text ||:= mark } } return text end ############################################################# Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From wgg@cs.washington.edu Wed Mar 14 17:54:59 1990 Received: from june.cs.washington.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02804; Wed, 14 Mar 90 17:54:59 MST Received: by june.cs.washington.edu (5.61/7.0jh) id AA20769; Wed, 14 Mar 90 16:54:34 -0800 Date: Wed, 14 Mar 90 16:54:34 -0800 From: wgg@cs.washington.edu (William Griswold) Return-Path: Message-Id: <9003150054.AA20769@june.cs.washington.edu> To: icon-group@cs.arizona.edu, tenaglia@mis.mcw.edu Subject: Re: Handy Icon Procedure for Reports Status: O An Icon programmer writes... >Date: Wed, 14 Mar 90 15:05:15 CDT >From: Chris Tenaglia - 257-8765 >To: icon-group@cs.arizona.edu >Subject: Handy Icon Procedure for Reports >Errors-To: icon-group-errors@cs.arizona.edu >Status: R > >Dear Icon Group : > >I am including a handy procedure that can reformat strings. It's >fairly intuitive as far as usage is concerned. My implementaion >is pretty plain. Perhaps someone has a more elegant expression >that makes use of string scanning or co-expressions? Enjoy! > Although the paradigm is a little different, there are a whole class of problems like this that can be cleverly implemented in one line with the map() function. In one of the later chapters of the Icon book there are several examples. Perhaps someone would like to submit some.... Bill Griswold From goer@sophist.uchicago.EDU Wed Mar 14 18:04:23 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA03325; Wed, 14 Mar 90 18:04:23 MST Return-Path: goer@Arizona.edu Received: from tank.uchicago.edu by Arizona.EDU; Wed, 14 Mar 90 17:55 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 14 Mar 90 18:54:59 CST Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03952; Wed, 14 Mar 90 18:26:35 CST Resent-Date: Wed, 14 Mar 90 18:02 MST Date: Wed, 14 Mar 90 18:26:35 CST From: Richard Goerwitz Subject: patch; using string scanning Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9003150026.AA03952@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I liked the previous posting, and I don't think there was anything wrong with it. String scanning just seems a bit clearer to me than the i/j stuff. This is how I would have done it: procedure patch(var,mask) text := "" var ? { every chr := !mask do { case chr of { "#" : text ||:= move(1) "$" : move(1) default : text ||:= chr } } } return text end Warning, warning: This code fragment has not been tested (though with Icon it's pretty hard to screw up something of this sort). -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From icon-group-request@arizona.edu Fri Mar 16 15:06:10 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA01622; Fri, 16 Mar 90 15:06:10 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 16 Mar 90 13:52 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA29091; Fri, 16 Mar 90 12:43:34 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Fri, 16 Mar 90 14:06 MST Date: 16 Mar 90 15:58:20 GMT From: mcsun!ukc!dcl-cs!se@uunet.uu.NET Subject: icon on a PC Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <891@dcl-vitus.comp.lancs.ac.uk> Organization: Department of Computing at Lancaster University, UK. X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I recently helped to install Icon on the University's Sequent Symmetry S81 and then put on some workshops to 'educate the masses'. I'm now being inundated with questions from people who were so impressed they got a copy to run on their PCs. They're now coming to me with questions about the implementation that I cannot answer. Anyone care to help? 1) Is there a PC version of icon which creates an executable file instead of having to run the ICONCX program every time? 2) Text files which I want to process using icon involve home-made fonts created in Pascal. What is the possibility of processing such fonts in icon? 3) I keep getting an error message 'inadequate space in block region'. Is there an environment variable that can be set to stop this? This happens with long files. Thanks in advance for any light shed on these problems. Steve -- NAME: Steve Elliott WORK PHONE: +44 524 65201 ext 3783 EMAIL: se@uk.ac.lancs.comp POST: University of Lancaster, Department of Computing, Engineering Building, Bailrigg, Lancaster, LA1 4YR, UK. From nowlin@iwtqg.att.COM Fri Mar 16 17:23:41 1990 Resent-From: nowlin@iwtqg.att.COM Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA12143; Fri, 16 Mar 90 17:23:41 MST Received: from att-in.att.com by Arizona.EDU; Fri, 16 Mar 90 15:47 MST Resent-Date: Fri, 16 Mar 90 16:38 MST Date: Fri, 16 Mar 90 16:41 CST From: nowlin@iwtqg.att.COM Subject: RE: icon on a PC Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Resent-Message-Id: Message-Id: >From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268) X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Status: O > 1) Is there a PC version of icon which creates an executable file > instead of having to run the ICONCX program every time? No. > 2) Text files which I want to process using icon involve home-made > fonts created in Pascal. What is the possibility of processing such > fonts in icon? That's up to you to implement but Icon should be up to it. > 3) I keep getting an error message 'inadequate space in block region'. > Is there an environment variable that can be set to stop this? This > happens with long files. The third one I'm familiar with on a number of systems. Define HEAPSIZE to be larger than the default for your system to get rid of that problem. The default on the system I use (3B2) is 51,200 so I use 100,000 when I start to get the block region warning. Jerry Nowlin From cjeffery Fri Mar 16 17:41:50 1990 Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13199; Fri, 16 Mar 90 17:41:50 MST Date: Fri, 16 Mar 90 17:41:48 mst From: "Clinton Jeffery" Message-Id: <9003170041.AA14968@caslon> Received: by caslon; Fri, 16 Mar 90 17:41:48 mst To: icon-group In-Reply-To: nowlin@iwtqg.att.COM's message of Fri, 16 Mar 90 16:41 CST Subject: icon on a PC Status: O >> 3) I keep getting an error message 'inadequate space in block region'. >> Is there an environment variable that can be set to stop this? This >> happens with long files. >The third one I'm familiar with on a number of systems. Define HEAPSIZE to >be larger than the default for your system to get rid of that problem. The >default on the system I use (3B2) is 51,200 so I use 100,000 when I start >to get the block region warning. This is the correct answer. Unfortunately, I have my doubts as to whether most MS-DOS Icon implementations can support HEAPSIZE values larger than 64K due to the segmentation of the 8086 architecture. Large Icon programs have to be designed well in order to run under MS-DOS. Version 8.0 of Icon is more space-efficient in its use of the block region. From icon-group-request@arizona.edu Tue Mar 20 05:56:18 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA05939; Tue, 20 Mar 90 05:56:18 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 20 Mar 90 05:46 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20684; Tue, 20 Mar 90 04:33:00 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 20 Mar 90 05:48 MST Date: 18 Mar 90 22:36:53 GMT From: cs.utexas.edu!news-server.csri.toronto.edu!qucdn!walmslec@tut.cis.ohio-state.EDU Subject: RE: icon on a PC Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <90077.173653WALMSLEC@QUCDN.BITNET> Organization: Queen's University at Kingston X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: , <9003170041.AA14968@caslon> Status: O Regarding Icon version 8.0. Is it available yet, if not then when, and what new features, fixes will it provide? thanks chris ------- |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~| | Christopher J. M. Walmsley | Queen's University | | BITNET: WALMSLEC@QUCDN | Kingston, Ontario | | X.400: Christopher.Walmsley@QueensU.CA | Canada | | | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From ralph Tue Mar 20 06:26:57 1990 Resent-From: "Ralph Griswold" Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA07059; Tue, 20 Mar 90 06:26:57 MST Received: from megaron.cs.arizona.edu by Arizona.EDU; Tue, 20 Mar 90 06:07 MST Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA06718; Tue, 20 Mar 90 06:06:58 MST Resent-Date: Tue, 20 Mar 90 06:18 MST Date: Tue, 20 Mar 90 06:06:58 MST From: Ralph Griswold Subject: RE: icon on a PC Resent-To: icon-group@cs.arizona.edu To: cs.utexas.edu!news-server.csri.toronto.edu!qucdn!walmslec@tut.cis.ohio-state.EDU, icon-group@arizona.edu Resent-Message-Id: Message-Id: <9003201306.AA06718@megaron.cs.arizona.edu> In-Reply-To: <90077.173653WALMSLEC@QUCDN.BITNET> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: cs.utexas.edu!news-server.csri.toronto.edu!qucdn!walmslec@tut.cis.ohio-state.EDU, icon-group@Arizona.edu Status: O Version 8 of Icon will be released on a system-by-system basis as we get individual implementations and documentation done. We expect to release Version 8 for UNIX and VMS in a few weeks. Others will follow as time and resources permit. We'll provide a summary of new features and other relevant information when the first release is announced. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From tenaglia@fps.mcw.edu Wed Mar 21 11:26:58 1990 Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA24446; Wed, 21 Mar 90 11:26:58 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA14178; Wed, 21 Mar 90 13:26:47 EST Received: by uwm.edu; id AA06632; Wed, 21 Mar 90 12:25:18 -0600 Message-Id: <9003211825.AA06632@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Wed, 21 Mar 90 11:54:37 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Wed, 21 Mar 90 11:54:14 CDT Date: Wed, 21 Mar 90 11:54:14 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: Icon Ideas ? Status: O I've noticed in the Icon Newsletter discussion of an object oriented version of Icon (IDOL?). It makes use of the $ character. Somehow I can never quite seem to comprehend this 'object oriented' stuff. It looks like kloojed terminology. But back to the $. I wonder about the use of the $ to create user define operators. For example: operation("$+",lst) case *lst of { 1 : return &null # unary form not defined 2 : return lst[1] || " " || lst[2] # binary form ok } end Later ... a := b $+ c # 'a' is 'b' appended with a space and then 'c' d := $+e # unary form should fail or be &null x $+:= z # augmented form appends blank and 'z' to 'x' Is this a desirable 'feature' for Icon 8.2 or 9.0? Or would it be impractical? Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From icon-group-request@arizona.edu Thu Mar 22 22:36:35 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA09424; Thu, 22 Mar 90 22:36:35 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 22 Mar 90 22:31 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA18719; Thu, 22 Mar 90 21:27:49 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 22 Mar 90 22:34 MST Date: 23 Mar 90 05:27:19 GMT From: zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!corre@tut.cis.ohio-state.EDU Subject: RE: icon on a PC Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <3028@uwm.edu> Organization: University of Wisconsin-Milwaukee X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <891@dcl-vitus.comp.lancs.ac.uk> Status: O I use custom made characters on my Zenith by loading a table of chars 128-255 into memory, then going to the graphics screen with writes("\e[=6h") I first install the ANSI.SYS terminal driver by including the relevant command in the CONFIG.SYS file. -- Alan D. Corre Department of Hebrew Studies University of Wisconsin-Milwaukee (414) 229-4245 PO Box 413, Milwaukee, WI 53201 corre@csd4.csd.uwm.edu From icon-group-request@arizona.edu Mon Mar 26 09:07:23 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA01386; Mon, 26 Mar 90 09:07:23 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Mon, 26 Mar 90 09:02 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA07097; Mon, 26 Mar 90 07:53:56 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Mon, 26 Mar 90 09:05 MST Date: 26 Mar 90 15:39:54 GMT From: consp22@bingvaxu.cc.binghamton.EDU Subject: Need general Info Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <3210@bingvaxu.cc.binghamton.edu> Organization: SUNY Binghamton POD consultants X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I was asked to do some 'poking' around to see what I could find out about icon. We are looking into using it to pre-process some information before moving the data down to a micro. If somebody would to be so kind as to give me or direct me to information on the language. Thank you, ------------------------------------------------------------------------------- | Consp22@Bingsuns.pod.binghamton.edu | SUNY-B Computer Consultants - | | Consp22@Bingvaxu.cc.binghamton.edu | Trying to keep the world safe from | |---------------------------------------| the SUNY-B Computer users. | | Consultant/Techie - World Computers |-------------------------------------| | Computer Cons. - SUNY Binghamton | Darren `Mac Hack' Handler | |-----------------------------------------------------------------------------| I don't know if I am going to heaven or hell, I just hope God grades on a curve From tenaglia@fps.mcw.edu Mon Mar 26 18:29:53 1990 Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA03233; Mon, 26 Mar 90 18:29:53 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA16891; Mon, 26 Mar 90 19:19:10 EST Received: by uwm.edu; id AA06342; Mon, 26 Mar 90 14:43:35 -0600 Message-Id: <9003262043.AA06342@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Mon, 26 Mar 90 14:04:18 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Mon, 26 Mar 90 13:31:24 CDT Date: Mon, 26 Mar 90 13:31:24 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: Re: Icon Ideas ? X-Vms-Mail-To: UUCP%"langley@DG-RTP.DG.COM" Status: O In response to a response to my posting,.... > > But back to the $. > > > > I wonder about the use of the $ to create user define operators. For example: > > > > operation("$+",lst) > > case *lst of > > { > > 1 : return &null # unary form not defined > > 2 : return lst[1] || " " || lst[2] # binary form ok > > } > > end > > > > Later ... > > > > a := b $+ c # 'a' is 'b' appended with a space and then 'c' > > d := $+e # unary form should fail or be &null > > x $+:= z # augmented form appends blank and 'z' to 'x' > > > > Is this a desirable 'feature' for Icon 8.2 or 9.0? Or would it be impractical? > > Wouldn't general operator overloading in Icon be better? Yes, I gave some thought to overloading some of the existing Icon operators. I had had some classes in ADA language which permits this. However, as a group of programmers (40 of us) discussed it. The thought that + could be * or - made us nervous. A language such as ADA is very strict about DATA TYPES, and for it NOT to be strict with the OPERATORS seemed sort of inconsistent. My icon background gives me the philosophy of typed operators as well as (loosely) typed data/procedures. It seems more natural to keep the user defined objects (operators, procedures, variables) separated from the built in ones. This is only my opinion, and it may not line up with the goals of the Icon project in the long run. Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From nowlin@iwtqg.att.COM Tue Mar 27 00:16:50 1990 Resent-From: nowlin@iwtqg.att.COM Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA21961; Tue, 27 Mar 90 00:16:50 MST Received: from att-in.att.com by Arizona.EDU; Tue, 27 Mar 90 00:05 MST Resent-Date: Tue, 27 Mar 90 00:14 MST Date: Mon, 26 Mar 90 23:15 CST From: nowlin@iwtqg.att.COM Subject: RE: Icon Ideas? (operators) Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Resent-Message-Id: Message-Id: >From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268) X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Status: O > In response to a response to my posting,.... > > > > But back to the $. > > > > > > I wonder about the use of the $ to create user define operators. > > > > Wouldn't general operator overloading in Icon be better? > > Yes, I gave some thought to overloading some of the existing Icon operators. > > ... > > My icon background gives me the philosophy of typed operators as well as > (loosely) typed data/procedures. It seems more natural to keep the user > defined objects (operators, procedures, variables) separated from the built > in ones. This is only my opinion, and it may not line up with the goals > of the Icon project in the long run. > > Chris Tenaglia (System Manager) I can't remember if Bill's object oriented Icon has operator and function overloading or not. This is my two cents worth and if I've got it all wrong I trust someone will let me know. The language that has overloaded operators that I'm most familiar with is C++. It discriminates between overloaded operators (functions) by enforcing strict typing of operands (arguments). This is how the compiler determines which operation to perform on the operands. Operators can appear to take on almost any type for the programmer working with well defined C++ classes. Icon, on the other hand, has operands or variables that can be any type. To discriminate between different types of operands Icon uses fairly strongly typed operators (and functions). There are exceptions (assignment) but for the most part the operators in Icon are type specific. I know this because of all the run time errors I generate. You get a great deal of automatic type conversion in Icon but it's driven by the operators more than the types of the operands. You can add two strings of digits in Icon with the "+" operator but you get a numeric result, not a longer string of digits. You can also concatenate two numbers into a string of digits with the "||" operator. To allow overloaded operators would violate this scheme. How would Icon know whether to do automatic type conversion or try for another version of the operator that was a better fit to the given operands? Someone with experience in the implementation could shed more light on this. User defined operators that are distinguished from built-in operators by an explicit syntax are the best compromise but there are an awful lot of operators in Icon already. Procedure names can be very descriptive. (hint) Jerry Nowlin (...!att!iwtqg!nowlin) From kwalker Tue Mar 27 09:45:16 1990 Date: Tue, 27 Mar 90 09:45:16 MST From: "Kenneth Walker" Message-Id: <9003271645.AA17945@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA17945; Tue, 27 Mar 90 09:45:16 MST In-Reply-To: To: icon-group Subject: RE: Icon Ideas? (operators) Status: O > Date: Mon, 26 Mar 90 23:15 CST > From: nowlin@iwtqg.att.COM > > You can add two strings of digits in Icon with the "+" operator but you get > a numeric result, not a longer string of digits. You can also concatenate > two numbers into a string of digits with the "||" operator. To allow > overloaded operators would violate this scheme. How would Icon know > whether to do automatic type conversion or try for another version of the > operator that was a better fit to the given operands? Someone with > experience in the implementation could shed more light on this. "+" does give a numeric result, but numeric is either integer or real. The decision about whether to do integer arithmetic or real arithmetic is made at run-time. If you could replace "+" with with your own operation and somehow invoke the old "+" within your operation, you could get the effect of overloading. Assuming the function old_op() gets you the built-in version, the following operation would enhance "+" to do pair wise addition of lists. operator +(a,b) if type(a) == type(b) == "list" then { r := [] every i := 1 to *a do put(r, old_op(a[i], b[i])) return r } else return old_op(a,b) end I don't necessarily think being able to arbitrarily redefine operators is a good idea. It leaves you with too few features in the language whose meaning you can "trust" while reading a program. The idea of being able to add new operators does not have this problem. However, no one has brought up the problems of precedence and associtivity. Does a $- b - c mean (a $- b) - c or a $- (b - c)? You need something in your definition of an operator to deal with this. Currently, the organization of icont does not allow you to add new operators. With the tools we use to make icont, it is possible to organize a translator so that adding new operators can be done, but you must decide on a fixed set of precedences. If you decide "+" is at precedence 12 and "*" is at precedence 13, you will not be able to add operators with intermediate precedences. If you fix them at 12 and 15, you are limited to 2 levels of precedence between them. To get around these problems (which are not particularly serious), you would need a different kind of parser within icont. Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721 +1 602 621 2858 kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker From tenaglia@fps.mcw.edu Tue Mar 27 10:22:43 1990 Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA20477; Tue, 27 Mar 90 10:22:43 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA23876; Tue, 27 Mar 90 11:18:50 EST Received: by uwm.edu; id AA23916; Tue, 27 Mar 90 09:22:38 -0600 Message-Id: <9003271522.AA23916@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Tue, 27 Mar 90 09:13:57 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Tue, 27 Mar 90 08:46:42 CDT Date: Tue, 27 Mar 90 08:46:42 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: VMS Icon 7.0 process cleanup Status: O I'm running Icon 7.0 on a VAX with VMS 5.2. I've noticed a little problem. With files this code fragment works as expected : (inf := open(file)) | stop("Can't open ",file) while line := read(inf) do if find(target,line) then break close(inf) return line But with processes something goes wrong (inf := open(cmd,"pr") | stop("Can't run ",cmd) while line := read(inf) do if find(target,line) then break close(inf) return line The close(inf) doesn't work here. The process stays open. Shouldn't it just be killed? Eventually ones process quota is reached and the open fails if the fragment is in a loop. Is this fixed in Icon 8? I think the unix versions work properly (do they?). (inf := open(cmd,"pr") | stop("Can't run ",cmd) while line := read(inf) do if find(target,line) then temp := line close(inf) return temp is my current work-around. But if it generates thousands of lines of output, and I'm only interested in the first 10, it's rather wasteful. Thanx, Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From gudeman Tue Mar 27 12:19:21 1990 Date: Tue, 27 Mar 90 12:19:21 MST From: "David Gudeman" Message-Id: <9003271919.AA00259@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA00259; Tue, 27 Mar 90 12:19:21 MST To: icon-group In-Reply-To: "Kenneth Walker"'s message of Tue, 27 Mar 90 09:45:16 MST <9003271645.AA17945@megaron.cs.arizona.edu> Subject: Icon Ideas? (operators) Status: O Date: Tue, 27 Mar 90 09:45:16 MST From: "Kenneth Walker" ]I don't necessarily think being able to arbitrarily redefine operators ]is a good idea. It leaves you with too few features in the language ]whose meaning you can "trust" while reading a program. Ken is doing research that involves optimization of Icon programs by doing static analysis of the programs. Obviously this gets harder as more things become dependent on run-time conditions. Perhaps this is slightly effecting his opinion? ;-) There is an important advantage to overloading operators, though. Suppose you write a calculator program. Of course, inside this program you use mathematical operators. Now suppose you decide to upgrade the program to use complex numbers. You can't just define a new type and write functions to operate on complex numbers, you have to go through the entire program and replace every arithmetic expression such as ``a + b'' with ``add(a,b)'' (if it is in a position to take complex values for ``a'' and/or ``b''). Ick. If you could overload Icon's built-in operators, all you would have to do is overload the arithmetic operators so that they understood complex numbers. I can think of similar examples for non-numeric applications. Someone objected that this lets you do strange things like define + to do subtraction. True enough. But honestly, if a programmer is that determined to make write an unreadable program, he can do it just as easily without operator overloading. There is nothing the language designer can do about obtuseness of programmers, and it seems pointless to try. From icon-group-request@arizona.edu Tue Mar 27 20:15:21 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA03298; Tue, 27 Mar 90 20:15:21 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 27 Mar 90 20:17 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20957; Tue, 27 Mar 90 19:03:35 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 27 Mar 90 20:17 MST Date: 28 Mar 90 02:30:09 GMT From: shelby!csli!poser@decwrl.dec.COM Subject: RE: Icon Ideas? (operators) Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <12860@csli.Stanford.EDU> Organization: Center for the Study of Language and Information, Stanford U. X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <9003271645.AA17945@megaron.cs.arizona.edu>, <9003271919.AA00259@megaron.cs.arizona.edu> Status: O In article <9003271919.AA00259@megaron.cs.arizona.edu> gudeman@CS.ARIZONA.EDU ("David Gudeman") writes: > >If you could overload Icon's built-in operators, all you would have to >do is overload the arithmetic operators so that they understood >complex numbers. I can think of similar examples for non-numeric >applications. > >Someone objected that this lets you do strange things like define + to >do subtraction. There is an intermediate approach available in object-oriented languages as well as in languages like ML that provide disjunctive procedure definitions. Implement operator overloading as ADDITION of methods for new data types, but don't allow pre-defined methods (i.e. the built-in operators) to be removed. This guarantees that an operator will have the expected semantics when applied to built-in data types and reduces the uncertainty to derived types. From goer@sophist.uchicago.EDU Wed Mar 28 11:36:57 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA24195; Wed, 28 Mar 90 11:36:57 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Wed, 28 Mar 90 11:38 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 28 Mar 90 12:38:40 CST Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA18812; Wed, 28 Mar 90 12:33:15 CST Resent-Date: Wed, 28 Mar 90 11:38 MST Date: Wed, 28 Mar 90 12:33:15 CST From: Richard Goerwitz Subject: icon & prolog Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9003281833.AA18812@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Just an idle question: Has anyone thought of implementing Prolog in Icon, either as a Prolog -> Icon translator, or as a Prolog interpreter written in Icon? I'm not a Prolog expert, but it occurs to me that Icon might offer facili- ties to make such a project much easier than it might be for most other languages. Just curious. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From gudeman Wed Mar 28 13:15:06 1990 Resent-From: "David Gudeman" Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA01233; Wed, 28 Mar 90 13:15:06 MST Received: from megaron.cs.arizona.edu by Arizona.EDU; Wed, 28 Mar 90 13:15 MST Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA01095; Wed, 28 Mar 90 13:12:49 MST Resent-Date: Wed, 28 Mar 90 13:16 MST Date: Wed, 28 Mar 90 13:12:49 MST From: David Gudeman Subject: icon & prolog Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9003282012.AA01095@megaron.cs.arizona.edu> In-Reply-To: Richard Goerwitz's message of Wed, 28 Mar 90 12:33:15 CST <9003281833.AA18812@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O From: Richard Goerwitz Just an idle question: Has anyone thought of implementing Prolog in Icon, either as a Prolog -> Icon translator, or as a Prolog interpreter written in Icon? I'm not a Prolog expert, but it occurs to me that Icon might offer facili- ties to make such a project much easier than it might be for most other languages. I wrote an interpreter for a small logic language in Icon, not much like Prolog, but it did do goal-directed unification with backtracking like Prolog does. Your intuition is correct that Icon makes this easy, at least for an interpreter. I was able to use Icon's goal-directed evaluation to do all the goal-directed evualation of the logic language, so I didn't have to keep track of states or anything like that. I just looked for the code I wrote, and it seems to have disapeared. Oh well. From icon-group-request@arizona.edu Thu Mar 29 04:47:01 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA01727; Thu, 29 Mar 90 04:47:01 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 04:46 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20318; Thu, 29 Mar 90 03:42:50 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 29 Mar 90 04:47 MST Date: 29 Mar 90 10:54:12 GMT From: zaphod.mps.ohio-state.edu!usc!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU Subject: RE: icon & prolog Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1996@bruce.OZ> Organization: Monash Uni. Computer Science, Australia X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <9003281833.AA18812@sophist.uchicago.edu> Status: O In article <9003281833.AA18812@sophist.uchicago.edu>, goer@SOPHIST.UCHICAGO.EDU (Richard Goerwitz) writes: > Just an idle question: Has anyone thought of implementing > Prolog in Icon, either as a Prolog -> Icon translator, or > as a Prolog interpreter written in Icon? I'm not a Prolog ... I wrote a Prolog interpreter in Icon some time ago. I never got around to doing anything with it (i.e. publishing wise). I was in the process of writing the converse (Icon interpreter in Prolog) when I got sidetracked. I will post the sources and documentation. There were four versions in increasing order of complexity. I only got around to documentation for versions 1 and 2. The versions appear to have implemented the following incrementally: 1. Basic pure Prolog with negation by failure, 2. List notation added (syntactic sugar), 3. Assert and Retract, 4. Cut. The program documentation files *.doc are in troff format. They're still readable however. The user guides *.usr are just plain text. The source is copyright in the sense that it can be used anywhere for any purpose provided the copyright is maintained and I get credit for my work. I would be interested in any comments about the code. I was trying to get as succinct a source file as possible without sacrificing clarity (but its always tempting to save a line here and there!). From icon-group-request@arizona.edu Thu Mar 29 05:02:47 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02405; Thu, 29 Mar 90 05:02:47 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:02 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20510; Thu, 29 Mar 90 03:47:58 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 29 Mar 90 05:04 MST Date: 29 Mar 90 11:10:30 GMT From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU Subject: Prolog in Icon (version 1) Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1997@bruce.OZ> Organization: Monash Uni. Computer Science, Australia X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O ########################## global variables and types ###################### record ctxt(env,subst) # integer[string] * ((integer | struct | null) list) record struct(name,args) # string * ((integer | struct) list) record rule(ids,head,body)# string list * predicate * predicate list \ record all(ids,body) # string list * predicate list } clauses record one(ids,body) # string list * predicate list / record fun(name,args) # string * predicate list \ types of record var(name) # string / predicates global dbase # table of clauses indexed by head name global consult # stack of files being consulted global query # top level query ################################## driver ################################## procedure main() dbase:=table([]); consult:=[&input] # empty dbase; standard input while \query | *consult>0 do { # more queries possible prog() # parse clauses, possibly setting query as a side effect if \query then case type(query) of { "all" : {every printsoln(query); write("no more solutions")} "one" : if not printsoln(query) then write("no")} else pop(consult)} end procedure printsoln(qry) # print first or next solution to qry local ans,v every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { writes("yes") every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst)) suspend write()} end ########################### Prolog interpreter ############################# procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl] local sub,q # in given context, returns updated context if hd>tl then return ctext if (q:=qry[hd]).name=="~" then # negation by failure {if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)} else every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],ctext.subst) do suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub)) end procedure tryclause(term,cls,sub) # resolves term using given clause or fails local ctext # a copy of sub is used so no side effects ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause if unify(term,scanpred(cls.head,ctext),ctext.subst) then suspend resolve(cls.body,1,*cls.body,ctext).subst end procedure scanpred(prd,ctext) # converts predicate to structure local args; args:=[] if type(prd)=="var" then return ctext.env[prd.name] every put(args,scanpred(!prd.args,ctext)) return struct(prd.name,args) end ######################## primitive domain operations ######################## procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub local v,i,num # side effect: sub is updated if type(t1)=="integer" then { while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1 return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2} if type(t2)=="integer" then return unify(t2,t1,sub) if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then { every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail return} end procedure newctxt(ids,sub) # forms a new context by extending sub local env; env:=table(&null) # to accommodate the unbound identifiers every env[!ids]:=*put(sub,&null) return ctxt(env,sub) end procedure trmstr(str,sub) # converts a term to a string suitable for output local s; s:="" case type(str) of { "integer" : return trmstr(sub[str],sub) "struct" : {every s:=s||trmstr(!str.args,sub)||"," return str.name||(if *s=0 then "" else "("||s[1:-1]||")")} "null" : return "undefined"} end ############################## Prolog parser ############################### procedure prog() # parses consult[1] until query found or end of file query:=&null while write(read(consult[1])) ? clause() if /query & consult[1]~===&input then close(consult[1]) end procedure clause() # adds a clause to the dbase or fails when query set local p,b,ids,t; b:=[]; ids:=[] if =":-" then query:=all(ids,b:=body()) else if ="?-" then query:=one(ids,b:=body()) else {p:=pred(); if =":-" then b:=body()} if (t:=trim(tab(0)))~=="." then # syntax error return write("syntax error: ",t,if *t=0 then "." else " not"," expected") every extractids(ids,\p|!b) # list of variable identifiers if (\p).name=="consult" then every push(consult,open((!p.args).name)) return dbase[(\p).name]:=dbase[p.name]|||[rule(ids,p,b)] end procedure body() # list of predicates local b; b:=[] if put(b,pred()) then while ="," & put(b,pred()) return b end procedure pred() # ~pred | name(body) | uc_name | lc_name() local name,args; args:=[] if ="~" then return fun("~",[pred()]) if not (name:=tab(many(&ucase++&lcase++'0123456789._'))) then fail if any(&ucase,name) then return var(name) if ="(" & args:=body() then # arguments parsed if not =")" then write("syntax error: \")\" expected before ",tab(0)) return fun(name,args) end procedure extractids(ids,pred) if type(pred)=="fun" then every extractids(ids,!pred.args) else if not (pred.name==!ids) then put(ids,pred.name) return end From icon-group-request@arizona.edu Thu Mar 29 05:02:53 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02430; Thu, 29 Mar 90 05:02:53 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:03 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20775; Thu, 29 Mar 90 03:53:51 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 29 Mar 90 05:04 MST Date: 29 Mar 90 11:20:55 GMT From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU Subject: prolog2.usr Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <2000@bruce.OZ> Organization: Monash Uni. Computer Science, Australia X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Prolog in Icon (version 2) -------------------------- Alan Finlay, Monash University. The Prolog interpretor associated with this file is written in Icon for a very simple version of pure Prolog. Cuts, arithmetic, assert and retract are not implemented. There is only one non logical primitive "consult" which is used as if it were a fact with one argument. The argument to consult is used as a file name and the file is consulted for clauses and queries. Negation is implemented as negation by failure. Version two includes list notation. Acceptable Prolog programs consist of a list of clauses and queries, one per line. The clauses are either facts or rules: predicate. predicate:-predicate,...,predicate. A fact is a predicate followed by a full stop. The syntax of a predicate will be described later. A rule has a "head" on the left of the turnstile ":-", and a "body" on the right. The body is a list of one or more predicates separated by commas and terminated by a full stop. The predicates are simple identifiers, identifiers parameterised by one or more arguments, or negated predicates: identifier identifier(argument,...,argument) ~predicate An identifier is a string of letters, digits, underline or full stop. An identifier which starts with an upper case letter and has no arguments associated with it is interpreted as a variable identifier. The arguments are syntactically identical to predicates but interpreted differently. Those arguments with arguments of their own are called structures, those without are called constants if not starting with an upper case letter and variables if they do. A query is like a rule without a head: :-predicate,...,predicate. ?-predicate,...,predicate. The first form causes the interpreter to find all solutions to the query. The second form only asks for one solution. The interpreter reports values assigned to free variables in the query and this is the way answers to questions more complex than yes/no are obtained. As a simple example here is a traditional inference test program mortal(X):-man(X). man(X):-greek(X). greek(socrates). ?-mortal(socrates). Try typing in this example after starting the interpreter with the command iconx prolog and the response is yes After this the interpreter will be waiting for another clause or query to be entered. Try entering another query for example ?-mortal(Socrates). which produces the strange response yes, Socrates = socrates This is because the upper case S indicates a free variable. To experiment with negation try being(X):-man(X). being(X):-god(X). god(apollo). :-being(X),~mortal(X). which produces the response yes, X = apollo no more solutions Note that negation should only be applied to ground terms (terms with all the variables bound) or strange behaviour may result. For example the query :-~mortal(X),being(X). which fails with no solutions. Finaly send an end of file to finish interpreting Prolog commands. More examples can be found in files "test1.plg", "test2.plg", . . . To run the first of these enter the fact consult(test?.plg). etc, after starting the interpreter. The list notation is simply a set of convenient abbreviations. Lists are assumed to be represented by using the binary dot constructor "." and "nil". The dot constructor should only be applied to (element,list) pairs and "nil" represents an empty list. An infix version of the dot constructor "|" is also provided and is useful for pattern matching. Some examples follow. abbreviation represents [] nil [1] .(1,nil) 1|nil .(1,nil) [1,2] .(1,.(2,nil)) [1,2,3,4] .(1,.(2,.(3,.(4,nil)))) 1|2|3|4|nil .(1,.(2,.(3,.(4,nil)))) [[1,2],[3,4]] .(.(1,.(2,nil)),.(.(3,.(4,nil)),nil)) 1|2 .(1,2) The last example is not a proper list since the second argument of the dot constuctor is not a list. The list [A,B,C,D] must have exactly four elements whereas the list A|B|C|D has three or more depending upon the length of the list bound to D. The following two versions of naive reverse are exactly equivalent. reverse([],[]). reverse(X|Y,Z):-reverse(Y,W),append(W,[X],Z). append(X,[],X). append([],X,X). append(X|Y,Z,X|W):-append(Y,Z,W). reverse(nil,nil). reverse(.(X,Y),Z):-reverse(Y,W),append(W,.(X,nil),Z). append(X,nil,X). append(nil,X,X). append(.(X,Y),Z,.(X,W)):-append(Y,Z,W). From icon-group-request@arizona.edu Thu Mar 29 05:02:57 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02452; Thu, 29 Mar 90 05:02:57 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:03 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20539; Thu, 29 Mar 90 03:48:27 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 29 Mar 90 05:04 MST Date: 29 Mar 90 11:14:23 GMT From: cs.utexas.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU Subject: Prolog in Icon (version 2) Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1998@bruce.OZ> Organization: Monash Uni. Computer Science, Australia X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O # Prolog in Icon, version 2, (C) Alan Finlay, Monash University. ########################## global variables and types ###################### record ctxt(env,subst) # integer[string] * ((integer | struct | null) list) record struct(name,args) # string * ((integer | struct) list) record rule(ids,head,body)# string list * predicate * predicate list \ record all(ids,body) # string list * predicate list } clauses record one(ids,body) # string list * predicate list / record fun(name,args) # string * predicate list \ types of record var(name) # string / predicates global dbase # table of clauses indexed by head name global consult # stack of files being consulted global query # top level query ################################## driver ################################## procedure main() dbase:=table([]); consult:=[&input] # empty dbase; standard input while \query | *consult>0 do { # more queries possible prog() # parse clauses, possibly setting query as a side effect if \query then case type(query) of { "all" : {every printsoln(query); write("no more solutions")} "one" : if not printsoln(query) then write("no")} else pop(consult)} end procedure printsoln(qry) # print first or next solution to qry local ans,v every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { writes("yes") every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst)) suspend write()} end ########################### Prolog interpreter ############################# procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl] local sub,q # in given context, returns updated context if hd>tl then return ctext if (q:=qry[hd]).name=="~" then # negation by failure {if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)} else every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],ctext.subst) do suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub)) end procedure tryclause(term,cls,sub) # resolves term using given clause or fails local ctext # a copy of sub is used so no side effects ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause if unify(term,scanpred(cls.head,ctext),ctext.subst) then suspend resolve(cls.body,1,*cls.body,ctext).subst end procedure scanpred(prd,ctext) # converts predicate to structure local args; args:=[] if type(prd)=="var" then return ctext.env[prd.name] every put(args,scanpred(!prd.args,ctext)) return struct(prd.name,args) end ######################## primitive domain operations ######################## procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub local v,i,num # side effect: sub is updated if type(t1)=="integer" then { while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1 return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2} if type(t2)=="integer" then return unify(t2,t1,sub) if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then { every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail return} end procedure newctxt(ids,sub) # forms a new context by extending sub local env; env:=table(&null) # to accommodate the unbound identifiers every env[!ids]:=*put(sub,&null) return ctxt(env,sub) end procedure trmstr(trm,sub) # converts a term to a string suitable for output local s; s:="" case type(trm) of { "integer" : return trmstr(sub[trm],sub) "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list else {every s:=s||trmstr(!trm.args,sub)||"," return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")} "null" : return "undefined"} end procedure lstr(l,sub) # succeeds if l is a proper non-empty list and local hd,tl # converts l to string suitable for output if l.name=="." & *l.args=2 then { hd:=trmstr(l.args[1],sub); tl:=l.args[2] while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl case type(tl) of { "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil return hd||","||lstr(tl,sub)} # cons "null" : return "undefined"}} end ############################## Prolog parser ############################### procedure prog() # parses consult[1] until query found or end of file query:=&null while write(read(consult[1])) ? clause() if /query & consult[1]~===&input then close(consult[1]) end procedure clause() # adds a clause to the dbase or fails when query set local p,b,ids,t; b:=[]; ids:=[] if =":-" then query:=all(ids,b:=body()) else if ="?-" then query:=one(ids,b:=body()) else {p:=pred(); if =":-" then b:=body()} if (t:=trim(tab(0)))~=="." then # syntax error return write("syntax error: ",t,if *t=0 then "." else " not"," expected") every extractids(ids,\p|!b) # list of variable identifiers if (\p).name=="consult" then every push(consult,open((!p.args).name)) return dbase[(\p).name]:=dbase[p.name]|||[rule(ids,p,b)] end procedure body() # list of predicates (may be empty) local b; b:=[] if put(b,pred()) then while ="," & put(b,pred()) return b end procedure dots() # converts non-empty body of list to cons cells local p if p:=pred() then if ="," then return fun(".",[p,dots()]) else return fun (".",[p,fun("nil",[])]) end procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred local name,args,d,p,pp; args:=[] if ="~" then p:=fun("~",[pred()]) else if name:=tab(many(&ucase++&lcase++'0123456789._')) then { if any(&ucase,name) then p:=var(name) else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}} else if ="[]" then p:=fun("nil",[]) # empty list abbreviation else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons else write("syntax error: missing second argument to \"|\"") return \p # n.b. fails if predicate invalid end procedure check(s) # report error if s not present or skip over it if not =s then write("syntax error: ",s," expected before ",tab(0)) end procedure extractids(ids,pred) # build the set of variable identifiers if type(pred)=="fun" then every extractids(ids,!pred.args) else if not (pred.name==!ids) then put(ids,pred.name) return # the identifiers have been appended to reference parameter ids end From icon-group-request@arizona.edu Thu Mar 29 05:03:05 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02469; Thu, 29 Mar 90 05:03:05 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:03 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20789; Thu, 29 Mar 90 03:54:04 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 29 Mar 90 05:04 MST Date: 29 Mar 90 11:22:14 GMT From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU Subject: prolog3.icn Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <2001@bruce.OZ> Organization: Monash Uni. Computer Science, Australia X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O # Prolog in Icon, version 3, (C) Alan Finlay, Monash University. ########################## global variables and types ###################### record ctxt(env,subst) # integer[string] * ((integer | struct | null) list) record struct(name,args) # string * ((integer | struct) list) record rule(ids,head,body)# string list * predicate * predicate list \ record all(ids,body) # string list * predicate list } clauses record one(ids,body) # string list * predicate list / record fun(name,args) # string * predicate list \ types of record var(name) # string / predicates global dbase # table of clauses indexed by head name global consult # stack of files being consulted global query # top level query ################################## driver ################################## procedure main() dbase:=table([]); consult:=[&input] # empty dbase; standard input while \query | *consult>0 do { # more queries possible prog() # parse clauses, possibly setting query as a side effect if \query then case type(query) of { "all" : {every printsoln(query); write("no more solutions")} "one" : if not printsoln(query) then write("no")} else pop(consult)} end procedure printsoln(qry) # print first or next solution to qry local ans,v every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { writes("yes") every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst)) suspend write()} end ########################### Prolog interpreter ############################# procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl] local sub,q,cls,r # and returns updated context if hd>tl then return ctext case (q:=qry[hd]).name of { "assert" : {r:=rule([],q.args[1],q.args[2:0]) every extractids(r.ids,!q.args) dbase[q.args[1].name]:=dbase[q.args[1].name]|||[r] suspend resolve(qry,hd+1,tl,ctext)} # always succeeds "retract" : suspend retract(q.args[1],ctext) & resolve(qry,hd+1,tl,ctext) "~" : {if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)} # negation by failure default : {goal:=scanpred(q,ctext) every cls := !dbase[q.name] do every sub:=tryclause(goal,cls,ctext.subst) do suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))} } end procedure retract(pred,ctext) # removes a clause matching pred from dbase local cand,goal,entry,i; i:=1 # fails when no more matching clauses to remove goal:=scanpred(pred,ctext) every entry:=!dbase[goal.name] do { # check for matching clause cand:=scanpred(entry.head,newctxt(entry.ids,ctext.subst)) if unify(goal,cand,copy(ctext.subst)) then { # found one so remove it dbase[goal.name]:=extract(dbase[goal.name],i) suspend} # on backtracking more retractions can occur else i+:=1 # i keeps track of the entry number even with extractions } end procedure tryclause(term,cls,sub) # resolves term using given clause or fails local ctext # a copy of sub is used so no side effects ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause if unify(term,scanpred(cls.head,ctext),ctext.subst) then suspend resolve(cls.body,1,*cls.body,ctext).subst end procedure scanpred(prd,ctext) # converts predicate to structure local args; args:=[] if type(prd)=="var" then return ctext.env[prd.name] every put(args,scanpred(!prd.args,ctext)) return struct(prd.name,args) end ######################## primitive domain operations ######################## procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub local v,i,num # side effect: sub is updated if type(t1)=="integer" then { while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1 return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2} if type(t2)=="integer" then return unify(t2,t1,sub) if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then { every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail return} end procedure newctxt(ids,sub) # forms a new context by extending sub local env; env:=table(&null) # to accommodate the unbound identifiers every env[!ids]:=*put(sub,&null) return ctxt(env,sub) end procedure trmstr(trm,sub) # converts a term to a string suitable for output local s; s:="" case type(trm) of { "integer" : return trmstr(sub[trm],sub) "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list else {every s:=s||trmstr(!trm.args,sub)||"," return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")} "null" : return "undefined"} end procedure lstr(l,sub) # succeeds if l is a proper non-empty list and local hd,tl # converts l to string suitable for output if l.name=="." & *l.args=2 then { hd:=trmstr(l.args[1],sub); tl:=l.args[2] while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl case type(tl) of { "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil return hd||","||lstr(tl,sub)} # cons "null" : return "undefined"}} end procedure extract(list,el) # extract list element in position [el:el+1] return list:=list[1:el]|||list[el+1:0] end ############################## Prolog parser ############################### procedure prog() # parses consult[1] until query found or end of file query:=&null while write(read(consult[1])) ? clause() if /query & consult[1]~===&input then close(consult[1]) end procedure clause() # adds a clause to the dbase or fails when query set local p,b,ids,t; b:=[]; ids:=[] if =":-" then query:=all(ids,b:=body()) else if ="?-" then query:=one(ids,b:=body()) else {p:=pred(); if =":-" then b:=body()} if (t:=trim(tab(0)))~=="." then # syntax error return write("syntax error: ",t,if *t=0 then "." else " not"," expected") every extractids(ids,\p|!b) # list of variable identifiers if (\p).name=="consult" then every push(consult,open((!p.args).name)) return dbase[(\p).name]:=dbase[p.name]|||[rule(ids,p,b)] end procedure body() # list of predicates (may be empty) local b; b:=[] if put(b,pred()) then while ="," & put(b,pred()) return b end procedure dots() # converts non-empty body of list to cons cells local p if p:=pred() then if ="," then return fun(".",[p,dots()]) else return fun (".",[p,fun("nil",[])]) end procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred local name,args,d,p,pp; args:=[] if ="~" then p:=fun("~",[pred()]) else if name:=tab(many(&ucase++&lcase++'0123456789._')) then { if any(&ucase,name) then p:=var(name) else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}} else if ="[]" then p:=fun("nil",[]) # empty list abbreviation else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons else write("syntax error: missing second argument to \"|\"") return \p # n.b. fails if predicate invalid end procedure check(s) # report error if s not present or skip over it if not =s then write("syntax error: ",s," expected before ",tab(0)) end procedure extractids(ids,pred) # build the set of variable identifiers if type(pred)=="fun" then every extractids(ids,!pred.args) else if not (pred.name==!ids) then put(ids,pred.name) return # the identifiers have been appended to reference parameter ids end From icon-group-request@arizona.edu Thu Mar 29 05:03:09 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02479; Thu, 29 Mar 90 05:03:09 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:03 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20977; Thu, 29 Mar 90 03:58:01 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 29 Mar 90 05:04 MST Date: 29 Mar 90 11:23:34 GMT From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU Subject: prolog4.icn Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <2002@bruce.OZ> Organization: Monash Uni. Computer Science, Australia X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O # Prolog in Icon, version 4, (C) Alan Finlay, Monash University. ########################## global variables and types ###################### record ctxt(env,subst) # integer[string] * ((integer | struct | null) list) record struct(name,args) # string * ((integer | struct) list) record rule(ids,head,body)# string list * predicate * predicate list \ record all(ids,body) # string list * predicate list } clauses record one(ids,body) # string list * predicate list / record fun(name,args) # string * predicate list \ types of record var(name) # string / predicates global dbase # table of clauses indexed by head name global consult # stack of files being consulted global query # top level query ################################## driver ################################## procedure main() dbase:=table([]); consult:=[&input] # empty dbase; standard input while \query | *consult>0 do { # more queries possible prog() # parse clauses, possibly setting query as a side effect if \query then case type(query) of { "all" : {every printsoln(query); write("no more solutions")} "one" : if not printsoln(query) then write("no")} else pop(consult)} end procedure printsoln(qry) # print first or next solution to qry local ans,v every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { if (type(ans)=="string") & (ans=="cut") then fail # cut query writes("yes") every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst)) suspend write()} end ########################### Prolog interpreter ############################# procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl] local sub,q,cls,r # and returns updated context if hd>tl then return ctext # terminate linear recursion case (q:=qry[hd]).name of { "assert" : {r:=rule([],q.args[1],q.args[2:0]) every extractids(r.ids,!q.args) dbase[q.args[1].name]:=dbase[q.args[1].name]|||[r] suspend resolve(qry,hd+1,tl,ctext)} # always succeeds "retract" : suspend retract(q.args[1],ctext) & resolve(qry,hd+1,tl,ctext) "~" : {if not (sub:=resolve(q.args,1,1,ctext).subst) | (type(sub)=="string" & sub=="cut") then suspend resolve(qry,hd+1,tl,ctext)} # negation by failure "!" : {suspend resolve(qry,hd+1,tl,ctext) return "cut"} # causes failure of parent clause default : {goal:=scanpred(q,ctext) every cls := !dbase[q.name] do every sub:=tryclause(goal,cls,ctext.subst) do { if type(sub)=="string" & sub=="cut" then fail else suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))} } } end procedure retract(pred,ctext) # removes a clause matching pred from dbase local cand,goal,entry,i; i:=1 # fails when no more matching clauses to remove goal:=scanpred(pred,ctext) every entry:=!dbase[goal.name] do { # check for matching clause cand:=scanpred(entry.head,newctxt(entry.ids,ctext.subst)) if unify(goal,cand,copy(ctext.subst)) then { # found one so remove it dbase[goal.name]:=extract(dbase[goal.name],i) suspend} # on backtracking more retractions can occur else i+:=1 # i keeps track of the entry number even with extractions } # n.b. this is a primitive retract since only the head is matched end procedure tryclause(term,cls,sub) # resolves term using given clause or fails local ctext,res # a copy of sub is used so no side effects ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause if unify(term,scanpred(cls.head,ctext),ctext.subst) then every res:=resolve(cls.body,1,*cls.body,ctext) do { if (type(res)=="string") & (res=="cut") then suspend "cut" else suspend res.subst} end ######################## primitive domain operations ######################## procedure scanpred(prd,ctext) # converts predicate to structure local args; args:=[] if type(prd)=="var" then return ctext.env[prd.name] every put(args,scanpred(!prd.args,ctext)) return struct(prd.name,args) end procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub local v,i,num # side effect: sub is updated if type(t1)=="integer" then { while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1 return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2} if type(t2)=="integer" then return unify(t2,t1,sub) if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then { every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail return} end procedure newctxt(ids,sub) # forms a new context by extending sub local env; env:=table(&null) # to accommodate the unbound identifiers every env[!ids]:=*put(sub,&null) return ctxt(env,sub) end procedure trmstr(trm,sub) # converts a term to a string suitable for output local s; s:="" case type(trm) of { "integer" : return trmstr(sub[trm],sub) "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list else {every s:=s||trmstr(!trm.args,sub)||"," return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")} "null" : return "undefined"} end procedure lstr(l,sub) # succeeds if l is a proper non-empty list and local hd,tl # converts l to string suitable for output if l.name=="." & *l.args=2 then { hd:=trmstr(l.args[1],sub); tl:=l.args[2] while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl case type(tl) of { "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil return hd||","||lstr(tl,sub)} # cons "null" : return "undefined"}} end procedure extract(list,el) # extract list element in position [el:el+1] return list:=list[1:el]|||list[el+1:0] end ############################## Prolog parser ############################### procedure prog() # parses consult[1] until query found or end of file query:=&null while write(read(consult[1])) ? clause() if /query & consult[1]~===&input then close(consult[1]) end procedure clause() # adds a clause to the dbase or fails when query set local p,b,ids,t; b:=[]; ids:=[] if =":-" then query:=all(ids,b:=body()) else if ="?-" then query:=one(ids,b:=body()) else {p:=pred(); if =":-" then b:=body()} if (t:=trim(tab(0)))~=="." then # syntax error return write("syntax error: ",t,if *t=0 then "." else " not"," expected") every extractids(ids,\p|!b) # list of variable identifiers if (\p).name=="consult" then every push(consult,open((!p.args).name)) return dbase[(\p).name]:=dbase[p.name]|||[rule(ids,p,b)] end procedure body() # list of predicates (may be empty) local b; b:=[] if put(b,pred()) then while ="," & put(b,pred()) return b end procedure dots() # converts non-empty body of list to cons cells local p if p:=pred() then if ="," then return fun(".",[p,dots()]) else return fun (".",[p,fun("nil",[])]) end procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred local name,args,d,p,pp; args:=[] if ="~" then p:=fun("~",[pred()]) else if ="!" then p:=fun("!",[]) else if name:=tab(many(&ucase++&lcase++'0123456789._')) then { if any(&ucase,name) then p:=var(name) else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}} else if ="[]" then p:=fun("nil",[]) # empty list abbreviation else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons else write("syntax error: missing second argument to \"|\"") return \p # n.b. fails if predicate invalid end procedure check(s) # report error if s not present or skip over it if not =s then write("syntax error: ",s," expected before ",tab(0)) end procedure extractids(ids,pred) # build the set of variable identifiers if type(pred)=="fun" then every extractids(ids,!pred.args) else if not (pred.name==!ids) then put(ids,pred.name) return # the identifiers have been appended to reference parameter ids end From icon-group-request@arizona.edu Thu Mar 29 05:03:45 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02493; Thu, 29 Mar 90 05:03:45 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 29 Mar 90 05:04 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA20764; Thu, 29 Mar 90 03:53:34 -0800 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 29 Mar 90 05:05 MST Date: 29 Mar 90 11:17:51 GMT From: zaphod.mps.ohio-state.edu!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU Subject: prolog2.doc Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1999@bruce.OZ> Organization: Monash Uni. Computer Science, Australia X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O .ce .uh "Prolog in Icon version 2 : Program documentation" .ce Alan Finlay, Monash University, March 1990. This version of Prolog is loosely based upon the various denotational semantics for Prolog with which I have been acquainted\*[*\*] .(f * In particular: T. Nicholson and N. Foo. "A Denotational Semantics for Prolog", to appear in ACM TOPLAS. .)f and discussion with colleagues at Monash University. The motivation was to gain experience with Icon and to see if there was any truth to the claim "It should only take about half a page of Icon to implement a Prolog interpreter." .sh1 "Data structures" .ftCW ########################## global variables and types ###################### .nf record ctxt(env,subst) # integer[string] * ((integer | struct | null) list) record struct(name,args) # string * ((integer | struct) list) record rule(ids,head,body)# string list * predicate * predicate list \\ record all(ids,body) # string list * predicate list } clauses record one(ids,body) # string list * predicate list / record fun(name,args) # string * predicate list \\ types of record var(name) # string / predicates global dbase # table of clauses indexed by head name global consult # stack of files being consulted global query # top level query .ft .fi Despite the lack of enforced type discipline in Icon I have used some self discipline. The record and global declarations above indicate types using "|" for disjoint sum and shared field names to get a degree of type inheritance. Type "ctxt" is a context for goal resolution and consists of an environment and a substitution. The environment is a table which maps variable identifiers to variables. A variable is represented by an integer which is its position in the substitution list. A substitution then effectively maps variables to terms or unbound. The null value is used to represent an unbound variable. A term is either another variable or a structure. Type "struct" is used to represent structures which are either functors or constants. Functors have a name and a list of arguments which are terms. Constants are represented as functors with zero arguments. The types for clauses and predicates are used to represent the syntax of Prolog. I will use the words predicate and functor more or less interchangeably since the interpreter doesn't distinguish between them structurally. A consequence of this is that a variable may be used as a goal. The use of separate domains for syntactic predicates and their run time equivalents (called structures above) is required because the syntax is independent of any context whereas the "terms" referred to above are only meaningful when considered together with an appropriate context. At various stages in a computation a given clause or predicate will have to be used in different contexts. There are a few idiosyncrasies of the syntactic domains worth mentioning. Clauses have the redundant field "ids" which is a list of all the variable identifiers used in the clause. This saves a lot of recalculation during execution. The types "all" and "one" are respectively queries where all or one solution are requested. Syntactically this is signaled by using the turnstile ":-" for all solutions and "?-" for one. As before type "fun" can be used with no arguments to indicate constants. Facts are represented as rules with empty bodies. The database is global and consists of a table of clause lists indexed by the name of the predicate at the head. The index is used for efficiency reasons only. The global "consult" is a stack of files being consulted, used for nesting of consult commands. The top level query is global simply to simplify the parser and could have been passed to the driver as a parameter via procedure "prog". .ne7 .sh1 "The driver" .ftCW ################################## driver ################################## procedure main() dbase:=table([]); consult:=[&input] # empty dbase; standard input while \\query | *consult>0 do { # more queries possible prog() # parse clauses, possibly setting query as a side effect if \\query then case type(query) of { "all" : {every printsoln(query); write("no more solutions")} "one" : if not printsoln(query) then write("no")} else pop(consult)} end procedure printsoln(qry) # print first or next solution to qry local ans,v every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { writes("yes") every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst)) suspend write()} end .ft Most of procedure "main" is concerned with providing a usable interactive interface and the details are of little interest. The parser is called until a query is encountered and "printsoln" is used to resolve it and display the answer substitution. The separate procedure "printsoln" is required because it can generate all solutions and may be activated once only or resumed until it fails depending on the type of query. The procedure "printsoln" simply uses procedure "resolve" to generate solution contexts from an initial context determined by the query. This initial context has an environment with all and only the free variables in the query and a substitution in which they are unbound. This context is created by calling procedure "newctxt" which will be described later. .ne7 .sh1 "The interpreter proper" .ftCW ########################### Prolog interpreter ############################# procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl] local sub,q # in given context, returns updated context if hd>tl then return ctext if (q:=qry[hd]).name=="~" then # negation by failure {if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)} else every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],ctext.subst) do suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub)) end procedure tryclause(term,cls,sub) # resolves term using given clause or fails local ctext # a copy of sub is used so no side effects ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause if unify(term,scanpred(cls.head,ctext),ctext.subst) then suspend resolve(cls.body,1,*cls.body,ctext).subst end procedure scanpred(prd,ctext) # converts predicate to structure local args; args:=[] if type(prd)=="var" then return ctext.env[prd.name] every put(args,scanpred(!prd.args,ctext)) return struct(prd.name,args) end .ft The procedure "resolve" is a generator which produces all solutions to a sublist of the supplied goal list "qry". This sublist is from element "hd" to element .(f * Lisp's variety of lists can be implemented in Icon but lack the useful operators the inbuilt lists have. On the other hand the inbuilt list type in Icon does not have a non copying "tail" or "cdr" operation. .)f "tl" and is passed in this way to save copying sublists\*[*\*]. The resolution takes place with respect to the supplied context and an updated context is returned as the answer. Ignoring negation for the moment, "resolve" proceeds by resolving the first goal in the list and for each solution uses a recursive call to satisfy the rest of the goal list if possible and generates a result for every case. The first goal is resolved by resuming "tryclause" for each clause in the database matching the goal and "tryclause" itself being a generator can be resumed several times for each clause. For those not accustomed to Icon's procedure resumption conventions this is more clearly expressed .ftCW q:=qry[hd] every cls:=!dbase[q.name] do every sub:=tryclause(scanpred(q,ctext),cls,ctext.subst) do suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub)) .ft Notice that the updated substitution return by "tryclause" is supplied to "resolve" for the rest of the list hence the effects of goal resolution within a clause body are cumulative. Another important point is to note that the recursion bottoms out when the sublist of goals is empty and succeeds in this case. It is tempting to try to use a goal generator instead of a goal list as in .ftCW sub:=ctext.subst every q:=!qry do every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],sub) return ctxt(ctext.env,sub) .ft This appears to save an explicit test for the end of the list but suffers from a fatal flaw. Apart from only being able to generate one solution this scheme finds the first solution provided there is one but otherwise succeeds anyway with a partial solution. Negation is simply handled as "negation as failure" by using Icon's "not" operator which succeeds if and only if its argument fails. Since when a negated goal succeeds the substitution cannot be updated the original substitution is passed to the remaining goals in this case. The procedure "tryclause" first creates a context for resolving the supplied term against the supplied clause. This context is based upon the supplied substitution and all the free identifiers in the clause. The free identifiers and the a copy of the substitution are used by "newctxt" to create a context which has an environment for the identifiers as new variables and the substitution extended with these new variables unbound. Denotational semantics for Prolog may perform this task on the fly as a clause is interpreted and this corresponds operationally to a great deal of recomputation. Here the syntax parser generates a list of free variable identifiers (without repetitions) only once and combined with the "newctxt" this avoids extending the context in a piecemeal fashion. The substitution passed to "newctxt" is a copy since "newctxt" and "unify" cause side effects upon their substitution parameter. If these side effects were eliminated it would require two copying operations to be performed on very similar substitutions where only one is required. The unifier returns only success or failure of the attempted unification but in the case of success the supplied substitution is updated as a side effect. The procedure "scanpred" simply converts a predicate from its syntactic form into a term according to some relevant context. There are no side effects. .ne7 .sh1 "Primitive domain operations and the parser" .ftCW ######################## primitive domain operations ######################## procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub local v,i,num # side effect: sub is updated if type(t1)=="integer" then { while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1 return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2} if type(t2)=="integer" then return unify(t2,t1,sub) if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then { every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail return} end procedure newctxt(ids,sub) # forms a new context by extending sub local env; env:=table(&null) # to accommodate the unbound identifiers every env[!ids]:=*put(sub,&null) return ctxt(env,sub) end procedure trmstr(trm,sub) # converts a term to a string suitable for output local s; s:="" case type(trm) of { "integer" : return trmstr(sub[trm],sub) "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list else {every s:=s||trmstr(!trm.args,sub)||"," return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")} "null" : return "undefined"} end procedure lstr(l,sub) # succeeds if l is a proper non-empty list and local hd,tl # converts l to string suitable for output if l.name=="." & *l.args=2 then { hd:=trmstr(l.args[1],sub); tl:=l.args[2] while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl case type(tl) of { "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil return hd||","||lstr(tl,sub)} # cons "null" : return "undefined"}} end ############################## Prolog parser ############################### procedure prog() # parses consult[1] until query found or end of file query:=&null while write(read(consult[1])) ? clause() if /query & consult[1]~===&input then close(consult[1]) end procedure clause() # adds a clause to the dbase or fails when query set local p,b,ids,t; b:=[]; ids:=[] if =":-" then query:=all(ids,b:=body()) else if ="?-" then query:=one(ids,b:=body()) else {p:=pred(); if =":-" then b:=body()} if (t:=trim(tab(0)))~=="." then # syntax error return write("syntax error: ",t,if *t=0 then "." else " not"," expected") every extractids(ids,\\p|!b) # list of variable identifiers if (\\p).name=="consult" then every push(consult,open((!p.args).name)) return dbase[(\\p).name]:=dbase[p.name]|||[rule(ids,p,b)] end procedure body() # list of predicates (may be empty) local b; b:=[] if put(b,pred()) then while ="," & put(b,pred()) return b end procedure dots() # converts non-empty body of list to cons cells local p if p:=pred() then if ="," then return fun(".",[p,dots()]) else return fun (".",[p,fun("nil",[])]) end procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred local name,args,d,p,pp; args:=[] if ="~" then p:=fun("~",[pred()]) else if name:=tab(many(&ucase++&lcase++'0123456789._')) then { if any(&ucase,name) then p:=var(name) else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}} else if ="[]" then p:=fun("nil",[]) # empty list abbreviation else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons else write("syntax error: missing second argument to \\"|\\"") return \\p # n.b. fails if predicate invalid end procedure check(s) # report error if s not present or skip over it if not =s then write("syntax error: ",s," expected before ",tab(0)) end procedure extractids(ids,pred) # build the set of variable identifiers if type(pred)=="fun" then every extractids(ids,!pred.args) else if not (pred.name==!ids) then put(ids,pred.name) return # the identifiers have been appended to reference parameter ids end .ft The rest of the interpreter is supplied for completeness but is of little intrinsic interest. The unifier updates the supplied substitution. It is natural to consider using the backtrackable assignment and hence avoid the need to copy substitutions altogether. A preliminary investigation indicates that this is feasible but space inefficient. This may be due to poor implementation of backtrackable assignment in the particular Icon interpreter used\*[*\*]. .(f * &version = "Icon Version 6.0. July 7, 1986." (University of Arizona) .)f The parser uses procedures which return parse trees except that "clause" uses a global variable to return a top level query and fails when it encounters one. This causes termination of string scanning in "prog". The behaviour of the user interface is described in the user manual. From kwalker Thu Mar 29 12:27:32 1990 Date: Thu, 29 Mar 90 12:27:32 MST From: "Kenneth Walker" Message-Id: <9003291927.AA00336@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA00336; Thu, 29 Mar 90 12:27:32 MST In-Reply-To: <9003281833.AA18812@sophist.uchicago.edu> To: icon-group Subject: Re: icon & prolog Status: O > Date: Wed, 28 Mar 90 12:33:15 CST > From: Richard Goerwitz > > Just an idle question: Has anyone thought of implementing > Prolog in Icon, either as a Prolog -> Icon translator, or > as a Prolog interpreter written in Icon? You might want to check out "Logicon: an Integration of Prolog into Icon" by Guy Lapalme and Suzanne Chapleau, Software Practice and Experience, Oct 1986. They implement a Prolog interpreter in Icon which lets you call back and forth between the two languages. Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721 +1 602 621 2858 kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker From ralph Sat Mar 31 05:07:38 1990 Date: Sat, 31 Mar 90 05:07:38 MST From: "Ralph Griswold" Message-Id: <9003311207.AA17346@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA17346; Sat, 31 Mar 90 05:07:38 MST To: icon-group Subject: Version 8 of Icon Status: O Version 8 of Icon is complete and implementations will be available for most computer systems soon. Version 8 has both new features and improvements to the implementation. New language features: Math functions: sin(), cos(), ..., exp(), log(), etc. Keyboard functions: getch(), getche(), kbhit(); PC implementations only. key(T) to generate the keys in table T. name(v) and variable(s) to produce string name of variable v and vice versa. p!L to invoke p with arguments in list L. &letters, cset of all letters. Arbitrary-precision integer arithmetic (not supported on all PCs). Serial numbers for structures. An interface for calling C functions from Icon and vice versa. Implementation changes: Smaller structures. Dynamic hashing for sets and tables. Instrumentation of storage management. Implementations of Version 8 will be announced as they become available. Please direct any questions to me, not to icon-group or icon-project. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From ralph Sat Mar 31 05:50:35 1990 Date: Sat, 31 Mar 90 05:50:35 MST From: "Ralph Griswold" Message-Id: <9003311250.AA18283@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA18283; Sat, 31 Mar 90 05:50:35 MST To: icon-group Subject: Version 8 of Icon for UNIX Status: O Version 8 of Icon for UNIX systems is now available. This implementation can be configured for a wide variety of UNIX systems. Configuration information is provided for 56 different systems, including the Sun Sparcstation, the DecStation, the NeXT, the DG AViiON, and the Cray-2. Configurations for new systems can be added with relative ease. The UNIX distribution includes source code, configuration files, documentation, the Icon program library (new in Version 8), and several auxiliary components of Icon. Version 8 of Icon for UNIX systems can be obtained by anonymous FTP to cs.arizona.edu. After connecting, cd /icon/v8. Get READ.ME there for more information. If you do not have FTP access or prefer to obtain a magnetic tape and printed documentation, Version 8 of Icon for UNIX can be ordered from: Icon Project Department of Computer Science Gould-Simpson Building The University of Arizona Tucson, AZ 85721 602 621-2018 (voice) 602 621-4246 (FAX) The price is $30, payable in US dollars with a check written on a bank in the United States. Orders also can be charged to MasterCard or Visa. This price includes shipping by parcel post in the United States, Canada, and Mexico. Add $10 for air mail delivery to other countries. Please direct any questions to me, not to icon-project or icon-group. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From goer@sophist.uchicago.EDU Sat Mar 31 16:32:35 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA21928; Sat, 31 Mar 90 16:32:35 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Sat, 31 Mar 90 16:33 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Sat, 31 Mar 90 17:33:50 CST Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA24215; Sat, 31 Mar 90 17:28:19 CST Resent-Date: Sat, 31 Mar 90 16:34 MST Date: Sat, 31 Mar 90 17:28:19 CST From: Richard Goerwitz Subject: more help w/ BSD->SysV filename conversion Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9003312328.AA24215@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Yet another version of the BSD->SysV tar file mapping aid. The chore of renaming files is time-consuming, and any automation that can be introduced into the process is certainly of use to me. I hope that this succession of mapping programs I'm posting will be as useful to others, too. This one has been used on more platforms than the one I posted before, and has a few more error checks. It also prevents the user from fooling with nested tar archives, and permits preservation of an arbitrary number of extensions of any length (e.g. .dvi.Z, .pxl, etc.) #------------------------------------------------------------------- # # PROGNAME: mtf (stands for "map tar file") # # PURPOSE: Maps 15+ char. filenames in a tar archive to 14 # chars. Handles both header blocks and the archive itself. # # USAGE: mtf inputfile .extensions # (writes to stdout) # # Inputfile is a tar archive; "extensions" is a sequence of # strings which denote .extensions to be preserved in mapped # filenames. One-char extensions are automatically preserved. # Writes a "mapped" tar archive to the stdout. # # BUGS: Mtf only maps filenames found in the main tar headers. # Because of this, mtf cannot accept nested tar archives. Mtf # also obviously cannot know about conflicts with filenames in # use outside the archive. Check before you extract! # # Richard L. Goerwitz, III # Last modified 3/29/90 # #------------------------------------------------------------------- global filenametbl, chunkset, short_chunkset # see procedure mappiece(s) global extensions # ditto record hblock(name,junk,size,mtime,chksum, linkflag,linkname,therest) # see readtarhdr(s) procedure main(a) usage := "usage: mtf inputfile extensions # output goes to stdout\n" || " useful extensions include .tar.Z .pxl .cpi, etc." 0 < *a | stop(usage) intext := open(a[1],"r") | stop("mtf: can't open ",a[1]) a[1][-2:0] == ".Z" & stop("mtf: sorry, can't accept compressed files") extensions := a[2:0] every i := 1 to *extensions do extensions[i] ?:= (=".", tab(0)) # Run through all the headers in the input file, filling # (global) filenametbl with the names of overlong files; # make_table_of_filenames fails if there are no such files. make_table_of_filenames(intext) | { write(&errout,"mtf: no overlong path names to map") a[1] ? (tab(find(".tar")+4), pos(0)) | write(&errout,"(Is ",a[1]," even a tar archive?)") exit(1) } # Now that a table of overlong filenames exists, go back # through the text, remapping all occurrences of these names # to new, 14-char values; also, reset header checksums, and # reformat text into correctly padded 512-byte blocks. Ter- # minate output with 512 nulls. seek(intext,1) every writes(output_mapped_headers_and_texts(intext)) close(intext) write_report() # Record mapped file and dir names for future ref. exit(0) end procedure make_table_of_filenames(intext) local header # chunkset is global # search headers for overlong filenames; for now # ignore everything else while header := readtarhdr(reads(intext,512)) do { # tab upto the next header block tab_nxt_hdr(intext,trim_str(header.size),1) # record overlong filenames in several global tables, sets fixpath(trim_str(header.name)) } *\chunkset ~= 0 | fail return &null end procedure output_mapped_headers_and_texts(intext) # Remember that filenametbl, chunkset, and short_chunkset # (which are used by various procedures below) are global. local header, newtext, full_block, block, lastblock # Read in headers, one at a time. while header := readtarhdr(reads(intext,512)) do { # Replace overlong filenames with shorter ones, according to # the conversions specified in the global hash table filenametbl # (which were generated by fixpath() on the first pass). header.name := left(map_filenams(header.name),100,"\x00") header.linkname := left(map_filenams(header.linkname),100,"\x00") # Use header.size field to determine the size of the subsequent text. # Read in the text as one string. Map overlong filenames found in it # to shorter names as specified in the global hash table filenamtbl. newtext := map_filenams(tab_nxt_hdr(intext,trim_str(header.size))) # Now, find the length of newtext, and insert it into the size field. header.size := right(exbase10(*newtext,8) || " ",12," ") # Calculate the checksum of the newly retouched header. header.chksum := right(exbase10(get_checksum(header),8)||"\x00 ",8," ") # Finally, join all the header fields into a new block and write it out full_block := ""; every full_block ||:= !header suspend left(full_block,512,"\x00") # Now we're ready to write out the text, padding the final block # out to an even 512 bytes if necessary; the next header must start # right at the beginning of a 512-byte block. newtext ? { while block := move(512) do suspend block pos(0) & next lastblock := left(tab(0),512,"\x00") suspend lastblock } } # Write out a final null-filled block. Some tar programs will write # out 1024 nulls at the end. Dunno why. return repl("\x00",512) end procedure trim_str(s) # Knock out spaces, nulls from those crazy tar header # block fields (some of which end in a space and a null, # some just a space, and some just a null [anyone know # why?]). return s ? { (tab(many(' ')) | &null) & trim(tab(find("\x00")|0)) } \ 1 end procedure tab_nxt_hdr(f,size_str,firstpass) # Tab upto the next header block. Return the bypassed text # as a string if not the first pass. local hs, next_header_offset hs := integer("8r" || size_str) next_header_offset := (hs / 512) * 512 hs % 512 ~= 0 & next_header_offset +:= 512 if 0 = next_header_offset then return "" else { # if this is pass no. 1 don't bother returning a value; we're # just collecting long filenames; if \firstpass then { seek(f,where(f)+next_header_offset) return } else { return reads(f,next_header_offset)[1:hs+1] | stop("mtf: error reading in ", string(next_header_offset)," bytes.") } } end procedure fixpath(s) # Fixpath is a misnomer of sorts, since it is used on # the first pass only, and merely examines each filename # in a path, using the procedure mappiece to record any # overlong ones in the global table filenametbl and in # the global sets chunkset and short_chunkset; no fixing # is actually done here. s2 := "" s ? { while piece := tab(find("/")+1) do s2 ||:= mappiece(piece) s2 ||:= mappiece(tab(0)) } return s2 end procedure mappiece(s) # Check s (the name of a file or dir as recorded in the tar header # being examined) to see if it is over 14 chars long. If so, # generate a unique 14-char version of the name, and store # both values in the global hashtable filenametbl. Also store # the original (overlong) file name in chunkset. Store the # first fifteen chars of the original file name in short_chunkset. # Sorry about all of the tables and sets. It actually makes for # a reasonably efficient program. Doing away with both sets, # while possible, causes a tenfold drop in execution speed! # global filenametbl, chunkset, short_chunkset, extensions local j, ending initial { filenametbl := table() chunkset := set() short_chunkset := set() } chunk := trim(s,'/') if chunk ? (tab(find(".tar")+4), pos(0)) then { write(&errout, "mtf: Sorry, I can't let you do this.\n", " You've nested a tar archive within\n", " another tar archive, which makes it\n", " likely I'll f your filenames ubar.") exit(2) } if *chunk > 14 then { i := 0 if /filenametbl[chunk] then { # if we have not seen this file, then... repeat { # ...find a new unique 14-character name for it; # preserve important suffixes like ".Z," ".c," etc. # First, check to see if the original filename (chunk) # ends in an important extension... if chunk ? (tab(find(".")), ending := move(1) || tab(match(!\extensions)|any(&ascii)), pos(0) ) # ...If so, then leave the extension alone; mess with the # middle part of the filename (e.g. file.with.extension.c -> # file.with001.c). then { j := (15 - *ending - 3) lchunk:= chunk[1:j] || right(string(i+:=1),3,"0") || ending } # If no important extension is present, then reformat the # end of the file (e.g. too.long.file.name -> too.long.fi01). else lchunk := chunk[1:13] || right(string(i+:=1),2,"0") # If the resulting shorter file name has already been used... if lchunk == !filenametbl # ...then go back and find another (i.e. increment i & try # again; else break from the repeat loop, and... then next else break } # ...record both the old filename (chunk) and its new, # mapped name (lchunk) in filenametbl. Also record the # mapped names in chunkset and short_chunkset. filenametbl[chunk] := lchunk insert(chunkset,chunk) insert(short_chunkset,chunk[1:16]) } } # If the filename is overlong, return lchunk (the shortened # name), else return the original name (chunk). If the name, # as passed to the current function, contained a trailing / # (i.e. if s[-1]=="/"), then put the / back. This could be # done more elegantly. return (\lchunk | chunk) || ((s[-1] == "/") | "") end procedure readtarhdr(s) # Read the silly tar header into a record. Note that, as was # complained about above, some of the fields end in a null, some # in a space, and some in a space and a null. The procedure # trim_str() may (and in fact often _is_) used to remove this # extra garbage. this_block := hblock() s ? { this_block.name := move(100) # <- to be looked at later this_block.junk := move(8+8+8) # skip the permissions, uid, etc. this_block.size := move(12) # <- to be looked at later this_block.mtime := move(12) this_block.chksum := move(8) # <- to be looked at later this_block.linkflag := move(1) this_block.linkname := move(100) # <- to be looked at later this_block.therest := tab(0) } integer(this_block.size) | fail # If it's not an integer, we've hit # the final (null-filled) block. return this_block end procedure map_filenams(s) # Chunkset is global, and contains all the overlong filenames # found in the first pass through the input file; here the aim # is to map these filenames to the shortened variants as stored # in filenametbl (GLOBAL). local s2 s2 := "" s ? { until pos(0) do { # first narrow the possibilities, using short_chunkset if member(short_chunkset,&subject[&pos:&pos+15]) # then try to map from a long to a shorter 14-char filename then s2 ||:= (filenametbl[=!chunkset] | move(1)) else s2 ||:= move(1) } } return s2 end # From the IPL. Thanks, Ralph - # Author: Ralph E. Griswold # Date: June 10, 1988 # exbase10(i,j) convert base-10 integer i to base j # The maximum base allowed is 36. procedure exbase10(i,j) static digits local s, d, sign initial digits := &digits || &lcase if i = 0 then return 0 if i < 0 then { sign := "-" i := -i } else sign := "" s := "" while i > 0 do { d := i % j if d > 9 then d := digits[d + 1] s := d || s i /:= j } return sign || s end # end IPL material procedure get_checksum(r) # Calculates the new value of the checksum field for the # current header block. Note that the specification say # that, when calculating this value, the chksum field must # be blank-filled. sum := 0 r.chksum := " " every field := !r do every sum +:= ord(!field) return sum end procedure write_report() # This procedure writes out a list of filenames which were # remapped (because they exceeded the SysV 14-char limit), # and then notifies the user of the existence of this file. local outtext, stbl, i (outtext := open(fname := "mapping.report","w")) | open(fname := "/tmp/mapping.report","w") | stop("mtf: Can't find a place to put mapping.report!") stbl := sort(filenametbl,3) every i := 1 to *stbl -1 by 2 do { write(outtext,left(stbl[i],35," ")," ",stbl[i+1]) } write(&errout,"mtf: ",fname," contains the list of changes") close(outtext) return &null end From goer@sophist.uchicago.EDU Sat Mar 31 16:59:29 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA22736; Sat, 31 Mar 90 16:59:29 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Sat, 31 Mar 90 17:00 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Sat, 31 Mar 90 18:00:26 CST Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA24270; Sat, 31 Mar 90 17:54:54 CST Resent-Date: Sat, 31 Mar 90 17:01 MST Date: Sat, 31 Mar 90 17:54:54 CST From: Richard Goerwitz Subject: help with MS-DOS files Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9003312354.AA24270@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Appended below are two short programs for converting MSDOS text files to Unix format and vice-versa. I use them all the time. I hope someone else finds them helpful. #------------------------------------------------------------------- # # PROGRAM: nocr (stands for "no carriage return") # # usage: nocr [file1 [file2 [etc.]]] # # PURPOSE: Nocr simply removes carriage returns from the # files whose names are given as arguments to the program. # I use it to import MS-DOS files. # # BUGS: None known. # # Richard L. Goerwitz, III # last modified 1/20/90 # #------------------------------------------------------------------- procedure main(a) local fname, infile, outfile *a = 0 & stop("usage: nocr file1 file2...") while fname := pop(a) do { infile := open(fname,"r") | (er(), next) outfile := open(fname || ".xM","w") | (er(), next) while line := !infile do { if line[-1] == "\x0D" then write(outfile,line[1:-1]) else write(outfile,line) } close(infile) | stop("nocr: cannot close, ",fname) remove(fname) | stop("nocr: cannot remove ",fname) rename(fname || ".xM",fname) } end procedure er() write(&errout,"nocr: cannot open ",fname," for reading") return end #------------------------------------------------------------------- # # PROGRAM: yescr # # usage: yescr [file1 [file2 [etc.]]] # # PURPOSE: Yescr simply adds a CR after each newlines in the # files whose names are given as arguments to the program. # I use it to export MS-DOS files. # # BUGS: None known. # # Richard L. Goerwitz, III # last modified 1/20/90 # #------------------------------------------------------------------- procedure main(a) local fname, infile, outfile *a = 0 & stop("usage: yescr file1 file2...") while fname := pop(a) do { infile := open(fname,"r") | (er(), next) outfile := open(fname || ".xM","w") | (er(), next) while line := !infile do { if line[-1] ~== "\x0D" | line == "" then write(outfile,line || "\x0D") else write(outfile,line) } close(infile) | stop("yescr: cannot close, ",fname) remove(fname) | stop("yescr: cannot remove ",fname) rename(fname || ".xM",fname) } end procedure er() write(&errout,"yescr: cannot open ",fname," for reading") return end -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From ralph Mon Apr 2 18:35:07 1990 Date: Mon, 2 Apr 90 18:35:07 MST From: "Ralph Griswold" Message-Id: <9004030135.AA24469@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA24469; Mon, 2 Apr 90 18:35:07 MST To: icon-group Subject: Version 8 of Icon for VMS Status: O Version 8 of Icon for VAX/VMS is now available. The VMS distribution includes source code, object code, executable binaries, documentation, and the Icon program library (new in Version 8). Version 8 of Icon for VMS systems can be obtained by anonymous FTP to cs.arizona.edu. After connecting, cd /icon/v8. Get READ.ME there for more information. See vmsfix.com in that directory for information about patching up VMS BACKUP tapes after FTP. If you do not have FTP access or prefer to obtain a magnetic tape and printed documentation, Version 8 of Icon for VMS can be ordered from: Icon Project Department of Computer Science Gould-Simpson Building The University of Arizona Tucson, AZ 85721 602 621-2018 (voice) 602 621-4246 (FAX) The price is $30, payable in US dollars with a check written on a bank in the United States. Orders also can be charged to MasterCard or Visa. This price includes shipping by parcel post in the United States, Canada, and Mexico. Add $10 for air mail delivery to other countries. Please direct any questions to me, not to icon-project or icon-group. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From cjeffery Mon Apr 2 20:52:45 1990 Resent-From: "Clinton Jeffery" Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA01383; Mon, 2 Apr 90 20:52:45 MST Received: from megaron.cs.arizona.edu by Arizona.EDU; Mon, 2 Apr 90 20:54 MST Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA01306; Mon, 2 Apr 90 20:52:11 MST Received: by caslon; Mon, 2 Apr 90 20:52:10 mst Resent-Date: Mon, 2 Apr 90 20:54 MST Date: Mon, 2 Apr 90 20:52:10 mst From: Clinton Jeffery Subject: Icon Ideas? (operator overloading) Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004030352.AA03647@caslon> In-Reply-To: shelby!csli!poser@decwrl.dec.COM's message of 28 Mar 90 02:30:09 GMT <12860@csli.Stanford.EDU> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O > There is an intermediate approach...Implement operator overloading as > ADDITION of methods for new data types, but don't allow pre-defined > methods (i.e. the built-in operators) to be removed. This guarantees > that an operator will have the expected semantics when applied to > built-in data types and reduces the uncertainty to derived types. I like this a lot. For Icon, though, I am not sure operator overloading makes sense, since Icon does not support the addition of new data types (other than records). It makes great sense in the object-oriented Icon-derivatives (what a mouthful!), but none of them do operator overloading so far as I know. No one is willing to translate + into an Icon procedure call. Neither is anyone willing to make extensive additions to the Icon interpreter to support this feature which "normal" Icon programs couldn't use. In the absence of type information, are there any other alternatives? From icon-group-request@arizona.edu Tue Apr 3 10:44:53 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13687; Tue, 3 Apr 90 10:44:53 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 3 Apr 90 10:21 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA10428; Tue, 3 Apr 90 09:20:00 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 3 Apr 90 10:22 MST Date: 3 Apr 90 15:02:32 GMT From: esquire!yost@nyu.EDU Subject: The Splash programming language Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1909@esquire.UUCP> Organization: DP&W, New York, NY X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O A friend at the IBM Watson Research Center saw a talk last week by Paul Abrahams on a language he is designing called Splash, that is supposed to derive to some extent on Icon. Does anyone know how to find out more about this language? Thanks --dave yost yost@dpw.com or uunet!esquire!yost Please ignore the From or Reply-To fields above, if different. From icon-group-request@arizona.edu Tue Apr 3 16:33:39 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13536; Tue, 3 Apr 90 16:33:39 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 3 Apr 90 16:33 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA10493; Tue, 3 Apr 90 16:30:24 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 3 Apr 90 16:35 MST Date: 3 Apr 90 23:30:14 GMT From: usenet@arizona.edu Subject: Can Icon be ftp'd from anywhere? Any icon for the Mac? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <35270@ucbvax.BERKELEY.EDU> Organization: School of Education, UC-Berkeley X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O around? For the Mac would be great. Any help much appreciated. Thanks From: thom@dewey.soe.berkeley.edu (Thom Gillespie) Path: dewey.soe.berkeley.edu!thom --Thom From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS Tue Apr 3 19:28:26 1990 Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA07078; Tue, 3 Apr 90 19:28:26 MST Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0) id AA05905; Tue, 3 Apr 90 22:28:21 -0400 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Tue, 3 Apr 90 22:28:14 EDT Date: Tue, 3 Apr 90 20:31:14 EDT From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu To: icon-group@cs.arizona.edu Message-Id: <214949@Wayne-MTS> Subject: The SPLASH Programming Language Status: O Re Dave Yost's inquiry: here's the abstract of the talk that I've been giving on SPLASH: ======================================================== SPLASH: A Systems Programming Language for Software Hackers SPLASH is a programming language designed for programmers who delight in their tools. SPLASH supports programming at a high level of expression, yet it enables its user to understand how the code he writes is actually executed and to maintain precise control, where it is wanted, over what the computer is actually doing. Its high-level facilities include the ability to define container types and iterators; inheritance of operations and polymorphism in the object-oriented style; generators that provide most of the facilities of coroutines but at far less cost; tuple types; and a great many syntactic niceties that encourage elegance and transparency of expression. Although the ideas in SPLASH are not radically new, they are combined and integrated in a new way. Major sources of inspiration for SPLASH are Icon, C++, and SEDL, a Software Engineering Design Language developed at IBM that is an extension of Ada. SPLASH is still a paper language, but the ideas in it are of interest independently of any particular implementation. The talk will describe the features of SPLASH and give some examples of its use. ============================================================= I still don't have a formal report on it, but I hope to have that by early summer. Meanwhile I'll be happy to answer questions about SPLASH and its status. There are a lot of things in it that are directly taken from ICON (for which I give full credit). Two major differences (which are related): SPLASH is strongly typed, and it is designed to be compiled rather than interpreted. The strong typing also makes operator overloading possible (re the recent discussions in the ICON group). Paul Abrahams Abrahams%wayne-mts@um.cc.umich.edu 214 River Road Deerfield MA 01342 (413) 774-5500 From R.J.Hare@EDINBURGH.AC.UK Wed Apr 4 11:19:43 1990 Received: from rvax.ccit.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA26454; Wed, 4 Apr 90 11:19:43 MST Received: from UKACRL.BITNET by rvax.ccit.arizona.edu; Wed, 4 Apr 90 11:18 MST Received: from RL.IB by UKACRL.BITNET (Mailer X1.25) with BSMTP id 2221; Tue, 03 Apr 90 10:04:13 BST Date: 03 Apr 90 10:04:40 bst From: R.J.Hare@EDINBURGH.AC.UK Subject: Prolog documentation To: icon-group@cs.arizona.edu Message-Id: <03 Apr 90 10:04:40 bst 340539@EMAS-A> Via: UK.AC.ED.EMAS-A; 3 APR 90 10:04:10 BST X-Envelope-To: icon-group@CS.ARIZONA.EDU Status: O Someone put a short document file on this board the other day, which contained instructions for running the prolog interpreter. I have foolishly lost this. Could it please be re-posted on the board. Thanks. Roger Hare. From icon-group-request@arizona.edu Wed Apr 4 19:00:49 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP id AA17274; Wed, 4 Apr 90 19:00:49 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 4 Apr 90 17:48 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA10800; Wed, 4 Apr 90 17:44:52 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 4 Apr 90 17:55 MST Date: 4 Apr 90 23:23:06 GMT From: motcid!henley@uunet.uu.NET Subject: RE: Can Icon be ftp'd from anywhere? Any icon for the Mac? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <2074@mica6.UUCP> Organization: Motorola Inc., Cellular Infrastructure Div., Arlington Heights, IL X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <35270@ucbvax.BERKELEY.EDU> Status: O usenet@ucbvax.BERKELEY.EDU (USENET News Administration) writes: >around? For the Mac would be great. Any help much appreciated. Thanks >From: thom@dewey.soe.berkeley.edu (Thom Gillespie) >Path: dewey.soe.berkeley.edu!thom >--Thom All questions, Yes! 1) ICON is available from the university of Arizona(version 7.5 and 7.0): BBS: (602) 621-2283 FTP: arizona.edu (/icon) (128.196.128.118 or 192.12.69.1) 2) There is a version available for the Mac! ------------------------------------------------- | Aaron Henley (uunet!motcid!henley) | | Motorola Cellular Infrastructure Division | ------------------------------------------------- From icon-group-request@arizona.edu Wed Apr 4 19:01:22 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP id AA17414; Wed, 4 Apr 90 19:01:22 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 4 Apr 90 17:33 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA09534; Wed, 4 Apr 90 17:24:54 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 4 Apr 90 17:37 MST Date: 5 Apr 90 00:01:01 GMT From: bullwinkle!ccsam@ucdavis.ucdavis.EDU Subject: Icon on the IBM RS/6000 Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <7033@aggie.ucdavis.edu> Organization: Computing Services, UC Davis X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O the subject says it all - has someone ported it yet? (please respond via mail; and, if you've got a 6000 and want to hear what responses i get, please send me a note). -sam From icon-group-request@arizona.edu Wed Apr 4 21:49:29 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP id AA26942; Wed, 4 Apr 90 21:49:29 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 4 Apr 90 21:50 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA23834; Wed, 4 Apr 90 21:38:24 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 4 Apr 90 21:51 MST Date: 4 Apr 90 16:05:44 GMT From: esquire!yost@nyu.EDU Subject: RE: Any Icon (programming language) for the Mac? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1913@esquire.UUCP> Organization: DP&W, New York, NY X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <35270@ucbvax.BERKELEY.EDU> Status: O In article <35270@ucbvax.BERKELEY.EDU> thom@dewey.soe.berkeley.edu.UUCP (Thom Gillespie) writes. ProIcon for the Mac looks from its manual to be extremely well done. I haven't used it or seen it used yet, but it has nicely-accessible online language reference documentation, power assist for writing code, some support for Mac windows, and tracing has been extended to include calls to builtin functions, and other goodies. ProIcon was developed by Mark Emmer and Ralph Griswold and is available mail order for $175 from their company called Bright Forest at 303-539-3884. It would be nice if someone who uses ProIcon would tell us if it really is as good as it looks. There is also more unix-like batch-oriented version for MPW available from icon-project@cs.arizona.edu. --dave yost yost@dpw.com or uunet!esquire!yost Please ignore the From or Reply-To fields above, if different. From shafto@eos.arc.nasa.GOV Thu Apr 5 08:20:36 1990 Resent-From: shafto@eos.arc.nasa.GOV Received: from maggie.telcom.arizona.edu by megaron (5.59-1.7/15) via SMTP id AA29876; Thu, 5 Apr 90 08:20:36 MST Received: from eos.arc.nasa.gov by Arizona.EDU; Thu, 5 Apr 90 08:19 MST Received: Thu, 5 Apr 90 08:16:57 PST by eos.arc.nasa.gov (5.59/1.2) Resent-Date: Thu, 5 Apr 90 08:21 MST Date: Thu, 5 Apr 90 08:16:57 PST From: Michael Shafto Subject: RE: Any Icon (programming language) for the Mac? Resent-To: icon-group@cs.arizona.edu To: esquire!yost@nyu.EDU, icon-group@arizona.edu Cc: shafto@EOS.ARC.NASA.GOV Resent-Message-Id: Message-Id: <9004051616.AA23858@eos.arc.nasa.gov> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: esquire!yost@nyu.EDU, icon-group@Arizona.edu X-Vms-Cc: shafto@EOS.ARC.NASA.GOV Status: O Well, I've used ProIcon quite a bit, as well as other symbol-processing languages on the Mac -- Allegro Common Lisp and MaxSpitbol. I uncovered a bug regarding coroutines, which Mark & Ralph patched up right away -- they're a lot more responsive than the MPW folks, in my experience. I found the ProIcon environment significantly more comfortable than the ACL environment, even though I had sort of adapted to ACL before getting ProIcon. As a general high-level language that gives access to the Mac qua Mac, I strongly recommend ProIcon. (Can't comment much on MaxSpitbol due to less experience with it, though it looks to have many of the same good environmental features as ProIcon.) Mike From goer@sophist.uchicago.EDU Thu Apr 5 15:51:15 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA00754; Thu, 5 Apr 90 15:51:15 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Thu, 5 Apr 90 14:55 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 5 Apr 90 16:54:03 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA00373; Thu, 5 Apr 90 16:50:21 CDT Resent-Date: Thu, 5 Apr 90 15:03 MST Date: Thu, 5 Apr 90 16:50:21 CDT From: Richard Goerwitz Subject: benchmarks for v8, Xenix Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004052150.AA00373@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I'm just curious what sorts of benchmarks people are getting for other machines. For me, v8 runs just as fast (maybe even a tad faster, but this may be due to my new compiler) than v7. I'm running a 386 box with Xenix 2.3.2, and the 2.3.0 compiler set (with lng085 sls = MSC 5.1). However, I actually ended up compiling Icon with gcc (-traditional). My results were: concord.out:concord elapsed time = 17166 deal.out:deal elapsed time = 21133 ipxref.out:ipxref elapsed time = 4233 queens.out: elapsed time = 22383 rsg.out: elapsed time = 24367 The MSC 5.1 compiler compiled okay, but the result was unsatis- factory (none of the builtin functions could be found by the interpreter). Gcc passed all the tests, except for things that are site-specific, and a few floating point operations (for which it gave a few digits less precision). Incidentally, if anyone gets Icon compiled with MSC 5.1 under Xenix, I'd appreciate hearing about it. My real reason for posting, though, isn't to gripe about the Xenix compiler, but to find out what others are discovering about Icon v8 as far as the benchmarks go. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From goer@sophist.uchicago.EDU Thu Apr 5 17:25:06 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA10199; Thu, 5 Apr 90 17:25:06 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Thu, 5 Apr 90 17:25 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 5 Apr 90 19:23:31 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA00503; Thu, 5 Apr 90 19:19:50 CDT Resent-Date: Thu, 5 Apr 90 17:27 MST Date: Thu, 5 Apr 90 19:19:50 CDT From: Richard Goerwitz Subject: coexpressions for Xenix Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004060019.AA00503@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I just implemented coexpressions for my Xenix system. It didn't require rewriting a single line of code. I just moved the Micro- port System V file into the Xenix directory, removed the #define NoCoexp (or whatever it is), and recompiled. Note that I used gcc. I'd guess it would work with MSC 5.1 if the bugs there could be worked out. Anyway, it passes the tests, and so I'm happy. I just thought someone else might like to know that it can (easily) be done. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From cargo@tardis.cray.com Fri Apr 6 06:23:21 1990 Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA17024; Fri, 6 Apr 90 06:23:21 MST Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34) id AA14189; Fri, 6 Apr 90 08:23:54 CDT Received: from zk.cray.com by hall.cray.com id AA10754; 3.2/CRI-3.12; Fri, 6 Apr 90 08:23:52 CDT Received: by zk.cray.com id AA17193; 3.2/CRI-3.12; Fri, 6 Apr 90 08:23:50 CDT Date: Fri, 6 Apr 90 08:23:50 CDT From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9004061323.AA17193@zk.cray.com> To: icon-group@cs.arizona.edu Subject: RE: Any Icon (programming language) for the Mac? Status: O I have used ProIcon for generating PostScript files for printing on the Mac. Just this last weekend, I wrote and tested two Icon programs on MS-DOS and then moved the files to the Mac (via MS-DOS 5.25" floppy to MS-DOS 720K 3.5" floppy to Mac via Apple File Exchange with a Mac with Superdrive). The same Icon programs compiled and ran with ProIcon. For me, this was extremely useful, since my home system is MS-DOS but my target environment was a Mac. I have also used the ProIcon features for reading and writing files to and from windows, but not so much since maintaining portability was high on my list for my major programs. dsc From ralph Fri Apr 6 08:26:29 1990 Date: Fri, 6 Apr 90 08:26:29 MST From: "Ralph Griswold" Message-Id: <9004061526.AA24992@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA24992; Fri, 6 Apr 90 08:26:29 MST To: icon-group Subject: ProIcon Status: O The contact/telephone numbers for ProIcon in recent e-mail were not correct. ProIcon is marketed by Catspaw, Inc. Their telephone number is 719-539-3884 The telephone number for The Bright Forest Company is 602-325-3948 Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From shafto@eos.arc.nasa.gov Fri Apr 6 10:32:01 1990 Received: from eos.arc.nasa.gov by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA05802; Fri, 6 Apr 90 10:32:01 MST Received: Fri, 6 Apr 90 08:28:04 PST by eos.arc.nasa.gov (5.59/1.2) Date: Fri, 6 Apr 90 08:28:04 PST From: Michael Shafto Message-Id: <9004061628.AA12692@eos.arc.nasa.gov> To: cargo@tardis.cray.com, icon-group@cs.arizona.edu Subject: RE: Any Icon (programming language) for the Mac? Cc: shafto@EOS.ARC.NASA.GOV Status: O I have also happily ported Icon from MS-DOS to Mac (ProIcon), which saved me a lot of time. I've not seen a Lisp dialect that can do the same trick, though I imagine there must be such a path via Common Lisp. Mike From icon-group-request@arizona.edu Fri Apr 6 15:26:42 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA29636; Fri, 6 Apr 90 15:26:42 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 6 Apr 90 15:26 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA23944; Fri, 6 Apr 90 15:04:16 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Fri, 6 Apr 90 15:26 MST Date: 6 Apr 90 21:20:38 GMT From: pacific.mps.ohio-state.edu!zaphod.mps.ohio-state.edu!usc!elroy.jpl.nasa.gov!suned1!zaft@tut.cis.ohio-state.EDU Subject: RE: Can Icon be ftp'd from anywhere? Any icon for the Mac? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <3605@suned1.Navy.MIL> Organization: NSWSES, Port Hueneme, CA X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <35270@ucbvax.BERKELEY.EDU>, <2074@mica6.UUCP> Status: O CORRECTION: The latest Icon Newsletter says the address for U of Arizona is changing from arizona.edu to cs.arizona.edu. Gordon Zaft -- ***************************************************************************** * suned1!zaft@elroy.JPL.Nasa.Gov zaft@suned1.nswses.navy.mil * * Chairman, Ventura County ACM Phone: (805) 982-0684 * * Any statements / opinions made here are mine, alone, not the Navy's. * From icon-group-request@arizona.edu Mon Apr 9 16:19:18 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA07889; Mon, 9 Apr 90 16:19:18 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Mon, 9 Apr 90 10:37 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA18962; Mon, 9 Apr 90 10:26:32 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Mon, 9 Apr 90 16:03 MST Date: 9 Apr 90 17:22:44 GMT From: usc!zaphod.mps.ohio-state.edu!rpi!uwm.edu!csd4.csd.uwm.edu!corre@ucsd.EDU Subject: Icon on the Mac Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <3344@uwm.edu> Organization: University of Wisconsin-Milwaukee X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I have been using ProIcon (Icon for the Macintosh computer) continuously for some three months, and am writing to share some impressions at this point. I feel that one's response to ProIcon will be largely determined by how one responds to the Macintosh. I am currently preparing some instructional materials for modern Hebrew reviving some ideas I tried to implement some twenty years ago using programmed learning. Many students liked the approach, but I was bothered by the almost inevitable clumsiness and turgidity of programmed texts, and did not pursue my ideas too far. I feel that the computer offers real possibilities of helping students with the difficult task of learning natural languages, especially those which use exotic scripts. I decided to try to implement this on the Macintosh, because my impression is that students prefer this machine, and when they can choose between it and the other system, they vote with their feet. The fact is that soaps doubtless have a much greater following than dramatic masterpieces on public tv, so one need hardly be surprised that a bright, cheery sometimes gormless approach wins out on the computer too. I hope I don't seem to be an intellectual snob in saying this, because I realise that my tastes are often low-brow, and I could not honestly criticize the individuals (my son is one) who enjoy the Macintosh. But I don't care for the Macintosh. Bad enough that I still have to take out the trash in real life, I don't need to do it symbolically on the screen. These trivialities annoy me as a Macintosh user, but, more seriously, as a programmer I feel the lack of support of a consistent and powerful operating system which will willingly accept my written orders. Using ProIcon is quite pleasant; it just can't get away from the world in which it lives. True it adds some functions presaging version 8, and it has a development system comparable to that of Turbo Pascal, bringing you back to the Editor when an error is detected. But there is a trade-off; the ProIcon Editor is an Editor and, unlike the Editor in Apple Pascal for the II+ for example, does not have an automatic fill mode that avoids the carriage returns when writing straight text. Accordingly I find that having more or less completed a program in ProIcon for which I found the Editor quite satisfactory, I am now preparing the data files which the program processes on my Zenith using my favorite editor JOVE (Jonathan's Own Version of Emacs), and transferring them to the Mac. There is one big addition which ProIcon has, and that is the windows. You do not need absolutely to use these windows if you don't want to. I transferred a lengthy Icon program which gives a visually equivalent Jewish and Gregorian calendar for any year from MS-DOS and UNIX to the Macintosh and only needed to remove the screen controls to make it work nicely on the Macintosh using only the Interactive window. But it is a pity to waste this resource if you are working in the first instance on the Macintosh, and portability is not important. It does however inject a new element into one's programming. Suddenly one has to become to some extent a graphic designer, and this, like programming, is an art, but an entirely different one. The immense flexibility and power of the window functions of ProIcon force the programmer to think about all kinds of esthetic issues which were not really relevant previously. Perhaps someone could do for ProIcon's windows what Leslie Lamport did for TeX -- take over the visual design aspect and let the programmer concentrate on logical design. Lamport points out that "with a visual design system, authors usually produce aesthetically pleasing, but poorly designed documents." I have an uncomfortable feeling that I may be doing the same thing with my windows. With this admission, I would yet suggest some tentative guidelines for using these windows. Perhaps others will have further suggestions which will enable us to build up a body of expertise in this area. First plan the windows which you will need for the entire program. They can be set up at the beginning, their size, position and fonts can be determined, and decisions made as to how they will be connected with disk files, if at all. They do not have to be visible at this stage, or indeed at any stage (I'll address this later.) For example, at one stage of my program I have a setup which looks like this: ------------------------------------------------------------------------ | | | | | | | | | | | | | | | | Interactive Window | Hebrew Window | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ | | | | | | | | | Information Window | | | | | | | | | ----------------------------------------------------------------------- The upper left window gives information to, and gets responses from, the user. It derives its information from a disk file, but the window itself is not connected to a file. The upper right window is connected to an empty file which has been opened for writing, so this window is dynamic. Items appear there (in Hebrew script) which have been prompted by the activity in the adjacent window. This material can be saved permanently if desired. The bottom window is connected to a complete, previously prepared file which has been opened for reading. This is a static window, which the user refers to as necessary, moving back and forth at will by manipulating the bar, arrows and thumb on the right side of the window. This might be a help screen, or a set of relevant information which needs to be handy throughout the exercise. In addition there is a fourth window which the program never activates. This is connected to a file logging the user's activity, hour by hour and day by day, and measuring success. To this file material is constantly appended, and is saved at the end of the program. The user can see it at any time before the program ends by using the pull-down window menu, and clicking on the Log entry, dismissing it after perusal by clicking on its close box in its upper left hand corner. This window just lurks in the background, and some users might never activate it at all. I think this is sufficient to indicate that the permutations of the ways in which windows can be used are vast. It is probably best to try out the window functions in some trivial program, just to get a feel for the manner in which they work. They really do what they are supposed to, but often it is easier to understand what the functions do by seeing them at work rather than trying to understand a verbal description. The documentation does a pretty good job, and is pleasant to look at, but it is really hard to describe all these possibilities clearly in words. -- Alan D. Corre Department of Hebrew Studies University of Wisconsin-Milwaukee (414) 229-4245 PO Box 413, Milwaukee, WI 53201 corre@csd4.csd.uwm.edu From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS Mon Apr 9 19:40:44 1990 Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA22670; Mon, 9 Apr 90 19:40:44 MST Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0) id AA16310; Mon, 9 Apr 90 22:40:38 -0400 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Mon, 9 Apr 90 22:40:25 EDT Date: Mon, 9 Apr 90 20:16:22 EDT From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu To: icon-group@cs.arizona.edu Message-Id: <216762@Wayne-MTS> Subject: SPLASH: A Systems Programming Language for Software Hackers Status: O In answer to a recent inquiry in this forum, here's the abstract of the talk that I've been giving on SPLASH. My apologies if you've received this twice; I thought I transmitted it once before but I never got a copy back. Many of the ideas in SPLASH are derived from Icon. The main differences are that SPLASH is strongly typed and that it is designed to be compiled rather than interpreted. (That helps in providing operator overloading - a recent topic of discussion in this forum.) I'll be happy to answer any questions about SPLASH either by email or otherwise. ================================================================== SPLASH: A Systems Programming Language for Software Hackers SPLASH is a programming language for programmers who delight in their tools. SPLASH supports programming at a high level of expression, yet it enables its user to understand how the code he writes is really executed and to maintain precise control, where it is wanted, over what the computer is actually doing. Its high-level facilities include the ability to define container types and iterators; generators that provide most of the facilities of coroutines but at far less cost; tuple types; inheritance of operations and polymorphism in the object-oriented style; and a great many syntactic niceties that encourage elegance and transparency of expression. Although the ideas in SPLASH are not radically new, they are combined and integrated in a new way. Major sources of inspiration for SPLASH are Icon, C++, and SEDL, a Software Engineering Design Language developed at IBM that is an extension of Ada. Although SPLASH has not yet been implemented, the ideas in it are of interest independently of any particular implementation. The talk will describe the features of SPLASH and give some examples of its use. Paul Abrahams 214 River Road Deerfield MA 01342 (413) 774-5500 Abrahams%wayne-mts@um.cc.umich.edu From goer@sophist.uchicago.EDU Mon Apr 9 23:20:36 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02894; Mon, 9 Apr 90 23:20:36 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Mon, 9 Apr 90 23:15 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Tue, 10 Apr 90 01:13:45 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA06171; Tue, 10 Apr 90 01:09:59 CDT Resent-Date: Mon, 9 Apr 90 23:21 MST Date: Tue, 10 Apr 90 01:09:59 CDT From: Richard Goerwitz Subject: type conversion Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004100609.AA06171@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I find that I don't make much use of automatic type conversion. Even though this feature lies at the heart of the Icon implemen- tation, I wonder how much it lies at the heart of the language's conception. Normally, if I think that a type conversion will occur, I make the conversion explicit with a builtin function like string() or cset() or integer(). I find that if conversions are occurring where I don't know about them, they are *usually* indicative of sloppy programming on my part. If automatic type conversion disappeared, what sorts of ramifications would it have? Would some aspect of the language that I haven't considered be radically altered? Or would it permit greater speed and allow for fuller implementation of things like operator overloading (which some seem to want)? What if we had optional static typing? Would this offer the best of both worlds? How would such a feature be implemented and inte- grated into the rest of the language (if in fact it is desirable in the first place)? Would it be hard to do? I don't claim to be a theoretician. Any discussion or clarification would be much appreciated. Flames as well. I don't take this sort of thing too personally. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From icon-group-request@arizona.edu Wed Apr 11 04:04:09 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA14471; Wed, 11 Apr 90 04:04:09 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 11 Apr 90 04:05 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA19478; Wed, 11 Apr 90 03:47:54 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 11 Apr 90 04:05 MST Date: 11 Apr 90 10:22:29 GMT From: zaphod.mps.ohio-state.edu!usc!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU Subject: RE: type conversion Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <2035@bruce.OZ> Organization: Monash Uni. Computer Science, Australia X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <9004100609.AA06171@sophist.uchicago.edu> Status: O In article <9004100609.AA06171@sophist.uchicago.edu>, goer@SOPHIST.UCHICAGO.EDU (Richard Goerwitz) writes: > > I find that I don't make much use of automatic type conversion. > Even though this feature lies at the heart of the Icon implemen- > tation, I wonder how much it lies at the heart of the language's > conception. I call this kind of type conversion "type coercion" and if I can quote from T.W. Pratt's book (2nd Edition) "Programming Languages, Design and Implementation", p 57: "Coercions are an important design issue in most languages. Two opposed philosophies exist regarding the extent to which the language should provide coercions between data types." The two schools of thought Pratt refers to are: (i) only do essential coercions like int/real (Modula-2 goes even further and does none!), (ii) if there is a coercion that might make sense then do it (e.g. if a string looks like it could be a numeral then convert it). > > Normally, if I think that a type conversion will occur, I make > the conversion explicit with a builtin function like string() > or cset() or integer(). I find that if conversions are occurring > where I don't know about them, they are *usually* indicative of > sloppy programming on my part. Your philosophy is (i) above. > > If automatic type conversion disappeared, what sorts of ramifications > would it have? Would some aspect of the language that I haven't > considered be radically altered? Or would it permit greater > speed and allow for fuller implementation of things like operator > overloading (which some seem to want)? > From the point of view of implementation it is probably easier to do Status: O no coercions. There would be little difference in execution speed either way. It would probably be easy to provide a flag to turn this feature on or off when using Icon. Unfortunately this would mean two kinds of source code proliferating. In the functional language community a similar controversy surrounds the use of strict or lazy evaluation of procedure arguments. Lazy evaluation is more powerful but can enable some very arcane programming techniques. The functional language ML chose to use strict evaluation. Some people liked ML except for this one thing, hence LML was born. Imagine what it would be like if every language design decision was made an implementation variable! > What if we had optional static typing? Would this offer the best > of both worlds? How would such a feature be implemented and inte- > grated into the rest of the language (if in fact it is desirable > in the first place)? Would it be hard to do? > Static type checking is not realistically possible for a language such as Icon. Consider the expression: if x=0 then "small" else 1 The type of this expression may be impossible to determine until run time. In Icon as well as such expressions as these it is possible to check the type of a value at run time and act according to this information. I have used this myself to obtain the behaviour of variant types. For example: if type(x)=="integer" then x+1 else 0 Expression such as this are quite foreign to languages with static type checking. Perhaps what you need is some kind of "lint" program like that which the C programming language has. For programs which can be statically typed it could warn of any type clashes and coercions that will occur, for other programs it could just indicate that static type checking was not feasible. From goer@sophist.uchicago.EDU Wed Apr 11 06:12:07 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA18233; Wed, 11 Apr 90 06:12:07 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Wed, 11 Apr 90 06:11 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 11 Apr 90 08:10:07 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA07853; Wed, 11 Apr 90 08:06:19 CDT Resent-Date: Wed, 11 Apr 90 06:13 MST Date: Wed, 11 Apr 90 08:06:19 CDT From: Richard Goerwitz Subject: type coercion Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004111306.AA07853@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Consider the following: Static type checking is not realistically possible for a language such as Icon. Consider the expression: if x=0 then "small" else 1 The type of this expression may be impossible to determine until run time. In Icon as well as such expressions as these it is possible to check the type of a value at run time and act according to this information. I have used this myself to obtain the behaviour of variant types. For example: if type(x)=="integer" then x+1 else 0 Expressions such as this are quite foreign to languages with static type checking. Perhaps what you need is some kind of "lint" program like that which the C programming language has. For programs which can be statically typed it could warn of any type clashes and coercions that will occur, for other programs it could just indicate that static type checking was not feasible. I don't disagree that expressions such as this are foreign to languages with static type checking. What I wonder is whether a thing like optional static typing might be applied to variables, and not expressions. In this scenario, if x = 0 then "small" else 1 would be fine. Just curious. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From ralph Wed Apr 11 08:35:53 1990 Date: Wed, 11 Apr 90 08:35:53 MST From: "Ralph Griswold" Message-Id: <9004111535.AA26352@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA26352; Wed, 11 Apr 90 08:35:53 MST To: icon-group Subject: Version 8 of Icon for MS-DOS Status: O Version 8 of Icon for MS-DOS is now available. There are two packages -- one contains executable binary files and the other contains source code. The executable binaries support only the large memory model. Two versions of the run-time system are provided: one that supports large-integer arithmetic and a smaller one that does not. The source code compiles under Microsoft C 5.10 (MS-DOS and OS/2), Lattice C 6.01, and Turbo C 2.0. Version 8 of Icon for MS-DOS systems can be obtained by anonymous FTP to cs.arizona.edu. After connecting, cd /icon/v8. Get READ.ME there for more information. If you do not have FTP access or prefer to obtain diskettes and printed documentation, Version 8 of Icon for MS-DOS can be ordered from: Icon Project Department of Computer Science Gould-Simpson Building The University of Arizona Tucson, AZ 85721 602 621-2018 (voice) 602 621-4246 (FAX) Specify whether you want executable binaries, source code, or both and the size of diskettes you prefer (5.25" or 3.5"). The packages are $20 each, payable in US dollars to The University of Arizona with a check written on a bank in the United States. Orders also can be charged to MasterCard or Visa. The price includes shipping by parcel post in the United States, Canada, and Mexico. Add $5 per package for air mail delivery to other countries. Please direct any questions to me, not to icon-project or icon-group. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From M17572@mwvm.mitre.ORG Thu Apr 12 05:53:54 1990 Resent-From: M17572@mwvm.mitre.ORG Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA18999; Thu, 12 Apr 90 05:53:54 MST Return-Path: M17572@mwvm.mitre.ORG Received: from mwunix.mitre.org by Arizona.EDU; Thu, 12 Apr 90 05:22 MST Received: from mwvm.mitre.org by mwunix.mitre.org (5.61/SMI-2.2) id AA00101; Thu, 12 Apr 90 08:19:28 -0400 Received: from MWVM by mwvm.mitre.org (IBM VM SMTP R1.2.1) with BSMTP id 7601; Thu, 12 Apr 90 08:20:06 EDT Resent-Date: Thu, 12 Apr 90 05:36 MST Date: Thursday, 12 Apr 1990 08:20:05 EST From: m17572@mwvm.mitre.ORG Subject: pl/i to c translator in icon Resent-To: icon-group@cs.arizona.edu To: icon-group%arizona.edu@mwunix.mitre.ORG Resent-Message-Id: Message-Id: <9004121219.AA00101@mwunix.mitre.org> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group%arizona.edu@mwunix.mitre.ORG Status: O I am looking for a translator from pl/i to c written in icon. It doesn't have to be perfect. alternatively, a translator from any similar language to c would be helpful. thanks in advance. * * John Artz jartz@mitre.org From kwalker Thu Apr 12 08:14:52 1990 Date: Thu, 12 Apr 90 08:14:52 MST From: "Kenneth Walker" Message-Id: <9004121514.AA25491@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA25491; Thu, 12 Apr 90 08:14:52 MST In-Reply-To: <9004111306.AA07853@sophist.uchicago.edu> To: icon-group Subject: Re: type coercion Status: O > Date: 11 Apr 90 10:22:29 GMT > From: zaphod.mps.ohio-state.edu!usc!samsung!munnari.oz.au!bruce!alanf@tut.cis.ohio-state.EDU > >Static type checking is not realistically possible for a language such as Icon. > Consider the expression: > if x=0 then "small" else 1 > The type of this expression may be impossible to determine until run time. > ... > Perhaps what you need is some kind of "lint" program like that > which the C programming language has. For programs which can be statically > typed it could warn of any type clashes and coercions that will occur, for > other programs it could just indicate that static type checking was not > feasible. It is possible to perform type inference on Icon programs. Type inference assigns a type to each expression, but some types may be of a form like "string or integer", as in the example. In addition to assigning types which an expression might actually produce, it is sometimes necessary for a type inferencing scheme to make conservative estimates, so the assigned type may include types which the expression would never take on in any possible execution. A compiler can use information from type inferencing to eliminate much of the run-time type checking from a program, improving execution speed. I implemented a prototype type inferencing scheme some time ago and was able to infer unique types for about 90 percent of all operands where type checking is normally needed. I am preparing to implement a type inferencing scheme in an experimental optimizing compiler for Icon. It will be interesting to see what kind of speedups result from using the type information in the code generator. > Date: Wed, 11 Apr 90 08:06:19 CDT > From: Richard Goerwitz > > What I wonder is whether a thing like optional > static typing might be applied to variables, and not expressions. This seems like a nice idea. Adding optional type information to declarations is quite easily. You could do something like local x:integer, y: string | record r x could then be assigned only integers and y could be assigned only strings or records of type r. Variables with no type information would simply be "any type", allowing the current style of typeless variables to still be used. This scheme would require some run-time type checking at assignments (and type coercions?), but that is not a serious problem, though it might take some thought as to how to implement it efficiently. This scheme effectively moves type checking from the uses of a variable to the assignments, but run-time checking at assignments is only needed when the type of the value being assigned cannot be statically determined. I would also like to be able to have a declaration like global x: list of integer This means that any list assigned to x must be restricted to contain only integers. It would be necessary to have some way of creating type-restricted structures. Ideally the method would be a simple extension of the current methods for creating structures. You probably also want a way of creating named types. type foo: list of (integer | foo) global x: foo Type equivalence would be structural. In the following example, x and y have the same type. type str_set: set of string local x: str_set local y: set of string x := y Does anyone know how this compares to other languages with flexible type systems? Are there pitfalls I haven't anticipated? Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721 +1 602 621 2858 kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker From ccc!ccc.com!clemc@uunet.UU.NET Thu Apr 12 08:49:10 1990 Resent-From: ccc!ccc.com!clemc@uunet.UU.NET Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA27469; Thu, 12 Apr 90 08:49:10 MST Received: from uunet.UU.NET by Arizona.EDU; Thu, 12 Apr 90 08:48 MST Received: from ccc.UUCP by uunet.uu.net (5.61/1.14) with UUCP id AA06191; Thu, 12 Apr 90 11:46:45 -0400 Received: from localhost by CCC.COM.ccc.CCC.COM id aa01444; 12 Apr 90 11:00 EDT Resent-Date: Thu, 12 Apr 90 08:50 MST Date: Thu, 12 Apr 90 10:59:58 EDT From: clemc@ccc.COM Subject: RE: pl/i to c translator in icon Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group@uunet.uu.NET Resent-Message-Id: Message-Id: <9004121546.AA06191@uunet.uu.net> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group@uunet.uu.NET Status: O > > I am looking for a translator from pl/i to c written in icon. > It doesn't have to be perfect. alternatively, a translator > from any similar language to c would be helpful. > thanks in advance. > * > * John Artz jartz@mitre.org You might want to look into f2c the FORTRAN 77 to C converter that is available from NETLIB at AT&T or the Pascal to C converters available in the UUNET net news archives. Good Luck, Clem Cole ------ Clement T. Cole Cole Computer Consulting uunet!ccc!clemc uucp 255 North Road #119 clemc@ccc.com Internet Chelmsford, MA 01824-1402 (508) 256-6967 voice From wgg@cs.washington.edu Thu Apr 12 11:24:21 1990 Received: from june.cs.washington.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA08442; Thu, 12 Apr 90 11:24:21 MST Received: by june.cs.washington.edu (5.61/7.0jh) id AA15180; Thu, 12 Apr 90 11:24:22 -0700 Date: Thu, 12 Apr 90 11:24:22 -0700 From: wgg@cs.washington.edu (William Griswold) Return-Path: Message-Id: <9004121824.AA15180@june.cs.washington.edu> To: icon-group@cs.arizona.edu, kwalker@cs.arizona.edu Subject: Re: type coercion Status: O > >Does anyone know how this compares to other languages with flexible >type systems? Are there pitfalls I haven't anticipated? > The problem that I can anticipate right off is that structural equivalence when only some variables are typed could be expensive, unless you do some precomputation and type inference. For example if you have the declaration: local L : list of list of table of (integer | real) What will you have to do to check an assignment to L coming from an untyped variable. What about assignments to any of its subcomponents? Are pointers to substructures of L a problem? Also, suppose you wanted to allow for typed recursive structures (there is plenty of evidence that you want truly self-referencing structures in Icon): type tree-node : list of (integer | tree-node) Is this hard to type check? Seems no harder than the above, but again, how does one check modifications to substructures with pointers running around? Also, what if it isn't a tree, i.e., you have a loop? Structures would probably have to be marked during the traversal of the check. Name equivalence would make the checking easier, but it would be far too restrictive and meaningless in Icon. I'm not knocking the idea of this flexible typing. I would *love* to have it. But I see some (interesting!) problems when only some of the objects are typed. I think the feature could be very useful during development, because it allows the gradual introduction of types where needed for testing, and they can be ``turned off'' after development if they cost too much to check all the time. Bill Griswold From tenaglia@fps.mcw.edu Fri Apr 13 05:36:01 1990 Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA26263; Fri, 13 Apr 90 05:36:01 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA00396; Thu, 12 Apr 90 21:13:36 EDT Received: by uwm.edu; id AA04499; Thu, 12 Apr 90 15:09:29 -0500 Message-Id: <9004122009.AA04499@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Thu, 12 Apr 90 14:25:33 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Thu, 12 Apr 90 14:06:10 CDT Date: Thu, 12 Apr 90 14:06:10 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: RE: pl/i to c translator in icon X-Vms-Mail-To: UUCP%"m17572@mwvm.mitre.ORG" Status: O In response to : > From: UUCP%"m17572@mwvm.mitre.ORG" 12-APR-1990 13:24:50.57 > To: mis!tenaglia > Subj: pl/i to c translator in icon > > I am looking for a translator from pl/i to c written in icon. It doesn't > have to be perfect. alternatively, a translator from any similar > language to c would be helpful. thanks in advance. > * > * John Artz jartz@mitre.org I'm not sure what the pl/i language is. But if it is something like the PL/M language offered on Intel RMX systems there may be something. Back a few years ago at Astronautics Corp. in Milwaukee I worked on a project to translate PL/M to C. I no longer work there, but they may still have the system spooled on a tape. It was designed to run under Icon 6 in a VAX/VMS environment, and was a peculiar combination of DCL script and a long chain (21 - 27 modules) that would run sequentially and convert good/average/moderately bad PL/M code into plain vanilla K&R C. Perhaps if pl/i is similar, the system could be adapted. The contact would be a Wesley Eckles, System Manager, Engineering Computer Services, Astronautics Corp., 4115 N. Teutonia Ave, Milwaukee 53209, (414)447-8200 X450. I can't say if they'd sell it or give it away. Just an idea. Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS Sat Apr 14 13:01:47 1990 Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA27153; Sat, 14 Apr 90 13:01:47 MST Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0) id AA26489; Sat, 14 Apr 90 16:01:38 -0400 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Sat, 14 Apr 90 16:01:32 EDT Date: Sat, 14 Apr 90 15:43:44 EDT From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu To: icon-group@cs.arizona.edu Message-Id: <218276@Wayne-MTS> Subject: Overloading of operators Status: O One of the major advantages of strongly typed languages is that you can overload the operators cleanly. This advantage may be even more important than what you gain in protection (perhaps not that much) or efficiency. In Icon you not only don't need to declare the types of parameters; you can't. Declaring the types of parameters is essential to defining overloaded operators unless you want to do the overload resolution within the operator's definition. I haven't found the dynamic type conversion in Icon to be of much use. Some of what you get with it is what you'd get in other languages through generics, such as the ability to write a sort procedure that works for any type of list. (Forget for the moment that sorting is built into Icon.) The one example I know of where dynamic type conversion really helps is in writing print procedures, since the form of what you print depends on the type of the argument. To me, an important principle of language design is `To thine own self be true'. Overloaded operators are contrary to the spirit of Icon. If you really want them, you should either do run-time dispatching or be using a different language. I think it would be a mistake to add them to Icon. Paul Abrahams Abrahams%wayne-mts@um.cc.umich.edu From sbw@naucse.cse.nau.edu Sat Apr 14 16:00:50 1990 Received: from naucse.cse.nau.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA04295; Sat, 14 Apr 90 16:00:50 MST Received: by naucse.cse.nau.edu (5.61/1.34) id AA26480; Sat, 14 Apr 90 16:00:32 -0700 Message-Id: <9004142300.AA26480@naucse.cse.nau.edu> Date: Sat, 14 Apr 90 16:00:30 MST X-Mailer: Mail User's Shell (6.5 4/17/89) From: sbw@naucse.cse.nau.edu (Steve Wampler) To: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu, icon-group@cs.arizona.edu Subject: Re: Overloading of operators Status: O Hmmm, *If* one could always get one's hands on the 'original version' of an operator, then implementing overloaded operators would be fairly simple in Icon - though, as you say, you would have to handle the overloading yourself. (I would view that as consistent with Icon, since you have to handle your own typechecking yourself - consider writing your own version of Icon.) When I first heard of 'string invocation' - I thought that this would be the first step in 'overloading' operators (and builtin functions). I assumed that "+"(1,3) and "write"("hello") would access the 'original' addition and 'write' functions. I was wrong, of course, but if I had been correct in my assumption, we would have overloading now (with adding syntactic support): operator +(x,y) if type(x) == type(y) == "complex" then return complex_add(x,y) return "+"(x,y) end (ignoring mixed-mode for the moment.) and: procedure write(a) # ignoring varargs for the moment... if type(a) == "table" then return write_table(a) else return "write"(a) end Alas, it is not so, so it will not be. As I said, this form, to me, would have been in the spirit of Icon. -- Steve Wampler {....!arizona!naucse!sbw} {sbw@naucse.cse.nau.edu} From sbw@naucse.cse.nau.edu Sun Apr 15 06:50:14 1990 Received: from naucse.cse.nau.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13890; Sun, 15 Apr 90 06:50:14 MST Received: by naucse.cse.nau.edu (5.61/1.34) id AA03424; Sun, 15 Apr 90 06:49:46 -0700 Message-Id: <9004151349.AA03424@naucse.cse.nau.edu> Date: Sun, 15 Apr 90 06:49:44 MST X-Mailer: Mail User's Shell (6.5 4/17/89) From: sbw@naucse.cse.nau.edu (Steve Wampler) To: icon-group@cs.arizona.edu Subject: Sigh... Status: O The sentence in my last posting that read in part: "consider writing your own version of Icon" was SUPPOSED to say: "consider writing your own version of 'image()'". So much for saturday postings. -- Steve Wampler {....!arizona!naucse!sbw} {sbw@naucse.cse.nau.edu} From icon-group-request@arizona.edu Tue Apr 17 18:34:24 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13815; Tue, 17 Apr 90 18:34:24 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 17 Apr 90 18:35 MST Received: by ucbvax.Berkeley.EDU (5.61/1.41) id AA02739; Tue, 17 Apr 90 18:24:50 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 17 Apr 90 18:35 MST Date: 18 Apr 90 00:58:46 GMT From: castor!ccs007@ucdavis.ucdavis.EDU Subject: I/O help Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <7097@aggie.ucdavis.edu> Organization: University of California, Davis X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I'm writing a mini-database and would like to know how to load and save a table of tables to disk. I'm using icon on a VAX 11/785 under Ultrix-32 V3.1 (Rev. 9). Any help would be greatly appreciated. Everything I've tried doesn't seem to work. Jonathan Sims ccs007@castor.ucdavis.edu From ralph Tue Apr 17 18:55:01 1990 Date: Tue, 17 Apr 90 18:55:01 MST From: "Ralph Griswold" Message-Id: <9004180155.AA15065@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA15065; Tue, 17 Apr 90 18:55:01 MST To: castor!ccs007@ucdavis.ucdavis.EDU Subject: Re: I/O help Cc: icon-group In-Reply-To: <7097@aggie.ucdavis.edu> Status: O There are two procedures in the Icon program library for encoding arbitary Icon data as strings that can be written to files and then restored. The Icon program library is available in several ways, including via FTP. What you want depends on the version of Icon you're running. Version 8 is current. If you need specific advice, let me know. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From sunquest!whm Tue Apr 17 23:31:37 1990 Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA28475; Tue, 17 Apr 90 23:31:37 MST Received: from grissom by sunquest; Tue, 17 Apr 90 23:30:25 MST Date: Tue, 17 Apr 90 23:30:22 MST From: "Bill Mitchell" Message-Id: <9004180630.AA23253@grissom> Received: by grissom; Tue, 17 Apr 90 23:30:22 MST To: arizona!icon-group Subject: Another kudo for Ralph Status: O I just heard this today and thought I'd pass it along for the group. I quote: Upon recommendation of [University of Arizona] President Henry Koffler, the Arizona Board of Regents has named Ralph E. Griswold to the rank of Regents Professor. This title, the highest faculty rank at the University, is reserved for scholars whose exceptional achievements have brought them national and international distinction, and who have made unique contributions to the quality of the University through distinguished accomplishments in teaching, scholarship, and creative work. From M13852@mwvm.mitre.ORG Wed Apr 18 07:57:00 1990 Resent-From: M13852@mwvm.mitre.ORG Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA22496; Wed, 18 Apr 90 07:57:00 MST Return-Path: M13852@mwvm.mitre.ORG Received: from mwunix.mitre.org by Arizona.EDU; Wed, 18 Apr 90 07:57 MST Received: from mwvm.mitre.org by mwunix.mitre.org (5.61/SMI-2.2) id AA18834; Wed, 18 Apr 90 10:55:28 -0400 Received: from MWVM by mwvm.mitre.org (IBM VM SMTP R1.2.1) with BSMTP id 9527; Wed, 18 Apr 90 10:56:06 EDT Resent-Date: Wed, 18 Apr 90 07:58 MST Date: Wednesday, 18 Apr 1990 10:56:04 EST From: m13852@mwvm.mitre.ORG Subject: ICON GROUP Resent-To: icon-group@cs.arizona.edu To: icon-group%arizona.edu@mwunix.mitre.ORG Resent-Message-Id: Message-Id: <9004181455.AA18834@mwunix.mitre.org> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group%arizona.edu@mwunix.mitre.ORG Status: O A couple of us here at MITRE are doing some programming in ICON. We are wondering how we can become a part of the ICON Group so that we will be able to pick up the network broadcast messages. Thanks in advance. NJBELL@mwvm.mitre.org * * Noel From tenaglia@fps.mcw.edu Thu Apr 19 01:19:41 1990 Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02853; Thu, 19 Apr 90 01:19:41 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.05) with UUCP id AA08638; Thu, 19 Apr 90 02:25:07 EDT Received: by uwm.edu; id AA02347; Thu, 19 Apr 90 00:37:11 -0500 Message-Id: <9004190537.AA02347@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Wed, 18 Apr 90 18:10:03 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Wed, 18 Apr 90 16:37:54 CDT Date: Wed, 18 Apr 90 16:37:54 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: RE: I/O help X-Vms-Mail-To: UUCP%"castor!ccs007@ucdavis.ucdavis.EDU" Status: O In regards to : > I'm writing a mini-database and would like to know how to load > and save a table of tables to disk. I'm using icon on a VAX 11/785 under > Ultrix-32 V3.1 (Rev. 9). Any help would be greatly appreciated. > Everything I've tried doesn't seem to work. > > Jonathan Sims > ccs007@castor.ucdavis.edu I also have written such a database. It's composed of lists of lists. It's less efficient, but I can have duplicate keys. I store the data in a file in the format of one record per line, and each field in the record is delimited with char(255). The database was designed to be flexible so it wouldn't have to be redone for each individual application. Each database consists of two parts. The data file and the configuration file. The configuration file points to the data file and describes it and certain default characteristics. When run this configuration is loaded, which tells the database how to load the file, build the screens, etc,... Many of the settings can be changed on the fly, even the data file. So it is conceivable to have several data files attached to a given application model. I've implemented both under VMS and Unix. I can see some advantages to storing tables of tables, but it sounds rather complicated. I suppose my database might be thought of as more of a simple tuple editor. Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From goer@sophist.uchicago.EDU Thu Apr 19 18:39:54 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13875; Thu, 19 Apr 90 18:39:54 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Thu, 19 Apr 90 18:39 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 19 Apr 90 20:38:52 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA21462; Thu, 19 Apr 90 20:34:52 CDT Resent-Date: Thu, 19 Apr 90 18:40 MST Date: Thu, 19 Apr 90 20:34:52 CDT From: Richard Goerwitz Subject: question about dereferenced functions Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004200134.AA21462@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Something I was doing the other day got me into a bit of trouble, and it seems obvious (at least on one level) why. The circumstances surrounding my mistake, however, led me to wonder about the underlying representation of functions - something I'd appreciate guidance on from someone more fam- iliar with the implementation. If I execute a program containing the line - write(name(main)) I get "name" written to the standard output. However, if I write, func_lst := [main] write(name(func_lst[])) I get "L[1]." This was what I expected, given the docu- mentation. What I did not expect, but in retrospect seems logical, was that when I execute write(name(func_lst[])) I get the error message, "Run-time error 111, variable expected." Clearly, the global identifier main was de- referenced when it was incorporated into func_lst. So func_lst[] is no longer a variable or identifier. I'm just curious what functions dereference as in Icon, as opposed to, say, C (where "main" usually means &main). Since I have functions and procedures on my mind right now, I might as well ask another question that has been interesting me. If I sort a list of functions and pro- cedures [many, any, myprocedure] what I get is [any, many, myprocedure] apparently in alphabetical order. Is this behavior guar- anteed? Or is it just a consequence of the implementation, and something I should not rely on? -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From icon-group-request@arizona.edu Fri Apr 20 05:58:02 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA17831; Fri, 20 Apr 90 05:58:02 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 20 Apr 90 05:59 MST Received: by ucbvax.Berkeley.EDU (5.62/1.41) id AA27838; Fri, 20 Apr 90 05:45:26 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Fri, 20 Apr 90 05:59 MST Date: 20 Apr 90 12:42:36 GMT From: cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!rpi!jefu@tut.cis.ohio-state.EDU Subject: Solving a simple problem Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <|{W#QJ#@rpi.edu> Organization: The Museum of Differential Geometry X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O The other day I needed to parse a line of comma separated fields into a list - thus "one,two,three,four,five" -> ["one" "two" "three" "four" "five" ] . I worked out a way to handle this, but it was inelegant. I tried at first to use some sort of string generating function to parse things out, but couldnt get it to work right. That is, I wanted to be able to say something like : res := parse(read(file)) where parse would look something like : procedure parse (line) local res every line ? (x := something()) do put(res,x) end Where something would return the next field in line each time it is resumed. Im sure there is a nice way to do this that I'm just missing. So, my question is - what is the most _elegant_ way to solve this problem - preferably using generators of some sort? -- jeff putnam (jefu@pawl.rpi.edu) From goer@sophist.uchicago.EDU Fri Apr 20 08:08:04 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA27282; Fri, 20 Apr 90 08:08:04 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Fri, 20 Apr 90 08:07 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 20 Apr 90 10:06:05 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA22198; Fri, 20 Apr 90 10:02:05 CDT Resent-Date: Fri, 20 Apr 90 08:08 MST Date: Fri, 20 Apr 90 10:02:05 CDT From: Richard Goerwitz Subject: tokenizing Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004201502.AA22198@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Re: "one,two,three,four,five" -> ["one" "two" "three" "four" "five" ] . The method used: procedure parse (line) local res every line ? (x := something()) do put(res,x) end I don't see anything wrong with this method. There's one thing to watch out for, though. If you say "every s ? move(1)", move will get called once, and that's it. What's more, it will reset &pos to the position it had before it was called. You almost always want to say "while tab(some matching procedure) do { something else }." Also, if you are going to "put" something into something else, make sure that the something is not &null (as above). Remember to initialize the list by saying "res := []" or "res := list(0)." I'm not sure precisely what you will be using your tokenized lists for, but anyway here's an alternative way of doing things (NB: *untested*): procedure tokenize(s) token_list := list() s ? { while put(token_list, 1(tab(find(",")),move(1))) put(token_list,tab(0)) } return token_list end Another option, using your format above, is: procedure parse (line) local res res := list() line ? { while x := tab(many(~',')) do { put(res,x); move(1) } } return res end If you wish to allow greater flexibility in your input strings, add characters to ',' above. Personally, I'd tend to think it better to permit "hello,word", "hello, world", and an accidental "hello, world,". It might also be nice to be able to have spaces in the tokens themselves (e.g. "hello, how, is, George Washington", where George Washington is really a single lexical item). procedure parse (line) local res res := list() line ? { while x := tab(many(~',')) do { if (=",", (tab(many(' \t')) | &null)) | pos(0) then put(res,x) else stop("Cannot parse ",line,".") } return res end Again, this is untested. What it should allow you to do is input "hello, how, are you doing" and get back ["hello", "how", "are you doing"]. Icon is indeed a very, very good language for doing things like tokenizing strings. That's one of the things that got me to thinking a while back whether Prolog had been implemented in Icon. It would be kinda like having one's cake and eating it, too. From icon-group-request@arizona.edu Fri Apr 20 19:10:03 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA22426; Fri, 20 Apr 90 19:10:03 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 20 Apr 90 19:03 MST Received: by ucbvax.Berkeley.EDU (5.62/1.41) id AA17517; Fri, 20 Apr 90 18:46:23 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Fri, 20 Apr 90 19:10 MST Date: 21 Apr 90 00:38:33 GMT From: limbo!taylor@apple.COM Subject: What's wrong with this Mac Icon program? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <702@limbo.Intuitive.Com> Organization: Intuitive Systems, Mountain View, CA: +1 (415) 966-1151 X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I'm playing around with the Icon programming language, and am having some very strange problems trying to get my sample program working. The Icon I'm working with is ProIcon for the Macintosh, and the sample program is: ========================================================== # Compute simple readability score of given ASCII text file # # Based on a program presented in "Icon Programming for Humanists" # by Alan Corre', Prentice Hall 1990 global wordlength procedure main() compute_readability(getfile("Choose file for readability score")) end procedure compute_readability(filename) # main work loop of the program local total_words, sentence, sentences total_words := 0 wordlength := 0 sentences := 0 every sentence := get_sentence(filename) do { total_words +:= count_sentence(sentence) sentences +:= 1 } display(total_words, wordlength, sentences) return end procedure get_sentence(filename) # gets a sentence from the file, even if less than a line or more # than a single line of input static markers local fileid, sentence, line, substring initial markers := '.!?' fileid := open(filename) | { write("couldn't open file"); return fail } sentence := "" while line := read(fileid) do { line ? { while substring := tab(upto(markers)) do { # if marker found add it to sentence incl. marker itself sentence ||:= (substring || tab(many(markers))) suspend sentence # skip blanks at beginning of next sentence tab(many(' ')) sentence := "" } # if the line is not finished, append rest to sentence if not pos(0) then sentence ||:= (line[&pos:0] || " ") } } close(fileid) end procedure getword_from_sentence(sentence) # produces a list of words from the given sentence static chars, punct local word initial { chars := (&lcase ++ &ucase ++ '1234567890\'-') punct := ' .,?";:!' } sentence ? { tab(many(' ')) # skip leading white space while word := tab(many(chars)) do { tab(many(punct)) suspend word } } end procedure count_sentence(sentence) # number of words in a sentence local total, word total := 0 every word := getword_from_sentence(sentence) do { wordlength +:= *word total +:= 1 } return total end procedure display(words, wordlength, sentences) local average_sentence_length, average_word_length average_sentence_length := real(words) / real(sentences) average_word_length := real(wordlength) / real(words) write("Total number of words in file: ", words) write("Total number of sentences in file: ", sentences) write("Total combined word length: ", wordlength) write("Average sentence length = ", average_sentence_length) write("Average word length = ", average_word_length) write() write("Readability score = ", average_sentence_length * average_word_length) end ========================================================== The problem I'm having with the program is that it doesn't return the correct results! That is, if I test it on files that are sufficiently small that I can count the words in the file, it gives me expected results. But if I try with an 8100 word file (word count from Microsoft Word) then the Icon program only thinks that there are about 3100 words therein! What's worse is that I wrote a quick C program to compute the same values and it returns much more reasonable values: 8500 words for the file (the difference I assume is based on how Word defines individual word separators). If someone can help me track down what's wrong with the above program, I would be ever-so-grateful! Thanks! -- Dave Taylor Intuitive Systems Macintosh Editor Mountain View, California "Computer Language" Magazine taylor@limbo.intuitive.com or {uunet!}{decwrl,apple}!limbo!taylor From ralph Sat Apr 21 06:34:19 1990 Date: Sat, 21 Apr 90 06:34:19 MST From: "Ralph Griswold" Message-Id: <9004211334.AA20571@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA20571; Sat, 21 Apr 90 06:34:19 MST To: limbo!taylor@apple.COM Subject: Re: What's wrong with this Mac Icon program? Cc: icon-group Status: O The most obvious problem is in the loop that generates the words: +++++++++++++++++++++++++++++++ sentence ? { tab(many(' ')) # skip leading white space while word := tab(many(chars)) do { tab(many(punct)) suspend word } +++++++++++++++++++++++++++++++ The expression tab(many(chars)) only successed if there is a character in chars at the current position. While that's probably true the first time around, it most likely won't be the second time around, so most of the words in the sentence will not be generated. The better method is +++++++++++++++++++++++++++++++ sentence { while tab(upto(chars)) do { word := tab(many(chars)) suspend word } +++++++++++++++++++++++++++++++ Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From goer@sophist.uchicago.EDU Sat Apr 21 07:08:53 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA21796; Sat, 21 Apr 90 07:08:53 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Sat, 21 Apr 90 07:05 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Sat, 21 Apr 90 09:04:21 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA23541; Sat, 21 Apr 90 09:00:19 CDT Resent-Date: Sat, 21 Apr 90 07:10 MST Date: Sat, 21 Apr 90 09:00:19 CDT From: Richard Goerwitz Subject: wordcounts Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004211400.AA23541@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O A recent poster comments how Alan Corre's readability program does not seem to come up with correct values. The following is a series of suggestions as to why this might be occurring. If I mention things you already know, I beg your indulgence. It's better to be safe (say too much) than sorry (not say enough). First, please note that the readability score program gives only a very rough count. Also be sure (you probably were, but I'll mention it anyway) that the program is fed ASCII files only. If you have anything fancy going on with the eighth bit of your char- acters, the program will not work correctly. What it will do un- der such circumstances (I glean this from a quick perusal of the source) is get the number of sentences about right, but undercount the number of words. Finally, note that the program is sentence- based, so if your input file has a lot of headers or other text- blocks not broken up into sentences, you'll get a wrong sentence count (the word count should not be drastically affected). In general, if you want to know what is going on inside an Icon program, "compile" it with the -t option, or else stick a line "&trace := -1" in your code near the beginning of where you want to start tracing. You probably knew this already, but I figured it wouldn't hurt to mention it just in case. One apparent bug in the program, incidentally, is that in the pro- cedure which actually slices out individual words, "punct" is not simply defined as the inverse of &lcase ++ &ucase ++ &digits. As a result, it cannot parse (Hi, this is an aside.) But this is real text (or at least test). I'd just point out again that the program is meant as a simple illus- tration, probably meant to work on a fairly restricted range of texts. Certainly AC could have broadened it to encompass a lot more texts, but then it would have lost its pedagogical value. I hope very much that this helps. Please follow up if it doesn't. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer BTW: In the posting to which I am responding, not all of the program was copied. In particular, getfile() seemed to be missing. I added it in a manner different from how the author appearently did it. I don't have Corre's original here, so if this seems a hopeless perversion of the pristine getfile(), think of it as a bit of programmer's licence. The program will compile now, and people learning Icon can use it to fol- low what is going on: ------------------------------------------------------------------------- # Compute simple readability score of given ASCII text file # # Based on a program presented in "Icon Programming for Humanists" # by Alan Corre', Prentice Hall 1990 global wordlength procedure main(a) every in_list := getfile(a) do { compute_readability(in_list[1], in_list[2]) } end procedure getfile(a) usage := "usage: readable file1 [file2 [file3 [etc...]]]" if *a = 0 then stop(usage) while filename := get(a) do { if intext := open(filename,"r") then suspend [filename, intext] else write(&errout,"readable: Cannot open ",filename,".") } end procedure compute_readability(filename,file) # main work loop of the program local total_words, sentence, sentences total_words := 0 wordlength := 0 sentences := 0 every sentence := get_sentence(file) do { total_words +:= count_sentence(sentence) sentences +:= 1 } display(filename,total_words, wordlength, sentences) return end procedure get_sentence(fileid) # gets a sentence from the file, even if less than a line or more # than a single line of input static markers local sentence, line, substring initial markers := '.!?' sentence := "" while line := read(fileid) do { line ? { while substring := tab(upto(markers)) do { # if marker found add it to sentence incl. marker itself sentence ||:= (substring || tab(many(markers))) suspend sentence # skip blanks at beginning of next sentence tab(many(' ')) sentence := "" } # if the line is not finished, append rest to sentence if not pos(0) then sentence ||:= (line[&pos:0] || " ") } } close(fileid) end procedure getword_from_sentence(sentence) # produces a list of words from the given sentence static chars, punct local word initial { chars := (&lcase ++ &ucase ++ '1234567890\'-') punct := ' .,?";:!' # here's the apparent bug; # try punct := ~chars at first } sentence ? { tab(many(' ')) # skip leading white space while word := tab(many(chars)) do { tab(many(punct)) suspend word } } end procedure count_sentence(sentence) # number of words in a sentence local total, word total := 0 every word := getword_from_sentence(sentence) do { wordlength +:= *word total +:= 1 } return total end procedure display(filename,words, wordlength, sentences) local average_sentence_length, average_word_length average_sentence_length := real(words) / real(sentences) average_word_length := real(wordlength) / real(words) write("\nFilename: ",filename) write("Total number of words in file: ", words) write("Total number of sentences in file: ", sentences) write("Total combined word length: ", wordlength) write("Average sentence length = ", average_sentence_length) write("Average word length = ", average_word_length) write() write("Readability score = ", average_sentence_length * average_word_length) end # taylor@limbo.intuitive.com or {uunet!}{decwrl,apple}!limbo!taylor From nowlin@iwtqg.att.COM Sat Apr 21 15:51:50 1990 Resent-From: nowlin@iwtqg.att.COM Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA04389; Sat, 21 Apr 90 15:51:50 MST Received: from att-in.att.com by Arizona.EDU; Sat, 21 Apr 90 15:49 MST Resent-Date: Sat, 21 Apr 90 15:53 MST Date: Sat, 21 Apr 90 16:23 CDT From: nowlin@iwtqg.att.COM Subject: RE: What's wrong with ... Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Resent-Message-Id: Message-Id: Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268) X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Status: O I'm working my way through the Humanist book. I was intrigued by the readability index program that Dave Taylor posted since it failed miserably on the "read.me" file included on the disk that comes with the Humanist book. Even after Ralph's recommended fix. This is no real fault of the program. This "read.me" file contains file names that contain embedded periods and other characters that are normally used for punctuation. I modified the program to work on this "read.me" file by changing the algorithm for finding an end of sentence and by adding some characters to the cset allowed for words. My modified program still has some problems. The count of words and sentences is correct for the "read.me" file but the length of words is now wrong due to the end-of-sentence character being counted in the length of the last word in a sentence. I don't know. Maybe that's OK. It's definitely non-trivial to parse text! The text analysis assumptions that work fine for plain vanilla text are mostly invalid for technical documents. I'd hate to have to parse text in any language besides Icon. Jerry Nowlin (...!att!iwtqg!nowlin) From goer@sophist.uchicago.EDU Sun Apr 22 22:24:48 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA26298; Sun, 22 Apr 90 22:24:48 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Sun, 22 Apr 90 22:18 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 23 Apr 90 00:17:15 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA25575; Mon, 23 Apr 90 00:13:12 CDT Resent-Date: Sun, 22 Apr 90 22:24 MST Date: Mon, 23 Apr 90 00:13:12 CDT From: Richard Goerwitz Subject: regular expressions for icon Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004230513.AA25575@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I recall several of us discussing some time ago the fact that Icon does not have first-class data-object patterns like Sno- bol, and that as a second best we might like to see an egrep()- like function added. After having thought about this for a while, I've decided that adding something like this would be a bit like guilding the lily. Besides, even in C regex is a library function, and a nasty one at that - one which really doesn't belong in Icon's core language definition. Most of the times I want to use an egrep-like function are when I want to contruct recognizers at run-time from user input. For most purposes, coexpressions suffice. Where these don't offer elegant solutions (or speed), I tend to use system("e- grep...") or open("egrep...","pr"). This isn't portable, and so I wrote an icon-ish egrep commmand that behaves a lot like find(). The following is really an alpha test version of that command. I'd appreciate input from others, especially bug reports. The code is not well-commented. It's a bit redundant as well. I made no efforts to compress or optimize it. I just wanted to make it work. One thing that is particularly nice about Icon is that it let me handle things like position in the current string by using string scanning. Hash tables also served as the basic state- recording structure. I avoided using coexpressions by saving operations in lists (e.g. [function, arg, resulting state]) so that, at run time, I could just say if lst[1](lst[2]) then go to state lst[3]. By doing this, I added just a little speed and portability (not everyone has coexpressions), and perhaps a tad greater clarity as well. Please, please send me comments. ######################################################################## # # Name: find_re.icn # # Title: "Find" Regular Expression # # Author: Richard L. Goerwitz # # Date: April 22, 1990 # ######################################################################## # # DESCRIPTION: Find_re is similar to the Icon builtin function # find(), except that it takes as its first argument a regular # expression of the sort used by the Unix egrep command. For those # unfamiliar with the notion of regular expressions, they represent # a simple string representation of a finite state transition # network which can be converted into an automaton capable of # recognizing patterns in strings of characters. The specific # symbols used, and the purposes they are used for, can be gleaned # from the Unix man pages for egrep and the regex library # functions. In even more basic terms, regular expressions can be # thought of as a very flexible and powerful set of "wildcards." # # DIFFERENCES between egrep and find_re: Find_re utilizes the same # basic language as egrep. The only major differences are: 1) That # find_re is a bit more rigid in its syntax (e.g. find_re will # reject vagarities like 'a*?'), and 2) that find_re utilizes the # intrinsic Icon data structures and escaping conventions, rather # than those of any particular Unix variant. # # BUGS: No attempt has been made to optimize find_re. For work # that requires a quick response, you'll have to use something like # system("egrep...")! Note, though, that while find_re takes a # while to compile a regular expression, find_re at least has enough # sense to store the resulting automaton for quick access in subse- # quent calls. # ######################################################################## global state_table procedure find_re(re, s, i, j) static FSTN_table initial FSTN_table := table() /re & stop("find_re: Call me with at least one argument!") /i := \&pos | 1 /s := \&subject | stop("find_re: No string.") /j := *s+1 if /FSTN_table[re] then { tokenized_re := tokenize(re) MakeFSTN(tokenized_re) | er(re,2) /FSTN_table[re] := copy(state_table) } s ? { tab(x := i to j) & apply_FSTN(&null,FSTN_table[re]) & (suspend x) } end procedure apply_FSTN(ini,tbl) static s_tbl local POS, tmp POS := &pos /ini := 1 & s_tbl := tbl fin := 2 if ini == 0 then { return 0 } if tmp := !s_tbl[ini] & tab(tmp[1](tmp[2])) then { if tmp[3] = fin then return 0 else { return apply_FSTN(tmp[3]) | (&pos := POS, fail) } } else &pos := POS end procedure tokenize(s) token_list := list() s ? { while chr := move(1) do { if chr == "\\" # it can't be a metacharacter; remove the \ and "put" # the integer value of the next chr into token_list then put(token_list,ord(move(1))) | er(s,2,chr) else if any('*+()|?.$^',chr) then put(token_list,-ord(chr)) else { case chr of { "[" : { every next_one := find("]") \next_one ~= &pos | er(s,2,chr) put(token_list,-ord(chr)) } "]" : { if &pos = (\next_one+1) then put(token_list,-ord(chr)) & next_one := &null else put(token_list,ord(chr)) } default: put(token_list,ord(chr)) } } } } token_list := UnMetaBrackets(token_list) fixed_length_token_list := list(*token_list) every i := 1 to *token_list do fixed_length_token_list[i] := token_list[i] return fixed_length_token_list end procedure UnMetaBrackets(l) # Since brackets delineate a cset, it doesn't make # any sense to have metacharacters inside of them. # UnMetaBrackets makes sure there are no metacharac- # ters inside of the braces. tmplst := list(); i := 0 Lb := -ord("[") Rb := -ord("]") while (i +:= 1) <= *l do { if l[i] = Lb then { put(tmplst,l[i]) until l[i +:= 1] = Rb do put(tmplst,abs(l[i])) put(tmplst,l[i]) } else put(tmplst,l[i]) } return tmplst end procedure MakeFSTN(l,INI,FIN) # MakeFSTN recursively descends through the tree structure # implied by the tokenized string, l, recording in (global) # fstn_table a list of operations to be performed, and the # initial and final states which apply to them. static Lp, Rp, Sl, Lb, Rb, Caret_inside, Dot, Dollar, Caret_outside initial { Lp := -ord("("); Rp := -ord(")") Sl := -ord("|") Lb := -ord("["); Rb := -ord("]"); Caret_inside := ord("^") Dot := -ord("."); Dollar := -ord("$"); Caret_outside := -ord("^") } /INI := NextState("new") & state_table := table() /FIN := NextState() # I haven't bothered to test for empty lists everywhere. if *l = 0 then { /state_table[INI] := [] put(state_table[INI],[zSucceed,,FIN]) return } # HUNT DOWN THE SLASH (ALTERNATION OPERATOR) ini := INI; inter := NextState() inter2:= NextState() every i := 1 to *l do { if l[i] = Sl & tab_bal(l,Lp,Rp) = i then { if i = 1 then er(l,2,char(abs(l[i]))) else { MakeFSTN(l[1:i],inter2,FIN) MakeFSTN(l[i+1:0],inter,FIN) /state_table[ini] := [] put(state_table[ini],[apply_FSTN,inter2,0]) put(state_table[ini],[apply_FSTN,inter,0]) return } } } # HUNT DOWN PARENTHESES ini := INI; fin := FIN if l[1] = Lp then { i := tab_bal(l,Lp,Rp) | er(l,2,"(") inter := NextState() if any('*+?',char(abs(0 > l[i+1]))) then { case l[i+1] of { -ord("*") : { /state_table[ini] := [] put(state_table[ini],[apply_FSTN,inter,0]) MakeFSTN(l[2:i],ini,ini) MakeFSTN(l[i+2:0],inter,fin) return } -ord("+") : { inter2 := NextState() /state_table[inter2] := [] MakeFSTN(l[2:i],ini,inter2) put(state_table[inter2],[apply_FSTN,inter,0]) MakeFSTN(l[2:i],inter2,inter2) MakeFSTN(l[i+2:0],inter,fin) return } -ord("?") : { /state_table[ini] := [] put(state_table[ini],[apply_FSTN,inter,0]) MakeFSTN(l[2:i],ini,inter) MakeFSTN(l[i+2:0],inter,fin) return } } } else { MakeFSTN(l[2:i],ini,inter) MakeFSTN(l[i+1:0],inter,fin) return } } else { # I.E. l[1] NOT = Lp (left parenthesis as -ord("(")) every i := 1 to *l do { case l[i] of { Lp : { inter := NextState() MakeFSTN(l[1:i],ini,inter) MakeFSTN(l[i:0],inter,fin) return } Rp : er(l,2,")") } } } # NOW, HUNT DOWN BRACKETS ini := INI; fin := FIN if l[1] = Lb then { i := tab_bal(l,Lb,Rb) | er(l,2,"[") inter := NextState() tmp := ""; every tmp ||:= char(l[2 to i-1]) if Caret_inside = l[2] then tmp := ~cset(Expand(tmp[2:0])) else tmp := cset(Expand(tmp)) if any('*+?',char(abs(0 > l[i+1]))) then { case l[i+1] of { -ord("*") : { /state_table[ini] := [] put(state_table[ini],[apply_FSTN,inter,0]) put(state_table[ini],[any,tmp,ini]) MakeFSTN(l[i+2:0],inter,fin) return } -ord("+") : { inter2 := NextState() /state_table[ini] := [] put(state_table[ini],[any,tmp,inter2]) /state_table[inter2] := [] put(state_table[inter2],[apply_FSTN,inter,0]) put(state_table[inter2],[any,tmp,inter2]) MakeFSTN(l[i+2:0],inter,fin) return } -ord("?") : { /state_table[ini] := [] put(state_table[ini],[apply_FSTN,inter,0]) put(state_table[ini],[any,tmp,inter]) MakeFSTN(l[i+2:0],inter,fin) return } } } else { /state_table[ini] := [] put(state_table[ini],[any,tmp,inter]) MakeFSTN(l[i+1:0],inter,fin) return } } else { # I.E. l[1] not = Lb every i := 1 to *l do { case l[i] of { Lb : { inter := NextState() MakeFSTN(l[1:i],ini,inter) MakeFSTN(l[i:0],inter,fin) return } Rb : er(l,2,"]") } } } # FIND INITIAL SEQUENCES OF POSITIVE INTEGERS, CONCATENATE THEM if i := match_positive_ints(l) then { inter := NextState() tmp := Ints2String(l[1:i]) /state_table[INI] := [] put(state_table[INI],[match,tmp,inter]) MakeFSTN(l[i:0],inter,FIN) return } # OKAY, CLEAN UP ALL THE JUNK THAT'S LEFT i := 0 while (i +:= 1) <= *l do { case l[i] of { Dot : { Op := any; Arg := &cset } Dollar : { Op := pos; Arg := 0 } Caret_outside: { Op := pos; Arg := 1 } default : { Op := match; Arg := char(0 < l[i]) } } | er(l,2,char(abs(l[i]))) ini := INI; fin := FIN inter := NextState() if any('*+?',char(abs(0 > l[i+1]))) then { case l[i+1] of { -ord("*") : { /state_table[ini] := [] put(state_table[ini],[apply_FSTN,inter,0]) put(state_table[ini],[Op,Arg,ini]) MakeFSTN(l[i+2:0],inter,FIN) return } -ord("+") : { inter2 := NextState() /state_table[ini] := [] put(state_table[ini],[Op,Arg,inter2]) /state_table[inter2] := [] put(state_table[inter2],[apply_FSTN,inter,0]) put(state_table[inter2],[Op,Arg,inter2]) MakeFSTN(l[i+2:0],inter,FIN) return } -ord("?") : { /state_table[ini] := [] put(state_table[ini],[apply_FSTN,inter,0]) put(state_table[ini],[Op,Arg,inter]) MakeFSTN(l[i+2:0],inter,FIN) return } } } else { /state_table[ini] := [] put(state_table[ini],[Op,Arg,inter]) MakeFSTN(l[i+1:0],inter,FIN) return } } # WE SHOULD NOW BE DONE INSERTING EVERYTHING INTO state_table # IF WE GET TO HERE, WE'VE PARSED INCORRECTLY! er(l,4) end procedure NextState(new) static nextstate if \new then nextstate := 0 return nextstate +:= 1 end procedure er(x,i,elem) writes(&errout,"Error number ",i," parsing ",image(x)," at ") if \elem then write(&errout,image(elem),".") else write(&errout,"(?).") exit(i) end procedure zSucceed() return .&pos end procedure Expand(s) s2 := "" s ? { s2 ||:= ="^" s2 ||:= ="-" while s2 ||:= tab(find("-")-1) do { if (c1 := move(1), ="-", c2 := move(1), c1 << c2) then every s2 ||:= char(ord(c1) to ord(c2)) else s2 ||:= 1(move(2), not(pos(0))) | er(s,2,"-") } s2 ||:= tab(0) } return s2 end procedure tab_bal(l,i1,i2) i := 0 i1_count := 0; i2_count := 0 while (i +:= 1) <= *l do { case l[i] of { i1 : i1_count +:= 1 i2 : i2_count +:= 1 } if i1_count = i2_count then suspend i } end procedure match_positive_ints(l) # Matches the longest sequence of positive integers in l, # beginning at l[1], which neither contains, nor is fol- # lowed by a negative integer. Returns the first position # after the match. Hence, given [55, 55, 55, -42, 55], # match_positive_ints will return 3. [55, -42] will cause # it to fail rather than return 1 (NOTE WELL!). every i := 1 to *l do { if l[i] < 0 then return (3 < i) - 1 } end procedure Ints2String(l) tmp := "" every tmp ||:= char(!l) return tmp end From nowlin@iwtqg.att.COM Mon Apr 23 07:05:37 1990 Resent-From: nowlin@iwtqg.att.COM Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA12657; Mon, 23 Apr 90 07:05:37 MST Received: from att-in.att.com by Arizona.EDU; Mon, 23 Apr 90 07:04 MST Resent-Date: Mon, 23 Apr 90 07:05 MST Date: Mon, 23 Apr 90 08:45 CDT From: nowlin@iwtqg.att.COM Subject: RE: regular expressions for icon Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Resent-Message-Id: Message-Id: Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268) X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Status: O It just so happens I have a set of tests for regular expression matching since I had an Icon version of grep to test a few years ago. I wrote a main program to test the version of find_re() posted to this group and it did OK. Of the 100 tests I have, it only matched two patterns it shouldn't have and only missed 17 patterns that it should have matched. I've included the main program that uses the find_re() procedure to simulate a grep command here. I get really bugged by partial postings that people have to hack a front on before they can try. procedure main(a) # the usage message usage := "Usage: RGgrep pattern [ file ... ]" # the first program argument must be the pattern pattern := get(a) | stop("I at least need a pattern\n",usage) # trick the program into using standard input if no files were passed if *a = 0 then a := [&null] # the rest of the arguments are files to search through every f := !a do { # if the file isn't null try to open it if \f then in := open(f) | stop("I can't open '",f,"'") # otherwise use standard input else in := &input # if there is only one file skip printing the file name if *a = 1 then f := "" # otherwise tack on a colon else f ||:= ":" # read all the lines every l := !in do { # scan the line for the pattern l ? { ##### BELOW IS THE CALL TO the find_re() procedure posted earlier ##### # if the pattern is found print the line if find_re(pattern) then write(f,l) } } # close the input file is one was opened if in ~=== &input then close(in) } end I'll post the tests in a separate message since there's an Icon program to run the tests (naturally) along with the file of tests and they total more than 100 lines. The comments in the test program should be enough to use the program and the included tests. Jerry Nowlin (...!att!iwtqg!nowlin) From nowlin@iwtqg.att.COM Mon Apr 23 07:18:33 1990 Resent-From: nowlin@iwtqg.att.COM Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA14809; Mon, 23 Apr 90 07:18:33 MST Received: from att-in.att.com by Arizona.EDU; Mon, 23 Apr 90 07:15 MST Resent-Date: Mon, 23 Apr 90 07:19 MST Date: Mon, 23 Apr 90 08:58 CDT From: nowlin@iwtqg.att.COM Subject: regular expression tests Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Resent-Message-Id: Message-Id: Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268) X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Status: O Below is a program that will test any grep-like command and a file of the tests that it uses. I've run these tests with the standard UNIX grep and it only fails on the tests that use the '+' character to match at least one and any subsequent number of characters. The grep on my system doesn't like that particular expression. The Icon grep I have does like it and that's why these tests are still in this suite. If anyone finds bugs in these tests let me know. Your software is only as good as your tests. Jerry Nowlin (..!att!iwtqg!nowlin) -------------------------- test program follows -------------------------- # This program is to test commands that work like the standard UNIX grep. # That is, they can read from standard input and their first argument is a # regular expression that is searched for in the input. The first argument # to the program is the name of a command to test. The second argument to # the program is a file containing tests to be executed. For example, the # following is a valid invocation of this program: # # tst grep test.pats # # Each line in the file of tests must contain a string followed by a # regular expression followed by a comment. I use the comment field to # indicate whether or not the test is supposed to succeed or fail but # anything you want can be included in the comment field. The token # separating the three fields is a pipe symbol. This was completely # arbitrary. You can change it to anything you want. The following is a # valid test line: # # abbbbc|ab*c|YES # # This test should succeed. If it doesn't there's a problem with the # program being tested. # # I use this program to test a version of grep written in Icon. Since I # can test the standard versions of grep with this program to I can compare # the Icon version to the standard. procedure main(args) # this program requires two arguments if *args ~= 2 then stop("Usage: tst cmd file") # the command to be tested is the first argument cmd := get(args) write("Testing the ",cmd," command") # the file of tests is the second argument file := get(args) in := open(file) | stop("I can't open '",file,"'") # each line should contains a test every !in ? { str := tab(upto('|')) move(1) pat := tab(upto('|')) move(1) com := tab(0) write(com,": searching '",str,"' for pattern '",pat,"'") # quote the string if not find("\"",str) then str := "\"" || str || "\"" else if not find("'",str) then str := "'" || str || "'" else stop("Bad string: ",str) # quote the pattern if not find("\"",pat) then pat := "\"" || pat || "\"" else if not find("'",pat) then pat := "'" || pat || "'" else stop("Bad pattern: ",pat) # invoke the command system("echo " || str || " | " || cmd || " " || pat) } # close the test file close(in) end -------------------------- test file follows -------------------------- ac|ab|NO ac|abc|NO ac|ab*c|YES ac|ab+c|NO ac|a.*c|YES ac|a.+c|NO abc|ab|YES abc|abc|YES abc|ab*c|YES abc|ab+c|YES abc|a.*c|YES abc|a.+c|YES abbbbc|ab|YES abbbbc|abc|NO abbbbc|ab*c|YES abbbbc|ab+c|YES abbbbc|a.*c|YES abbbbc|a.+c|YES akc|ab|NO akc|abc|NO akc|ab*c|NO akc|ab+c|NO akc|a.*c|YES akc|a.+c|YES akjhgfc|ab|NO akjhgfc|abc|NO akjhgfc|ab*c|NO akjhgfc|ab+c|NO akjhgfc|a.*c|YES akjhgfc|a.+c|YES this is it|^this|YES this is it|^his|NO this is it|his|YES this is it|his$|NO this is it|it$|YES match carat|.*^|NO match (^) carat|.*^|YES (^) match carat|.^|YES (^) match carat|(^|YES match dollar|$.*|NO match ($) dollar|$.*|YES ($) match dollar|$.|YES ($) match dollar|$)|YES no stars|^**|YES no stars|^*+|NO *#$%&@!_+=:|^**|YES *#$%&@!_+=:|^*+|YES no stars|**|YES no stars|*+|NO *#$%&@!_+=:|**|YES *#$%&@!_+=:|*+|YES no pluses|^+*|YES no pluses|^++|NO +#$%&@!_*=:|^+*|YES +#$%&@!_*=:|^++|YES no pluses|+*|YES no pluses|++|NO +#$%&@!_*=:|+*|YES +#$%&@!_*=:|++|YES ABCabcdefDEF|^[a-z]|NO ABCabcdefDEF|[a-z]$|NO ABCabcdefDEF|^[A-Z]|YES ABCabcdefDEF|[A-Z]$|YES abcABCDEFdef|^[a-z]|YES abcABCDEFdef|[a-z]$|YES abcABCDEFdef|^[A-Z]|NO abcABCDEFdef|[A-Z]$|NO ABCabcdefDEF|^[acbfed]|NO ABCabcdefDEF|[acbfed]$|NO ABCabcdefDEF|^[FA]|YES ABCabcdefDEF|[FA]$|YES abcABCDEFdef|^[acbfed]|YES abcABCDEFdef|[acbfed]$|YES abcABCDEFdef|^[FEADCB]|NO abcABCDEFdef|[FEADCB]$|NO ABCabcdefDEF|^[FE0DCB]|NO ABCabcdefDEF|[9EADCB]$|NO abcABCDEFdef|^[9cbfed]|NO abcABCDEFdef|[acb0ed]$|NO ABCabcdefDEF|[a-cd-f]D|YES ABCabcdefDEF|C[fa]|YES abcABCDEFdef|c[^a-z]|YES abcABCDEFdef|[^0-9]A|YES this is a more complicated test| is .*test$|YES this is a more complicated test| is .*test|YES this is a more complicated test| is *test$|NO this is a more complicated test.| is .*test$|NO this is a more complicated test|is.*test|YES this istest may be weird|is.*test|YES this may be a more complicated test| is .*test$|NO this may be a more complicated test|is .*test$|YES this may be a more complicated test| is .*test|NO this may be a more complicated test| is .*test$|NO this may be a more complicated test.|is .*test$|NO this may be a more complicated test|is.*test|YES test ranges 5198402 ablkseimnfaKJLDLD|[-D]|YES test ranges 5198402 ablkseimnfaKJLDLD|[-Z]|NO test ranges 5198402 ablkseimnfaKJLDLD|[A-Z]|YES test ranges 5198402 ablkseimnfaKJLDLD|[A-]|NO test ranges 5198402 ablkseimnfaKJLDLD|[a-]|YES From goer@sophist.uchicago.EDU Mon Apr 23 13:50:42 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA28441; Mon, 23 Apr 90 13:50:42 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Mon, 23 Apr 90 10:44 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 23 Apr 90 11:49:40 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA28106; Mon, 23 Apr 90 11:45:10 CDT Resent-Date: Mon, 23 Apr 90 13:48 MST Date: Mon, 23 Apr 90 11:45:10 CDT From: Richard Goerwitz Subject: find_re and egrep Resent-To: icon-group@cs.arizona.edu To: nowlin@iwtqg.att.COM Cc: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004231645.AA28106@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: nowlin@iwtqg.att.COM X-Vms-Cc: icon-group@Arizona.edu Status: O JN expressed some annoyance that I did not provide find_re() in a form that would make it immediately testable. I am sorry that I annoyed anyone. My intent was to help! My reason for posting this way was that find_re is NOT just an egrep without "procedure main." If people wanna put wrappers around it, and then run it through egrep-like tests they can do this. The comments prepended to it, however, state very clearly that it will not pass all tests geared for the Unix egrep system command. In particular (as Jerry Nowlin's tests confirm), find_re will reject constructs like '.*?'. If I ever chance to write '.*?', it always means one of two things: 1) I am being very sloppy, or 2) I am not thinking about what I am doing, and am in fact making an error. Note also that find_re utilizes Icon's escaping conventions, and does not attempt to accommodate itself to any particular Unix variant. Again, I am sorry if the way I posted find_re annoyed any- one else. My aim was not to make testing difficult. In fact, I have al- ready put it through a large battery of egrep tests. Differences that ex- ist between egrep and find_re are there because I want them there (or else because egrep is not consistent from operating system [version] to operating system [version]). What I had hoped is that people might test it within Icon as a "find" variant with added functionality (but a lot slower). One thing that might be done with it, in fact, is to place a test right at the outset that checks for input strings without metacharacters and then calls find() if none are found. Another thing to do might be to add a fifth argu- ment, which if it is nonnull, frees up all the space allocated for stored automata. I have no idea whether this would be worth it (probably not). Naturally, I'll work on speeding it up. If people do in fact want to test find_re as an egrep program, I don't object. I just want to be sure everyone realizes that in the marginal kinds of cases that standard tests tend to work on, find_re will show certain sys- tematic differences from egrep. In actual usage, these differences will only show up once in a blue moon, and should always consist in an error mes- sage flagging a pattern like '.*+' or '$)' (the last of which most egrep commands will flag as an error as well). Like all egrep commands I have access to, find_re will not construe $ and ^ as literals in contexts like '.*^' or '$?', even though it might make better sense to do so. Because these sequences lie in one of those gray areas, maybe I should consider flagging them as errors? -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From icon-group-request@arizona.edu Tue Apr 24 11:07:41 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA06257; Tue, 24 Apr 90 11:07:41 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 24 Apr 90 11:08 MST Received: by ucbvax.Berkeley.EDU (5.62/1.41) id AA23588; Tue, 24 Apr 90 11:03:16 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 24 Apr 90 11:09 MST Date: 24 Apr 90 18:02:36 GMT From: sdd.hp.com!zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!corre@ucsd.EDU Subject: Word Counts Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <3594@uwm.edu> Organization: University of Wisconsin-Milwaukee X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I think the relevant points concerning the program in my book have been made by others with the skill that one is accustomed to seeing on this newsgroup, but I should like to add some meditations on this theme. Leslie Lamport writes: "TeX ... has trouble deciding which periods end sentences." The statement is not strictly accurate. The author of TeX apparently decided that while it was worth while to teach TeX to hyphenate virtually every word in the English language (with the possible exception of "gnomonly"), it was not worth while to allow for the etc.'s and viz.'s and I + II.'s. The task of making determinations about the period is left in human hands. Actually it would be quite possible to write an Icon preprocessor to take care of that gap, but one likewise has to decide if it is worth it. The fact is that our notation both of natural and mathematical language is saddled with centuries of accrued accidents. Consider: Capitalization: Semitic languages manage just fine without it. So did computers for a while. FORTRAN is not written like that because it is an acronym, but because the limited character set forced it. what a chance was missed to dump our two-tier system, the main function of which is to make third graders miserable! but QWERTY reared its head, and we had to squeeze computers into our foolishness. if Capitalization had any Meaning we might be capitalizing our Nouns like german and norwegian. Functions: The fact that "plus" is written in between its arguments, "square" is written after and superscript, and "square root" is written before obscures the essential similarity of these items which establish relationships. Yet we really can't get used to Polish type notation. Intonation: A tremendously significant force which has a wretched representation in punctuation and underlining. The possibilities of recording saying a simple "yes" fearfully, enthusiastically, grudgingly have not even been explored. If we must have complicated representation systems, why leave this out? Spelling: enuff said. We need not think that these anomalies do not take their toll. When Joshua was pursuing some kings and wanted to know their names, a simple shepherd boy was able to write them down for him. The new alphabetic system was devoid of frills, could be written on a piece of broken pot, and learned in short order. Our clumsy system leaves us with vast numbers of intelligent illiterates trying to master a fundamentally rong sistem. G.B. Shaw left his entire estate to bankroll a reform of English spelling, but it didn't help. The toll in human deprivation is quite substantial. Of course, we'd have to find something for the teachers to do if the representation of English could be learned in five easy lessons... I suspect that the ancient Akkadian scribes who had to master an incredibly complicated writing system liked it that way, because it made them indispensible. Their monopoly was smashed by the inventors of the alphabet, but we have managed to reestablish the old, turf-bound order to a certain degree. Computers bring us head to head with all the inconsistencies that we cope with daily. Instead of fitting their simple logic, we massage or bludgeon them into accepting our outworn habits. Now on the pedagogical issue. My definition of "word" is then simplistic, if indeed we could ever agree on what a word is. I define it as a string of alphabetic characters, or something marked off by markers such as space and period. If you consider the following sentence which is a bit Elizabethan but nonetheless valid 'Tis the boys'. (= modern English, "It belongs to the boys") you will see the difficult of teaching the machine that this is not a pair of single quotes but two apostrophes. I do hint on page forty that the apostrophe is troublesome, but deem it better to let the reader find his or her own problems that belong in the realm of the way we represent things in general rather than in Icon or the hardware. Let me give another example. On p. 36 there is a little program which simply puts on the screen the entire ascii character set. It had worked fine when I originally tried it with version 5 of Icon, but when I tried it on version 7 it failed. Control-Z would not allow itself to be written to the screen. Since I could not solve this problem, I applied to Ralph who ascertained that the problem is a feature (or bug, depending how you look at it) of one of the C compilers, and this had been differently implemented in v5. I opted to include in the program an instruction to omit Control-Z if the program failed, but did not explain the details to the reader. I figured that a student at that point would really not want to be bothered by the vagaries of C compilers and would probably prefer to remain in blissful ignorance (as, in general, I do myself on such matters.) This is the kind of paternalistic decision which teachers (like parents) are continually called upon to make. Reference manuals should be exhaustive, and exhaustiveness should have priority over clarity if a choice has to be made. But books meant to teach have to be clear, and this means leaving a great deal out. When I started to learn Hebrew, I used a grammar (by the Scotsman Davidson) which started with a vast amount of theoretical knowledge of Hebrew's complex vocalization system. It was a chicken and egg situation; you needed the theory to understand the rest of the book, but you couldn't understand the theory until you had read the rest of the book. That may be the real reason that I have small classes, so maybe I shouldn't complain. Seriously though, it is difficult to determine how much detail a student can handle---and one can't write a book to fit the needs of every individual. As the judge said in the Ulysses case, you just have to consider the reader whose degree of sensuality is average. Read sense for sensuality in this case. What is all of this doing in an Icon discussion? It is relevant I believe. The development of algorithmic thinking is a valuable asset which has a distinct humanistic value. Not to say that there is no room for sentiment, opinion, taste. But one has to know the difference, and the great thing about programming is that ideas can be tested by a reliable arbiter. As a child I was told that learning Latin would help me "think logically", and for seven long years I had Caesar, Virgil and Lucretius shoved down my throat. I disagreed with my teachers, although I rarely expressed it because it could result in a sore bottom. The logic of Latin grammar seemed to me a myth to which I was forced to give lip service. (I was delighted when I found later that the Classical Arabic verb "to be" takes an accusative---which my Latin teachers declared to be a sin against logic.) Computer languages really are logical, and I believe they really do make a difference in the way one approaches day to day problems. So programming has a humanistic value in its own right. And from this point of view, I believe that Icon is the best language to study. It gets across the fundamental point of algorithmic thinking without burdening one with endless struggles with data types, significant though they may be. It gives fair treatment to text and to math. It doesn't pretend to be "English-like" on the one hand, or use impenetrable abbreviations on the other. And if I never write a compiler in it, I won't be heart broken. -- Alan D. Corre Department of Hebrew Studies University of Wisconsin-Milwaukee (414) 229-4245 PO Box 413, Milwaukee, WI 53201 corre@csd4.csd.uwm.edu From icon-group-request@arizona.edu Tue Apr 24 14:52:48 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA00443; Tue, 24 Apr 90 14:52:48 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 24 Apr 90 14:52 MST Received: by ucbvax.Berkeley.EDU (5.62/1.41) id AA07280; Tue, 24 Apr 90 14:47:34 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 24 Apr 90 14:53 MST Date: 20 Apr 90 06:26:42 GMT From: helios.ee.lbl.gov!hellgate.utah.edu!uplherc!wicat!sarek!gsarff@ucsd.EDU Subject: How to obtain IDOL? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <00464@sarek.UUCP> Organization: Programmers in Exile X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O As in the subject line, how can I obtain a copy of IDOL? Preferably non-ftp since I don't have access myself. Does the Icon project have a mail server or would anyone there be able to mail it, or possibly better, is it online on any other system that does have a mail-archive-server? Thanks. From cjeffery Tue Apr 24 15:15:33 1990 Resent-From: "Clinton Jeffery" Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02404; Tue, 24 Apr 90 15:15:33 MST Received: from megaron.cs.arizona.edu by Arizona.EDU; Tue, 24 Apr 90 15:14 MST Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02162; Tue, 24 Apr 90 15:13:01 MST Received: by caslon; Tue, 24 Apr 90 15:13:01 mst Resent-Date: Tue, 24 Apr 90 15:17 MST Date: Tue, 24 Apr 90 15:13:01 mst From: Clinton Jeffery Subject: How to obtain IDOL? Resent-To: icon-group@cs.arizona.edu To: helios.ee.lbl.gov!hellgate.utah.edu!uplherc!wicat!sarek!gsarff@ucsd.EDU Cc: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004242213.AA14596@caslon> In-Reply-To: helios.ee.lbl.gov!hellgate.utah.edu!uplherc!wicat!sarek!gsarff@ucsd.EDU's message of 20 Apr 90 06:26:42 GMT <00464@sarek.UUCP> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: helios.ee.lbl.gov!hellgate.utah.edu!uplherc!wicat!sarek!gsarff@ucsd.EDU X-Vms-Cc: icon-group@Arizona.edu Status: O You asked if Idol is available electronically to people without ftp access. cs.arizona.edu does not have a mail server that I know of. I have automated the process of e-mailing out copies of Idol in the form of UNIX shell-archive files for people who can send me a working e-mail address. I guess that makes me a mail server. Idol is also distributed with various systems' Version 8 of the Icon Program Library, which can be ordered from the Icon Project. -- | Clint Jeffery, U. of Arizona Dept. of Computer Science | cjeffery@cs.arizona.edu -or- {noao allegra}!arizona!cjeffery -- From icon-group-request@arizona.edu Wed Apr 25 11:37:48 1990 Resent-From: icon-group-request@arizona.edu Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA16446; Wed, 25 Apr 90 11:37:48 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 25 Apr 90 11:38 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA17908; Wed, 25 Apr 90 11:30:47 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 25 Apr 90 11:39 MST Date: 25 Apr 90 18:30:08 GMT From: usc!cs.utexas.edu!jnino@ucsd.EDU Subject: what is IDOL Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1254@gorath.cs.utexas.edu> Organization: U. Texas CS Dept., Austin, Texas X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I have very recently become interested in Icon, and I'm new to this news group. I read a message of someone inquiring about IDOL. Could anybody drop me a hint as to what that is...just wondering. Thank you. Jaime From goer@sophist.uchicago.EDU Wed Apr 25 12:24:02 1990 Resent-From: goer@sophist.uchicago.EDU Received: from maggie.telcom.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA21179; Wed, 25 Apr 90 12:24:02 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Wed, 25 Apr 90 12:25 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 25 Apr 90 14:23:09 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA02170; Wed, 25 Apr 90 14:18:59 CDT Resent-Date: Wed, 25 Apr 90 12:25 MST Date: Wed, 25 Apr 90 14:18:59 CDT From: Richard Goerwitz Subject: great set of wildcards - improved Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004251918.AA02170@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O (It has come to my attention that many reading this newsgroup do not know about egrep. In nontechnical terms, egrep is a great little pattern- finding program that uses a powerful wildcard system. These wildcards are far, far superior to, anything you might find in a wordprocessor, or say, in MS-DOS's ? and * [these symbols are used, but they mean something different to egrep]. Icon's pattern-matching facilities are, in fact, much more powerful than egrep's. However, you can't generally access them at run-time as elegantly as with, say, Snobol4. I.e. you can't easily tell Icon what to scan for while the program is running. Unless you want to write a compiler yourself, use a lot of coexpressions, or use a variant translator like the one Ken Walker constructed, you will be up a creek - UNLESS you have find_re() [which essentially is a lit- tle compiler]. I hope this helps everyone understand the following posting.) Okay, okay, I've been chided by several people now privately about not just biting the bullet and making find_re fully egrep-compatible. Here's a new version WITH a wrapper (courtesy of Jerry Nowlin). It fails only one of the tests included with the gnu egrep distribution, and this be- cause of differences in escaping conventions that I don't think would be wise to change. In essence, find_re() is now compatible both with find() (as to its syn- tax), and with egrep (as to its input language and exit codes). I hope that it will provide people with distinctly Iconish access to the ori- ginally non-Iconish egrep-style pattern-matching language. Note that the program posted here runs from two to twenty-five times as fast as the previous version. Jerry Nowlin reminded that he had earlier posted a grep-like utility. I found it in my archives, ran it, found it to be MUCH faster than my command. Though JN's grep did not implement things like () and |, find_re hardly seemed worth its added functionality if it was so slow. I used some of the ideas in JN's grep, added a few easy optimizations, ran it through all the tests again, and presto! Please tell me if any bugs are found. Note that the syntax is like that of *e*grep - not grep. Note also that I am not in any way directly con- nected with computer science as a field. I've never read the egrep source code, and I've never written a compiler. This is just a utility I wrote because I myself needed it. It would *certainly* be possible to write it faster. I invite (in fact, let me challenge) the real computer science people around here to do it *right* (= faster, better). No cheating, either (i.e. adding C calls to the Icon source, as one can do with version 8). It's gotta have full egrep functionality as well, and pass all the main tests one might subject an egrepi-like command to! --------------------------------RGegrep.icn---------------------------------- # wrapper by Jerry Nowlin procedure main(a) # the usage message usage := "Usage: RGegrep pattern [ file ... ]" # the first program argument must be the pattern # Perhaps: *a = 0 & stop("more args!"); while pattern := get(a) do { pattern := get(a) | stop("I at least need a pattern\n",usage) # trick the program into using standard input if no files were passed if *a = 0 then a := [&null] # the rest of the arguments are files to search through every f := !a do { # if the file isn't null try to open it if \f then in := open(f) | stop("I can't open '",f,"'") # otherwise use standard input else in := &input # if there is only one file skip printing the file name if *a = 1 then f := "" # otherwise tack on a colon else f ||:= ":" # read all the lines every l := !in do { # scan the line for the pattern l ? { ##### BELOW IS THE CALL TO the find_re() procedure listed below ##### # if the pattern is found print the line if find_re(pattern) then exit_status := 0 & write(f,l) } } # close the input file is one was opened if in ~=== &input then close(in) } # If set in a while loop, put another brace around the above - # } exit(\exit_status | 1) end ######################################################################## # # Name: find_re.icn # # Title: "Find" Regular Expression # # Author: Richard L. Goerwitz # # Date: April 25, 1990 (version 0.9) # ######################################################################## # # DESCRIPTION: Find_re is similar to the Icon builtin function # find(), except that it takes as its first argument a regular # expression of the sort used by the Unix egrep command. For those # unfamiliar with the notion of regular expressions, they represent # a simple string representation of a finite state transition # network which can be converted into an automaton capable of # recognizing patterns in strings of characters. The specific # symbols used, and the purposes they are used for, can be gleaned # from the Unix man pages for egrep and the regex library # functions. In nontechnical terms, regular expressions are a # great set of wildcards. # # DIFFERENCES between find and find_re: Find_re is backwards com- # patible with find(). Aside from permitting regular expressions, # the only difference between find_re() and find() is that find_re # sets the global variable __endpoint to the first position after # any given match occurs. Use this variable with great care, pre- # ferably assigning its value to some other variable immediately # after the match (e.g. find_re("hello[. ?!]*",s) & tmp := __end- # point). Otherwise, you will certainly run into trouble! # # DIFFERENCES between egrep and find_re: Find_re utilizes the same # basic language as egrep. The only big diff. is that find_re uses # intrinsic Icon data structures and escaping conventions rather # than those of any particular Unix variant. Be careful! If you # put find_re('\(hello\)',s) into your source file, find_re will # treat it just like find_re('(hello)',s). If, however, you enter # '\(hello\)' at run-time (due to find_re(!&input,s)), what Icon # receives will depend on your operating system (most likely, a # trace will show "\\(hello\\)"). # # BUGS: Little attempt has been made to optimize find_re. For work # that requires a quick response, you'll have to use something like # system("egrep...")! # ######################################################################## global state_table, biggest_nonmeta_str, __endpoint record o_a_s(op,arg,state) procedure find_re(re, s, i, j) static FSTN_table, STRING_table initial { FSTN_table := table() STRING_table := table() } /re & stop("find_re: Call me with at least one argument!") /i := \&pos | 1 /s := \&subject | stop("find_re: No string.") /j := *s+1 if /FSTN_table[re] then { if \STRING_table[re] then { every p := find(STRING_table[re],s,i,j) do { __endpoint := p + *re; suspend p } fail } else { tokenized_re := tokenize(re) if 0 > !tokenized_re then { MakeFSTN(tokenized_re) | er(re,2) /FSTN_table[re] := copy(state_table) } else { tmp := ""; every tmp ||:= char(!tokenized_re) insert(STRING_table,re,tmp) every p := find(STRING_table[re],s,i,j) do { __endpoint := p + *re; suspend p } fail } } } s ? { tab(x := i to j) & (find(biggest_nonmeta_str) | fail) \ 1 & apply_FSTN(&null,FSTN_table[re]) & (__endpoint := .&pos - 1, suspend x) } end procedure apply_FSTN(ini,tbl) static s_tbl local POS, tmp, fin /ini := 1 & s_tbl := tbl if ini = 0 then { return .&pos } POS := &pos fin := 0 if tmp := !s_tbl[ini] & tab(tmp.op(tmp.arg)) then { if tmp.state = fin then return .&pos else { return apply_FSTN(tmp.state) | (&pos := POS, fail) } } else &pos := POS end procedure tokenize(s) local chr, tmp token_list := list() s ? { tab(many('*+?|')) while chr := move(1) do { if chr == "\\" # it can't be a metacharacter; remove the \ and "put" # the integer value of the next chr into token_list then put(token_list,ord(move(1))) | er(s,2,chr) else if any('*+()|?.$^',chr) then { # Yuck! Egrep compatibility stuff. case chr of { "*" : { tab(many('*+?')) put(token_list,-ord("*")) } "+" : { tmp := tab(many('*?+')) | &null if upto('*?',\tmp) then put(token_list,-ord("*")) else put(token_list,-ord("+")) } "?" : { tmp := tab(many('*?+')) | &null if upto('*+',\tmp) then put(token_list,-ord("*")) else put(token_list,-ord("?")) } "(" : { tab(many('*+?')) put(token_list,-ord("(")) } default: put(token_list,-ord(chr)) } } else { case chr of { # More egrep compatibility stuff. "[" : { every next_one := find("]") \next_one ~= &pos | er(s,2,chr) put(token_list,-ord(chr)) } "]" : { if &pos = (\next_one+1) then put(token_list,-ord(chr)) & next_one := &null else put(token_list,ord(chr)) } default: put(token_list,ord(chr)) } } } } token_list := UnMetaBrackets(token_list) fixed_length_token_list := list(*token_list) every i := 1 to *token_list do fixed_length_token_list[i] := token_list[i] return fixed_length_token_list end procedure UnMetaBrackets(l) # Since brackets delineate a cset, it doesn't make # any sense to have metacharacters inside of them. # UnMetaBrackets makes sure there are no metacharac- # ters inside of the braces. local tmplst, i, Lb, Rb tmplst := list(); i := 0 Lb := -ord("[") Rb := -ord("]") while (i +:= 1) <= *l do { if l[i] = Lb then { put(tmplst,l[i]) until l[i +:= 1] = Rb do put(tmplst,abs(l[i])) put(tmplst,l[i]) } else put(tmplst,l[i]) } return tmplst end procedure MakeFSTN(l,INI,FIN) # MakeFSTN recursively descends through the tree structure # implied by the tokenized string, l, recording in (global) # fstn_table a list of operations to be performed, and the # initial and final states which apply to them. static Lp, Rp, Sl, Lb, Rb, Caret_inside, Dot, Dollar, Caret_outside local i, inter, inter2, tmp initial { Lp := -ord("("); Rp := -ord(")") Sl := -ord("|") Lb := -ord("["); Rb := -ord("]"); Caret_inside := ord("^") Dot := -ord("."); Dollar := -ord("$"); Caret_outside := -ord("^") biggest_nonmeta_str := "" } /INI := 1 & state_table := table() & NextState("new") /FIN := 0 # I haven't bothered to test for empty lists everywhere. if *l = 0 then { /state_table[INI] := [] put(state_table[INI],o_a_s(zSucceed,&null,FIN)) return } # HUNT DOWN THE SLASH (ALTERNATION OPERATOR) every i := 1 to *l do { if l[i] = Sl & tab_bal(l,Lp,Rp) = i then { if i = 1 then er(l,2,char(abs(l[i]))) else { inter := NextState() inter2:= NextState() MakeFSTN(l[1:i],inter2,FIN) MakeFSTN(l[i+1:0],inter,FIN) /state_table[INI] := [] put(state_table[INI],o_a_s(apply_FSTN,inter2,0)) put(state_table[INI],o_a_s(apply_FSTN,inter,0)) return } } } # HUNT DOWN PARENTHESES if l[1] = Lp then { i := tab_bal(l,Lp,Rp) | er(l,2,"(") inter := NextState() if any('*+?',char(abs(0 > l[i+1]))) then { case l[i+1] of { -ord("*") : { /state_table[INI] := [] put(state_table[INI],o_a_s(apply_FSTN,inter,0)) MakeFSTN(l[2:i],INI,INI) MakeFSTN(l[i+2:0],inter,FIN) return } -ord("+") : { inter2 := NextState() /state_table[inter2] := [] MakeFSTN(l[2:i],INI,inter2) put(state_table[inter2],o_a_s(apply_FSTN,inter,0)) MakeFSTN(l[2:i],inter2,inter2) MakeFSTN(l[i+2:0],inter,FIN) return } -ord("?") : { /state_table[INI] := [] put(state_table[INI],o_a_s(apply_FSTN,inter,0)) MakeFSTN(l[2:i],INI,inter) MakeFSTN(l[i+2:0],inter,FIN) return } } } else { MakeFSTN(l[2:i],INI,inter) MakeFSTN(l[i+1:0],inter,FIN) return } } else { # I.E. l[1] NOT = Lp (left parenthesis as -ord("(")) every i := 1 to *l do { case l[i] of { Lp : { inter := NextState() MakeFSTN(l[1:i],INI,inter) MakeFSTN(l[i:0],inter,FIN) return } Rp : er(l,2,")") } } } # NOW, HUNT DOWN BRACKETS if l[1] = Lb then { i := tab_bal(l,Lb,Rb) | er(l,2,"[") inter := NextState() tmp := ""; every tmp ||:= char(l[2 to i-1]) if Caret_inside = l[2] then tmp := ~cset(Expand(tmp[2:0])) else tmp := cset(Expand(tmp)) if any('*+?',char(abs(0 > l[i+1]))) then { case l[i+1] of { -ord("*") : { /state_table[INI] := [] put(state_table[INI],o_a_s(apply_FSTN,inter,0)) put(state_table[INI],o_a_s(any,tmp,INI)) MakeFSTN(l[i+2:0],inter,FIN) return } -ord("+") : { inter2 := NextState() /state_table[INI] := [] put(state_table[INI],o_a_s(any,tmp,inter2)) /state_table[inter2] := [] put(state_table[inter2],o_a_s(apply_FSTN,inter,0)) put(state_table[inter2],o_a_s(any,tmp,inter2)) MakeFSTN(l[i+2:0],inter,FIN) return } -ord("?") : { /state_table[INI] := [] put(state_table[INI],o_a_s(apply_FSTN,inter,0)) put(state_table[INI],o_a_s(any,tmp,inter)) MakeFSTN(l[i+2:0],inter,FIN) return } } } else { /state_table[INI] := [] put(state_table[INI],o_a_s(any,tmp,inter)) MakeFSTN(l[i+1:0],inter,FIN) return } } else { # I.E. l[1] not = Lb every i := 1 to *l do { case l[i] of { Lb : { inter := NextState() MakeFSTN(l[1:i],INI,inter) MakeFSTN(l[i:0],inter,FIN) return } Rb : er(l,2,"]") } } } # FIND INITIAL SEQUENCES OF POSITIVE INTEGERS, CONCATENATE THEM if i := match_positive_ints(l) then { inter := NextState() tmp := Ints2String(l[1:i]) if *tmp > *biggest_nonmeta_str then biggest_nonmeta_str := tmp /state_table[INI] := [] put(state_table[INI],o_a_s(match,tmp,inter)) MakeFSTN(l[i:0],inter,FIN) return } # OKAY, CLEAN UP ALL THE JUNK THAT'S LEFT i := 0 while (i +:= 1) <= *l do { case l[i] of { Dot : { Op := any; Arg := &cset } Dollar : { Op := pos; Arg := 0 } Caret_outside: { Op := pos; Arg := 1 } default : { Op := match; Arg := char(0 < l[i]) } } | er(l,2,char(abs(l[i]))) inter := NextState() if any('*+?',char(abs(0 > l[i+1]))) then { case l[i+1] of { -ord("*") : { /state_table[INI] := [] put(state_table[INI],o_a_s(apply_FSTN,inter,0)) put(state_table[INI],o_a_s(Op,Arg,INI)) MakeFSTN(l[i+2:0],inter,FIN) return } -ord("+") : { inter2 := NextState() /state_table[INI] := [] put(state_table[INI],o_a_s(Op,Arg,inter2)) /state_table[inter2] := [] put(state_table[inter2],o_a_s(apply_FSTN,inter,0)) put(state_table[inter2],o_a_s(Op,Arg,inter2)) MakeFSTN(l[i+2:0],inter,FIN) return } -ord("?") : { /state_table[INI] := [] put(state_table[INI],o_a_s(apply_FSTN,inter,0)) put(state_table[INI],o_a_s(Op,Arg,inter)) MakeFSTN(l[i+2:0],inter,FIN) return } } } else { /state_table[INI] := [] put(state_table[INI],o_a_s(Op,Arg,inter)) MakeFSTN(l[i+1:0],inter,FIN) return } } # WE SHOULD NOW BE DONE INSERTING EVERYTHING INTO state_table # IF WE GET TO HERE, WE'VE PARSED INCORRECTLY! er(l,4) end procedure NextState(new) static nextstate if \new then nextstate := 1 else nextstate +:= 1 return nextstate end procedure er(x,i,elem) writes(&errout,"Error number ",i," parsing ",image(x)," at ") if \elem then write(&errout,image(elem),".") else write(&errout,"(?).") exit(i) end procedure zSucceed() return .&pos end procedure Expand(s) s2 := "" s ? { s2 ||:= ="^" s2 ||:= ="-" while s2 ||:= tab(find("-")-1) do { if (c1 := move(1), ="-", c2 := move(1), c1 << c2) then every s2 ||:= char(ord(c1) to ord(c2)) else s2 ||:= 1(move(2), not(pos(0))) | er(s,2,"-") } s2 ||:= tab(0) } return s2 end procedure tab_bal(l,i1,i2) i := 0 i1_count := 0; i2_count := 0 while (i +:= 1) <= *l do { case l[i] of { i1 : i1_count +:= 1 i2 : i2_count +:= 1 } if i1_count = i2_count then suspend i } end procedure match_positive_ints(l) # Matches the longest sequence of positive integers in l, # beginning at l[1], which neither contains, nor is fol- # lowed by a negative integer. Returns the first position # after the match. Hence, given [55, 55, 55, -42, 55], # match_positive_ints will return 3. [55, -42] will cause # it to fail rather than return 1 (NOTE WELL!). every i := 1 to *l do { if l[i] < 0 then return (3 < i) - 1 } end procedure Ints2String(l) tmp := "" every tmp ||:= char(!l) return tmp end procedure StripChar(s,s2) if find(s2,s) then { tmp := "" s ? { while tmp ||:= tab(find("s2")) do tab(many(cset(s2))) tmp ||:= tab(0) } } return \tmp | s end From @RELAY.CS.NET:Adalbert.Kerber@uni-bayreuth.dbp.de Thu Apr 26 02:46:52 1990 Received: from relay.cs.net by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA19701; Thu, 26 Apr 90 02:46:52 MST Received: from relay2.cs.net by RELAY.CS.NET id aa13696; 26 Apr 90 5:46 EDT Received: from zix.gmd.dbp.de by RELAY.CS.NET id ae06551; 26 Apr 90 5:38 EDT Received: from zix.gmd.dbp.de by .zix.gmd.dbp.de id a001775; 26 Apr 90 8:41 MET Date: 26 Apr 90 07:31 GMT From: Adalbert.Kerber%uni-bayreuth.dbp.de@RELAY.CS.NET To: ICON-GROUP@cs.arizona.edu Message-Id: <94138062400991/13537 X400> Status: O please stop subscription for btm203@dbthrz5.bitnetplease stop subscription for btm203@dbthrz5.bitnet From goer@sophist.uchicago.EDU Fri Apr 27 08:05:57 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA16824; Fri, 27 Apr 90 08:05:57 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Fri, 27 Apr 90 08:07 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 27 Apr 90 08:44:41 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA04266; Fri, 27 Apr 90 08:40:31 CDT Resent-Date: Fri, 27 Apr 90 08:07 MST Date: Fri, 27 Apr 90 08:40:31 CDT From: Richard Goerwitz Subject: coexpressions, questions Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004271340.AA04266@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I've always wondered why coexpressions were introduced. Was it because people wanted coroutines? Or did the idea of delaying evaluation prompt its inclusion (a holdover from SNOBOL's *)? Final question: How is it that the designers came to recognize that coroutines and delayed evalu- ation (or really, controlled evaluation) could be united under the same syntactic rubric as coroutines? -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From goer@sophist.uchicago.EDU Fri Apr 27 08:06:08 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA16848; Fri, 27 Apr 90 08:06:08 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Fri, 27 Apr 90 08:06 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 27 Apr 90 09:13:06 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA04306; Fri, 27 Apr 90 09:08:52 CDT Resent-Date: Fri, 27 Apr 90 08:07 MST Date: Fri, 27 Apr 90 09:08:52 CDT From: Richard Goerwitz Subject: grammars, questions Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004271408.AA04306@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O My last questions were centered on coexpressions. These ones concern representation of various types of grammars in Icon. They seemed to belong in separate postings. I've long wondered what sorts of grammars Icon is capable of representing, using string scanning. It appears that Icon has no trouble representing context free languages, with the restriction that there can be no left- recursion in the grammar. For representing natural languages, this can be a serious drawback. Lots of natural language constructs cannot be rep- resented without left recursion (e.g. the possessive 's construct). Prolog has a thing called indexed grammars, which on the infamous "Chomsky hierarchy" seem to fall between context free phrase structure grammars and context sensitive phrase structure grammars. These grammars are like con- text free grammars except that while recognizing sequences the various nodes are given labels which can be correlated and compared with labels for other nodes. The result is that you can have the NP node labeled as plural, and the VP node as well, and tell the grammar that if the two do not match the sentence must be regected. As far as I can see, this could be done in Icon simply by having matching procedures return values. I dunno. Has anyone looked into this? It seems to me that Icon has the distinct advantage of allowing the user to skip the tokenizing stage in many cases. You can just parse the string di- rectly. I like this. But what do we do in cases where we must deal with a backslash. Most solutions I've seen are pretty ugly. What I did in my find_re procedure posted a few days ago was to convert input strings to lists, and then convert metacharacters to negative integers, leaving nonmetas as positive integers. This worked well, since Icon has ord() and char(). Has anyone developed an elegant solution to the \ problem using string scanning? This posting has wandered a bit, so let me summarize. I am curious, first of all, about what sorts of grammars can easily be represented using Icon's special string-processing facilities. Secondly, I'm curious whether anyone has done research into indexed grammars in Icon. Thirdly, I'd like to know whether there exist elegant solutions to the \ problem. I hope that these questions are relevant, interesting, etc., and not just a waste of bandwidth. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From ralph Fri Apr 27 08:26:26 1990 Resent-From: "Ralph Griswold" Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA19579; Fri, 27 Apr 90 08:26:26 MST Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Fri, 27 Apr 90 08:27 MST Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA19537; Fri, 27 Apr 90 08:25:52 MST Resent-Date: Fri, 27 Apr 90 08:27 MST Date: Fri, 27 Apr 90 08:25:52 MST From: Ralph Griswold Subject: RE: coexpressions, questions Resent-To: icon-group@cs.arizona.edu To: goer@sophist.uchicago.EDU, icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004271525.AA19537@megaron.cs.arizona.edu> In-Reply-To: <9004271340.AA04266@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: goer@sophist.uchicago.EDU, icon-group@Arizona.edu Status: RO The motivation for adding co-expressions to Icon was to be able to control when and where the results of a generator are produced. Without co-expressions, the results of a generator are produced in a last-in-first-out fashion at the lexical site in program at which the generator apprears, as demanded by the enclosing expression. Parallel production of the results of two generators, for example, is impossible without co-expressions. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From icon-group-request@arizona.edu Fri Apr 27 10:57:05 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA05153; Fri, 27 Apr 90 10:57:05 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 27 Apr 90 10:58 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA15563; Thu, 26 Apr 90 12:41:59 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Fri, 27 Apr 90 10:58 MST Date: 26 Apr 90 18:50:05 GMT From: nic!hri!sparc9!rolandi@bbn.COM Subject: ftp new sparc station icon sources Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <1990Apr26.185005.19973@hri.com> Organization: Horizon Research X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: RO What does one need to do in order to obtain the new SPARC station Icon source code? *************************************** * Walter G. Rolandi * * Horizon Research, Inc. * * 1432 Main Street * * Waltham, MA 02154 USA * * (617) 466 8339 * * * * rolandi@hri.com * *************************************** From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS Fri Apr 27 11:07:25 1990 Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA06381; Fri, 27 Apr 90 11:07:25 MST Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0) id AA05405; Fri, 27 Apr 90 14:07:15 -0400 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Fri, 27 Apr 90 14:06:43 EDT Date: Fri, 27 Apr 90 12:59:36 EDT From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu To: icon-group@cs.arizona.edu Message-Id: <221535@Wayne-MTS> Subject: Why co-expressions? Status: RO Ralph's explanation of the motivation for co-expressions is right on. But Ralph--why do they have to be symmetric? An asymmetric version, which is what I have in SPLASH, is conceptually simpler (rather like Unix piping, but with branching) and seems to provide all the functionality I've ever needed. Paul Abrahams abrahams%wayne-mts@um.cc.umich.edu From ralph Fri Apr 27 11:14:42 1990 Date: Fri, 27 Apr 90 11:14:42 MST From: "Ralph Griswold" Message-Id: <9004271814.AA06892@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA06892; Fri, 27 Apr 90 11:14:42 MST To: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu Subject: Re: Why co-expressions? Cc: icon-group In-Reply-To: <221535@Wayne-MTS> Status: O Ralph's explanation of the motivation for co-expressions is right on. But Ralph--why do they have to be symmetric? An asymmetric version, which is what I have in SPLASH, is conceptually simpler (rather like Unix piping, but with branching) and seems to provide all the functionality I've ever needed. Paul Abrahams abrahams%wayne-mts@um.cc.umich.edu This is a long-standing question. In fact, it's been posed as a challenge -- produce a program that really needs the full coroutine capabilities of co-expressions. Perhaps Steve Wampler, who designed and implemented co-expressions, will respond. It's worth noting that symmetry usually is viewed as an aesthetic virtue. I guess my personal view is that I can ignore the coroutine aspects of co-expressions. Except when I have to document them or teach about them. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From kwalker Fri Apr 27 11:19:06 1990 Date: Fri, 27 Apr 90 11:19:06 MST From: "Kenneth Walker" Message-Id: <9004271819.AA07299@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA07299; Fri, 27 Apr 90 11:19:06 MST In-Reply-To: <9004271408.AA04306@sophist.uchicago.edu> To: icon-group Subject: Re: grammars, questions Status: O > Date: Fri, 27 Apr 90 09:08:52 CDT > From: Richard Goerwitz > > It appears that Icon has no trouble representing > context free languages, with the restriction that there can be no left- > recursion in the grammar. Certain special cases of left-recursion can be converted into looping. Unfortunately, this bounds backtracking so to you have to know that backtracking won't find other solutions. For example, the production s ::= t | s "+" t can be parsed with a procedure something like procedure s() x := [] push(x, t()) | fail while ="+" do push(x, t()) | stop("syntax error") return x end (I haven't tested this code; in any event it needs to be a little more sophisticated.) This pattern of left recursion comes up a lot in programming languages. Is it common in natural languages? Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721 +1 602 621-4324 kwalker@cs.arizona.edu {uunet|allegra|noao}!arizona!kwalker From cargo@tardis.cray.com Fri Apr 27 12:06:16 1990 Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA10501; Fri, 27 Apr 90 12:06:16 MST Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34) id AA18506; Fri, 27 Apr 90 14:06:25 CDT Received: from zk.cray.com by hall.cray.com id AA15158; 4.1/CRI-3.12; Fri, 27 Apr 90 14:06:24 CDT Received: by zk.cray.com id AA04961; 3.2/CRI-3.12; Fri, 27 Apr 90 14:06:45 CDT Date: Fri, 27 Apr 90 14:06:45 CDT From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9004271906.AA04961@zk.cray.com> To: icon-group@cs.arizona.edu Subject: parsers Status: O I am interested in using Icon to write parsers for what I have heard called "braced languages." "The braced languages are deterministic and context-free langauges that explicity identify and mark the beginning and end of each piece of information comprising the data object." [from The automatic generation of software for data exchange in the graphics domain, Sandra A. Mamrak, et al., The Ohio State University] The languages in general are SGML documents (with the document type description defining the particular structure of the data objects). I haven't had time to do much aside from research some of the work that other people have done. dsc From wunder@hpsdel.sde.hp.com Fri Apr 27 12:53:27 1990 Received: from hp-sde.sde.hp.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13751; Fri, 27 Apr 90 12:53:27 MST Received: from orac.sde.hp.com by hp-sde.sde.hp.com with SMTP (16.2A/15.5+IOS 3.13) id AA22723; Fri, 27 Apr 90 12:53:01 -0700 Received: by hpsdel.sde.hp.com (15.7/SES42.42) id AA25100; Fri, 27 Apr 90 12:52:42 pdt Date: Fri, 27 Apr 90 12:52:42 pdt From: Walter Underwood Message-Id: <9004271952.AA25100@hpsdel.sde.hp.com> To: cargo@tardis.cray.com Cc: icon-group@cs.arizona.edu In-Reply-To: David S. Cargo's message of Fri, 27 Apr 90 14:06:45 CDT <9004271906.AA04961@zk.cray.com> Subject: parsers Status: O The languages in general are SGML documents (with the document type description defining the particular structure of the data objects). Check out this paper: The implementation of the Amsterdam SGML parser J Warmer & S Van Egmond Electronic Publishing, vol 2 no 2, page 65 July 1989 They talk about why SGML is not LL(1), and about implementing a parser with the Amsterdam Compiler Kit. wunder From sbw@naucse.cse.nau.edu Fri Apr 27 12:55:05 1990 Received: from naucse.cse.nau.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA13884; Fri, 27 Apr 90 12:55:05 MST Received: by naucse.cse.nau.edu (5.61/1.34) id AA05834; Fri, 27 Apr 90 12:54:21 -0700 Message-Id: <9004271954.AA05834@naucse.cse.nau.edu> Date: Fri, 27 Apr 90 12:54:12 MST X-Mailer: Mail User's Shell (6.5 4/17/89) From: sbw@naucse.cse.nau.edu (Steve Wampler) To: "Ralph Griswold" , um.cc.umich.edu!Wayne-MTS!Paul_Abrahams@cs.arizona.edu Subject: Re: Why co-expressions? Cc: icon-group@cs.arizona.edu Status: O On Apr 27 at 11:18, "Ralph Griswold" writes: } } Ralph's explanation of the motivation for co-expressions is right on. } But Ralph--why do they have to be symmetric? An asymmetric version, } which is what I have in SPLASH, is conceptually simpler (rather like } Unix piping, but with branching) and seems to provide all the functionality } I've ever needed. } } Paul Abrahams } abrahams%wayne-mts@um.cc.umich.edu } } This is a long-standing question. In fact, it's been posed as a challenge -- } produce a program that really needs the full coroutine capabilities of } co-expressions. } } Perhaps Steve Wampler, who designed and implemented co-expressions, will } respond. Well, let's see, just what do I want my memory of back then to be... Oh yes. One thing to keep in mind is that, as a PhD student, I was interested in 'research' topics, not just implementation. The nice thing about a symmetric view of co-expressions is that they are full of interesting potentials - for example, since they effectively provide a heap-based calling structure (instead of the conventional stack-based model), they provide all sorts of fun graph-based programming strategies. And, of course, lend themselves reasonably well to exploring certain multi- process programming strategies (could do better at this one, though). I think I can claim (heck, I can claim anything - wonder if I'm right?) that asymmetric co-expressions provide lazy evaluation of a tree-based calling structure - which is interesting, but not as general (from a research point of view, remember). And, I know it seems odd, but I find the symmetric view very straight-forward, there isn't much special-casing going on. I think this is reflected in the original implementation, which was flat-out trivial on a PDP (to activate a co-expression, you simply changed the sp register to point into the stack for the co-expression, saved the pc and reset it to the saved pc for the co-expression. Only about 4 instructions. Returning was, well, symmetric.) I'd bet that an implementation of the asymmetric model would be no easier, and possibly more complex (since you may need to worry about preventing cycles). Since all that changes is the CPU state, this seems naturally (to my warped mind) as a simple multi-processor model. Of course, the original model was flawed if one really wanted coroutines, but there are other things (such as the heap-based call graph mentioned above) that are nice *from a research point of view*. Sigh, I wish someone would throw me a couple of graduate students and some time to really explore these things. Paul, I'm anxious to learn more about SPLASH! I'd like to play with asymmetric co-expressions as well as some of the other features you've tempted us with! -- Steve Wampler {....!arizona!naucse!sbw} {sbw@naucse.cse.nau.edu} From goer@sophist.uchicago.EDU Fri Apr 27 12:57:36 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA14211; Fri, 27 Apr 90 12:57:36 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Fri, 27 Apr 90 12:58 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 27 Apr 90 14:57:23 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA04801; Fri, 27 Apr 90 14:53:10 CDT Resent-Date: Fri, 27 Apr 90 12:59 MST Date: Fri, 27 Apr 90 14:53:10 CDT From: Richard Goerwitz Subject: left recursion in natural languages Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004271953.AA04801@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Ken Walker kindly responds to my query about parsing strategies (or at least one of the many queries): For example, the production s ::= t | s "+" t (For those who want to get in this game, quick read Griswold & Griswold, chapter 15.) can be parsed with a procedure something like procedure s() x := [] push(x, t()) | fail while ="+" do push(x, t()) | stop("syntax error") return x end (I haven't tested this code; in any event it needs to be a little more sophisticated.) This pattern of left recursion comes up a lot in programming languages. Is it common in natural languages? Yes, it's fairly common. One case in point is the 's construction. You can define a noun phrase as a combination of articles, simple nouns, adjectives, relative clauses, etc. This works fairly well using a simple phrase structure grammar. However, as soon as you try to include 's, you have to define a noun phrase as a noun phrase fol- lowed by an 's. E.g. - the queen of England the queen of England's throne The phrase "the queen of England" is a noun phrase, and so is "the queen of England's throne." You can see immediately how there's left recursion here. Any time you get a postpositions (which is what 's really is - it's not a "case" or an affix of any kind) you'll have this problem of left recursion. Sumerian, which is one language I've studied, is all postpositions - no prepositions at all. In fact, you can often do a mirror-image lexical calque of a phrase and have it come out as acceptable English. The problem with natural language parsing strategies that don't permit elegant left-recursion is that they don't mirror the apparent ability for people to handle both sorts of recursion in any given languages. Normally a language will use one or the other predominantly, but often there is mixing (as in English). One way out is to permit X levels of left recursion, with X representing the number of levels beyond which people get confused, and don't really talk that way. The problem with this is that people might perhaps really be using an internal grammar that permits infinite recursion, but that they just can't fully realize the grammar. This seems a bit silly to me. I have to admit that my main interest is in the sounds - the phonemes, or systmatic pronunciation units - of ancient Semitic languages. Someone who is more into the suntax of natural languages, please jump in here ---> From @s.ms.uky.edu:mtbb95@ms.uky.edu Fri Apr 27 14:29:16 1990 Received: from e.ms.uky.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA21281; Fri, 27 Apr 90 14:29:16 MST Received: from s.ms.uky.edu by g.ms.uky.edu id aa24729; 27 Apr 90 16:48 EDT From: Bob Maras Date: Fri, 27 Apr 90 16:47:52 EDT X-Mailer: Mail User's Shell (6.4 2/14/89) To: icon-group@cs.arizona.edu, mtbb95@ms.uky.edu Subject: Removal Message-Id: <9004271647.aa22801@s.s.ms.uky.edu> Status: O Please remove my name from your Icon mailing list of users. I appreciate the fine effort you are making and wish each of you the very best. I have enjoyed your information very much. Robert Maras -- _ _ ( ) __ ( ) | O O | B O B M A R A S / __ \ / ( \/ ) __/ \ \__/ / \____/ |_/\_| H A P P Y C O M P U T I N G !!! From icon-group-request@arizona.edu Fri Apr 27 17:54:28 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA06635; Fri, 27 Apr 90 17:54:28 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 27 Apr 90 17:54 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA27061; Fri, 27 Apr 90 17:39:16 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Fri, 27 Apr 90 17:54 MST Date: 27 Apr 90 18:22:00 GMT From: swrinde!zaphod.mps.ohio-state.edu!uwm.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!daniel@ucsd.EDU Subject: Icon v8 port for Convex ? Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <6900001@ux1.cso.uiuc.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I desire to bring up Icon version 8 on a Convex 220. Has anyone done the port yet? I see that version 7 was ported. -- Daniel Pommert email.internet: pommert@uiuc.edu email.bitnet: daniel@uiucvmd phone: (217) 333-8629 post: DCL Rm, 150 1304 W. Springfield Urbana, IL 61801-2987 where: 40 6 47 N Latitude 88 13 36 W Longitude From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS Fri Apr 27 19:20:34 1990 Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA11705; Fri, 27 Apr 90 19:20:34 MST Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0) id AA27095; Fri, 27 Apr 90 22:20:30 -0400 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Fri, 27 Apr 90 22:20:12 EDT Date: Fri, 27 Apr 90 22:14:38 EDT From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu To: icon-group@cs.arizona.edu Message-Id: <221656@Wayne-MTS> Subject: Deletions from the Icon mailing list Status: O Would all you people who (unwisely) want to get off the Icon mailing list please send notice of that to icon-PROJECT, not to icon-GROUP. (If I'm leading the flocks astray, dear icon-group-person, please let us know.) - Paul Abrahams From ralph Fri Apr 27 19:43:06 1990 Date: Fri, 27 Apr 90 19:43:06 MST From: "Ralph Griswold" Message-Id: <9004280243.AA12512@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA12512; Fri, 27 Apr 90 19:43:06 MST To: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu, icon-group@cs.arizona.edu Subject: Re: Deletions from the Icon mailing list Status: O The correct e-mail address to use to get on/off the icon-group mailing list is icon-group-request. The name is common protocol used for all such groups. The problem is that folks can't be expected to know/remember that. A good alternative is icon-project; we'll handle icon-group changes, and, at least, mail to icon-project doesn't get resent to hundreds of addresses all over the world. In general, if you have something that's not suitable for broadcasting, send it to icon-project. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph p.s. Thanks, Paul. From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS Sun Apr 29 13:01:50 1990 Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA14826; Sun, 29 Apr 90 13:01:50 MST Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0) id AA13598; Sun, 29 Apr 90 16:01:45 -0400 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Sun, 29 Apr 90 16:01:27 EDT Date: Sun, 29 Apr 90 15:45:40 EDT From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu To: icon-group@cs.arizona.edu Message-Id: <221847@Wayne-MTS> Subject: Co-expressions, symmetric and otherwise Status: O Rich Goerwitz asked me what the difference was between symmetric and asymmetric co-expressions. The answer, in terms of Icon, is simple: if you disallow the expressions @&source, @&main, and the binary form e1@e2 (see p. 138 of the Icon book), then you have asymmetric co-expressions. In other words, you can create a co-expression e, which can then pass back values at various points of call with suspend (or return--just once). These values can be picked up by evaluating @e. So the co-expression sends values (via suspend), and various callers can retrieve them (via @). But there's no way for a caller to pass a value to e once it's been created except, of course, through globals or other such devices. In my experience the asymmetric form is extremely useful, and it's all that I've ever needed. To tell the truth, I never fully fathomed the program on p. 139. (How many of you out there have really understood it?) My suspicion is that if the facilities of Sec. 13.4.1 were dropped from Icon, no useful programs would be broken. To answer Steve Wampler's point about restrictions, the only restriction needed to limit Icon to asymmetric coexpressions would be to eliminate those facilities--so it would be pretty simple if the Icon project wanted to do it. Historically, I think that the symmetric form of coroutine arose because there was no natural way to make coroutines asymmetric. The suspend notation provides such a way. In terms of implementation, the hard part is not so much passing control back and forth but the storage allocation. A coexpression requires either heap allocation or a stack of its own, and if you choose the stack, you either have to limit its size or make it relocatable from one place to another. Hence the references in the literature to cactus stacks, which are what you need to implement coroutines. With asymmetric coroutines, there are some interesting optimization possibilities if the optimizer can discover that a coexpression only uses a bounded amount of storage. (This is how it works in SPLASH.) I don't know whether symmetric co-expressions make this much harder to do. Coroutines were originally designed for Fortran-like (or assembly) languages in environments without recursion, so the storage allocation problem was essentially trivial. Paul Abrahams abrahams%wayne-mts@um.cc.umich.edu From goer@sophist.uchicago.EDU Sun Apr 29 15:45:51 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA23817; Sun, 29 Apr 90 15:45:51 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Sun, 29 Apr 90 15:46 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Sun, 29 Apr 90 17:45:37 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA02634; Sun, 29 Apr 90 17:41:26 CDT Resent-Date: Sun, 29 Apr 90 15:47 MST Date: Sun, 29 Apr 90 17:41:26 CDT From: Richard Goerwitz Subject: hell Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004292241.AA02634@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O In my experience the asymmetric form is extremely useful, and it's all that I've ever needed. To tell the truth, I never fully fathomed the program on p. 139. (How many of you out there have really understood it?) My suspicion is that if the facilities of Sec. 13.4.1 were dropped from Icon, no useful programs would be broken. To answer Steve Wampler's point about restrictions, the only restriction needed to limit Icon to asymmetric coexpressions would be to eliminate those facilities--so it would be pretty simple if the Icon project wanted to do it. I guess my experience was this: I learned Icon on my own, and I figured than anything in "the book" was there because it was important for a fun- damental understanding of the language. As a result, I fooled with the example of symmetric coexpressions until I understood what they were used for. I'd say that for about two years at least half the programs I wrote used these symmetric coexpressions. I've written some very extensive con- version programs using methods like those outlined on the infamous p. 139. I recall being a bit surprised several years ago when David Gudeman ex- pressed a dislike for coroutines. Jerry Nowlin also used to jump in and redo most programs that were posted using coexpressions (still less co- routines) so that these were not necessary, mumbling something about speed :-). My problem is that, although I like using them for some things, they usually obfuscate my code unless I am careful. I guess I've gotten away from coroutines, but I'd note that, at least here on my home machine, they do not cause much of a speed decrease over other methods. I would be very, very interested in seeing a short, clean example of a situation where symmetric coroutines provide at least the most elegant, if not the only possible, way of doing something. Another subject (yes, this should have been placed in another posting, but I post here often enough as it is): What are Backus-Naur Forms? I was reading about SNOBOL the other day - a language I am only very superficially familiar with. The author of the article I was perusing stated that BNFs could be translated directly into SNOBOL4 patterns. First of all, like so many others who use Icon, I am not a computer scientist. I do linguistics and text processing mainly. I have no idea what BNFs are. I'd like to know what they are. Secondly, I'd enjoy knowing whether the same translation as is possible for SNOBOL is possible for Icon. If someone knows the answers to these questions, could he or she perhaps chime in? -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From ralph Sun Apr 29 16:08:52 1990 Date: Sun, 29 Apr 90 16:08:52 MST From: "Ralph Griswold" Message-Id: <9004292308.AA25231@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA25231; Sun, 29 Apr 90 16:08:52 MST To: goer@sophist.uchicago.EDU Subject: Re: hell Cc: icon-group In-Reply-To: <9004292241.AA02634@sophist.uchicago.edu> Status: O Co-expressions are space-intensive, but not time-intensive. In fact, activating a co-expression is a bit faster than calling a procedure. The biggest problem is that co-expressions are not supported in all versions of Icon, so code that uses them is not portable. Symmetry/non-symmetry aside, co-expressions as they are in Icon are not going to be changed. I can't really see what all the fuss is about; the symmetry costs relatively little code, it's be done, and you can ignore the symmetric uses if you don't need them. As to BNF: It's a notational system for writing context-free production grammars. It was first used in the design of Algol 60 (or possibly Algol 58). It's nothing special, but it can be found in many older programming language texts, used for describing syntax. There are examples of BNF grammars in the Icon langauge book, although they may not be labeled as such. There's a fairly simple mapping from BNF (and other CF production grammar systems) to patterns in SNOBOL4. There's a similar mapping for Icon, provided matching procedures are used for the nonterminal symbols. The result is a recursive-descent parser with backtracking. Aside from efficiency considerations, the main problem is that left-recursion in productions translates into left recusrion in matching. SNOBOL4 avoids this by using a length-shortening heuristic. The mapping is described in the Icon language book. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From cjeffery Sun Apr 29 16:19:29 1990 Resent-From: "Clinton Jeffery" Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA25760; Sun, 29 Apr 90 16:19:29 MST Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Sun, 29 Apr 90 16:18 MST Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA25635; Sun, 29 Apr 90 16:17:09 MST Received: by caslon; Sun, 29 Apr 90 16:17:09 mst Resent-Date: Sun, 29 Apr 90 16:20 MST Date: Sun, 29 Apr 90 16:17:09 mst From: Clinton Jeffery Subject: hell Resent-To: icon-group@cs.arizona.edu To: goer@sophist.uchicago.EDU Cc: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004292317.AA02545@caslon> In-Reply-To: Richard Goerwitz's message of Sun, 29 Apr 90 17:41:26 CDT <9004292241.AA02634@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: goer@sophist.uchicago.EDU X-Vms-Cc: icon-group@Arizona.edu Status: O Backus Naur Forms (or Backus Normal Forms, and including "Extended Backus Naur Forms) are one notation for expressing "Phrase Structure Grammars"--I'm sure the main reason we call them BNF is to avoid giving linguists the credit where credit is due. Well, I am just kidding here folks; computer scientists use it for historical reasons, but BNF's are essentially PSG's. Most BNF grammars would translate easily into Icon, as has been discussed in the past two weeks. I am going to stay out of the co-routine squabble for the moment. I think that symmetric co-expressions can defend themselves. -- | Clint Jeffery, U. of Arizona Dept. of Computer Science | cjeffery@cs.arizona.edu -or- {noao allegra}!arizona!cjeffery -- From gudeman Sun Apr 29 21:36:32 1990 Resent-From: "David Gudeman" Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA12514; Sun, 29 Apr 90 21:36:32 MST Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Sun, 29 Apr 90 21:37 MST Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA12501; Sun, 29 Apr 90 21:36:06 MST Resent-Date: Sun, 29 Apr 90 21:38 MST Date: Sun, 29 Apr 90 21:36:06 MST From: David Gudeman Subject: hell Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004300436.AA12501@megaron.cs.arizona.edu> In-Reply-To: Richard Goerwitz's message of Sun, 29 Apr 90 17:41:26 CDT <9004292241.AA02634@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O From: Richard Goerwitz I recall being a bit surprised several years ago when David Gudeman ex- pressed a dislike for coroutines. Well..., more accurately I _prefer_ to use recursive procedures. On the other hand, there are certainly some problems that are more conveniently solved with coroutines. My preference for recursive procedures is largely based on the fact that _I_ find them easier to understand than I do coroutines. I can rationalize this rather selfish attitude by noting that this is probably typical: it is likely that most people find procedures easier to understand. So if your code is going to be read by others, it is probably best not to go looking for places to use coroutines. People generally learn to use recursive procedures long before they ever hear of coexpressions, and (like me) they are too lazy to spend a lot of time with a new construction when most problems can be adequately solved without it. In some ways this is similar to the resistance people have against learning a new programming language. We are probably poorer for our specialization, and given your unusual experience in learning Icon, you surely have some unique perspectives to contribute. In particular, it is interesting to see how coroutines are used by someone who never developed a strong prejudice in favor of procedures. From @mirsa.inria.fr:ol@cerisi.cerisi.Fr Sun Apr 29 23:05:03 1990 Received: from mirsa.inria.fr by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA17108; Sun, 29 Apr 90 23:05:03 MST Received: from cerisi.cerisi.fr by mirsa.inria.fr with SMTP (5.59++/IDA-1.2.8) id AA08681; Mon, 30 Apr 90 08:05:11 +0200 Message-Id: <9004300605.AA08681@mirsa.inria.fr> Date: Mon, 30 Apr 90 08:03:44 -0100 Posted-Date: Mon, 30 Apr 90 08:03:44 -0100 From: Lecarme Olivier To: icon-group@cs.arizona.edu In-Reply-To: "Ralph Griswold"'s message of Sun, 29 Apr 90 16:08:52 MST <9004292308.AA25231@megaron.cs.arizona.edu> Subject: hell Status: O A point of history: BNF was first used in the description of Algol 60 (Algol 58, names IAL (for International Algorithmic Language) when it was first designed, used a notation similar to that used for Fortran). It's called Backus Normal Form because first used in a draft document by John Backus. It is also known as Backus-Naur Form, because its final form, used in the "Report about the Algorithmic Language Algol", was designed by Peter Naur. Original BNF had flaws: for example, terminal symbols were not specially quoted, contrarily to non-terminal symbols; there was no delimiter between rules; meta-symbols of the notation could not be used in the language being described. The most popular notation presently is probably EBNF (Extended Backus-Naur Form), designed by Niklaus Wirth, which corrects these flaws and adds some more meta-operators for expressing frequent cases (optional parts, repetitions, and so on). Olivier Lecarme From goer@sophist.uchicago.EDU Mon Apr 30 06:59:28 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA06121; Mon, 30 Apr 90 06:59:28 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Mon, 30 Apr 90 07:00 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 30 Apr 90 08:59:54 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03346; Mon, 30 Apr 90 08:55:43 CDT Resent-Date: Mon, 30 Apr 90 07:01 MST Date: Mon, 30 Apr 90 08:55:43 CDT From: Richard Goerwitz Subject: BNFs Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004301355.AA03346@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O It really is a shame that there has to be this sort of terminological fragmentation. I see now that Backus Naur Forms (and EBNFs) are really just a notational device for expressing context free grammars. Good- ness. What a fuss :-)! As one poster pointed out, linguists have been using their own practical notation for context free grammars for some time now. Prolog is one language that implements this notation in the form of its definite clause grammars (which in reality are slightly more powerful than context free grammars). I suppose it's all a matter of which tradition you "grew up" in. In a previous posting, I showed how left recursion occurs in English. There are also lots of context-dependent rules, such as verb-noun agree- ment. You can't use a context free phrase structure grammar, for in- stance, to recognize a sentence that goes - The woman loves me. The woman I married loves me. The woman I married, who won't love me if my son and I forget Mother's Day, loves me. There are two problems here: 1) We have to get the verb phrase - something that must be defined separately from the noun phrase - to know about what is going on in the noun phrase, namely that it is singular (hence "loves" and not "love." The other problem (2) is what I mentioned before, namely left re- cursion in the grammar. Another thing to consider is the arbitrary complexity of the noun phrase, and the theoretically unlimited distance that separates the main noun of that phrase (above = "woman") from the verb phrase ("loves me"). Phrase structure grammars, BNFs, regular expressions, definite clause grammars without indexing - whatever you happen to call them (they are all context free) - are, to the linguist, not of most central interest. As I hinted at above, the problem of the verb knowing what the noun before it is doing number-wise can be solved. In Prolog you do it using indexed gram- mars. In Icon, the solution looks pretty straightforward, even elegant. Just have your matching procedures return a value. This has the effect of allowing nodes to have labels. I'd tend to want to create a list, and then have each node, if it wishes, simply return a value, which is then "put" into a list and returned (or put into another list by the calling node, or whatever). Anyway, all you would have to do is make sure that neither the noun phrase nor the verb phrase conflict as to number (in this case, I'd just check to be sure that, if the one is marked "singular," the other is as well). Again, I'd really like to see some comments by someone who has studied the syntax of natural languages in more detail than I. It just seemed useful to provide a little input from linguistics. It will be useful in the long run, I think, if I keep reminding computer scientists that their work has implications in closely related fields. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From @RELAY.CS.NET,@dg-rtp.rtp.dg.com:langley@DG-RTP.DG.COM Mon Apr 30 08:05:02 1990 Received: from relay.cs.net by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA11625; Mon, 30 Apr 90 08:05:02 MST Received: from dg-rtp.rtp.dg.com by RELAY.CS.NET id aa23732; 30 Apr 90 11:01 EDT Received: from bigbird.rtp.dg.com by dg-rtp.dg.com (4.20/4.7) id AA17366; Mon, 30 Apr 90 10:59:37 edt via SMTP Received: by bigbird.rtp.dg.com (4.20/rtp-s01) id AA06810; Mon, 30 Apr 90 11:00:37 edt Date: Mon, 30 Apr 90 11:00:37 edt From: Mark L Langley Message-Id: <9004301500.AA06810@bigbird.rtp.dg.com> Return-Receipt-To: langley@dg-rtp.dg.com To: icon-group@cs.arizona.edu Subject: linguism Status: O First, I just wanted to say that I have really enjoyed the linguistic- related postings. This group is quite a melting pot of Computer Scientists, Researchers, and thinkers-in-general, centered around the unlikely vehicle of Icon. (Now back to the show....) Richard Goerwitz remarked > > It really is a shame that there has to be this sort of terminological > fragmentation. I see now that Backus Naur Forms (and EBNFs) are really > just a notational device for expressing context free grammars. Good- > ness. What a fuss :-)! As one poster pointed out, linguists have been > using their own practical notation for context free grammars for some > time now. Prolog is one language that implements this notation in the > form of its definite clause grammars (which in reality are slightly more > powerful than context free grammars). I suppose it's all a matter of > which tradition you "grew up" in. Can you say a little about the linguistic convention for describing cfg-s? What are it's advantages? Does it submit more readily to being manipulated by a program? I once wrote Icon (what else?) programs to manipulate a cfg, by putting it through its paces/transformations between normal forms, factoring left recursion, et al. I would be interested in seeing an alternate representation than BNF that might offer conceptual improvements. > > The woman loves me. > The woman I married loves me. > The woman I married, who won't love me if my son and I forget > Mother's Day, loves me. > > There are two problems here: 1) We have to get the verb phrase - something > that must be defined separately from the noun phrase - to know about what > is going on in the noun phrase, namely that it is singular (hence "loves" and > not "love." The other problem (2) is what I mentioned before, namely left re- > cursion in the grammar. Another thing to consider is the arbitrary complexity > of the noun phrase, and the theoretically unlimited distance that separates > the main noun of that phrase (above = "woman") from the verb phrase ("loves > me"). > > Phrase structure grammars, BNFs, regular expressions, definite clause grammars > without indexing - whatever you happen to call them (they are all context free) > - are, to the linguist, not of most central interest. > > As I hinted at above, the problem of the verb knowing what the noun before it > is doing number-wise can be solved. In Prolog you do it using indexed gram- > mars. In Icon, the solution looks pretty straightforward, even elegant. > Just have your matching procedures return a value. This has the effect of > allowing nodes to have labels. I'd tend to want to create a list, and then > have each node, if it wishes, simply return a value, which is then "put" into > a list and returned (or put into another list by the calling node, or whatever). > Anyway, all you would have to do is make sure that neither the noun phrase > nor the verb phrase conflict as to number (in this case, I'd just check to > be sure that, if the one is marked "singular," the other is as well). This looks like what we compiler wonks do when we check non-syntactic things (like whether something is declared or not...) during the translation process. Is there a better (more fully encompassing) formalism here? That is, syntax-with-ad-hoc-checking is highly impure... > > Again, I'd really like to see some comments by someone who has studied the > syntax of natural languages in more detail than I. It just seemed useful > to provide a little input from linguistics. It will be useful in the long > run, I think, if I keep reminding computer scientists that their work has > implications in closely related fields. So would I... Has anyone successfully parsed "The policeman raised his hand and stopped the car" (Courtesy of R. Schank.) > -Richard L. Goerwitz goer%sophist@uchicago.bitnet > goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer > > Mark langley@dg-rtp.dg.com From sboisen@BBN.COM Mon Apr 30 09:17:00 1990 Message-Id: <9004301617.AA16804@megaron.cs.arizona.edu> Received: from RIGEL.BBN.COM by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA16804; Mon, 30 Apr 90 09:17:00 MST To: langley@dg-rtp.dg.com Cc: icon-group@cs.arizona.edu In-Reply-To: Mark L Langley's message of Mon, 30 Apr 90 11:00:37 edt <9004301500.AA06810@bigbird.rtp.dg.com> Subject: linguism From: Sean Boisen Sender: sboisen@BBN.COM Reply-To: sboisen@BBN.COM Date: Mon, 30 Apr 90 12:15:00 EDT Status: O > Can you say a little about the linguistic convention for describing > cfg-s? What are it's advantages? Does it submit more readily to being > manipulated by a program? I once wrote Icon (what else?) programs to > manipulate a cfg, by putting it through its paces/transformations between > normal forms, factoring left recursion, et al. > > I would be interested in seeing an alternate representation than BNF > that might offer conceptual improvements. > As someone working in Natural Language Processing (NLP), i'll jump into the fray: There really is no single linguistic convention for representing CFGs, and it's not at all clear that CFGs are sufficiently powerful for representing NL (one classic case for this argument is a construction like "John, Bill, and Fred love Mary, Sally, and Julie, respectively"). Most current lingustic theories use grammars whose power is somewhere between context-free and context-sensitive, inclusive. In addition to strictly parsing, there are also the problems of building semantic representations (since it usually doesn't do much good to simply represent the structure of a sentence: you want to know what it *means*). Note to Richard Goerwitz: left-recursion is only a problem if you are parsing top-down, and that's not a foregone conclusion. In fact, many very good NLP systems use bottom-up parsing for independent reasons. > This looks like what we compiler wonks do when we check non-syntactic > things (like whether something is declared or not...) during the translation > process. Is there a better (more fully encompassing) formalism here? That > is, syntax-with-ad-hoc-checking is highly impure... > One popular approach these days is unification-based formalisms, where the agreement checking is at least not quite so ad hoc. DCGs under Prolog are a well-known instance, although one can also do unification in many other languages (we use a unification-based formalism in Lisp). > Has anyone successfully parsed > "The policeman raised his hand and stopped the car" > (Courtesy of R. Schank.) This isn't really a parsing problem (this sentence is pretty clearly a conjoined verb phrase with a single subject noun phrase), but a problem of semantics: the raised hand "primes" one to think that he stopped it by contacting the car, but the pragmatics of this don't work. We don't work on any traffic domains :-) but this doesn't seem all that problematic to the generally-naive semantics of most of today's NLP systems. Hope this is helpful. ........................................ Sean Boisen -- sboisen@bbn.com BBN Systems and Technologies Corporation, Cambridge MA Disclaimer: these opinions void where prohibited by lawyers. From goer@sophist.uchicago.EDU Mon Apr 30 09:32:04 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA17729; Mon, 30 Apr 90 09:32:04 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Mon, 30 Apr 90 09:33 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 30 Apr 90 11:32:14 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03522; Mon, 30 Apr 90 11:28:01 CDT Resent-Date: Mon, 30 Apr 90 09:33 MST Date: Mon, 30 Apr 90 11:28:01 CDT From: Richard Goerwitz Subject: encompassing formalism (stealing from Prolog) Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004301628.AA03522@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Can you say a little about the linguistic convention for describing cfg-s? What are it's advantages? Does it submit more readily to being manipulated by a program? You know, I should throw in here that linguists are pretty bad about notation as well. On of the tremendous annoyances a researcher in a specific language field runs into is the proliferation of formalisms to represent grammars of various kinds. The most prominent methods are two. The first is a notation for context-sensitive grammars. For instance, in Akkadian, you have a rule which deletes short vowels in open syllables if preceded by another syllable with a short vowel and followed by another syllable (below, V: is a long vowel, while V is short): V -> 0 / CVC_CV I.e. a short vowel, V, goes to nut'n (zero) in the context consonant, vowel, consonant, ___, consonant, vowel (where the blank is our vowel). Linguists try to keep one symbol on the right, thus preventing the notation from lean- ing towards an unrestricted rewrite system. The other formalism is more process-oriented (hence it is ironic that Prolog is the main language which implements it). You say - S -> NP VP and what not (i.e. "a sentence may be broken down into a noun phrase and a verb phrase," or, depending on your model, "a sentence consists of" [more like a definition). Anyway, it's pretty much like the Backus-Naur formal- ism, though it is less well-defined, and tends to get modified ad hoc by everyone who uses it. Particularly interesting for us here is the way Prolog implements this for- malism. Assuming the Prolog you use implements definite clause grammar no- tation, you can say, s --> np(Number), vp(Number). np(Number) --> det, n(Number). vp(Number) --> v, np. det --> [the]. n(singular) --> [woman] n(plural) --> [husbands] v(singular) --> [love] etc. You get the idea. I don't see any reason that this couldn't be im- plemented EASILY in Icon. And Icon has some neat advantages over Prolog, such as very good string handling. There needs to be some research on just how far these indexed grammars can represent natural languages. They are kind of an intemediate creature on the hierarchy of grammar types - some- thing that has not really been studied a great deal. Recently a formalism called PATR has been evolved. This formalism has been implemented in Prolog and Lisp. PATR is an extension, so far as I can see, of this definite clause notation we seen in Prolog. I have toyed with doing it in Icon. The question is whether to create a PATR -> Icon translator that outputs code that must be translated and linked, or if it should be permit the user to "consult" a database at run-time. In my find_re posting (which, incidentally had yet a few small bugs, since after all it *was* version 0.9; it is very usable, though, and I hope that people test it) I adopted the run-time approach. Basically I just did what regex does - it compiles a string representation of a finite state transi- tion network into an automaton, stored in a small table (and which can be eval()'d any time). I dunno what would be best with PATR. Both facilities would be nice. Icon is a very good language for natural language proces- sing, and I would like very much to see it gain greater popularity in a field now dominated by (lots (of (and (extremely annoying) (unintuitive)) parentheses)). I once wrote Icon (what else? programs to manipulate a cfg, by putting it through its paces/transformations between normal forms, factoring left recursion, et al. I would be interested in seeing an alternate representation... > > The woman loves me. > The woman I married loves me. > The woman I married, who won't love me if my son and I forget > Mother's Day, loves me. > > There are two problems here: 1) We have to get the verb phrase - something > that must be defined separately from the noun phrase - to know about what > is going on in the noun phrase, namely that it is singular (hence "loves" and > not "love." The other problem (2) is what I mentioned before, namely left re- > cursion in the grammar. Another thing to consider is the arbitrary complexity > of the noun phrase, and the theoretically unlimited distance that separates > the main noun of that phrase (above = "woman") from the verb phrase ("loves > me"). > > Phrase structure grammars, BNFs, regular expressions, definite clause grammars > without indexing - whatever you happen to call them (they are all context free) > - are, to the linguist, not of most central interest. > > As I hinted at above, the problem of the verb knowing what the noun before it > is doing number-wise can be solved. In Prolog you do it using indexed gram- > mars. In Icon, the solution looks pretty straightforward, even elegant. > Just have your matching procedures return a value. This has the effect of > allowing nodes to have labels. I'd tend to want to create a list, and then > have each node, if it wishes, simply return a value, which is then "put" into > a list and returned (or put into another list by the calling node, or whatever). > Anyway, all you would have to do is make sure that neither the noun phrase > nor the verb phrase conflict as to number (in this case, I'd just check to > be sure that, if the one is marked "singular," the other is as well). This looks like what we compiler wonks do when we check non-syntactic things (like whether something is declared or not...) during the translation process. Is there a better (more fully encompassing) formalism here? That is, syntax-with-ad-hoc-checking is highly impure... > > Again, I'd really like to see some comments by someone who has studied the > syntax of natural languages in more detail than I. It just seemed useful > to provide a little input from linguistics. It will be useful in the long > run, I think, if I keep reminding computer scientists that their work has > implications in closely related fields. So would I... Has anyone successfully parsed "The policeman raised his hand and stopped the car" (Courtesy of R. Schank.) > -Richard L. Goerwitz goer%sophist@uchicago.bitnet > goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer > > Mark -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From ralph Mon Apr 30 09:48:30 1990 Date: Mon, 30 Apr 90 09:48:30 MST From: "Ralph Griswold" Message-Id: <9004301648.AA18532@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA18532; Mon, 30 Apr 90 09:48:30 MST To: icon-group Subject: Version 8 of Icon for personal computers Status: O Version 8 of Icon for personal computers is now available. In addition to Version 8 for MS-DOS, announced earlier, there are now implementations for the Amiga, the Atari ST, and the Macintosh (under MPW). There are two packages for each computer -- one contains executable binary files and the other contains source code. 1MB of RAM is about the minimum for successful use. Version 8 of Icon for these computers can be obtained by anonymous FTP to cs.arizona.edu. After connecting, cd /icon/v8. Get READ.ME there for more information. If you do not have FTP access or prefer to obtain diskettes and printed documentation, Version 8 of Icon for for the computers listed above can be ordered from: Icon Project Department of Computer Science Gould-Simpson Building The University of Arizona Tucson, AZ 85721 602 621-2018 (voice) 602 621-4246 (FAX) Specify whether you want executable binaries, source code, or both. The packages are $15 each, payable in US dollars to The University of Arizona with a check written on a bank in the United States. Orders also can be charged to MasterCard or Visa. The price includes shipping by parcel post in the United States, Canada, and Mexico. Add $5 per package for air mail delivery to other countries. Please direct any questions to me, not to icon-project or icon-group. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From reid@ctc.contel.COM Mon Apr 30 10:47:52 1990 Resent-From: reid@ctc.contel.COM Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA23971; Mon, 30 Apr 90 10:47:52 MST Received: from ctc.contel.com (turing.ctc.contel.com) by Arizona.EDU; Mon, 30 Apr 90 10:49 MST Received: from demo360.ctc.contel.com by ctc.contel.com (4.0/SMI-4.0) id AA04565; Mon, 30 Apr 90 13:47:19 EDT Resent-Date: Mon, 30 Apr 90 10:49 MST Date: Mon, 30 Apr 90 13:47:19 EDT From: reid@ctc.contel.COM Subject: RE: grammars, questions Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004301747.AA04565@ctc.contel.com> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O > > It appears that Icon has no trouble representing > > context free languages, with the restriction that there can be no left- > > recursion in the grammar. > > Certain special cases of left-recursion can be converted into looping. > Unfortunately, this bounds backtracking so to you have to know that > backtracking won't find other solutions. > > For example, the production > > s ::= t | s "+" t > > can be parsed with a procedure something like > > procedure s() > x := [] > push(x, t()) | fail > while ="+" do > push(x, t()) | stop("syntax error") > return x > end > Look at converting your LL1-style BNF to extended BNF (EBNF). Three nice things happen: 1) your grammar is shorter, much more readable and no left recursion is needed 2) the implementing procedure for a nonterminal is real straight forward and 3) adding attributes and semantic actions is much easier. Tom. Thomas F. Reid, Ph. D. (703)818-4505 (work) Contel Technology Center (703)742-8720 (home) 15000 Conference Center Drive Net: reid@ctc.contel.com P.O. Box 10814 Chantilly, Va. 22021-3808 From goer@sophist.uchicago.EDU Mon Apr 30 12:28:33 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA02825; Mon, 30 Apr 90 12:28:33 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Mon, 30 Apr 90 12:29 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Mon, 30 Apr 90 14:28:40 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA03997; Mon, 30 Apr 90 14:24:28 CDT Resent-Date: Mon, 30 Apr 90 12:30 MST Date: Mon, 30 Apr 90 14:24:28 CDT From: Richard Goerwitz Subject: clarification requested Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: Message-Id: <9004301924.AA03997@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O # It has been asked of me that I clarify what I mean by implementing # indexed grammars like Prolog's in Icon. What follows here is a # very simple example. Note how Prolog has the advantage of either # producing or recognizing. Icon, however, has the advantage of # providing an immediate interface with the outside world (no need # for tokenizing or for giving input like :- s([the husband...],[])). # Please don't anyone gripe about how incomplete the grammars are, # or about the fact that the Prolog and Icon code are not exactly # equivalent! Maybe someone will kindly offer us a DCG -> Icon con- # verter. # # Naturally, the Prolog is much shorter. This is the sort of thing # it was designed to do. # # s --> np(Number), vp(Number). # np(Number) --> det, n(Number). # vp(Number) --> v(Number), np(_). # # % I get so tired of "man and wife"; let's try "woman and husband" :-) # # n(singular) --> [woman]. # n(plural) --> [husbands]. # n(plural) --> [women]. # n(singular) --> [husband]. # v(singular) --> [loves]. # v(singular) --> [hates]. # v(plural) --> [hate]. # v(plural) --> [love]. # det --> [the]. procedure main() while input_line := trim(map(!&input),',.?!') do write(input_line ? S()) end procedure S() NP() == VP() & pos(0) & (return "yes") return "no" end procedure NP() DET() & tag := N() & (suspend tag) end procedure VP() tag := V() & NP() | &null & (suspend tag) end procedure DET() ="the" & =" " | &null & suspend end procedure N() suspend Nsing() | Nplur() end procedure Nsing() wordlst := ["husband","woman"] =!wordlst & =" " | &null & (suspend "singular") end procedure Nplur() wordlst := ["husbands","women"] =!wordlst & =" " | &null & (suspend "plural") end procedure V() suspend Vsing() | Vplur() end procedure Vsing() wordlst := ["loves","hates"] =!wordlst & =" " | &null & (suspend "singular") end procedure Vplur() wordlst := ["love","hate"] =!wordlst & =" " | &null & (suspend "plural") end -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From reid@ctc.contel.com Mon Apr 30 14:16:46 1990 Received: from turing.ctc.contel.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA11520; Mon, 30 Apr 90 14:16:46 MST Received: from demo360.ctc.contel.com by ctc.contel.com (4.0/SMI-4.0) id AA05049; Mon, 30 Apr 90 17:16:55 EDT Date: Mon, 30 Apr 90 17:16:55 EDT From: reid@ctc.contel.com (Tom Reid x4505) Message-Id: <9004302116.AA05049@ctc.contel.com> To: goer@sophist.uchicago.edu, icon-group@cs.arizona.edu Subject: RE: grammars, questions Cc: reid@ctc.contel.com Status: O > From goer@sophist.uchicago.edu Mon Apr 30 14:58:46 1990 > From: Richard Goerwitz > To: reid@ctc.contel.com > Subject: RE: grammars, questions > > What are LL1-style BNFs? It's the LL1 that I don't understand. > You don't need to post this to the group, unless you want to > approach this as an "in case not everyone knows what we're talk- > ing about, here's some background" type posting. > > -Richard L. Goerwitz goer%sophist@uchicago.bitnet > goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer > Richard (and others): Sorry about that. LL1 grammars are a subset of context-free grammars (cfgs) that arose in (automatic) parser construction. There are two kinds of classical parsers: bottom-up and top-down. Bottom-up spawned the LR, LALR, etc. cfg-subset grammars as the largest languages recognized by particular kinds of pushdown automata (PDA) which recognized in linear time and space. In order to design a top-down predictive, recursive descent parser, you needed to restrict cfgs to what was termed ll1 grammars. The two basic restrictions were that no production could have left recursion and that you could not have common prefixes. The reason for both is in the following example. Assume that A ::= q1 A ::= q2 ... A ::= qn are A's productions in a grammar G. In order for G to be LL1 (and thus have a non ambiguous, non backtracking recursive descent parser), the FIRST sets of q1, q2, ..., qn must be disjoint (i.e., in order to to have backtracking, the PDA must be able to uniquely choose which A-production to apply by looking at just the next token). Unless the language is trivial, the FIRST set of the left recursive production A := A .. would contain all the other FIRST symbols. Oh yes, the FIRST set for a production is the set of all terminal symbols which can begin a string derived from that symbol. From @RELAY.CS.NET,@dg-rtp.rtp.dg.com:langley@DG-RTP.DG.COM Tue May 1 05:51:44 1990 Received: from relay.cs.net by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA09247; Tue, 1 May 90 05:51:44 MST Received: from dg-rtp.rtp.dg.com by RELAY.CS.NET id aa16747; 1 May 90 8:51 EDT Received: from bigbird.rtp.dg.com by dg-rtp.dg.com (4.20/4.7) id AA28301; Tue, 1 May 90 08:49:18 edt via SMTP Received: by bigbird.rtp.dg.com (4.20/rtp-s01) id AA19283; Tue, 1 May 90 08:50:20 edt Date: Tue, 1 May 90 08:50:20 edt From: Mark L Langley Message-Id: <9005011250.AA19283@bigbird.rtp.dg.com> Return-Receipt-To: langley@dg-rtp.dg.com To: icon-group@cs.arizona.edu Subject: Challenge! & more on grammars, questions Status: O is at the end, skip ahead if you like...> Richard Goerwitz asks > > > > What are LL1-style BNFs? > > Richard (and others): To which Tom Reid replied > > Sorry about that. LL1 grammars are a subset of context-free grammars (cfgs) > that arose in (automatic) parser construction. There are two kinds of > classical parsers: bottom-up and top-down. Bottom-up spawned the LR, LALR, > etc. cfg-subset grammars as the largest languages recognized by > particular kinds of pushdown automata (PDA) which recognized in linear time > and space. > If I may add a little,... LL(1) refers to LEFT-scan-of-input (i.e. reading left to right), producing a LEFTmost derivation, using at most ONE token of lookahead. Thus LR(1) means producing a rightmost derivation. LALR(1) means Look-ahead LR(1) which is a technique for reducing LR parsing which are linear (though huge) to something a lot smaller. (The Icon parser is written in YACC, which is an LALR parser generator.) LALR(1) is theoretically less powerful than LR(1) but I have never found a grammar I couldn't rewrite. LL parsing is the same thing as recursive descent parsing. It is generally thought to be more intuitive -- you can think about an LL parser as always making forward progress by consuming one token per state. LR parsing detects errors as soon as is possible. (i.e. the fewest number of tokens that can't be something legitimate are flagged, whereas LL parsers may kick around and not report an error right away.) While LL parsers can't handle Left-recursion, Alternatively LR parsers don't like right-recursion. It tends to overflow the internal pushdown stack. For example, matching parenthesis using right-recursion in LR is bad. Therefore in LR and LALR you should rewrite your rules to be left recursive. This is usually a mechanical process, but not always. (Consider what happens if you are expecting some action to take place at the same time a production is matched.) There is a well-known theorem (I couldn't find it) that states that any LL(k) grammar can be rewritten as an LL(1) grammar. This is easy to see because you can keep "left-factoring" productions. Can you rewrite an arbitrary LR(k) grammar as an LR(1) grammar? I have yet to find an LR(k) grammar that I couldn't rewrite, but I haven't successfully proven the theorem either... But I'm not a bright theoretician... Anybody? Mark From @mirsa.inria.fr:ol@cerisi.cerisi.Fr Tue May 1 11:04:23 1990 Received: from mirsa.inria.fr by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA28571; Tue, 1 May 90 11:04:23 MST Received: from cerisi.cerisi.fr by mirsa.inria.fr with SMTP (5.59++/IDA-1.2.8) id AA04105; Tue, 1 May 90 20:05:37 +0200 Message-Id: <9005011805.AA04105@mirsa.inria.fr> Date: Tue, 1 May 90 20:03:23 -0100 Posted-Date: Tue, 1 May 90 20:03:23 -0100 From: Lecarme Olivier To: langley@DG-RTP.DG.COM Cc: icon-group@cs.arizona.edu In-Reply-To: Mark L Langley's message of Tue, 1 May 90 08:50:20 edt <9005011250.AA19283@bigbird.rtp.dg.com> Subject: Challenge! & more on grammars, questions Status: O The theorem that for every LR(k) grammar with k>1 there exists an equivalent LR(1) grammar is only stated by Waite & Goos (Compiler construction, Springer-Verlag 1984), but it is demonstrated by Aho & Ullman (The theory of parsing, translation, and compiling, Prentice-Hall 1972). Unfortunately, as explained by Waite & Goos, "the transformation underlying the proof of this theorem is unsuitable for practical purposes". Olivier Lecarme From icon-group-request@arizona.edu Tue May 1 13:57:01 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA12403; Tue, 1 May 90 13:57:01 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 1 May 90 13:53 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA17432; Tue, 1 May 90 13:15:57 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 1 May 90 13:57 MST Date: 1 May 90 20:00:17 GMT From: tank!sophist!goer@handies.ucar.EDU Subject: RE: linguism Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <9F8C8DA101FFA097D5@Arizona.EDU> Message-Id: <9062@tank.uchicago.edu> Organization: University of Chicago X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <9004301500.AA06810@bigbird.rtp.dg.com>, <9004301617.AA16804@megaron.cs.arizona.edu> Status: O In article <9004301617.AA16804@megaron.cs.arizona.edu> sboisen@BBN.COM writes: >> Can you say a little about the linguistic convention for describing >...it's not at all clear that CFGs are sufficiently powerful for >representing NL (one classic case for this argument is a construction >like "John, Bill, and Fred love Mary, Sally, and Julie, >respectively"). I'd think that the real problem here is that the noun phrases must be recursively defined (as someone pointed out, it's not problem if we are using a bottom-up parser). I think what the argument here is is that the two noun phrases, "John, Bill, and Fred" and "Mary, Sally, and Julie" must be of equal length. I dunno. We should probably extend this statement even further. We should probably be saying that the two noun phrases have to have equal length AND that each member in the first set should plausibly be able to "love" the corresponding member in the second set. Both of these criteria, the length of the noun phrases, and the applicability of the action "love" to each of the respective members are really se- mantic considerations. Let me explain this another way. From the standpoint of the grammar, there is absolutely nothing wrong with saying, "John, Bill, and Fred love Mary, Sally, and Julie." You can, if you want, tack on an ad- verb, "J, B, and F love M, S, and J very much." Whether that adverb is "respectively" or "very much" is not important to the grammar. The consideration that the length of the noun phrases must "make sense" (i.e. be of the same length, and have members that can love and be loved) is extraneous to the basic grammar. Perhaps we should be integrating syntax and semantics from the start. Still, looking at this sentence in the terms we in this group are currently discussing parsing problems, the word "respectively" cannot be said to impose any extraordinary new organization on a sentence. It is just an adverb which the speaker may or may not add, depending on what his/her meaning is, and whether the word makes sense in the context of what is being said. Perhaps irrelevant side note: How often do you really hear people use the term, respectively, in the context you mentioned? Just cu- rious. To me it is primarily an affectation of the educated, and I rarely hear even them using it in contexts where more than two things are being respectively-ed. This is due to the fact that people don't naturally think about how many members are in the noun phrases they are using, and it's pretty easy to forget, and, say, put four nouns in the first set, and five in the second. The very fact that we have to strain at this construction tells me that it is not really a fundamental part of the grammar in the same sense as is the fact that most sentences consist of a noun and a verb phrase. Point: Many such examples where natural languages are said to re- quire exotic parsing mechanisms in fact may not. What they re- quire is a way of integrating semantics more closely into syntax. We also have to keep our eyes peeled for cases where marginal or literary usage is thrust into the core of the grammar. In most such cases there is indeed a important process at work. How- ever, this process rarely belongs in the basic structural me- chanisms. In the case of "respectively," I believe the correct interpretation resides in the interactions of syntax and seman- tics. I'd appreciate argument on this point, especially if it is ac- companied by Icon code! -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From icon-group-request@arizona.edu Tue May 1 18:02:30 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA29079; Tue, 1 May 90 18:02:30 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 1 May 90 18:02 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA04828; Tue, 1 May 90 17:54:00 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 1 May 90 18:04 MST Date: 1 May 90 15:12:40 GMT From: ntvax!leff@tut.cis.ohio-state.EDU Subject: Reversible Assignment Problem Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <9F6A1EE99FFFA09F0F@Arizona.EDU> Message-Id: <1990May1.151240.11020@dept.csci.unt.edu> Organization: University of North Texas X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Reversible assignments to global variables inside procedures are not being reversed. The program below prints out three. It should print out one. Obviously, the reversible assignment to z is not being reversed when test does not lead to an eventual success. Why does the reversing of the assignment not take place, and what would make it do so? The Icon Programming Language, Chapter 11, section 11.8.2 did not shed any light on these issues. global z procedure test(i) z<-z+1 if i~=3 then fail if i=3 then return 1 end procedure main() z:=0 every i:=(1 to 10) do if test(i) then write("test succeeded ",i," ",z) end From goer@sophist.uchicago.EDU Tue May 1 19:28:53 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA04110; Tue, 1 May 90 19:28:53 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Tue, 1 May 90 19:28 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Tue, 1 May 90 21:27:23 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA06119; Tue, 1 May 90 21:23:09 CDT Resent-Date: Tue, 1 May 90 19:28 MST Date: Tue, 1 May 90 21:23:09 CDT From: Richard Goerwitz Subject: reversible assignment Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <9F5E450635BFA0A633@Arizona.EDU> Message-Id: <9005020223.AA06119@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O # Reversible assignments to global variables inside procedures are # not being reversed. # # The program below prints out three. It should print out one. Obviously, # the reversible assignment to z is not being reversed when test does not # lead to an eventual success. # # Why does the reversing of the assignment not take place, and what would # make it do so? # # The Icon Programming Language, Chapter 11, section 11.8.2 did not shed # any light on these issues. global z procedure test(i) z<-z+1 if i~=3 then fail if i=3 then return 1 end procedure main() z:=0 # not needed!!!!! every i:= 1 to 10 do if test(i) then write("test succeeded ",i," ",z) end Okay, it appears that you are misunderstanding the purpose of reversible assignment, and, because of this, are not quite getting the idea of how and when to use it (or how it behaves when you do). Don't worry. You aren't the only one who has had trouble with this.... Basically, Icon is set up for control backtracking. Hence if you say i := 0 i +:= |1 & i = 5 the i = 5 comparison will fail at first. So the expression will back- track to the i +:= |1 expression to see if it can produce another re- sult. The assignment itself doesn't produce another result; however the expression |1 can (basically | before something makes Icon do it repeatedly). Hence it produces another 1, and that 1 is then added (+:=) to i to make i one bigger. This assignment operation succeeds, and so control passes to the i = 5 comparison. This fails, and so the process of backtracking and incrementing is repeated once again. Note that throughout this process, the i, when it is assigned a value, keeps that value, even if the expression i +:= |1 is being resumed. That is, if you add 1 to i, then go on to test whether i is equal to 5, and then resume to increment i again - if you do this, the i will not get reset to whatever value it had before you started. It will keep the last assigned value. This is what makes it get bigger every time i +:= 1 is resumed, rather then going back to the original value it had. Eventually it will reach the value of 5, and the expression as a whole will succeed. Sometimes this feature isn't wanted. Sometimes you don't want the i to keep the value you gave it if it is resumed. Let me offer a silly example: str := "string of nonsensical strings" 59 < (position <- find("string",str)) if \position then write("the word \"string\" occurs after position 13") else write("the word \"string\" occurs before position 14") Essentially, position will be assigned the value of the find ex- pression, and then its value will be compared with 59. If it is less than or equal to 59 (which it will be every time the com- parison takes place), then the expression (position <- find(...)) will get resumed. When it is resumed, the former assignment of position will get undone. Then it will be assigned a new value, and the comparison will be made again. On the next resumption, find will fail. There are only two places where "string" occurs in str. Because we included the reversible assignment operator, position will be returned to its former value (namely &null), and control will move to the next line. I know that this example is silly, but I wanted to illustrate the point without having to get into string scanning (the place where reversible assignment seems most handy). You'll eventually get to the chapter in Griswold & Griswold on complex string processing. The Arb() procedure is a nice little example of where you really have to have reversible assignment. Put in general terms, re- versible assignment makes backtracking undo assignments. Normally backtracking doesn't do this. Now, to your sample program. I don't know exactly what you would be using this for, but it doesn't matter. If I say a <- 1 a will be assigned the value of 1. Nothing will change this be- cause the expression a <- 1 will not be resumed. I have heard the term "bounded" applied to this situation. Whatever you call it, it means it's done and that's it. Even if the procedure in which it occurs is resumed, you won't see any change. It is only if you set it in a context where the expression itself will be resumed will you see any effects. You might write, for example, a <- 1 & open("inputfile","r") If the open() function fails to open "inputfile," then the ex- pression a <- 1 will be resumed. Since there are no generators there, it will not produce another result, and a will be returned to the value it had before the assignment was made. I hope that this long-winded discursus helps you. Basically, I'd stay away from reversible assignment until you have gotten past generators, and into string scanning far enough to understand say, the Arb() program. Your basic misconception is that if a procedure is called in which reversible assignment occurs the assignment will be undone. This isn't the case. It is only if the expression in which it occurs causes control to backtrack through the assignment that it will be undone. It's nice to see questions like this on this newsgroup. The surveys tell us that most Icon users call themselves beginners or, perhaps, intermediate-ers. I sometimes wonder whether the fact that discussion here is dominated by people who have been doing Icon for some time intimidates these people, or whether they feel they are wasting bandwidth. It's not a waste at all! Don't be intimidated! -Richard From wgg@cs.washington.EDU Tue May 1 21:01:22 1990 Resent-From: wgg@cs.washington.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA09716; Tue, 1 May 90 21:01:22 MST Return-Path: wgg@cs.washington.EDU Received: from june.cs.washington.edu by Arizona.EDU; Tue, 1 May 90 21:01 MST Received: by june.cs.washington.edu (5.61/7.0jh) id AA04237; Tue, 1 May 90 20:59:24 -0700 Resent-Date: Tue, 1 May 90 21:02 MST Date: Tue, 1 May 90 20:59:24 -0700 From: wgg@cs.washington.EDU Subject: RE: reversible assignment Resent-To: icon-group@cs.arizona.edu To: goer@sophist.uchicago.EDU, icon-group@arizona.edu Resent-Message-Id: <9F513282723FA0A08F@Arizona.EDU> Message-Id: <9005020359.AA04237@june.cs.washington.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: goer@sophist.uchicago.EDU, icon-group@Arizona.edu Status: O Richard Goerwitz's answer is correct: data backtracking is possible only if the control backtracks to the expression involved. In the expressions z <- 1 z = 10 each has an implicit semi-colon at the end: z <- 1; z = 10; The first semicolon delimits the assignment from the comparison, and prevents backtracking back into the assignment from the comparison if it fails. Thus the assignment expression is ``bounded'' by the semicolon, and cannot be resumed once it has yielded a result. On the other hand, in the expression (z <- i) & (z = 10) The assignment itself is not bounded, and it is possible to backtrack from the comparison into the assignment, if the comparison fails. In most cases the semantics of traditional-appearing control structures in Icon is to bound an expression so that it produces only one result. This prevents ``surprises'', and also avoids the overhead of often unneeded backtracking. Hence the control structures if-then and while-do bound their control expressions (but not their bodies!). Of course, it is easy to phrase ``backtracking'' versions of these control structures: basic backtracking ------------------------------------------------------- if X then Y else Z (X & Y) | Z while X do Y every X do Y X;Y X & Y return X suspend X One could easily argue that I've chosen the wrong analogues. (Suppose that the analogue for while-do resumes X only if Y fails, otherwise X just starts over. Consider, too, its behavior when Y contains a break.) Bill Griswold From ralph Wed May 2 09:54:57 1990 Date: Wed, 2 May 90 09:54:57 MST From: "Ralph Griswold" Message-Id: <9005021654.AA23250@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA23250; Wed, 2 May 90 09:54:57 MST To: icon-group Subject: icon-group mail Status: O There's been a lot of interesting mail to icon-group recently. We're encouraged to see this activity and hope it will keep up. Please bear in mind when you're sending e-mail to icon-group that long messages sometimes cause problems. The main difficulty is when one person responds to icon-group mail and includes most or all of the text of the message toward which the response is directed. This sometimes makes such responses very bulky. While this is no problem for most persons on icon-group, it is for some. Persons who get icon-group mail via a modem connections may have difficulty receiving long messages and it's also expensive for them. Long messages also may be refused by electronic gateways. This, for example, can prevent icon-group mail from getting to electronic news distribution. Please take a little extra time when composing lengthy messages to be sure you only include relevant information. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From gudeman Wed May 2 12:44:25 1990 Resent-From: "David Gudeman" Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA05435; Wed, 2 May 90 12:44:25 MST Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Wed, 2 May 90 12:40 MST Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA05223; Wed, 2 May 90 12:38:12 MST Resent-Date: Wed, 2 May 90 12:41 MST Date: Wed, 2 May 90 12:38:12 MST From: David Gudeman Subject: backtracking rules (was: Reversible Assignment Problem) Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <9ECDF6D2891FA09E8B@Arizona.EDU> Message-Id: <9005021938.AA05223@megaron.cs.arizona.edu> In-Reply-To: ntvax!leff@tut.cis.ohio-state.EDU's message of 1 May 90 15:12:40 GMT <1990May1.151240.11020@dept.csci.unt.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O This problem with reversible assignment is part of a larger problem that a lot of people seem to have with Icon. That is the problem of understanding where backtracking "goes". In particular, the reversible assignment problem seems to be caused by the following misunderstanding: global x procedure foo() x <- 1 suspend x end procedure main() every foo() end where people think that when foo gets resumed, that resumes every expression in foo that suspended in the past. But once you pass the semi-colon (an implicit one in this case) the expression before the semicolon is no longer suspended, it is finished. Here is a good test for what expressions in a procedure can be resumed after a suspend: imagine what would happen if you replaced suspend EXPRESSION with every write(image(EXPRESSION)) basically, the sequence that would be written is the sequence that a calling procedure would see. Also, any backtracking that would be done between written values gets done between real suspensions. If you wrote procedure foo() x <- 1 every write(image(x)) end from above, would the assignment ever get reversed? From utah-cs!boulder!ncar.UCAR.EDU!oddjob!sophist.uchicago.edu.richard!zenu! Thu May 3 09:55:47 1990 Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA07804; Thu, 3 May 90 09:55:47 MST Received: from boulder.UUCP by cs.utah.edu (5.61/utah-2.10-cs) id AA13526; Thu, 3 May 90 10:54:27 -0600 Received: by boulder.Colorado.EDU (cu-hub.890824) Received: by ncar.ucar.EDU (5.61/ NCAR Central Post Office 04/10/90) id AA07950; Thu, 3 May 90 10:53:42 MDT Received: from tank.uchicago.edu by oddjob.uchicago.edu Thu, 3 May 90 11:16:11 CDT Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 3 May 90 11:17:09 CDT Return-Path: Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA08195; Thu, 3 May 90 11:12:52 CDT Received: by sophist.uchicago.edu (smail2.5) id AA00229; 3 May 90 10:31:14 CDT (Thu) Subject: lifetime of variables To: icon-group@arizona.edu Date: Thu, 3 May 90 10:31:13 CDT X-Mailer: ELM [version 2.2 PL0] Message-Id: <9005031031.AA00229@sophist.uchicago.edu> From: utah-cs!boulder!sophist.uchicago.edu!richard (Richard L. Goerwitz III) Status: O Why is it that a procedure like procedure return_table() tbl := table() return tbl end works. I guess I never really thought about it before (I don't mentally transfer Icon into equivalent constructions in other languages). If I had no familiarity with Icon, I'd probably way "make tbl static or global, 'cause it'll disappear when return_ table() returns, and all you'll be left with is a pointer aiming into the great void." From icon-group-request@arizona.edu Thu May 3 13:34:18 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron (5.59-1.7/15) via SMTP id AA22791; Thu, 3 May 90 13:34:18 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 3 May 90 13:34 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA04254; Thu, 3 May 90 13:18:55 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 3 May 90 13:34 MST Date: 3 May 90 20:18:39 GMT From: swrinde!cs.utexas.edu!jnino@ucsd.EDU Subject: Differences between version 5 and version 7 Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <9DFD6CB34E7FA05491@Arizona.EDU> Message-Id: <1280@gorath.cs.utexas.edu> Organization: U. Texas CS Dept., Austin, Texas X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I am getting started with this language. I got a book written by R. Griswold and published in 1983 tittled "The Icon programming Language". In the preface it is indicated that Version 5 is to be used in the book. I'd like to know how different is Version 5 from version 7 or even version 8. Is it advisable to go ahead and get an intro to Icon using this book? Thanks Jaime Nino From goer@sophist.uchicago.EDU Thu May 3 16:53:20 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron (5.59-1.7/15) via SMTP id AA05262; Thu, 3 May 90 16:53:20 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Thu, 3 May 90 16:54 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Thu, 3 May 90 18:53:45 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA08853; Thu, 3 May 90 18:49:25 CDT Resent-Date: Thu, 3 May 90 16:54 MST Date: Thu, 3 May 90 18:49:25 CDT From: Richard Goerwitz Subject: go ahead with Griswold & Griswold Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <9DE172D9C61FA0B255@Arizona.EDU> Message-Id: <9005032349.AA08853@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O Go ahead and use Griswold & Griswold. Icon is mostly backwards compatible with its former incarnations. You'll notice that co- routines came and went and came back again from version 5 to 7. String scanning no longer operates with simple global variables &pos and &source. These variables still exist. However, their scope is a bit different. No need to worry about the specifics. There are a few nice features, like faster and cleaner options 3 and 4 for sort. We now have math functions that used to be part of the library (sin, etc.). In general, don't worry about the differences. If something seems awry - which is unlikely - post. You certainly won't be the only one whose had questions :-). The version of Icon you are using will certainly have documentation to go with it. When you feel comfortable enough with the language, just browse through them. They are well-written and pretty concise. You quickly get caught up on the additions that have been made to the language since ver- sion 5. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From nevin1@ihlpb.att.com Thu May 3 19:41:49 1990 Message-Id: <9005040241.AA13224@megaron> Received: from att-in.att.com by megaron (5.59-1.7/15) via SMTP id AA13224; Thu, 3 May 90 19:41:49 MST From: nevin1@ihlpb.att.com Date: Thu, 3 May 90 19:41 CDT Original-From: ihlpb!nevin1 (Nevin J Liber +1 708 979 4751) To: att!cs.arizona.edu!icon-group Subject: Re: lifetime of variables Status: O >Why is it that a procedure like > > procedure return_table() > tbl := table() > return tbl > end > >works. [...] >If I had no familiarity with Icon, I'd probably way >"make tbl static or global, 'cause it'll disappear when return_ >table() returns, and all you'll be left with is a pointer aiming >into the great void." [Side note: the above is a good explanation of a very common C programming error.] The the table sticks around because is it is stored in that area of memory commonly referred to as the "heap". (This is the same type of memory that C's malloc() function returns pointers into.) [Note: there are other ways of implementing call-return mechanisms (eg: copy the object before returning), but they have other problems associated with it.] One purpose of a heap is to have objects survive procedure calls and returns. Like static variables, it has limited visibility. However, it differs from statics in that each call to a function like your return_table() returns a DIFFERENT table each time. (I don't mean to say that if tbl were declared static that the return_table() would return the same table each time; its behavior would not change. What I mean is that in the framework of a language like C, if you return a pointer to a static you will always get the same address, while if you return a pointer to something malloc()ed you will get a different address.) The other purpose to having a heap is to create objects of arbitrary size or of sizes unknown at compile time. I hoped I haven't rambled too long. It's been a long day. :-) NEVIN ":-)" LIBER nevin1@ihlpb.ATT.COM (708) 831-FLYS From icon-group-request@arizona.edu Thu May 3 19:48:11 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron (5.59-1.7/15) via SMTP id AA13499; Thu, 3 May 90 19:48:11 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Thu, 3 May 90 19:49 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA27182; Thu, 3 May 90 19:35:29 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Thu, 3 May 90 19:49 MST Date: 4 May 90 01:45:12 GMT From: uupsi!sunic!sics.se!sics!soder@rice.EDU Subject: Icon on Sun386i Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <9DC90397F2DFA022A0@Arizona.EDU> Message-Id: Organization: nmp X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Status: O I have run Icon 7.5 successfully on a Sun386i for a long time. But without co-expressions. I recently fetched Icon 8.0. There is a configuration called "sun386i" in the distribution. I copied "rswitch.c" from the "i386_sysv" configuration and ended up with the following "define.h": #define SysTime #define GetHost #define MaxHdr 5000 #define UNIX 1 No other modifications to the "sun386i" configuration. My knowledge of assembler and co-expression implementation is identical to zero, but this installation passed the co-expression tests and seems to work fine. Well, there is one Sun386i-specific glitch. (It was there in 7.5 too.) Icon programs generate a random (?) return code, except "stop" that reliably returns '1'. Not even "exit(0)" helps. I've noticed that C programs that just flow out of "main" also return "random" codes. Can this be the problem in "iconx"? I can't easily find out. Anyway, a C program must end by executing "exit" to be portable. -- ---------------------------------------------------- Hakan Soderstrom Phone: +46 (8) 752 1138 NMP-CAD Fax: +46 (8) 750 8056 P.O. Box 1193 E-mail: soder@nmpcad.se S-164 22 Kista, Sweden From @RELAY.CS.NET,@dg-rtp.rtp.dg.com:langley@DG-RTP.DG.COM Fri May 4 06:55:47 1990 Received: from relay.cs.net by megaron (5.59-1.7/15) via SMTP id AA16623; Fri, 4 May 90 06:55:47 MST Received: from dg-rtp.rtp.dg.com by RELAY.CS.NET id aa18447; 4 May 90 9:55 EDT Received: from bigbird.rtp.dg.com by dg-rtp.dg.com (4.20/4.7) id AA03537; Fri, 4 May 90 09:53:38 edt via SMTP Received: by bigbird.rtp.dg.com (4.20/rtp-s01) id AA17493; Fri, 4 May 90 09:54:50 edt Date: Fri, 4 May 90 09:54:50 edt From: Mark L Langley Message-Id: <9005041354.AA17493@bigbird.rtp.dg.com> Return-Receipt-To: langley@dg-rtp.dg.com To: icon-group@cs.arizona.edu Subject: Re: lifetime of variables Status: O Richard Goerwitz III asks > > Why is it that a procedure like > > procedure return_table() > tbl := table() > return tbl > end > > works. I guess I never really thought about it before (I don't > mentally transfer Icon into equivalent constructions in other > languages). If I had no familiarity with Icon, I'd probably way > "make tbl static or global, 'cause it'll disappear when return_ > table() returns, and all you'll be left with is a pointer aiming > into the great void." > Ah, this is one of the great things about Icon -- Memory management is done for you. Dynamic storage allocation is the trick. Imagine two ways of using your office, playroom, or kitchen counter. Static Storage Allocation: Take something out, put it back, Take it out, put it back... Dynamic Storage Allocation: Take things out, put them back when you need the space. To make a long story short, the Icon garbage collector is responsible for collecting things that you no longer need. The interpreter doles out memory as needed. When it runs out, it finds all the objects that could still be referenced and moves them together. This writes over all the objects that cannot be reached anymore, leaving space at the end. Between this and saying "Mother may I have some more?" to the operating system, it usually avoids running out of memory. Mark From utah-cs!boulder!ncar.UCAR.EDU!oddjob!sophist.uchicago.edu.goer!zenu! Fri May 4 13:48:45 1990 Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA24398; Fri, 4 May 90 13:48:45 MST Received: from boulder.UUCP by cs.utah.edu (5.61/utah-2.10-cs) id AA06666; Fri, 4 May 90 14:48:24 -0600 Received: by boulder.Colorado.EDU (cu-hub.890824) Received: by ncar.ucar.EDU (5.61/ NCAR Central Post Office 04/10/90) id AA06967; Fri, 4 May 90 14:47:06 MDT Received: from tank.uchicago.edu by oddjob.uchicago.edu Fri, 4 May 90 15:17:42 CDT Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 4 May 90 15:18:02 CDT Return-Path: Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA10051; Fri, 4 May 90 15:13:36 CDT Received: by sophist.uchicago.edu (smail2.5) id AA00157; 4 May 90 11:18:15 PDT (Fri) Subject: frequently-asked questions To: icon-group@arizona.edu Date: Fri, 4 May 90 11:18:14 PDT X-Mailer: ELM [version 2.2 PL0] Message-Id: <9005041118.AA00157@sophist.uchicago.edu> From: utah-cs!boulder!sophist.uchicago.edu!goer (Richard qua goer.) Status: O I wonder if we might do well to accumulate a "frequently-asked questions" list, to make things easier for people starting to learn Icon (presumably a large portion of the Icon-group's membership). I'll just post an entry, and if anyone wants to add to it, I'll simply append their additions to my list. I find that people starting to learn Icon tend to make similar mistakes, and I end up answering the same questions over and over. Not that answering them is difficult or tedious. I just hate to see people have to find out, after hours of debugging, that they have run into a problem that might easily have been avoided through the use of such a list. Take five minutes out, and add to the list! Problem: Why do I get unexpected results when I initialize a table like this: tbl := table([])? What I want is to make all the keys in tbl have empty lists as their initial values. Answer: Tables, sets, and lists in Icon are handled differently than, say, strings, csets, and integers. When you "dereference" a variable whose value is a string, cset, or integer, you get a string, cset or integer (nothing complicated here). In other words, if you say i := 1 j := i j will end up with a value of 1. When the i is dereferenced, it produces the integer 1, and *that* is what gets assigned to j. With structures like lists, however, dereferencing them produces a "pointer" to the structure in question. It does not produce a copy of the structure (for that, you have to use copy()). This is why, if you say l1 := ["hello"] l2 := ["hello"] if l1 === l2 then write("the same") else write("different") you will see "different" written to the screen. In effect, you have created two lists which, although they bear a structural similarity, reside in different places in memory, and therefore are *different lists*. What is the point here? The point is that, if you say tbl := table([]) you are actually setting up tbl so that each time you insert a new key, it will automatically be assigned the value produced by []. If you had said "tbl := table(1)" this would be fine. "1" produces the integer 1. Remember, however, that [] creates a specific structure (an empty list) and produces a pointer to that list. What you'll end up with, therefore, is a table with keys whose values are all pointers to the one list structure! What this does to your program is make it so that if you make any insertions into any key's value (e.g. tbl[key1] |||:= ["hello"] or insert(tbl[key1],"hello")), you will find, suddenly, that *all* of the keys' values have been modified. To make the long story short, you have to initialize the table using &null, tbl := table() # the same as tbl := table(&null) and then, each time you add a key, do this: /tbl[key] := [] # or /tbl[key] := list() The above expression first checks to see whether key has been inserted into tbl yet, and if not, makes its value the empty list (the forward slash tests for the null value, and so if key is already present in the table, and has been assigned a value, tbl[key] := [] will not take place). You can then go about inserting things into this list as expected. From ralph Sat May 5 07:19:59 1990 Date: Sat, 5 May 90 07:19:59 MST From: "Ralph Griswold" Message-Id: <9005051419.AA14654@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA14654; Sat, 5 May 90 07:19:59 MST To: icon-group Subject: Second Edition of the Icon programming language book Status: O The second edition of The Icon Programming Language is now available. The second edition describes Version 8 of Icon (the first edition, published in 1983, describes Version 5.9). In addition to describing Version 8, the second edition is completely revised. Important concepts such as generators and string scanning are presented first, allowing subsequent material to be presented in ways more natural to Icon. New material includes a chapter on the details of running Icon programs, more (and harder) exercises, several large sample programs, and an expanded "mini-reference" to Icon's functions and operations. Here's the publication information: The Icon Programming Language, second edition. Ralph E. Griswold and Madge T. Griswold, Prentice Hall, 1990. 367 pages. $29.95. ISBN 0-13-447889-4. The book can be ordered from any full-service bookstore or from the Icon Project. The Icon Project pays postage in the United States, Canada, and Mexico. There is a $13 charge for shipping to other countries, which is by air mail. Orders placed with the Icon Project must be in US dollars to The University of Arizona with a check written on a bank in the United States. Orders also can be charged to MasterCard or Visa. Icon Project Department of Computer Science Gould-Simpson Building The University of Arizona Tucson, AZ 85721 602 621-2018 (voice) 602 621-4246 (FAX) Please address any questions to me, not icon-project or icon-group. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From ralph Sat May 5 08:37:23 1990 Date: Sat, 5 May 90 07:19:59 MST From: "Ralph Griswold" Message-Id: <9005051419.AA14654@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA14654; Sat, 5 May 90 07:19:59 MST To: icon-group Subject: Second Edition of the Icon programming language book Status: RO The second edition of The Icon Programming Language is now available. The second edition describes Version 8 of Icon (the first edition, published in 1983, describes Version 5.9). In addition to describing Version 8, the second edition is completely revised. Important concepts such as generators and string scanning are presented first, allowing subsequent material to be presented in ways more natural to Icon. New material includes a chapter on the details of running Icon programs, more (and harder) exercises, several large sample programs, and an expanded "mini-reference" to Icon's functions and operations. Here's the publication information: The Icon Programming Language, second edition. Ralph E. Griswold and Madge T. Griswold, Prentice Hall, 1990. 367 pages. $29.95. ISBN 0-13-447889-4. The book can be ordered from any full-service bookstore or from the Icon Project. The Icon Project pays postage in the United States, Canada, and Mexico. There is a $13 charge for shipping to other countries, which is by air mail. Orders placed with the Icon Project must be in US dollars to The University of Arizona with a check written on a bank in the United States. Orders also can be charged to MasterCard or Visa. Icon Project Department of Computer Science Gould-Simpson Building The University of Arizona Tucson, AZ 85721 602 621-2018 (voice) 602 621-4246 (FAX) Please address any questions to me, not icon-project or icon-group. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From icon-group-request@arizona.edu Wed May 9 01:49:45 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA01904; Wed, 9 May 90 01:49:45 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 9 May 90 01:43 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA24236; Wed, 9 May 90 01:09:52 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 9 May 90 01:50 MST Date: 9 May 90 08:08:06 GMT From: zaphod.mps.ohio-state.edu!sdd.hp.com!uakari.primate.wisc.edu!samsung!munnari.oz.au!mudla!ok@tut.cis.ohio-state.EDU Subject: RE: encompassing formalism (stealing from Prolog) Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <99A8C815087FA04FA2@Arizona.EDU> Message-Id: <3948@munnari.oz.au> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <9004301628.AA03522@sophist.uchicago.edu> In article <9004301628.AA03522@sophist.uchicago.edu>, goer@SOPHIST.UCHICAGO.EDU (Richard Goerwitz) writes: > The other formalism is more process-oriented (hence it is ironic that > Prolog is the main language which implements it). You say - > S -> NP VP *process*-oriented? It's just a declarative constraint on a node in a ======= constituent structure: a node labelled S may dominate two daughters, one labelled NP and one labelled VP, and the one labelled NP must precede the one labelled VP. Rules like this can be used as easily for generation as for parsing. > Particularly interesting for us here is the way Prolog implements this > for malism. Assuming the Prolog you use implements definite clause > grammar notation, you can say, Since there is a *freely* distributable DCG->Prolog rule-at-a-time translator which has been broadcast over the net, I think it's safe to say that any Prolog system which _hasn't_ got DCG rules isn't trying. The really exciting thing about grammar rules in Prolog is that (if you avoid cuts and non-logical features of Prolog) you have a non-directional declarative formalism which can be processed in a variety of ways: one and the same set of rules may be loaded directly as Prolog code, or parsed with a left- corner parser, or given to a chart parser, or (and this _happens_) used for generation. In each case one "compiles" rather easily from rules to Prolog. > I don't see any reason that this couldn't be implemented EASILY in Icon. > And Icon has some neat advantages over Prolog, > such as very good string handling. I think Icon is a _wonderful_ language. But it isn't supposed to be a declarative language. Most of the time when I write grammar rules in Prolog I am using them to _generate_ lists. Some of the rest of the time I don't know whether the code I write will generate or parse, and have no reason to care. In the case of PATR-II, people are writing large grammars where one and the same grammar is used for both parsing and generation, basically by switching control strategies in a kind of interpreter. Yes, Icon has very good string handling. Anyone with substantial string-handling problems would be crazy not to use Icon if they had the chance. But what has that to do with parsing? I think that the most important lesson I ever learned about SNOBOL was when I enthused about it to an anthropologist, who said "it can parse sequences of characters? Great! Can it parse sequences of words? No? Then it's no use to me!" That is one of (several) respects in which Icon improves dramatically on SNOBOL: you _can_ parse a sequence of words in Icon using the same basic mechanisms that you use for string scanning. I imagine that someone writing a parser for English (or Akkadian!) in Icon would represent a sentence as a list of (pointers to) dictionary entries, where a dictionary entry might be a record or quite possibly a set of "senses". > [still talking about grammar rules in Prolog] > There needs to be some research on just > how far these indexed grammars can represent natural languages. Prolog grammar rules have the full power of Turing machines, because the additional arguments may be arbitrarily complex. (So may the attribute/value matrices used in several current formalisms.) > Recently a formalism called PATR has been developed. PATR is based on the idea of "complex categories". The label on a node of the constituent structure is taken to be, not a simple name as in BNF, but an attribute/value matrix in which the traditional category label itself, if there is such a thing at all, is merely one of the attributes. For example, instead of the simple categories S(entence), V(erb)P(hrase), V(erb), it is common to talk about [cat=v,bar=2], [cat=v,bar=1], [cat=v,bar=0] in order to capture certain regularities. For example, there is something called the Head Feature Convention in GPSG, which basically says that in a meaningful rule X0 -> X1 ... Xn there is a distinguished daughter Xi called the "head" of the phrase and X0 and Xi have certain features in common (such as 'cat' but not 'bar'). Information is passed around in PATR by a method similar to unification. Icon can certainly implement this, but so can Pascal... The point is that it isn't directional. In one use of a rule, an attribute may be in effect copied from the parent to its head daughter; in another use of the same rule in the same parse, the same attribute may be in effect copied from the daughter to the parent. The fact that PATR-II has been implemented in Lisp as well as Prolog shows that backtracking built into the the implementation language is not necessary. Icon may well make a good base for such parsers and generators, but don't expect it to have any advantage over Lisp (other than size, and of course price...). From cargo@tardis.cray.com Wed May 9 08:23:13 1990 Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA19476; Wed, 9 May 90 08:23:13 MST Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34) id AA22200; Wed, 9 May 90 08:54:35 CDT Received: from zk.cray.com by hall.cray.com id AA20018; 4.1/CRI-3.12; Wed, 9 May 90 08:54:33 CDT Received: by zk.cray.com id AA06993; 3.2/CRI-3.12; Wed, 9 May 90 08:54:42 CDT Date: Wed, 9 May 90 08:54:42 CDT From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9005091354.AA06993@zk.cray.com> To: icon-group@cs.arizona.edu Subject: Icon 8.0 MS-DOS performance A user of some Icon programs for MS-DOS written in Icon 7.0 was asking me if V8.0 had any performance differences over V7.0. Anybody know? dsc From ralph Wed May 9 08:38:08 1990 Date: Wed, 9 May 90 08:38:08 MST From: "Ralph Griswold" Message-Id: <9005091538.AA20409@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA20409; Wed, 9 May 90 08:38:08 MST To: cargo@tardis.cray.com Subject: Re: Icon 8.0 MS-DOS performance Cc: icon-group In-Reply-To: <9005091354.AA06993@zk.cray.com> Version 8 is faster especially with large sets and tables. It also has somewhat smaller structures for lists, tables, sets, and records. Anyone using 7.0 under MS-DOS should upgrade to 8.0 for two reasons: 8.0 fixes several bugs and if you need help from the Icon Project, you'll have to be running 8.0. Ralph Griswold / Dept of Computer Science / Univ of Arizona / Tucson, AZ 85721 +1 602 621 6609 ralph@cs.arizona.edu uunet!arizona!ralph From icon-group-request@arizona.edu Wed May 9 18:01:30 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA07060; Wed, 9 May 90 18:01:30 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 9 May 90 18:01 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA21791; Wed, 9 May 90 17:51:41 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 9 May 90 18:03 MST Date: 8 May 90 17:24:21 GMT From: hpfcso!hpldola!schreck@hplabs.hp.COM Subject: RE: Reversible Assignment Problem Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <9920E9BE77FFA0E30F@Arizona.EDU> Message-Id: <1130001@hpldola.HP.COM> Organization: HP Elec. Design Div. -ColoSpgs X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <1990May1.151240.11020@dept.csci.unt.edu> / hpldola:comp.lang.icon / leff@dept.csci.unt.edu (Dr. Laurence L. Leff) / 9:12 am May 1, 1990 / > Reversible assignments to global variables inside procedures are > not being reversed. > The program below prints out three. It should print out one. Obviously, > the reversible assignment to z is not being reversed when test does not > lead to an eventual success. > Why does the reversing of the assignment not take place, and what would > make it do so? > The Icon Programming Language, Chapter 11, section 11.8.2 did not shed > any light on these issues. > global z > procedure test(i) > z<-z+1 > if i~=3 then fail > if i=3 then return 1 > end > > procedure main() > z:=0 > every i:=(1 to 10) do if test(i) then write("test succeeded ",i," ",z) > end There is no reason for reversing the assignment, because the choice point created by "<-" is never resumed. The expression z <- z+1 succeeds. To get the effect you're looking for, you could substitute the following for the body of test: return (z <- z+1, i = 3, 1) When the i = 3 expression fails, the assignment statement will be resumed and z will be restored to its original value. Backtracking is initiated, in this case, by a failure. From icon-group-request@arizona.edu Fri May 11 07:50:47 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA19955; Fri, 11 May 90 07:50:47 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Fri, 11 May 90 07:51 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA25339; Fri, 11 May 90 07:43:49 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Fri, 11 May 90 07:52 MST Date: 11 May 90 14:43:35 GMT From: usc!zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!corre@ucsd.EDU Subject: Boolean Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <97E3E4092D7FA0EDC9@Arizona.EDU> Message-Id: <3919@uwm.edu> Organization: University of Wisconsin-Milwaukee X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Iconists don't seem to use the word "boolean" very much. As an emigrant from Pascal-land I guess I notice this. I suppose the reason is partly than boolean concepts are written into the fabric of the language by the succeed/fail mechanism and partly that Icon allows constructs that belie strict structuring but are sensible and useful. I have in mind such things as break and stop() which render unnecessary the typical WHILE.....AND NOT FINISHED of Pascal. Occasionally though, I find it desirable to establish a global variable which toggles between having a value and having the null value. For example, I have a program which enables printed output in mixed Hebrew and English, the input file being in normal English and transcribed Hebrew. An arbitrary symbol (I used tilde) tells the program that a change has taken place, and this can be recorded by roman := 1 or roman := &null In this way the program always "knows" what mode it is in, as it can always check \roman or /roman and maybe change it while it is about it: if (\roman := &null) Pascal has a rather neat ROMAN := TRUE ... ROMAN := NOT ROMAN which toggles a boolean variable. I have represented this in Icon by roman := 1 .... roman :=: other (where other was previously undefined.) Maybe some of you have evolved better ways of handling such issues. -- Alan D. Corre Department of Hebrew Studies University of Wisconsin-Milwaukee (414) 229-4245 PO Box 413, Milwaukee, WI 53201 corre@csd4.csd.uwm.edu From utah-cs!boulder!ncar.UCAR.EDU!oddjob!sophist.uchicago.edu.goer!zenu! Sun May 13 20:49:14 1990 Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA03057; Sun, 13 May 90 20:49:14 MST Received: from boulder.UUCP by cs.utah.edu (5.61/utah-2.11-cs) id AA02629; Sun, 13 May 90 21:48:54 -0600 Received: by boulder.Colorado.EDU (cu-hub.890824) Received: by ncar.ucar.EDU (5.61/ NCAR Central Post Office 04/10/90) id AA10385; Sun, 13 May 90 21:48:28 MDT Received: from tank.uchicago.edu by oddjob.uchicago.edu Sun, 13 May 90 22:26:29 CDT Received: from sophist.uchicago.edu by tank.uchicago.edu Sun, 13 May 90 22:27:35 CDT Return-Path: Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA21993; Sun, 13 May 90 22:23:05 CDT Received: by sophist.uchicago.edu (smail2.5) id AA00361; 13 May 90 20:06:09 PDT (Sun) Subject: determinism - how? To: icon-group@arizona.edu Date: Sun, 13 May 90 20:06:09 PDT X-Mailer: ELM [version 2.2 PL0] Message-Id: <9005132006.AA00361@sophist.uchicago.edu> From: utah-cs!boulder!sophist.uchicago.edu!goer (Richard Goerwitz qua goer.) This question is not strictly related to Icon, but since many of those reading this group are interested in parsing strategies (whether for natural or "artificial" languages) I felt it reasonable to seek some guidance here. Let me add that I'd enjoy seeing Icon code as part of any response that might appear. Let's say we have a regular expression like a*aab (I use plain ol' regular expressions because a previous discussion has shown me that people utilize different notational conventions, depending on whether their training is primarily computational or linguistic). I'd figure that the above regular expression would translate into a transition network having an initial state (call it zero), with two arcs leading from it, the one labeled "a" (leading back to itself) and the other labeled "aa" (leading to state 1). From state one would be another arc leading to the final state (state 2). This arc would be labeled "b." Problem: The resulting transition network will not convert into a deterministic finite state automaton. In more concrete terms, if you were to turn a*aab loose on a string beginning with "aa," you wouldn't know that the arc labeled "aa" lead up a "false path" until the automaton reached the next state (1), and attempted to cross over to state 2 (via "b"). Normally, when I am confronted with this sort of situation, I just laugh and use a pushdown automaton of some sort. Clearly, though, it is possible to make this into a deterministic automaton. All you gotta do is turn a*aab into aaa*b. I'd just rearrange everything I run into in this manner were it not for the fact that things get considerably nastier when you get involved in things like (a*|b)(aa|b|c). Is there some conversion method I am overlooking? NB: I'm coming at this from the standpoint of a student of the humanities, and so if I am given references to computational journals, chances are that I'll have more difficulty using them than a bit of sample Icon (or, for that matter, C, Prolog, or even Lisp) code. I admit that I prefer to read Icon code, though (hence my posting to this group). Beggars can't be choosers, though, I guess, so I will gladly accept any suggestions, references, or even flames that come my way. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From tenaglia@fps.mcw.edu Mon May 14 10:23:53 1990 From: tenaglia@fps.mcw.edu Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA11961; Mon, 14 May 90 10:23:53 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.06) with UUCP id AA21961; Mon, 14 May 90 12:16:59 EDT Received: by uwm.edu; id AA02910; Mon, 14 May 90 10:58:24 -0500 Date: Mon, 14 May 90 10:58:24 -0500 Message-Id: <9005141558.AA02910@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Mon, 14 May 90 10:58:58 CDT Apparently-To: rutgers!uwvax!cs.arizona.edu!icon-group > Iconists don't seem to use the word "boolean" very much. As an > emigrant from Pascal-land I guess I notice this. I suppose the > reason is partly than boolean concepts are written into the fabric > of the language by the succeed/fail mechanism and partly that Icon > allows constructs that belie strict structuring but are sensible and > useful. I have found 2 useful approaches to it. The quick and dirty method I use for throw away one shot deals works like this. global toggle ... toggle := 1 ... if change(state) then toggle := -toggle (toggle = 1) | map(item,&lcase,&ucase) This fragment is handy for things like converting source to all upper case except for quoted strings. To accomplish the same thing in a more permanent program, one might use named variables for readibility. global true,false,condition ... true := 1 false := -1 ... condition := true ... if chage(state) then condition := -condition (condition = true) | map(item,&lcase,&ucase) ... or ... case condition of { true : condition := false false: condition := true default: stop("Logic has ceased to function!") } After looking at these, it becomes obvious that they are the same thing. If one wanted to get very tricky with bit masks and 'exclusive or', that might be way to handle large amounts of booleans. I haven't gotten latest Icon book yet, so I don't know if the bit operations include a bitest() procedure which helps process binary bit data. Here's how it might work. procedure bitest(bitpat,boolnum) local i,count count := 0 (*bitpat <= *boolnum) | (bitpat := right(bitpat,*boolnum,"0")) (*bitpat >= *boolnum) | (boolnum := right(boolnum,*bitpat,"0")) every i := 1 to *bitpat do if bitpat[i] == boolnum[i] then count +:= 1 if count = 0 then return "none" if count = *bitpat then return "full" return "some" end Returns the degree of bitmatch. Whether 'full', 'none', or 'some'. Or else maybe it could return a list containing the position numbers of matches? Or maybe 0 - 0 matches might not be included? Any other nifty variations? Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From nowlin@iwtqg.att.COM Mon May 14 12:59:15 1990 Resent-From: nowlin@iwtqg.att.COM Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA24666; Mon, 14 May 90 12:59:15 MST Received: from att-in.att.com by Arizona.EDU; Mon, 14 May 90 12:55 MST Resent-Date: Mon, 14 May 90 12:57 MST Date: Mon, 14 May 90 13:57 CDT From: nowlin@iwtqg.att.COM Subject: RE: boolean Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Resent-Message-Id: <955DBB548ABFA108D4@Arizona.EDU> Message-Id: <955E203F329FA1034C@Arizona.EDU> Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268) X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu > > Iconists don't seem to use the word "boolean" very much. As an > > emigrant from Pascal-land I guess I notice this. I suppose the > > reason is partly than boolean concepts are written into the fabric > > of the language by the succeed/fail mechanism and partly that Icon > > allows constructs that belie strict structuring but are sensible and > > useful. > > I have found 2 useful approaches to it. The quick and dirty method I use > for throw away one shot deals works like this. > > global toggle > ... > toggle := 1 > ... > if change(state) then toggle := -toggle > (toggle = 1) | map(item,&lcase,&ucase) > > This fragment is handy for things like converting source to all upper case > except for quoted strings. Icon doesn't need booleans. Unset versus set or &null versus some value handle the problem. I know I saw someone say this before but what the hay. A COMPLETE and SIMPLISTIC example for converting everything not enclosed in double quotes to upper case follows: procedure main() chgcase := 1 tmp := &null while inline := read() do { outline := "" inline ? { while part := tab(upto('"')) do { if \chgcase then outline ||:= map(part,&lcase,&ucase) else outline ||:= part outline ||:= move(1) chgcase :=: temp } if \chgcase then outline ||:= map(tab(0),&lcase,&ucase) else outline ||:= tab(0) } write(outline) } end This example is to illustrate set and unset used as boolean and is not a complete solution. Notice that this program fails when used to print itself. There are other design problems too. Fix it? > If one wanted to get very tricky with bit masks and 'exclusive or', that > might be way to handle large amounts of booleans. > > Returns the degree of bitmatch. Whether 'full', 'none', or 'some'. Or else > maybe it could return a list containing the position numbers of matches? Or > maybe 0 - 0 matches might not be included? Any other nifty variations? Not like any boolean I ever saw. I thought boolean implied on or off? Jerry Nowlin (...!att!iwtqg!nowlin) From gudeman Mon May 14 14:08:20 1990 Date: Mon, 14 May 90 14:08:20 MST From: "David Gudeman" Message-Id: <9005142108.AA01957@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA01957; Mon, 14 May 90 14:08:20 MST To: icon-group@cs.arizona.edu In-Reply-To: nowlin@iwtqg.att.COM's message of Mon, 14 May 90 13:57 CDT <955E203F329FA1034C@Arizona.EDU> Subject: boolean How about using csets for booleans? After all, they _do_ form a boolean algebra. Just subsitute &cset for TRUE '' for FALSE ++ for AND ** for OR ~ for NOT Of course, this isn't very efficient... More seriously, Pascal boolean values and operations represent an inadequate attempt to force predicates to be functions. This is because Pascal does not support true predicates. Icon doesn't support true predicates either, but Icon's succeed/fail is closer to the pure concept of valid/invalid than are Pascal's booleans. There _are_ some applications for having true/false as values, but these applications are fairly rare, and the paradigm is easily simulated by other types (as has been pointed out before). From cargo@tardis.cray.com Mon May 14 14:41:08 1990 Received: from timbuk.cray.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA04417; Mon, 14 May 90 14:41:08 MST Received: from hall.cray.com by timbuk.CRAY.COM (4.1/CRI-1.34) id AA25102; Mon, 14 May 90 16:40:50 CDT Received: from zk.cray.com by hall.cray.com id AA16667; 4.1/CRI-3.12; Mon, 14 May 90 16:40:47 CDT Received: by zk.cray.com id AA11027; 3.2/CRI-3.12; Mon, 14 May 90 16:41:01 CDT Date: Mon, 14 May 90 16:41:01 CDT From: cargo@tardis.cray.com (David S. Cargo) Message-Id: <9005142141.AA11027@zk.cray.com> To: icon-group@cs.arizona.edu Subject: boolean I have found that I use one of two methods of dealing with "boolean" operations in Icon. One is the aforementioned use of null values in a variable. (I had a long time trying to memorize what operation the / and \ operators performed. I knew that they were for testing for null and nonnull values, but I could never remember which did what. I finally developed this mnemonic device. / slopes Up and tests for Undefined; \ slopes Down and tests for Defined. I recognize that defined and undefined are not the exact Icon concepts, but at least I can remember the operations now.) The other way I deal with boolean operations is to use null records: record true() record false() ... flag := true() ... if type(flag) == "true" then ... I have also seen someone learning to program in Icon use global false, true ... false := "false" true := "true" ... etc. dsc From icon-group-request@arizona.edu Tue May 15 10:01:48 1990 Resent-From: icon-group-request@arizona.edu Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA11683; Tue, 15 May 90 10:01:48 MST Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Tue, 15 May 90 09:58 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA06788; Tue, 15 May 90 09:54:45 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Tue, 15 May 90 10:02 MST Date: 15 May 90 16:54:40 GMT From: usc!zaphod.mps.ohio-state.edu!uwm.edu!csd4.csd.uwm.edu!corre@ucsd.EDU Subject: RE: boolean Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <94AD16B4707FA1085A@Arizona.EDU> Message-Id: <3979@uwm.edu> Organization: University of Wisconsin-Milwaukee X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu References: <9005142141.AA11027@zk.cray.com> In article <9005142141.AA11027@zk.cray.com> cargo@TARDIS.CRAY.COM (David S. Cargo) writes: >I have found that I use one of two methods of dealing with "boolean" >operations in Icon. One is the aforementioned use of null values >in a variable. (I had a long time trying to memorize what operation >the / and \ operators performed. I knew that they were for testing >for null and nonnull values, but I could never remember which did >what. I had the same problem. I decided that the "natural" state of a variable is null and the "natural" slash succeeds therefor. (Who ever saw a backslash before they saw a computer?) -- Alan D. Corre Department of Hebrew Studies University of Wisconsin-Milwaukee (414) 229-4245 PO Box 413, Milwaukee, WI 53201 corre@csd4.csd.uwm.edu From gmt Tue May 15 10:18:06 1990 Date: Tue, 15 May 90 10:18:06 MST From: "Gregg Townsend" Message-Id: <9005151718.AA13161@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA13161; Tue, 15 May 90 10:18:06 MST To: icon-group Subject: mnemonics for / and \ I remember the meanings of / and \ by the slant of the first consonant in: / iZnull \ Notnull I read that first in this group, but I don't know who to credit. Gregg Townsend / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721 +1 602 621 4325 gmt@cs.arizona.edu 110 57 16 W / 32 13 45 N / +758m From goer@sophist.uchicago.EDU Wed May 16 10:30:28 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA10601; Wed, 16 May 90 10:30:28 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Wed, 16 May 90 10:28 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 16 May 90 12:27:00 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA25307; Wed, 16 May 90 12:22:22 CDT Resent-Date: Wed, 16 May 90 10:28 MST Date: Wed, 16 May 90 12:22:22 CDT From: Richard Goerwitz Subject: problem: Use records, tables, or lists? Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <93E0453CE8FFA0871E@Arizona.EDU> Message-Id: <9005161722.AA25307@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu I have to make a very basic decision about how I'm going to imple- ment a lexicon in some language processing I'm doing. It's one of those cases where many solutions present themselves, and the prob- lem is which solution will really prove fastest and most flexible in the long run. It'll take me a moment to explain, so bear with me.... Let's say that I have a program which creates a little database. Lets say that, at run time, I'm reading in a key, and maybe between five and ten fields of information for that key: language: part_of_speech, noun number, singular gender, person, 3 etc. Now I don't want to use a record, because the record is fixed in terms of the number of fields it has. Though access is very fast, and I'd LIKE to use a record, I can't rely on knowing just what and how many fields will be filled at run-time. I also might need to augment the record at run-time. The natural idea of creating a table with the keys being lexical items (e.g. "language"), and the values being a record, won't work. How about we keep the table, but instead of using a record as each key's value, use instead another table? I dunno. What if the lexi- con is a couple of thousand words long? Will thousands of tables with just five or ten elements work out? Maybe someone familiar with the internals of Icon heap allocation and memory management will offer a guess as to whether this will ultimately prove a pro- ductive method. Like records, tables at least offer easy access via fields or keys. Unlike records, though, they are easily manipulated at run-time. This is their big advantage. Another possibility is to use a list to store the various fields' values (["part_of_speech.verb","number.singular"], or the like). It would be fairly easy to extract the values (untested!!!): procedure Get_Value(key) return !List ? (tab(find("."))==key,move(1),tab(0)) end The lists would be fairly small, but changing the values of fields would become non-trivial, and perhaps a bit slow. So would simply accessing them. The above procedure is going to take some time every time it's called. From previous experience with Icon, I'd say it'd be less than a twentieth the speed of a simple record access. Anyway, my question is which of these various solutions might in the long run prove best. The record solution isn't workable. The tables and lists are fine. I don't know which will prove better in terms of memory/speed. Nor do I know whether there are other solutions I might use (other than, say, using a set rather than a list, so that insertions are not duplicating already existing material - but then how much overhead is there for sets, over and above what I would expect for a list?). Any suggestions would be welcome. Please feel free to write me or post. I'm not *only* interested in high-power comments about the nature of the underlying implementation. I'm sure that there are lots of things I haven't thought of. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From cjeffery Wed May 16 12:24:47 1990 Resent-From: "Clinton Jeffery" Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA19678; Wed, 16 May 90 12:24:47 MST Received: from megaron.cs.Arizona.EDU by Arizona.EDU; Wed, 16 May 90 12:26 MST Received: from caslon.cs.arizona.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA19652; Wed, 16 May 90 12:24:08 MST Received: by caslon; Wed, 16 May 90 12:24:07 mst Resent-Date: Wed, 16 May 90 12:26 MST Date: Wed, 16 May 90 12:24:07 mst From: Clinton Jeffery Subject: problem: Use records, tables, or lists? Resent-To: icon-group@cs.arizona.edu To: goer@sophist.uchicago.EDU Cc: icon-group@arizona.edu Resent-Message-Id: <93CFC6EF2E5FA11CD4@Arizona.EDU> Message-Id: <9005161924.AA08082@caslon> In-Reply-To: Richard Goerwitz's message of Wed, 16 May 90 12:22:22 CDT <9005161722.AA25307@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: goer@sophist.uchicago.EDU X-Vms-Cc: icon-group@Arizona.edu Richard Goerwitz asks: what is the best Icon representation for lexicon entries which consist of a variable number of name-attribute pairs? Records are Icon's fastest group data type. Unfortunately, records do not work for everything. A table of tables would provide acceptable speed but it would be Really Really space-consuming. Use it only if your lexical entries have a LOT of fields. A table of lists would use way less space, but run slower. If your lexicon entries have a largely common set of field names, I would suggest a hybrid approach. Declare a record like record lexentry( part_of_speech, number, gender, person , other ) And allocate a list for the "other" field only for those entries which have exotic fields. Records use way less space than lists, so you might declare ALL of the common fields, and use the "other" field only for really weird words. You can hide the hybrid approach with a few procedures similar to the one you suggested, or better yet write an Idol class, and share it with me! Here's a start: procedure Get_Value(rec,key) return case key of { "part_of_speech": rec.part_of_speech "number": rec.number "gender": rec.gender "person": rec.person default: (!(\(rec.other)) ? (tab(find("."))==key,move(1),tab(0))) } end This is still a linear search through a bunch of strings, but when you know you are accessing one of the builtin fields, you can just access the field directly by name, e.g. rec.part_of_speech Hope this helps, Clint From nowlin@iwtqg.att.COM Wed May 16 13:17:17 1990 Resent-From: nowlin@iwtqg.att.COM Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA24036; Wed, 16 May 90 13:17:17 MST Received: from att-in.att.com by Arizona.EDU; Wed, 16 May 90 13:12 MST Resent-Date: Wed, 16 May 90 13:15 MST Date: Wed, 16 May 90 14:17 CDT From: nowlin@iwtqg.att.COM Subject: RE: problem: Use records, tables, or lists? Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Resent-Message-Id: <93C8E778833FA11D54@Arizona.EDU> Message-Id: <93C9678E1FDFA11951@Arizona.EDU> Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268) X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu > Let's say that I have a program which creates a little database. > Lets say that, at run time, I'm reading in a key, and maybe between > five and ten fields of information for that key: > > language: part_of_speech, noun > number, singular > gender, > person, 3 > etc. > > Now I don't want to use a record, because the record is fixed in > terms of the number of fields it has. Though access is very fast, > and I'd LIKE to use a record, I can't ... > ... > Anyway, my question is which of these various solutions might in > the long run prove best. The record solution isn't workable. The > tables and lists are fine. I don't know which will prove better in > terms of memory/speed. ... I think the first rule of Icon programming should be: 1) Don't worry about efficiency until you get it to work. One of the beauties of Icon is that you can get this kind of application going in a flash. If you find efficiency in speed or memory a problem then the real decision should be if you want to leave it in Icon so it's easy to maintain (or if Icon is your only choice) or if you want to convert it to something like C and really make it efficient. This application sounds like a real good use for objects with inheritance. A base class of words could have attributes like length and frequency of use. The sub-class noun could have attributes like plural or singular and the sub-class verb could have attributes like tense. You get the idea. I've done some pseudo class stuff like this with records of records in Icon. You can included function as fields in records that use other fields in the record as arguments. record word(word,length,freq,details) record noun(number,gender,...) record verb(tense,object,...) With this data layout a noun or verb record could be assigned to the details member of a word record and another word.noun record could be assigned to the object member of a verb record? This example assumes I know something about English grammar which could suffer the fate of most assumptions. Maybe the Idol language (which I've heard about but haven't looked into yet...I'm busy) actually has this kind of feature built in. Does it? Jerry From tenaglia@fps.mcw.edu Wed May 16 14:09:46 1990 Received: from RUTGERS.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA29152; Wed, 16 May 90 14:09:46 MST Received: from uwm.UUCP by rutgers.edu (5.59/SMI4.0/RU1.3/3.06) with UUCP id AA09621; Wed, 16 May 90 16:45:02 EDT Received: by uwm.edu; id AA08551; Wed, 16 May 90 15:38:13 -0500 Message-Id: <9005162038.AA08551@uwm.edu> Received: from mis by fps.mcw.edu (DECUS UUCP w/Smail); Wed, 16 May 90 14:59:07 CDT Received: by mis.mcw.edu (DECUS UUCP w/Smail); Wed, 16 May 90 14:41:06 CDT Date: Wed, 16 May 90 14:41:06 CDT From: Chris Tenaglia - 257-8765 To: icon-group@cs.arizona.edu Subject: Lexicon Database Concerning the generation of a lexicon, and what are the best structures,... I think I have played with similar concepts in a different application a long time ago. I think a table is nice. The word in question is the entry. To the assigned value a long delimted string with a fixed rule tree for each part of language. For example : vocab := table() vocab["car"] := "noun,singular,neuter,A motorized vehicle" vocab["cows"]:= "noun,plural,female,Female Cattle" vocab["red"] := "adjective,Color" vocab["a"] := "article,singular,Indefinite article for one" vocab["any"] := "article,plural,Indefinate article for many" vocab["paint"]:="verb,transitive,Apply a colored fluid" vocab["jump"]:= "verb,intransitive,Hop over" Depending on the eventual application, this may or may not work. One tranvesty generator I wrote, used separate lists loaded from files for each part of the language. It was pretty random and useless. The structure above, is more flexible, and easy to parse. Chris Tenaglia (System Manager) Medical College of Wisconsin 8701 W. Watertown Plank Rd. Milwaukee, WI 53226 (414)257-8765 tenaglia@mis.mcw.edu From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS Wed May 16 19:28:36 1990 Received: from umich.edu by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA19823; Wed, 16 May 90 19:28:36 MST Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0) id AA18730; Wed, 16 May 90 22:28:12 -0400 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Wed, 16 May 90 22:27:03 EDT Date: Wed, 16 May 90 18:36:08 EDT From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu To: icon-group@cs.arizona.edu Message-Id: <225766@Wayne-MTS> Subject: Booleans Booleans and success/failure play complementary roles in a programming language. You can think of a boolean as capturing and preserving the result of success/failure. It's too bad that Icon doesn't have natural boolean operators. Of course there are umpteen ways to fake them, but none of those ways are quite as nice as the standard boolean operators (where I include short-circuit "and" and "or" among the standard boolean operators, as they are in C). But I'd much rather have a language like Icon that has success/failure but lacks true booleans than a language like Pascal that lacks success/failure but has true booleans. My approach in SPLASH is to provide both. The ? operator converts success/failure to true/false, while the "is" operator converts true/false to success/failure. Here's an example of a SPLASH generator that illustrates these operators. The generator merges the output of a sequence of other generators. merge: generic(t) process(stream(*): generator(t)) yield t is declare found: boolean in do { % get one element from each stream found := false for i in stream'range do found |:= ?yield *stream(i) % yield is like suspend } until is ~found end merge Paul Abrahams From markc%essex.ac.uk@NSFnet-Relay.AC.UK Thu May 17 00:26:16 1990 Received: from nsfnet-relay.ac.uk by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA04423; Thu, 17 May 90 00:26:16 MST Received: from sun.nsfnet-relay.ac.uk by vax.NSFnet-Relay.AC.UK via Janet with NIFTP id aa01072; 17 May 90 8:03 BST Received: from ese by servax0.sx.ac.uk SMTP/TCP id aa15922; 17 May 90 8:20 WET DST From: Clark Mark Date: Thu, 17 May 90 08:20:24 +0100 Message-Id: <794.9005170720@ese.essex.ac.uk> To: icon-group@cs.arizona.edu Subject: Withdrawl Please remove my name from your icon-group, thanks. From nowlin@iwtqg.att.COM Thu May 17 05:50:28 1990 Resent-From: nowlin@iwtqg.att.COM Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA18791; Thu, 17 May 90 05:50:28 MST Received: from att-in.att.com by Arizona.EDU; Thu, 17 May 90 05:49 MST Resent-Date: Thu, 17 May 90 05:51 MST Date: Thu, 17 May 90 07:37 CDT From: nowlin@iwtqg.att.COM Subject: RE: problem: Use records, tables, or lists? Resent-To: icon-group@cs.arizona.edu To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Resent-Message-Id: <933DBD555FFFA12404@Arizona.EDU> Message-Id: <933E0EF7CFFFA1195A@Arizona.EDU> Original-From: iwtqg!nowlin (Jerry D Nowlin +1 312 979 7268) X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: arizona.edu!icon-group%att.UUCP@cs.arizona.edu Regarding records and their use in holding features such as number, gender, word-class: I've looked over the solutions offered, and have found them tempting. Whatever I end up doing, I'll post the code. It'll end up a left corner bottom-up parser (that way I can avoid the problem of left recursion which so often comes up in natural languages). I've decided, at least provisionally, against the "record" solution. Let me explain why. The rea- son might not be obvious to people working primarily with regular languages or with programming languages that can be handled with a deterministic pushdown automaton. If I use records, I'll have to do things like decide what the most often used categories will be, and then enforce a con- stant spelling across all files (i.e. no sing, s, singular - just one of them). What's terrible about this is that, even if I do remember all the naming conventions, I'll be sad- dled with lots of extraneous record fields. Arabic and Ugar- itic, say, will use a dual category. This will be superflu- ous for English, German, Hebrew, etc. (but not, say, clas- sical Greek). Likewise, gender will be important for French, German, Latin, Arabic, less so for Dutch, and hardly at all for English. What I'm attempting to illustrate is that, if the system is to achieve some theoretical elegance (and a nice, clean look, too), it's not going to be desirable to spend any effort trying to predict what categories will be most often used. Even if we were talking about a single- language system, many categories would not come into play for a given range of constructions. It is true that, at some point, especially when dealing with a specific set of problems within a specific language, I *might* find it sensible to introduce records. However, as Jerry Nowlin pointed out, at this stage it is important to make it work rather than start worrying too much about speed. This doesn't mean that speed is not a consideration. It is important that I not lose sight of it. Memory requirements are also important (which is why I can't just use tables and forget it). My tendency right now is to use lists or sets. But I dunno. I do know that records will take me way out of line with what the system itself needs in order to operate cleanly and elegantly. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From esquire!info8!yost@cmcl2.NYU.EDU Thu May 17 08:02:51 1990 Received: from NYU.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA25200; Thu, 17 May 90 08:02:51 MST Received: by cmcl2.NYU.EDU (5.61/1.34) id AA07024; Thu, 17 May 90 11:03:40 -0400 Received: from info8 by ESQUIRE.DPW. id aa21254; 17 May 90 10:57 EDT Received: from localhost by info8. (4.0/SMI-4.0) id AA02151; Thu, 17 May 90 10:59:56 EDT Message-Id: <9005171459.AA02151@info8.> From: yost@DPW.COM (Dave Yost) Reply-To: yost@DPW.COM (Dave Yost) To: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu Cc: icon-group@cs.arizona.edu, yost@cmcl2.NYU.EDU Subject: Re: Booleans In-Reply-To: Your message of Wed, 16 May 90 18:36:08 EDT. <225766@Wayne-MTS> Phone: +1 212-266-0796 (Voice Direct Line) Fax: +1 212-266-0790 Organization: Davis Polk & Wardwell 1 Chase Manhattan Plaza New York, NY 10005 Date: Thu, 17 May 90 10:59:53 -0400 Sender: yost@info8.NYU.EDU > Booleans and success/failure play complementary roles in a programming > language. You can think of a boolean as capturing and preserving the result > of success/failure. It's too bad that Icon doesn't have natural boolean > operators. Of course there are umpteen ways to fake them, but none of those > ways are quite as nice as the standard boolean operators (where I include > short-circuit "and" and "or" among the standard boolean operators, as they are > in C). But I'd much rather have a language like Icon that has success/failure > but lacks true booleans than a language like Pascal that lacks success/failure > but has true booleans. I agree! Is there even a convention in Icon on how to store true/false state? I've used both (\x) and (x = 1) and (x ~= 0) as true/false indicators. The last two (last one preferred) I have found better because I get the extra benefit of a runtime error if I try to test x before it is set -- which can also be *not* what you want sometimes. Mostly I would rather have a syntax that says what I mean than to use a nonstandardized fake. Someone reading the code (x ~= 0) has to look around to see if x can take on more than two values. --dave yost yost@dpw.com or uunet!esquire!yost Please ignore the From or Reply-To fields above, if different. From utah-cs!cs.utexas.edu!yale!LRW.COM!lrw!leichter Thu May 17 08:39:49 1990 Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA28306; Thu, 17 May 90 08:39:49 MST Received: from cs.utexas.edu by cs.utah.edu (5.61/utah-2.11-cs) id AA17130; Thu, 17 May 90 09:39:21 -0600 Posted-Date: Thu, 17 May 90 11:08:45 EDT Received: from ames.arc.nasa.gov by cs.utexas.edu (5.61/1.62) id AA16008; Thu, 17 May 90 10:35:48 -0500 Received: from harvard.harvard.edu by ames.arc.nasa.gov (5.61/1.2); Thu, 17 May 90 08:35:30 -0700 Received: by harvard.harvard.edu (5.54/a0.25) (for cs.utexas.edu!utah-cs!arizona!CS.Arizona.EDU!icon-group) id AA28867; Thu, 17 May 90 11:35:11 EDT Received: from lrw.UUCP by BULLDOG.CS.YALE.EDU via UUCP; Thu, 17 May 90 11:08:10 EDT Message-Id: <9005171508.AA01693@BULLDOG.CS.YALE.EDU> Received: by lrw.UUCP (DECUS UUCP w/Smail); Thu, 17 May 90 11:08:45 EDT Date: Thu, 17 May 90 11:08:45 EDT From: To: CS.Arizona.EDU!icon-group@cs.utexas.edu Subject: RE: problem: Use records, tables, or lists? X-Vms-Mail-To: IN::"icon-group@CS.Arizona.EDU" There's actually a standard technique to deal with this kind of problem. Stated more generally: You have a set of pairs of names and (attribute,value) pairs; for example: {("green",{("part","adjective"),("plural","none")}), ("dog",{("part","noun")})} You are viewing this as a two-level hierarchy: First you map from the name ("green") to the set of pairs {(part,adjective),(plural,none)}; then within that set of pairs, you map an attribute ("part") to a value ("adjective"). The problem with this approach, as you've noted, is that the lower level of the hierarchy is difficult to implement efficiently: It consists of many small collections, and you are forced to pay the overhead of a data structure per collection. So, the trick is to amortize the cost by collapsing the hierarchy. Logically, this involves changing the collection above to the following: {(("green","part"),"adjective"),(("green","plural"),"none"), (("dog","part"),"noun")} That is, where you previously had two functions of one argument, word-to-attribute-list and attribute-to-value, you now have a single function, word-and-attribute-to-value. In Icon terms, this is very simple: Make a single table whose keys consist of name-attribute pairs, which could be records or simply strings (e.g., "green:part"), and whose values are the values you want associated. The advantage of this approach is that you pay the cost of table maintenance only once. As long as large tables are efficiently implemented, this method will work very well. What you lose if you use this approach is the ability to pick up all the attribute-value pairs associated with a single word: There is no efficient way to extract from a table everything whose key is of the from "green:". Depending on your application, this may not be an issue at all, or it may be one that you can work around. For example, you can maintain a separate table in which the keys are words and the values are lists of attributes. This will work well unless you need to look up all the attributes very often, or you change the attribute lists associated with a given word frequently. There are, of course, many possible optimizations. For example, if there is a list of common attributes which almost all words have values for, it is probably better to store those in a separate record, and only store the extras in the table. Optimizing the common case often gives you most of the perfor- mance advantage with very little of the extra cost. -- Jerry From gmt Thu May 17 10:07:02 1990 Date: Thu, 17 May 90 10:07:02 MST From: "Gregg Townsend" Message-Id: <9005171707.AA04752@megaron.cs.arizona.edu> Received: by megaron.cs.arizona.edu (5.59-1.7/15) id AA04752; Thu, 17 May 90 10:07:02 MST To: icon-group Subject: memory requirements of tables Small tables in Icon v8 use *much* less space than in version 7. This doesn't necessarily make them the best approach (small records are still cheaper), but don't make any decisions based on the old version's behavior. Gregg Townsend / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721 +1 602 621 4325 gmt@cs.arizona.edu 110 57 16 W / 32 13 45 N / +758m From goer@sophist.uchicago.EDU Fri May 18 13:49:44 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Maggie.Telcom.Arizona.EDU by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA00357; Fri, 18 May 90 13:49:44 MST Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Fri, 18 May 90 13:48 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Fri, 18 May 90 15:21:45 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA28592; Fri, 18 May 90 15:14:27 CDT Resent-Date: Fri, 18 May 90 13:48 MST Date: Fri, 18 May 90 15:14:27 CDT From: Richard Goerwitz Subject: deterministic automata Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <923202F42DFFA12DCC@Arizona.EDU> Message-Id: <9005182014.AA28592@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu Thanks to everyone who responded regarding converting nondeterministic finite state automata to deterministic ones. I read and thought about the responses for a couple of days. The dfsa would be too big, I gathered - at least in some instances. And rarely would all of it get used. So I just built a program that makes the nfsa, and then builds the dfsa as it goes, removing epsilon moves, and collapsing states together on the fly. It is true that this system works out conceptually very cleanly. I just found that it is slow, slow, slow. The reason is that, in order for it to work, I could not play games that I use with fsa's and pushdown automata, namely utilize Icon's existing functions like find, any, match, etc. I had to break each step down into a node and arcs labeled with a single character. Only in this man- ner is it easy to find out, after removing epsilon moves, and multiplying out the states, whether two or more arcs with the same label diverge from a single state(-set). I also have not found a good, clean way of using Icon to make references to state-sets clean. It's not a matter of making table which stores simple integer-labeled nodes. We're not talking about states that can be labeled with simple integers anymore, but rather sets of integers (corresponding to states in the nfsa). Anyway, although converting an nfsa to a dfsa proved pretty easy, I have yet to make it really efficient within Icon. Having said this, let me ask if anyone else has played with dfsa's in Icon. Has anyone found a clean way of referring to (and checking for the previous existence of) sets of states? I don't even have a good way of, say, collecting final state-sets together and storing them. I have to convert them to some other data type (usually a string), and then store them in this form. Lotsa space. I also have no way of easily getting back to using any() for what corresponds to regular expressions such as [a-z]. I'm sure I could somehow collect the arcs, check to see whether they point to the same state-sets (with the attendant prob- lems noted above regarding uniqueness of sets), and then make them all point to a string equivalent to the set which they all lead to, and then enter that string equivalent in a table.... My mind's beginning to wander as I try to fathom the overhead. Is this really a job I should reasonably only be doing in C, or am I just mis- sing some shortcuts? I have, by the way, made the nfsa work all by itself pretty nicely and efficiently in Icon (small, too - just a couple of table entries). It's the dfsa that's killing me. -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer From nevin1@ihlpb.att.com Fri May 18 16:18:47 1990 Message-Id: <9005182318.AA10442@megaron.cs.arizona.edu> Received: from att-in.att.com by megaron.cs.arizona.edu (5.59-1.7/15) via SMTP id AA10442; Fri, 18 May 90 16:18:47 MST From: nevin1@ihlpb.att.com Date: Fri, 18 May 90 17:53 CDT Original-From: ihlpb!nevin1 (Nevin J Liber +1 708 979 4751) To: att!cs.arizona.edu!icon-group Cc: Richard Goerwitz Subject: Re: deterministic automata Richard Goerwitz writes: >We're not talking about >states that can be labeled with simple integers anymore, but rather sets >of integers (corresponding to states in the nfsa). But you CAN still use integers (especially since Icon now has arbitrarily large integers); all you need to do is encode the states. You could use a binary encoding scheme, where each position corresponds to a state in the NFA (non-deterministic finite automata). Suppose that your DFA (deterministic finite automata) state is [0,2,3]. Converting this to a binary string, you get (from most significant bit to least significant bit) "1101" (bits 3, 2, and 0 are "on"). This is easily converted to 13, an integer (in Icon, try converting with integer("2r" || "1101"), or the "uglier" 2^3 + 2^2 +2^0), or you can keep it in binary string form if that is more convenient. [Note: for the string form, it may or may not prove useful to pad it out with 0's on the left so all the strings have the same length, which would be equal to the number of states in the NFA.] Once in this form, the problem you had with the inefficiency of having different sets of the exact same elements goes away, and it is still relatively easy to check and see if a given NFA state is part of a certain DFA state (in binary string form, for example, use dfaState[-1 - nfaState]). >Anyway, although converting an nfsa to a dfsa proved pretty easy, I have >yet to make it really efficient within Icon. But how often are you going to do the conversion? Unless you are building these state machines on the fly, this part of the project probably isn't worth making more efficient. NEVIN ":-)" LIBER nevin1@ihlpb.ATT.COM (708) 831-FLYS From icon-group-request@arizona.edu Wed May 23 12:41:08 1990 Resent-From: icon-group-request@arizona.edu Received: from Arizona.EDU (Maggie.Telcom.Arizona.EDU) by megaron (5.61/15) via SMTP id AA19800; Wed, 23 May 90 12:41:08 -0700 Received: from ucbvax.Berkeley.EDU by Arizona.EDU; Wed, 23 May 90 12:40 MST Received: by ucbvax.Berkeley.EDU (5.63/1.41) id AA08014; Wed, 23 May 90 12:32:25 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for icon-group@arizona.edu (icon-group@arizona.edu) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Resent-Date: Wed, 23 May 90 12:41 MST Date: 23 May 90 18:27:14 GMT From: usc!snorkelwacker!ai-lab!idsardi@ucsd.EDU Subject: graphics Sender: icon-group-request@arizona.edu Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <8E4D882E88FF2012D2@Arizona.EDU> Message-Id: <8683@rice-chex.ai.mit.edu> Organization: MIT Artificial Intelligence Laboratory X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu I'm looking to draw some trees and graphs, especially on the Mac. I have ProIcon and was wondering whether there's a hack that would allow quickdraw calls to be made on ProIcon windows. Barring that does anyone have character-oriented graphics, stuff that would approximate line drawing. Thanks, Bill Idsardi From @um.cc.umich.edu:Paul_Abrahams@Wayne-MTS Thu May 24 07:34:19 1990 Received: from umich.edu by megaron (5.61/15) via SMTP id AA10074; Thu, 24 May 90 07:34:19 -0700 Received: from ummts.cc.umich.edu by umich.edu (5.61/1123-1.0) id AA03230; Thu, 24 May 90 10:34:18 -0400 Received: from Wayne-MTS by um.cc.umich.edu via MTS-Net; Thu, 24 May 90 10:33:00 EDT Date: Wed, 23 May 90 22:45:42 EDT From: Paul_Abrahams%Wayne-MTS@um.cc.umich.edu To: icon-group@cs.arizona.edu Message-Id: <227438@Wayne-MTS> Subject: Coexpressions revisited In some earlier messages I advanced the view that there really wasn't much use in Icon for coexpressions other than those that returned values via `suspend'. In other words, coexpressions should be limited to the generalization of generators that would enable them to be called from different places or from the same place at different times. Such coexpressions seemed capable of doing all the "interesting" examples, such as interleaving the output of two generators. Since then I've thought of another example that shows why the more general form of coexpressions might be needed after all. This example is not theoretical at all. Suppose we want to write an error-listing routine `errlist' that receives error messages, each consisting of a page number and a string. We want the output of this routine to indicate all the errors on a page, listing the page number *just once*. Easy enough with an ordinary procedure, it seems. But now let's impose one restriction: `errlist' cannot use any static or global variables to save its state from one call to another. This restriction is one often encountered in writing recursive procedures. Now `errlist' has a problem: how to know if an error message has the same page number as its predecessor. Here's how `errlist' might be written using coexpressions(I haven't tested it): record errmessage(page, text) procedure errmessage() local prevpage, err repeat { while (err := @&source).page = \prevpage do write(repl(" ", 10), err.text) write(right(err.page, 8), " ", text) prevpage = err.page } end If `errlist' produced values instead of receiving them, we could make it into a single coexpression that suspended the values in turn. But there's no `accept' expression corresponding to `suspend', so `errlist' needs to use @ to pick up the message pairs, and the various callers need to use @ to transmit those pairs. There is a kludgy way around this: `errlist' suspends a record object, and the caller fills in the object. When `errlist' gets control back, it processes whatever is in the object. But this kludge is not at all satisfying; not only is it awkward to use, but it requires special handling to initiate and terminate `errlist'. A more intuitive explanation of what's going on here is that connecting coexpressions via `suspend' provides a form of input piping, rather like Unix but generalized to allow several inputs to one filter. Input piping usually suffices, but in cases such as 'errlist', output piping is needed as well. Paul Abrahams abrahams%wayne-mts@um.cc.umich.edu From pax@ihcup.att.com Thu May 24 15:48:42 1990 Date: Thu, 24 May 90 15:48:42 -0700 From: pax@ihcup.att.com Message-Id: <9005242248.AA14248@megaron.cs.arizona.edu> Received: from att.UUCP by megaron.cs.arizona.edu (5.61/15) via UUCP id AA14248; Thu, 24 May 90 15:48:42 -0700 To: icon-group@arizona.att.com Subject: Icon cross ref I seem to remember some time ago that someone posted to this group the Icon source for an Icon Cross Refrence program. At the time I did not save the source but would now like to have such a program. I need to cross reference Iocn V8 procedures and global variables. I would appreciate it very much if the provider(s) of cross reference program(s) would send me the source to: att!ihcup!pax Thanks Joe T. Hall AT&T Bell Laboratories 200 Park Plaza, Room IHP 2B-524 Naperville, Illinois 60566-7050 USA att!ihcup!pax tel: +1 708 713-7285 fax: +1 708 713-7480 tlx:157294384(JTHALL) From buchs@Mayo.edu Tue May 29 12:57:36 1990 Received: from fermat.Mayo.edu by megaron (5.61/15) via SMTP id AA04505; Tue, 29 May 90 12:57:36 -0700 Received: from FALCON.DECnet MAIL11D_V3 by fermat.Mayo.edu (5.57/Ultrix2.4-C) id AA16575; Tue, 29 May 90 14:48:07 CDT Date: Tue, 29 May 90 14:48:06 CDT Message-Id: <9005291948.AA16575@fermat.Mayo.edu> From: buchs@Mayo.edu To: :"icon-group@cs.arizona.edu"@FERMAT Cc: BUCHS@fermat.Mayo.edu Subject: beginner help I have just started with Icon. It looks like I really need to get "The Icon Programming Language" to get very far. What have others done? I have found a bit of info in some of the TR documents and I could probably read through some of the library programs to get ideas. I am trying to parse a file with lines of backslash delimited fields, with no trailing delimiter: field1\field2\field3 I thought I was on to an elegant way with the string scanning operator: procedure main() while line := read() do { line ? while write(tab(find("\\"))) do move(1) } end But I cannot get the last field. Any ideas? ------------------------------------------------------------- Kevin Buchs Internet: buchs@mayo.edu Mayo Foundation Is this my life or is it just an Rochester, MN 55905 incredible, high-speed, simulation? (507) 284-0009 -S. R. Cleaves ------------------------------------------------------------- From goer@sophist.uchicago.EDU Tue May 29 16:37:23 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Arizona.EDU (Maggie.Telcom.Arizona.EDU) by megaron (5.61/15) via SMTP id AA21065; Tue, 29 May 90 16:37:23 -0700 Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Tue, 29 May 90 16:27 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Tue, 29 May 90 18:26:43 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA13953; Tue, 29 May 90 18:22:07 CDT Resent-Date: Tue, 29 May 90 16:29 MST Date: Tue, 29 May 90 18:22:07 CDT From: Richard Goerwitz Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <8976B8E3B55F20A047@Arizona.EDU> Message-Id: <9005292322.AA13953@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu I am trying to parse a file with lines of backslash delimited fields, with no trailing delimiter: field1\field2\field3 I thought I was on to an elegant way with the string scanning operator: procedure main() while line := read() do { line ? while write(tab(find("\\"))) do move(1) } end It looks to me as though, for a beginner, you have gotten pretty far into string scanning. Nice work. The problem is easy to see - once, that is, you "get it." Find() will tell you the position in line (the subject of the string- scanning operation) at which a backslash occurs. Tab() will then move you there. This works fine for a while. But what happens when you've come to the last \, have move()'d past it? You have no backslashes left on the line to find(), and so naturally find() fails, the loop fails, the scanning espression fails, and you are sent back for another input string. Try doing something like this: while line := read() do { line ? { while write(tab(find("\\")|0)) do move(1) | break } } The expression a|b yields all results in a, then those for b. In the context above, the a side is all that will normally get evalu- ated. In other words, when you say, tab(find(x)), find will only look for one result. Icon only starts backtracking and doing all those fancy result-sequences in contexts like every i := tab(find(x)) do ... Right here we are looking for only one result. What I've done above is to show you how to exploit Icon's desire to find just one result. find() in tab(find(x)) will return an integer as long as there is a backslash ahead of it in the current subject. When it fails, that's it. However, if we say tab(find(x)|0) when find() fails, Icon has another expression it can try to see if it produces another result, namely the 0. Tab(0) means, "go to the end of the line." You'll note above that I put in the expression "move(1) | break" as the do-clause associated with tab(find("\\")|0). All this does is make sure that if you can't move one character (i.e. you've hit line's end), the loop will fail. "Move(1) | fail" means, at least in this con- text, "try to move(1), and if you can't, try to do whatever is on the other side of the slash (namely break out of the loop)." This should do what you want, though I confess that I haven't tested the code. You might want to use the fields of your backslashed strings in various ways, but I'd guess you best bet is to place them in a list: tmp_list := [] line ? { while put(tmp_lst,tab(find("\\")|0)) do move(1) | break } now do something with tmp_lst, like save it in a bigger list, or permutate it, or just print it... Get it? Am I being too pedantic, or is this okay? Nice job picking up string scanning so fast. From nevin1@ihlpb.att.com Tue May 29 17:17:59 1990 Message-Id: <9005300017.AA24779@megaron> Received: from att-in.att.com by megaron (5.61/15) via SMTP id AA24779; Tue, 29 May 90 17:17:59 -0700 From: nevin1@ihlpb.att.com Date: Tue, 29 May 90 19:16 CDT Original-From: ihlpb!nevin1 (Nevin J Liber +1 708 979 4751) To: att!cs.arizona.edu!icon-group Cc: att!Mayo.edu!buchs Subject: Re: beginner help >I am trying to parse a file with lines of backslash delimited >fields, with no trailing delimiter: > field1\field2\field3 >I thought I was on to an elegant way with the string scanning >operator: > > procedure main() > while line := read() do { > line ? while write(tab(find("\\"))) > do move(1) > } > end > >But I cannot get the last field. Any ideas? The reason that you don't get the last field is because your expression fails just before that point. What happens is find() doesn't see a backslash so it fails, and since failure is inherited, tab() fails, write() fails, the inner while clause fails, the string scanning fails, the do part of the outer while loop is done, and control is passed back to the outer while clause (to read in another line). You need to specify what happens when find() can't find a backslash. Here is how I would have coded it (late at night after work :-)): procedure main() local line while line := read() do line ? while write(tab(upto('\\') | 0)) & move(1) end Note: you could use find("\\") in place of the upto('\\'), but I prefer using upto() for three reasons: 1. It emphasizes that you are looking for a delimiter of length 1. 2. It allows you to look for more than one delimiter at the same time. 3. If I recall correctly, it is the idiom found in the Icon documentation. Anyway, here is an explanation of what goes on at the last field: the upto('\\') fails, so by alternation (the | operator) a write(tab(0)) is performed, printing out the last field. Next move(1) fails (since the current position is at the end of the string, it cannot move over one position to the right), the inner while clause fails, the string scanning fails, the do part of the outer while loop is done, and control is passed back to the outer while loop (to read in another line). NEVIN ":-)" LIBER ..!gargoyle!igloo!nevin (708) 831-FLYS From CELEX@HNYMPI52.BITNET Wed May 30 03:19:39 1990 Received: from rvax.ccit.arizona.edu by megaron (5.61/15) via SMTP id AA22964; Wed, 30 May 90 03:19:39 -0700 Received: from HNYMPI52.BITNET by rvax.ccit.arizona.edu; Wed, 30 May 90 03:18 MST Date: Wed, 30 May 90 11:14 N From: CELEX@HNYMPI52.BITNET Subject: splitting up lines. To: icon-group@cs.arizona.edu Message-Id: <891BFBCD971F602E5A@rvax.ccit.arizona.edu> X-Original-To: icon-group@cs.arizona.edu, CELEX X-Envelope-To: icon-group@cs.arizona.edu I found the recent discussion on dividing a line into its fields very interesting. A lot of our files are organized in this way, so we have to use these techniques pretty often. For finding the n-th field of a line we use the following procedure: procedure field(line,n) if n = 1 then return line[1:find("\\",line)] every x := 1 + find("\\",line) \ (n - 1) return line[x:find("\\",line,x)] end It does the job, but I wonder if this can be done quicker or more elegantly. Any comments? Marcel Bingley CELEX University of Nijmegen Nijmegen - The Netherlands From buchs@Mayo.edu Wed May 30 08:14:53 1990 Received: from fermat.Mayo.edu by megaron (5.61/15) via SMTP id AA03402; Wed, 30 May 90 08:14:53 -0700 Received: from FALCON.DECnet MAIL11D_V3 by fermat.Mayo.edu (5.57/Ultrix2.4-C) id AA18237; Wed, 30 May 90 10:09:25 CDT Date: Wed, 30 May 90 10:09:24 CDT Message-Id: <9005301509.AA18237@fermat.Mayo.edu> From: buchs@Mayo.edu To: :"icon-group@cs.arizona.edu"@FERMAT Cc: BUCHS@fermat.Mayo.edu Subject: Re: beginner help Thanks to everyone who helped me over my first hurdle. ------------------------------------------------------------- Kevin Buchs Internet: buchs@mayo.edu Mayo Foundation Is this my life or is it just an Rochester, MN 55905 incredible, high-speed, simulation? (507) 284-0009 -S. R. Cleaves ------------------------------------------------------------- From goer@sophist.uchicago.EDU Wed May 30 12:23:29 1990 Resent-From: goer@sophist.uchicago.EDU Received: from Arizona.EDU (Maggie.Telcom.Arizona.EDU) by megaron (5.61/15) via SMTP id AA21030; Wed, 30 May 90 12:23:29 -0700 Return-Path: goer@sophist.uchicago.EDU Received: from tank.uchicago.edu by Arizona.EDU; Wed, 30 May 90 12:09 MST Received: from sophist.uchicago.edu by tank.uchicago.edu Wed, 30 May 90 14:08:04 CDT Received: by sophist.uchicago.edu (3.2/UofC3.0) id AA15300; Wed, 30 May 90 14:03:28 CDT Resent-Date: Wed, 30 May 90 12:15 MST Date: Wed, 30 May 90 14:03:28 CDT From: Richard Goerwitz Subject: splitting up lines Resent-To: icon-group@cs.arizona.edu To: icon-group@arizona.edu Resent-Message-Id: <88D10B1CC47F20B3CB@Arizona.EDU> Message-Id: <9005301903.AA15300@sophist.uchicago.edu> X-Envelope-To: icon-group@CS.Arizona.EDU X-Vms-To: icon-group@Arizona.edu I found the recent discussion on dividing a line into its fields very interesting. A lot of our files are organized in this way, so we have to use these techniques pretty often. For finding the n-th field of a line we use the following procedure: procedure field(line,n) if n = 1 then return line[1:find("\\",line)] every x := 1 + find("\\",line) \ (n - 1) return line[x:find("\\",line,x)] end Icon is pretty good about letting us dispense with what I call disparagingly the "ij stuff" (this isn't a joke about Dutch, by the way :-), but rather a reference to the usual variable names used in explicit subscripting operations). procedure find_field(line,sep,n) x := 0 line ? { until (x +:= 1) = n do tab(find(sep)+*sep) | fail target := tab(find(sep)|0) } return target end Note that sep defines the field separator. It has to be a string. It would be pretty easy to have it be itself a pattern. A few weeks ago I posted a program called find_re that works like find above, except that it takes an egrep-style regular expression. I have a new version around if anyone wants it. I see no reason to keep on posting code, when the old code works (it has some minor bugs) - at least until I can prod people into trying it out and letting me know if it works as it should. I can only be just so imaginative on my own :-). Is this more elegant that what you posted? There is no disputing matters of taste. Take your pick. Probably I'd use a completely different approach, like make your field-finder a matching procedure used in string scanning expressions. What sort of context would you use this in? -Richard L. Goerwitz goer%sophist@uchicago.bitnet goer@sophist.uchicago.edu rutgers!oddjob!gide!sophist!goer