patterns.icn: Procedures for SNOBOL4-style pattern matching

link patterns
June 10, 1988; Ralph E. Griswold
This file is in the public domain.

These procedures provide procedural equivalents for most SNOBOL4
patterns and some extensions.

Procedures and their pattern equivalents are:

     Any(s)         ANY(S)

     Arb()          ARB

     Arbno(p)       ARBNO(P)

     Arbx(i)        ARB(I)

     Bal()          BAL

     Break(s)       BREAK(S)

     Breakx(s)      BREAKX(S)

     Cat(p1,p2)     P1 P2

     Discard(p)     /P

     Exog(s)        \S

     Find(s)        FIND(S)

     Len(i)         LEN(I)

     Limit(p,i)     P \ i

     Locate(p)      LOCATE(P)

     Marb()         longest-first ARB

     Notany(s)      NOTANY(S)

     Pos(i)         POS(I)

     Replace(p,s)   P = S

     Rpos(i)        RPOS(I)

     Rtab(i)        RTAB(I)

     Span(s)        SPAN(S)

     String(s)      S

     Succeed()      SUCCEED

     Tab(i)         TAB(I)

     Xform(f,p)     F(P)

   The following procedures relate to the application and control
of pattern matching:

     Apply(s,p)     S ? P

     Mode()         anchored or unanchored matching (see Anchor
                    and Float)

     Anchor()       &ANCHOR = 1  if Mode := Anchor

     Float()        &ANCHOR = 0  if Mode := Float

In addition to the procedures above, the following expressions
can be used:

     p1() | p2()    P1 | P2

     v <- p()       P . V  (approximate)

     v := p()       P $ V  (approximate)

     fail           FAIL

     =s             S  (in place of String(s))

     p1() || p2()   P1 P2  (in place of Cat(p1,p2))

Using this system, most SNOBOL4 patterns can be satisfactorily
transliterated into Icon procedures and expressions. For example,
the pattern

        SPAN("0123456789") $ N "H" LEN(*N) $ LIT

can be transliterated into

        (n <- Span('0123456789')) || ="H" ||
           (lit <- Len(n))

Concatenation of components is necessary to preserve the
pattern-matching properties of SNOBOL4.

Caveats: Simulating SNOBOL4 pattern matching using the procedures
above is inefficient.

