Icon Program Library: procs/soundex1.icn

soundex1.icn: Procedures for Soundex algorithm

link soundex1
April 30, 1993; John David Stone
This file is in the public domain.

When names are communicated by telephone, they are often transcribed
incorrectly.  An organization that has to keep track of a lot of names has
a need, therefore, for some system of representing or encoding a name that
will mitigate the effects of transcription errors.  One idea, originally
proposed by Margaret K. Odell and Robert C. Russell, uses the following
encoding system to try to bring together occurrences of the same surname,
variously spelled:

Encode each of the letters of the name according to the
following equivalences:

      a, e, h, i, o, u, w, y -> *
      b, f, p, v             -> 1
      c, g, j, k, q, s, x, z -> 2
      d, t                   -> 3
      l                      -> 4
      m, n                   -> 5
      r                      -> 6


If any two adjacent letters have the same code, change the code for the
second one to *.

The Soundex representation consists of four characters: the initial letter
of the name, and the first three digit (non-asterisk) codes corresponding
to letters after the initial.  If there are fewer than three such digit
codes, use all that there are, and add zeroes at the end to make up the
four-character representation.

Source code | Program Library Page | Icon Home Page