TopoVista Tile Format

Version 4F

Overview

A TopoVista tile is a square grid of elevation values accompanied by a tree of error values. Data is sequenced to allow progressive display, with initial coarse data repeatedly refined by subsequent data that doubles the resolution. The data stream is segmented to allow selective discarding by an intelligent router. A compact, byte-oriented data format is further compressed using gzip.

A grid of order k has 2k+1 points along each edge: an order 3 grid has 9×9=81 points forming 8×8=64 square cells. The points along an edge are shared by two adjacent grids. A grid of order k also has an error tree of 2k levels. Some or all of these levels are omitted from the file, trading space savings for computational cost.

File Names

For tiles based on UTM data, the file names are conventionally of the form xxxUyyyc.tvg, encoding a location within a UTM zone. (TopoVista data does not tile across zone boundaries.)

xxx and yyy are X and Y coordinates, as described below. They are unsigned decimal integers with no leading zeroes (so xxx or yyy may be shorter or longer than three characters.)

U is a character encoding the UTM zone, with A meaning zone 1 and Z meaning zone 26. This covers longitudes 24W to 180W, including all of the US except for the most westerly Aleutian Islands.

c is a character identifying the grid order: a means k=1, b means k=2, etc. For a typical tile size of 256×256 cells (257×257 points), k=8 and the character c is h.

The character c also defines the units used for the coordinates, which are counted in gridwidths. A gridwidth depends on both the size and the resolution of a grid. For 30 meter resolution and k=8 (c=h), the gridwidth is 30×256 meters.

xxx and yyy give the coordinates of the southwest (lower left) corner of the grid. For UTM data, the origin is the point 10 meters east of the intersection of the equator with the central meridian of the zone. The reason for the 10 meter offset is that the 30-meter USGS data grid does not include the central meridian.

Each X coordinate is biased by 500 to avoid negative values. A bias of 500 is sufficient to handle a worst case of a 64×64 grid at 10 meter resolution at the southernmost latitude of Hawaii. Y values are assumed to be north of the equator and so are unbiased.

To give a concrete example, for 30 meter data in UTM zone 12 (L), the filename 502L465h.tvg specifies a 256×256 grid (c = h) with its corner 2×256×30+10 meters east and 465×256×30 meters north of (0 N, 111 W). This is a 7.7 km square surrounding Sabino Canyon (in the Santa Catalina Mountains near Tucson, Arizona).

File Contents

Each file contains a series of variable-length records. These records are the basic units that could be discarded by a smart network router.

There are three types of records: header, elevation, and error. The header comes first. Elevation records and error records each appear in order of increasing detail, but the two streams may interleave in any sequence.

Each record consists of:

  1. four identifying characters TV4F
  2. one flag byte
  3. a three-byte (big-endian) integer giving the total record length
  4. data, as a TypedOutputStream, possibly gzipped

The flag byte values are:

Record lengths exceeding 65535 bytes, a two-byte length field, are expected to be very rare; but they are possible with large uncompressed grids.

Header Record

The header record always comes first, and contains these fields:

  1. t   title string (may be empty)
  2. v   version string (may be empty)
  3. c   comment string (may be empty)
  4. r   horizontal resolution in meters (typically 10 or 30)
  5. z   UTM zone, if applicable, else zero
  6. d   coordinate datum (1=NAD27, 2=WGS72, 3=WGS84, 4=NAD83)
  7. k   grid order
  8. e   number of levels of error tree present
  9. x   coordinate of W edge (biased by 500)
  10. y   coordinate of S edge
  11. g   elevation granularity (height of one unit, in mm)
  12. l   minimum elevation
  13. h   maximum elevation

Additional fields, either integer or string, may be added later to extend the format. Tile-reading programs should ignore any extra fields for which they are unprepared.

Elevation Records

Each elevation record is a sequence of integer values representing elevation measurements. A value of −1000 (regardless of units) represents missing data.

The first elevation record gives the corner elevations and defines the minimal four-point k=0 grid. Each subsequent record increases k by 1, doubling the resolution and quadrupling the number of points. Note that each record corresponds to two levels of the triangulation.

For the k=0 record, the grid's four corners are given SW, SE, NW, NE. Subsequent records give

  1. new points added to existing rows
  2. new rows of points

For example, consider this k=2 grid:

    3 e 6 f 4
    l m n o p
    7 c 8 d 9
    g h i j k
    1 a 5 b 2

The k=0 record starts with the elevations of points 1,2,3,4. The k=1 record adds 5,6 and then 7,8,9. The k=2 record adds a,b,c,d,e,f and then ghijk,lmnop.

After the first four points, each new point is midway between two existing points and is given as a difference from the expected (interpolated) elevation.

Error Records

The file may include precalculated error values for the TopoVista triangulation. This is a binary tree traversed breadth first, with one tree level per elevation record.

The reader must reconstruct any levels missing from the tree. The bottom tree level (the leaves, with error=0) is always omitted. The next two levels are easily recalculated and also usually omitted. Higher levels can also be omitted to save space, but at higher levels the space requirement decreases and the calculation cost increases.

The error trees (only) are dependent on the TopoVista triangulation. Division of the top-level square into triangles is controlled by the parities of the X and Y values; see the code for details.

An error record is just a sequence of integer error entries. Each entry is given as a difference, generally negative, from the value of the parent triangle's error.


INDEX