Developing
Time-Oriented
Database
Applications
in SQL

Richard T. Snodgrass

 

Principles

Throughout the book, principles summarize the material. These 200-odd principles, which convey much of the technical content of the book, are gathered here. If you understand the principles, you are well on your way to being a temporal database expert.

The page number in which the principle appears is provided in parentheses.

"At least one" queries can be easily stated using an additional correlation name in the FROM clause (p. 5).

An instant is an anchored location on the time line. An SQL-92 datetime denotes an instant (p. 26).

The SQL-92 datetime types DATE, TIME, and TIMESTAMP differ in the fields (year, month, day, hour, minute, and second) they contain (p. 28).

The time zone can be stored with SQL-92 TIME and TIMESTAMP values (p. 29).

An interval is an unanchored, directional duration of the time line (p. 30).

Intervals have a qualifier that specifies the leading field, an optional trailing field, and an optional precision for the leading and trailing fields (p. 31).

Year-month intervals contain a year, a month, or both fields (p. 31).

Day-time intervals contain day, hour, minute, and second fields, in any contiguous sequence (p. 32).

No vendor supports SQL-92 at the Full SQL level of conformance. All products include idiosyncrasies in their temporal support that render porting to other DBMSs difficult (p. 42).

The year 2000 problem is a specific instance of a more general problem of an (often unstated) assumption that will be invalidated purely by the course of time (p. 65).

An instant has no duration, but its representation as a particular granule always does (p. 75).

Which second an SQL timestamp before 1958 denotes is not adequately specified in the standard (p. 77).

An SQL-92 TIME value is really an interval that can be added to midnight of a particular day to specify an instant (p. 79).

Use TIME (without time zones) exceedingly carefully, as the standard is imprecise and defective in its application of implicit time zones (p. 80).

Use TIME WITH TIME ZONE carefully, as the time zone stored in such a value is often ignored (p. 80).

Whether or not leap seconds are included in day-time intervals is not specified in the SQL-92 standard (p. 81).

A period is an anchored duration of the time line (p. 89).

The primary representations of periods are closed-closed and closed-open pairs of datetimes, and a pair of a starting datetime and an interval, with both components of the same granularity (p. 90).

The time zone of a period, if any, should be stored with the first datetime of the representation (p. 90).

Equality testing on periods is highly dependent on their underlying representation (p. 90).

While datetimes and intervals are totally ordered, periods are only partially ordered, with 13 possible relationships between two periods (p. 91).

The preferred representation of a period is a closed-open pair of datetimes (p. 91).

When comparing a datetime with a period, consider the datetime to be a period of a single granule in duration (p. 92).

SQL is adequate to handle queries on isolated temporal columns (p. 113).

A valid-time table records the history of the modeled reality. The history can be retained by adding timestamp column(s) (p. 116).

The original primary key is not, by itself, a primary key of the temporal table (p. 117).

Adding the timestamp does not serve to convert a nontemporal key to a temporal key (p. 118).

A sequenced constraint is one that is applied independently at each point in time (p. 118).

A sequenced primary key can be expressed as an SQL assertion or table constraint (p. 118).

The special value "now" can be stored as a specific instant value that will not occur otherwise (p. 119).

"Now" can also be represented with a null value, but this complicates queries (p. 120).

"Now" can be represented with "forever," or a close approximation. However, it still renders the data a rather inaccurate model of reality (p. 120).

Two rows are value-equivalent if the values of their nontimestamp columns are identical. Value equivalence is a weak form of duplication (p. 121).

Two rows are sequenced duplicates if they are duplicates at some instant (p. 121).

Two rows are current duplicates if they are sequenced duplicates at the current instant (p. 121).

Two rows are nonsequenced duplicates if the values of all columns are identical (p. 122).

The SQL UNIQUE constraint prevents nonsequenced duplicates (p. 122).

Nonsequenced uniqueness constraints are easy to specify in SQL, but do not correspond to a naturally stated condition on the modeled reality (p. 123).

Current uniqueness constraints require an SQL constraint or assertion, and are rather fragile (p. 123).

The SQL UNIQUE constraint with the end date prevents current duplicates, if future data is never stored (p. 124).

Sequenced uniqueness constraints are also specified with an SQL constraint or assertion. Such constraints are analogous to conventional uniqueness constraints on nontemporal tables (p. 124).

Current referential integrity requires an SQL constraint or assertion (p. 127).

Nonsequenced referential integrity is easy to express, but is unnatural (p. 128).

Sequenced referential integrity is the natural extension to time-varying tables, but requires a complex SQL assertion (p. 129).

Exploiting contiguous histories in the referenced table simplifies sequenced referential integrity when both tables are temporal (p. 130).

The case where the referencing table is nontemporal but the referenced table is temporal reduces to the other cases just described (p. 131).

Temporal constraints and assertions should be DEFERRABLE INITIALLY DEFERRED, with each transaction containing a modification resetting this via a SET CONSTRAINTS ALL DEFERRED (p. 132).

Executing a query on the current state of a temporal table requires an additional predicate (p. 143).

Current joins over two temporal tables are not that much harder (p. 144).

Time-slice queries, over a previous state, require an additional predicate for each temporal table (p. 145).

A selection (a predicate over a nontimestamp column) is a sequenced selection on a temporal table (p. 146).

A sequenced projection of specified columns in the SELECT clause can be effected by including the timestamp columns (p. 146).

A query using ORDER BY is automatically sequenced, whether or not the timestamp columns are retained (p. 148).

A UNION ALL over temporal tables is automatically sequenced if the timestamp columns are retained (p. 148).

A sequenced join requires four SELECT statements and complex inequality predicates (p. 150).

A sequenced NOT EXISTS (EXCEPT, NOT IN) requires four SELECT statements, each with a nested NOT EXISTS (p. 155).

A nonsequenced query considers the timestamp columns as just additional columns (p. 158).

Coalescing reduces the number of rows by merging the periods of validity of value-equivalent rows (p. 160).

Coalescing may remove or retain sequenced duplicates (p. 160).

Coalescing with duplicate removal can be done with a single (complex) SQL statement (p. 166).

To coalesce while retaining duplicates, it is necessary to merge value-equivalent rows whose periods of validity meet (p. 169).

A current modification concerns something that happens right now (p. 177).

Ensuring uniqueness requires a WHERE predicate, an augmented primary key constraint, or a uniqueness constraint (p. 179).

Ensuring referential integrity with a current insertion in the restricted case requires an additional WHERE predicate (p. 180).

Filling the gap in the referenced table is an easy way to ensure referential integrity for current insertions into the referencing table (p. 180).

When unrestricted modifications are possible, such modifications may generate gaps that must be filled to ensure referential integrity (p. 182).

In the restricted case, a current deletion is converted into an update of the end date (p. 183).

In the general case, a current deletion is implemented as an update, for those currently valid periods, and a delete, for those periods starting in the future (p. 184).

A current update in the restricted case is implemented by an update to end the current row at "now" and an insertion of the new values (p. 184).

A current update in the general case is implemented by two updates and an insertion (p. 187).

A current modification is simply a sequenced modification with a period of applicability of "now" to "forever" (p. 188).

A sequenced deletion is implemented by four statements: an insertion, two updates, and a deletion (p. 193).

A sequenced deletion can violate a nonsequenced uniqueness constraint. It cannot violate a current or sequenced uniqueness constraint (p. 193).

A sequenced update is implemented by five statements: two insertions and three updates (p. 194).

Nonsequenced modifications are usually difficult to state in English because they are expressed in terms of the representation, but easier to express in SQL for the same reason (p. 197).

Nonsequenced modifications are rare (p. 198).

Correctly implementing a nonsequenced modification in the presence of sequenced constraints is difficult and must be done on a case-by-case basis (p. 198).

Current modifications that mention other tables require an additional overlap predicate for each correlation name (p. 200).

Sequenced updates referring to other tables are complex to convert into SQL (p. 206).

Temporal partitioning finesses the limitation of CURRENT_DATE not being permitted as column values by splitting a valid-time table into a current store containing only current information and a history store (p. 206).

Current queries are simplified when applied to a partitioned valid-time table (p. 207).

If the history store does not include current data, then sequenced and nonsequenced queries are more complex. Otherwise, such data must be replicated across both tables (p. 208).

Current insertions only impact the current store of a partitioned table (p. 208).

Sequenced updates are tedious due to the extensive case analysis required for the current and history stores separately (p. 211).

The modifications found in nontemporal applications are all current modifications (p. 216).

A tracking log retains the past states of a table without impacting the monitored table. As differentiated from a valid-time table, which models the state of the enterprise over time, a tracking log captures the state of the monitored table itself over time (p. 220).

The schema of a tracking log comprises the columns of the monitored table, along with a single timestamp column. Its key is simply the primary key of the monitored table and the timestamp column (p. 221).

Triggers allow the tracking log to be maintained automatically, without necessitating changes to the application code (p. 222).

Extracting a prior state involves looking at both the monitored table and the tracking log (p. 223).

Queries on past states of a monitored table are easiest to express via a reconstruction view (p. 226).

Sequenced and nonsequenced queries on a tracking log are best stated on a view that extracts the states as a transaction-time state table (p. 229).

Sequenced and nonsequenced modifications are not allowed on tracking logs (p. 230).

The time sequence of an object records the evolution over time of that object (p. 231).

If arbitrary insertions are allowed, the reconstruction algorithm becomes more complex (p. 232).

A backlog is a tracking log with the modification operation (insert, delete, update) explicitly identified (p. 233).

A tracking log can contain before-images, after-images, or both, with differing implications for reconstruction (p. 235).

Using after-images consistently simplifies the reconstruction algorithm considerably (p. 236).

Using after-images consistently also greatly simplifies the conversion of the tracking log to a transaction-time state table (p. 238).

Sequenced and nonsequenced queries are best expressed on a transaction-time state table view, rather than on the underlying tracking log (p. 238).

If after-images are used in the tracking log, then the monitored table itself is superfluous (p. 239).

Simple approaches to maintaining the tracking log impose rather harsh constraints on the application (p. 240).

The reconstruction algorithm may yield states inconsistent with serializability (p. 242).

Achieving fully accurate transaction semantics for the reconstructed table is difficult (p. 244).

Different organizations of the tracking log result from various initial assumptions and constraints on the monitored table (p. 248).

Tracking logs support transaction time, which is orthogonal to valid time (p. 249).

The reconstructed state of a transaction-time table as of a point in the past will never change, independent of when that reconstruction query or view is evaluated. The state at a point in time of a valid-time table can change, as new information is received and incorporated into the table (p. 250).

The schema of a transaction-time state table comprises the columns of the monitored table, along with two timestamp columns denoting the period of presence. Its key is simply the primary key of the monitored table and the start timestamp column (p. 254).

Triggers on the monitored table can be used to maintain a transaction-time state table (p. 256).

A transaction-time state table may also be maintained directly (p. 257).

Maintaining the state table via direct modifications obviates the need for a materialized monitored table, decreasing the space overhead for capturing changes over time (p. 259).

Reconstruction is easy to express on a state table, though only states in the past or present should be requested (p. 259).

Sequenced queries over transaction-time state tables are expressed identically to such queries over valid-time state tables. However, their semantics is in terms of "when was it recorded that" (p. 260).

Tracking logs, backlogs, and transaction-time state tables have identical information content (p. 262).

A transaction-time state table may be represented with two tables, a current and an archival store (p. 264).

The two-table representation requires a few replacements in the legacy code (p. 265).

Both the monitored table and the transaction-time state table can be defined as views on the two stores (p. 266).

In a tripartitioned table, the current store consists of just the primary key and the start date (p. 266).

A view reconstitutes the state table from the three underlying tables (p. 268).

The archival store can be reduced in size by purging information on invalid entities (p. 269).

Although vacuuming a transaction-time table helps contend with the unchecked growth of the table, it violates the underlying semantics of the table, revising the meaning of subsequent queries (p. 270).

The DBA may wish to vacuum the archival store based on a combination of criteria (p. 271).

The vacuum log, which indicates the meaning of queries on the state table, should be maintained automatically (p. 271).

Vacuuming specifications should be monotonic to avoid time-dependent assumptions (p. 271).

A bitemporal state table contains four timestamp columns, two specifying the period of validity and two specifying the period of presence (p. 279).

Stating that a key on a bitemporal table is valid-time sequenced requires an assertion (p. 281).

Current insertions require only that the valid and transaction timestamps be appropriately specified (p. 283).

For modifications on bitemporal tables, the first stage contends with valid time, resulting in a series of SQL statements (p. 285).

Only two kinds of modifications are permitted on bitemporal state tables: insertions with a transaction time of "now" to "forever," and updates that set the transaction-stop time to "now" (p. 286).

Deletions on bitemporal tables follow these same two stages: first consider valid time, then transaction time (p. 293).

One approach to a sequenced insertion is to simply insert the new period of validity, without regard to how it interacts with the period of validity of the existing rows (p. 296).

A second approach to a sequenced insertion on a bitemporal table computes a new period of validity (p. 299).

Sequenced deletions on valid-time tables require some four SQL statements; when mapped to bitemporal tables, a total of six statements are required (p. 301).

Sequenced updates require applying the same two-stage transformation process, resulting in some eight SQL statements to implement a single sequenced update (p. 305).

Nonsequenced modifications are initially complex to write, but require no subsequent transformations (p. 306).

A transaction time-slice query corresponds to a vertical slice in the time diagram (p. 307).

A valid time-slice query corresponds to a horizontal slice in the time diagram, resulting in a transaction-time state table (p. 309).

A bitemporal time-slice query extracts a single point from a time diagram, resulting in a snapshot table (p. 311).

All combinations of current, sequenced, and nonsequenced over valid time and transaction time are possible and sensible (p. 320).

Current/current queries are common and can be easily stated in SQL via currency predicates (p. 320).

Sequenced/current queries allow you to probe the history as best known (p. 321).

Current/nonsequenced queries concern incorrectly stored information about now (p. 322).

Nonsequenced/nonsequenced queries tease out the interaction between valid time and transaction time (p. 323).

An integrity constraint can be implemented by first writing a SELECT statement, then embedding it in a CHECK constraint (p. 323).

There are six variants corresponding to each nontemporal integrity constraint (p. 326).

A sequenced/current foreign key constraint can be expressed as an embedded sequenced/current query (p. 327).

By studying the particulars of the desired temporal integrity constraint, often a much simpler expression is possible (p. 328).

For referential integrity constraints between tables supporting differing aspects of time, use a current constraint if the time support is missing (p. 329).

A current partition has the advantage that a valid-end time of "forever" need not be stored; the valid-end timestamp is implicit in the current store (p. 331).

The current store has the disadvantage that the passage of time alone can cause rows to enter and exit this store; managing this movement is awkward (p. 331).

A history store is much easier to maintain than a current store and retains many of its advantages. The archival store can be maintained automatically through triggers defined on the history store (p. 332).

The full bitemporal table can be expressed as a view (p. 333).

The archival store can be narrowed by storing only one transaction timestamp: transaction start. The drawback is a more complex view reconstituting the bitemporal state table (p. 336).

Often the history store is the largest component of a bitemporal state table, implying that vacuuming the archival store may not be effective in substantially reducing the size of such a table (p. 338).

The temporal aspects of the application should be initially ignored when developing the conceptual schema (p. 344).

For each entity and relationship type, decide whether the valid time should be recorded, and if so, its granularity (p. 349).

For each attribute, determine if the valid time of the attribute's value should be captured, and if so, the associated granularity (p. 350).

A time-varying key uniquely identifies a particular entity at each point in time. A nontemporal key identifies a particular entity over all time (p. 351).

For each entity and relationship type, decide whether the transaction-time extent should be captured, and if so, its granularity. For each attribute, determine if the transaction time should be recorded, and if so, its granularity (p. 354).

For each attribute that is a user-defined time, choose a granularity supported by SQL-92 (p. 358).

For tables corresponding to entity and relationship types for which valid time is to be recorded, add either a single instant timestamp column or a period timestamp, represented with two instant timestamp columns (p. 360).

Decompose tables so that all attributes of a table have an identical temporal support and precision (p. 361).

For tables corresponding to entity and relationship types for which transaction time is to be recorded, add either two timestamps denoting the period of presence or a single transaction timestamp, optionally with an additional operation column, if the table is to be a backlog (p. 362).

Transaction-time support may induce additional temporal support decomposition (p. 362).

A valid-time sequenced primary key may require an assertion and an additional surrogate identifier column; there are three cases (p. 363).

The WHEN_CHANGED column should be added to the primary key of transaction-time tables to effect a current, and thus sequenced, primary key (p. 365).

There are three cases for the primary key of bitemporal tables, mirroring those for valid-time tables (p. 366).

For transaction-time tables implemented as backlogs, only the last insert or update entry is relevant (p. 367).

If the referenced table is nontemporal, the original foreign key constraints may be retained (p. 368).

Referential integrity on valid-time tables should be expressed as a sequenced constraint (p. 368).

For foreign keys referencing bitemporal tables, if the referencing table supports transaction time, use a transaction-time current constraint. If the referencing table is instant-stamped in valid time, simply ignore the valid time (p. 370).

For foreign keys over bitemporal tables, use sequenced/current constraints (p. 374).

Some key, uniqueness, and participation constraints may hold over the entire lifespan (or valid time) of the associated entity (or relationship) type, and are thus designated as time-invariant (p. 378).

Also consider whether the key, uniqueness, and participation constraints hold over the entire transaction-time extent (p. 379).

Classify each bitemporal entity and relationship type as fully general, retroactive, degenerate, or postactive (p. 379).

A transaction time-invariant sequenced primary key requires an assertion stating that the periods associated with any particular key value are contiguous (p. 380).

Time-invariant uniqueness requires an assertion that the column(s) are unique with respect to the time-invariant primary key (p. 381).

The valid-time start column may be dropped from a table corresponding to a degenerate entity type (p. 382).

SQL3 has a period type constructor. Period types can be constructed from datetime and exact numeric element types (p. 403).

SQL3 period literals support all combinations of open and closed delimiting datetimes (p. 404).

Valid-time support is specified in SQL3 with an ADD VALIDTIME clause (p. 406).

Constraints expressed on nontemporal tables are interpreted in SQL3 as current constraints when valid-time support is added to the table (p. 408).

An SQL-92 statement can be converted into a sequenced statement in SQL3 simply by prepending VALIDTIME (p. 408).

Unlike the complex statements required in SQL-92, the sequenced variant in SQL3 is almost identical to the nontemporal analog (p. 410).

An SQL-92 statement can be converted into a nonsequenced statement in SQL3 simply by prepending NONSEQUENCED VALIDTIME (p. 410).

A valid time-slice query is nonsequenced, with the associated timestamp period compared with the specified instant (p. 412).

Current modifications in SQL3 are the same whether applied to nontemporal tables or to tables with valid-time support (p. 416).

The period of applicability of a sequenced modification is specified in SQL3 immediately after VALIDTIME (p. 417).

The SQL-92 and SQL3 versions of nonsequenced modifications are quite similar (p. 418).

Even complex sequenced modifications over several tables can be easily expressed in SQL3 via the VALIDTIME construct (p. 419).

Temporal partitioning is an aspect of physical design (p. 419).

In SQL3, temporal partitioning has no impact on queries (p. 419).

The representation of a table with transaction-time support in SQL3 is specified with physical design statements supported by the DBMS (p. 421).

Transaction-time sequenced queries are signaled in SQL3 with the TRANSACTIONTIME prefix (p. 423).

Valid-time support and transaction-time support in concert result in a bitemporal table (p. 427).

All combinations of current, sequenced, and nonsequenced queries over valid time and transaction time are easily expressed in SQL3 (p. 429).

Following NONSEQUENCED VALIDTIME with a period expression in SQL3 renders the result a table with valid-time support (p. 434).

SQL3 integrity constraints on bitemporal tables can be any combination of current, sequenced, and nonsequenced (p. 434).

Current modifications in SQL3 on bitemporal tables are identical to their nontemporal counterparts (p. 435).

The period of applicability for valid-time sequenced modifications in SQL3 is specified immediately following the VALIDTIME prefix (p. 436).

For tables corresponding to entity and relationship types for which valid time is to be recorded, use an AS VALIDTIME clause in SQL3 (p. 438).

TRANSACTIONTIME, without a stated granularity, may be used in SQL3 (p. 439).

Transaction-time support may induce additional temporal support decomposition (p. 439).

In SQL3, the primary key should reflect the temporal support accorded the table (p. 440).

Time-invariant primary keys are inherently nonsequenced (p. 440).