Colloquium Speaker

Speaker: Beth Plale
College of Computing
Georgia Institute of Technology
Topic:Dynamic Querying of Data Streams with the dQUOB System
Date:Tuesday, April 3, 2001
Time:11:00 AM
Place:Gould-Simpson, Room 701

Refreshments will be served in the 7th-floor lobby of Gould-Simpson at 10:45 AM


This talk is on the management of large data streams found in high performance parallel and distributed applications. By viewing the data stream as though it were data in a database, we gain access to the data using standard SQL queries. Further, by associating computation to the query, we can perform computations on the data en-route from data provider to consumer. The need for such capability is amply demonstrated in scientific visualization where multiple consumers may be interested in visualizing the results from a complex scientific, but with different views of the data (e.g., 2D vs. 3D), at different resolutions (e.g., laptop vs. visualization engine), and for different regions of the atmosphere. For instance, an SQL query serves to extract the region of interest, and the computation executes a clustering algorithm over the 3D data to create a lower resolution image.

The scientific model is but one example of a class of applications to which a query/computation approach to managing data streams is useful. The proliferation of Internet access and the continual drop in per-flop cost of cluster computers have created numerous such examples as high performance computations have scaled in multiple directions to utilize the wealth of resources available and satisfy the broadening demands of new users. Other examples include, high performance processing applications, distributed collaborative environments, distributed data analysis, and remote instrument control.

In this talk I discuss the key results and findings of applying the relational approach to managing large data streams found in scientific visualization. A significant strength of the approach is the uniform way in which the system handles dynamic changes originating as user requests, in the environment, or from the data sources. I discuss these results as well.