The University of Arizona

Ergalics



Overview

Ergalics is "the science of computational tools and of computation itself." See an expanded discussion of ergalics and a derivation of this term.

The articulation of scientific theories and the evaluation of such theories via hypothesis testing is found in isolated sub-disciplines of computer science, including HCI, empirical software engineering, and web science. Where our department is notable, and perhaps unique, is in its application of ergalics across computer science, including those sub-disciplines concerned with specific software systems artifacts: databases, networks, multimedia systems, and operating systems.

All of the projects described below and also listed to the right share several important characteristics. They propose predictive causal models about members of an identified class of computational tool and they subject those predictive models to hypothesis testing. They thus strive to articulate fundamental properties or fundamental understanding about the behavior of those tools or of the nature of interaction with users of those tools. They all embrace empirical generalization.

The Ergalics Focal Problem

The focal problem embodied by ergalics is concise:

What are the predictive causal models that underlie computational tools, the use of such tools, and computation itself?

Why is this important? Succinctly: a predictive model can be tested by comparing its predictions to what we observe in experiments, lending credence to the model, thus helping us uncover causal relationships that enable control of the computational tool, which then enables improvement (in terms of performance, functionality, reliability, and other engineering goals) as well as possibly mathematical insights in the form of new theorems. Prediction, along with explanation, yields understanding.

This focal problem is a general form of more specific problems whose detailed answers will help advance the power, utility, and ease of use of specific computational tools and will help us understand the potential and limitations of those tools and of computation itself.

The overarching focal problem and the specific research problems that share its goal of predictive causal models cannot be answered through mathematical theorems, for we are nowhere near understanding most complex computational tools to the degree required to state and prove such theorems. These pressing problems also cannot be addressed through the building of engineering artifacts, for that activity cannot address problems about an entire class of computational tool. Rather, addressing this focal problem requires a new perspective, in addition to the mathematical and engineering perspectives: that of science. This perspective encourages the understanding of computational tools and computation though the articulation of technology-independent principles and the determination of technology-independent limitations. This yields deep insights into tool developers, tool users and the tools themselves, enabling refinements and new tools that are more closely aligned with the innate abilities and limitations of those developers and users.

Areas and Projects

Ergalics is a focal problem and an associated methodology. It is not a discipline; rather, it transcends and embraces most sub-disciplines of computer science. The methodology of empirical generalization has long been present within human-computer interaction and more recently in experimental software engineering and web science. The projects listed below expand the use of this methodology across many other disciplines of computer science, including classes of computational tools (databases, computer-aided instruction, high-performance computing, intelligent agents, multimedia, networks, operating systems and robotics), as well as more foundational efforts (algorithms, automation, cognition, metrology, and philosophy). In adopting the scientific perspective, these projects augment the highly-successful mathematical and engineering perspectives now prevalent in Computer Science.

(Unless otherwise indicated, each participating person is in the Computer Science Department at the University of Arizona.)

AZDBLab

The Arizona Database Laboratory (or AZDBLab) is a Laboratory Information Management System (LIMS). AZDBLab currently supports experiments on four DBMSes, two commercial and two open-source.

Faculty
Industrial Collaborator
Sabah Currim (Alumni Office, UA)
Concrete Complexity

Algorithmic ergalics adopts the scientific perspective, augmenting the mathematical perspective of asymptotic complexity, by attempting to determine a cost formula, stating the complexity of an implementation of an algorithm takes based on a number of parameters, including input size but also including type of processor, speed of main memory, and other relevant factors.

Validating Cost Models

The goal of this project is to scientifically validate the cost formula or interesting statements about the formula via experiments with actual implementation(s) and actual test data.

Faculty
Robert Maier (UA Math Dept)
Industrial Collaborator
Youngkyoon Suh (KISTI)
Database Ergalics

This area concerns database management systems and associated utilities as computational tools of study, as well as the interaction of users with such tools.

Science of Databases

There are questions concerning fundamental limits of DBMS architectures that simply cannot be answered by investigating a single algorithm or even a single DBMS. Rather, addressing such questions requires the development of predictive models across multiple DBMSes. The objective is to understand database management systems as a general class of computational artifacts, to come up with insights and ultimately with predictive models about how such systems, again, as a general class, behave.

Faculty
Industrial Collaborators
Sabah Currim (Alumni Office, UA)
Youngkyoon Suh (KISTI)
Rui Zhang (Dataware Ventures)
Ergalic Metrology

Metrology is "the science of measurement" (wikipedia article). So ergalic metrology is the science of measuring computational tools. International Bureau of Weights and Measures (BIPM) defines metrology as "the science of measurement, embracing both experimental and theoretical determinations at any level of uncertainty in any field of science and technology."

Repeatable Measurement

This project's goal is to understand how one can get repeatable measurements with little variance of CPU time and I/O, which are surprisingly difficult problems, and then how to measure DBMS activity accurately. It turns out that even simple interactions with a DBMS exhibit wide variance in time.

Faculty
Industrial Collaborators
Sabah Currim (Alumni Office, UA)
Youngkyoon Suh (KISTI)
Rui Zhang (Dataware Ventures)
Ergalic Theories

The methodology of ergalics has resulted in extant scientific theories about computational tools.

A Field Guide to the Science of Computation

This NSF-funded project is developing an on-line service, the Field Guide, that presents extant scientific theories, thereby helping people discover and apply the scientific principles of computing. It will enable people to find practical information about how computational things work, analogous to the Boy Scout Handbook helping the scout out in the field. The Field Guide will be seeded by the work of this project and will incorporate community involvement in creating useful new entries.

Faculty
Peter Denning (Naval Postgraduate School)
Susan Higgins (Naval Postgraduate School)
K12 Teachers
Rama Rao Cheepurlpalli (Luz Academy)
Matthew Johnston (BASIS School)
Graduate Students
Green Ergalics

This area attempts to develop predictive models that can be used to reduce energy consumption, thereby helping our environment.

Dynamic REsource Allocation and Management

The goal of the DREAM project is to dynamically match performance levels and resource availability to the demands of a variety of tasks, in order to save energy. As an example, Program-Context-Based Prediction develops predictive models of resource usage in operating systems.

Faculty
Multimedia Ergalics

This area concerns the scientific study of computational tools that allow users to interact with and manipulate multi-media information (e.g., images, sound, video).

SLIC

The SLIC (Semantically Linked Instructional Content) project aims to assist students and scholars to efficiently browse and seek segments of interest in educational videos of lectures and talks.

This project has developed a dynamic Hidden Markov Model for slide change during academic presentations. The probability that each slide was shown at each point in time is modeled using a combination of image features and temporal modeling of slide changes and camera events. Variants of the models are tested and compared using predictive accuracy on ground truth data. This has verified the hypothesis that the combined model should do better than using either spatial or temporal modeling alone. The combined model is able to automatically match slides to presentation videos with accuracies exceeding 90%.

Faculty
Industrial Collaborator
Arnon Amir (IBM Almaden Research Center)
Graduate Students
Network Ergalics

This area concerns computer networks as computational tools of study.

Evolution of Internet Topology

This project attempts to understand the Internet topology and its evolution. Such a topology is a time-varying graph where a node represents an Autonomous System (AS) and a link represents the existence of one or multiple BGP routing sessions between the particular nodes. Specifically, the goal is an empirical description and articulated model of the topology evolution: when and where AS nodes and inter-AS links are added or removed over time.

Faculty
Lixia Zhang (UCLA)
Operating System Ergalics

This area concerns operating systems and associated utilities as computational tools of study, as well as the interaction of users with such tools.

Modeling Scientific Applications

This project models complex high-performance computing (HPC) scientific applications running on hundreds of thousands of processors through a small number of input parameters. This project has two related components.

In one component, the goal is to produce a cost formula for the running time of iterative HPC programs. A short series of training runs using small processor counts and just a few iterations provides data for a regression analysis that provides the parameters for the cost formula.

In another component, the goal is to develop a cost formula for predicting performance of an HPC application at an arbitrary CPU frequency using hardware performance counters given a single program trace.

Faculty
Jaxk Reeves (Statistics, University of Georgia)
Industrial Collaborators
Bronis R. de Supinski (Lawrence Livermore National Laboratory)
Martin Schulz (Lawrence Livermore National Laboratory)
Graduate Students
Brad Barnes (University of Georgia)
Jeonifer Garren (Statistics, University of Georgia)

Webmaster: Andrey Kvochko