The University of Arizona / Computer Science / UAScience

Russell Lewis

Senior Lecturer
Department of Computer Science, University of Arizona

Current Classes (Fall 24):

CSC 252 - Computer Organization
CSC 337 - Web Development
CSC 452 - Principles of Operating Systems

Previous Classes:

CSC 120 - Introduction to Computer Programming II
- Summer 17, Fall 19, Spring 20, Fall 20, Spring 21, Fall 21, Spring 22, Summer 22, Summer 24
CSC 127a - Introduction to Computer Science (now replaced by CSC 110)
- Fall 15, Spring 16, Summer 16
CSC 245 - Discrete Mathematics
- Summer 15
CSC 252 - Computer Organization
- Fall 16, Spring 17, Fall 17, Spring 18, Fall 18, Summer 19, Summer 20, Summer 21, Fall 22, Fall 23, Spring 24, Summer 24
CSC 345 - Analysis of Discrete Structures (a.k.a. Data Structures and Algorithms)
- Fall 15, Spring 16, Spring 17, Spring 18, Fall 18
CSC 346 - Cloud Computing
- Spring 19, Fall 19, Spring 20, Fall 20, Fall 21
CSC 352 - System Programming and UNIX
- Fall 16, Fall 17
CSC 452 - Principles of Operating Systems
- Fall 22, Spring 23, Fall 23, Spring 24
CSC 552 - Advanced Operating Systems
- Spring 23, Spring 24

Education

Work Experience

Future Research

I plan to be focusing my future efforts on implementing, analyzing, and demonstrating use cases for Subcontexts.

Subcontexts Overview
Subcontexts make it possible to have a plurality of executing entities (somewhat analogous to processes) share a single virtual address space, without violating the safety guarantees that processes normally expect. That is, code running in one process would have no access to the code or data of any other process, even though they share the same virtual address space. We call each such executing entity a "subcontext."
This can be accomplished by having a different page table for each subcontext - and each subcontext's page table contains entries only for its own pages. Thus, while the subcontexts share a single conceptual space, they share no pages. So why is that any different than normal virtual memory?
The first difference is that the pages of the various subcontexts must all be non-overlapping in the virtual space - so, for any given virtual address, there is at most one entity which has a page for that address. This places a limit on where new places can be placed: a subcontext cannot mmap() any new page into addresses which overlap pages used by another subcontext in the same virtual address space.
Second, each subcontext can be mapped into a plurality of different virtual address spaces. In each address space, the subcontext has the same pages - meaning that they share the same virtual addresses, and also the same physical backing pages. When a page is mmap()ed or munmap()ed from the subcontext, this change atomically occurs in all of the virtual address spaces into which the subcontext is mapped. Thus, the subcontext effectively exists in all of the virtual address spaces in parallel.
Third and most importantly, we can define mappings between subcontexts. For instance, we might declare that subcontext A has a readonly mapping to subcontext B, which we would write as A->B(ro). Due to this mapping, every virtual address space which contains subcontext A must also contain subcontext B. More importantly, in each of these address spaces, every page in B is readable (but not writable) in A. In addition to readonly mappings, we also allow for read/write mappings - in which case A would see the same permissions as B for B's pages. Mappings are transitive but not symmetric - so B does not necessarily have any mapping to A's pages, but if B has a mapping to the pages of C, then so does A.
Mappings make it possible to share complex data structures. Subcontext B can now define data structures that include pointers - such as trees, linked lists, graphs, or more complex variants - in its own memory, and A will be able to read them. A can use B's pointers directly, without any need for indirection. Moreover, as B changes its pages by mmap()ing new pages or munmap()ing old ones, these changes are atomically reflected in the effective page table of A. This means that A and B only have to initialize their relationship once - by establishing a mapping - and all future changes to B's pages will be visible in A, at all times.
In addition to complex data structures, mappings make it possible to share long buffers - meaning that B can write data to a buffer, and A can read it without need for copies or interaction with the kernel. Likewise, if A has a read/write mapping to B, A can write data into B's pages, and B can read it without additional copies.
Finally, subcontexts all the creation of "entry points" into other subcontexts. These are pre-defined call locations in a subcontext which another subcontext is allowed to jump to. In our A->B example, for instance, A could provide one or more entry points to B. B is now allowed to call A - but only at the entry points. When B calls into A, the thread switches context to A, and has access to all of the pages that A can access, and will continue to do so until the call returns.
In this way, subcontexts allows the creation of extremely fast local RPC services. In this model, we view A as a service and B as its client; A has some access to B's memory (perhaps readonly, perhaps read/write), and B has the ability to call A's registered services.
With a software-only implementation, subcontexts make it possible to implement elegant shared-memory IPC abstraction; with hardware support, we believe that (for some types of services) the performance of a service subcontext could be competitive with a kernel-mode service implementation.

Past Research

Currently, I am focusing on file fingerprinting, recipe generation, deduplication, and efficient transfers of large files over long distances.

File Fingerprinting
File fingerprinting is the art of scanning a file and generating a relatively small set of characteristic values which are thought to represent the file. Ideally, two files which have some high amount of duplicate data should produce at least a few common fingerprints, without any need to directly compare the contents of the file.
File fingerprinting is always a probabilistic endeavor; there is no guarantee that all, or even most, of the duplication between two files will be discovered by comparing their fingerprints. However, some file fingerprinting algorithms are better than others, and some provide attractive upper bounds on false negatives. We are currently investigating which file fingerprintging algorithms work best, and hope to be able to describe the "best in breed" for a variety of purposes.
Recipe Generation
A recipe for a file is a description of the file in terms of the contents of its blocks. Depending on the recipe format, the blocks may be fixed-size or variable size; they may or may not overlap each other. However, every byte in the file must be covered by at least one of the blocks in the recipe.
Recipes can be generated by leveraging file fingerprinting techniques; the fingerprints can be used to determine the location and/or length of blocks. In this way, we can generate recipes such that two files with duplicate data will often have one or more duplicate blocks.
Deduplication and Efficient File Transfer
Recipes and file fingerprinting can be used to reduce storage and network transfer costs by detecting, and then eliminating, duplicate data in a file system. Files which share some amount of duplicate content need only to store their unique data; shared data is only stored once.
We are investigating leveraging this technique in order to build efficient distributed file systems. We hope to deploy this technology in file systems related to biological computing, where data sets are huge and generally are appended to over time. If a multi-petabyte dataset, thousands of miles away, is updated periodically with new data, can we automatically detect which block(s) have been changed, and only send those over the network - rather than re-transferring the entire dataset?
While tools exist to do this detection at transfer time (such as rsync), this requires a full, brute-force scan of both the origin and destination files. Moreover, such scans are not normally adaptive, and thus gigantic files can require gigantic data transfers simply to confirm that data has not changed. We are investigating a more general and efficient system, where files are scanned only once (when they are updated) and where recipes automatically adapt their scale to detect duplication (if and when it exists) with a minimum of network overhead.

Publications and Patents

My Favorites

Accordion: Multi-Scale Recipes... (HotStorage '15)
Generate and use multi-scale recipes (that is, recipes which are the union of many different recipes, each generated at a different scale) to very efficiently find duplication between two files, or between a file and a large index of blocks. We found that, under certain assumptions, it was possible to perform a multi-scale comparison which was no slower (in the worst case) than existing single-scale comparisons, but which was, in the best case, more than an order of magnitude faster.
Subcontexts (U.S. Patent 7,543,126 B2)
Innovative way to organize page permissions and thread / process / address space identification. Allows for creation of user-level services directly mapped into client processes, while retaining all of the protections of kernel code.
Preventing Deadlocks (U.S. Patent 8,117,616)
Use a secondary (read/write) lock to serialize the operation of locking threads; allow high locking concurrency while preventing deadlock caused by threads which must use dangerous locking strategies

Master's Thesis

Bodyguard (Application Protection Inside an Untrusted OS)

Assume that you have a well-debugged program performing some critical task. Can you trust the operating system that it runs on?
Bodyguard is a minimal hypervisor, combined with a thin shim layer that stands between your program and the untrusted operating system. The shim and the hypervisor communciate using a channel which the OS can neither intercept, corrupt, snoop upon, nor emulate; together they can detect and prevent any attempt by the OS to snoop on or corrupt the state of your program.
However, Bodyguard is carefully designed to allow the operating system to do any number of safe memory accesses. The OS can swap your pages to disk, and then restore them; it can perform COW copies; it can accept buffers from your program (for write() and the like) and deliver buffers to your program (for read()).
The essence of Bodyguard is that it allows these safe accesses but prevents any sort of unauthorized access; it is invisible unless the operating system does something against the rules.

IBM Publications

Undoable Writes (2010)
CPU Cache improvement to reduce bus traffic during atomic instructions
Application Protection Inside an Untrusted OS (2010)
See: Master's Thesis above
Trace Cache Self Modification (2006), Cpu Self-Optimization Thread (2007)
Allows for dynamic optimization of microinstructions stored in the trace cache of a CPU
Compress Debug Data On-the-Fly Before Dump (2006)
Improved performance and storage requirements for point-in-time debug snapshots in an embedded system
Linked List Stack (2006)
Allow for unlimited growth of a thread's stack
Executing Alternate Activities During a Page Swap (2004)
Automatically do useful (alternate) work on a thread which otherwise would be blocked for I/O

IBM Patents

Determining the End of Valid Log in a Log of Write Records Using a Next Pointer and a Far Ahead Pointer (U.S. Patent 8,171,257)
Enhancement to a log format which eliminates reliance on a fixed block to store checkpoints
Discontiguous Multiple Issue of Instructions (U.S. Patent 7,822,948)
Reduces hardware cost of highly-superscalar instruction issue units in a CPU by reducing the need to check for data hazards
Scheduling Grace (U.S. Patent 8,024,739)
Allow interaction between user and kernel to lessen the chance of inopportune preemption
Executing Multiple Threads in a Processor (U.S. Patent 8,607,244)
Hardware-assisted preemptive scheduling between logical threads in a single core, without the use of timer interrupts
Soft Protections to Safeguard Program Execution (U.S. Patent Application 20080016305 A1, Abandoned)
Send warnings to a thread as it nears the end of its allocated stack; warnings can be masked off when it would be unsafe to handle a signal (or when it would cause recursive warnings)
Slow Modify Cache (U.S. Patent 7,383,388 B2)
Mix different memory technologies in a CPU cache; some cache lines may respond quickly to lookups, but will take a longer time to update
Method to Generate a Formatted Trace (U.S. Patent 7,305,660)
In debug statements for code inside an embedded device, use printf()-like statements without the CPU or memory cost associated with runtime string expansion
Depth Counter to Reduce Number of Items Considered for Loop Detection in a Reference-Counting Garbage Collector (U.S. Patent 7,315,873)
Trivial mechanism to cull the list of candidate garbage objects which must be considered by a costly analysis algorithm.

Russell Lewis

Education

Work Experience

Future Research

Subcontexts Overview

Past Research

File Fingerprinting

Recipe Generation

Deduplication and Efficient File Transfer

Publications and Patents

My Favorites

Accordion: Multi-Scale Recipes... (HotStorage '15)

Subcontexts (U.S. Patent 7,543,126 B2)

Preventing Deadlocks (U.S. Patent 8,117,616)

Master's Thesis

Bodyguard (Application Protection Inside an Untrusted OS)

IBM Publications

Undoable Writes (2010)

Application Protection Inside an Untrusted OS (2010)

Trace Cache Self Modification (2006), Cpu Self-Optimization Thread (2007)

Compress Debug Data On-the-Fly Before Dump (2006)

Linked List Stack (2006)

Executing Alternate Activities During a Page Swap (2004)

IBM Patents

Determining the End of Valid Log in a Log of Write Records Using a Next Pointer and a Far Ahead Pointer (U.S. Patent 8,171,257)

Discontiguous Multiple Issue of Instructions (U.S. Patent 7,822,948)

Scheduling Grace (U.S. Patent 8,024,739)

Executing Multiple Threads in a Processor (U.S. Patent 8,607,244)

Soft Protections to Safeguard Program Execution (U.S. Patent Application 20080016305 A1, Abandoned)

Slow Modify Cache (U.S. Patent 7,383,388 B2)

Method to Generate a Formatted Trace (U.S. Patent 7,305,660)

Depth Counter to Reduce Number of Items Considered for Loop Detection in a Reference-Counting Garbage Collector (U.S. Patent 7,315,873)