Current Classes (Fall 24):
Previous Classes:
I plan to be focusing my future efforts on implementing, analyzing, and demonstrating use cases for Subcontexts.
This can be accomplished by having a different page table for each subcontext - and each subcontext's page table contains entries only for its own pages. Thus, while the subcontexts share a single conceptual space, they share no pages. So why is that any different than normal virtual memory?
The first difference is that the pages of the various subcontexts must all be non-overlapping in the virtual space - so, for any given virtual address, there is at most one entity which has a page for that address. This places a limit on where new places can be placed: a subcontext cannot mmap() any new page into addresses which overlap pages used by another subcontext in the same virtual address space.
Second, each subcontext can be mapped into a plurality of different virtual address spaces. In each address space, the subcontext has the same pages - meaning that they share the same virtual addresses, and also the same physical backing pages. When a page is mmap()ed or munmap()ed from the subcontext, this change atomically occurs in all of the virtual address spaces into which the subcontext is mapped. Thus, the subcontext effectively exists in all of the virtual address spaces in parallel.
Third and most importantly, we can define mappings between subcontexts. For instance, we might declare that subcontext A has a readonly mapping to subcontext B, which we would write as A->B(ro). Due to this mapping, every virtual address space which contains subcontext A must also contain subcontext B. More importantly, in each of these address spaces, every page in B is readable (but not writable) in A. In addition to readonly mappings, we also allow for read/write mappings - in which case A would see the same permissions as B for B's pages. Mappings are transitive but not symmetric - so B does not necessarily have any mapping to A's pages, but if B has a mapping to the pages of C, then so does A.
Mappings make it possible to share complex data structures. Subcontext B can now define data structures that include pointers - such as trees, linked lists, graphs, or more complex variants - in its own memory, and A will be able to read them. A can use B's pointers directly, without any need for indirection. Moreover, as B changes its pages by mmap()ing new pages or munmap()ing old ones, these changes are atomically reflected in the effective page table of A. This means that A and B only have to initialize their relationship once - by establishing a mapping - and all future changes to B's pages will be visible in A, at all times.
In addition to complex data structures, mappings make it possible to share long buffers - meaning that B can write data to a buffer, and A can read it without need for copies or interaction with the kernel. Likewise, if A has a read/write mapping to B, A can write data into B's pages, and B can read it without additional copies.
Finally, subcontexts all the creation of "entry points" into other subcontexts. These are pre-defined call locations in a subcontext which another subcontext is allowed to jump to. In our A->B example, for instance, A could provide one or more entry points to B. B is now allowed to call A - but only at the entry points. When B calls into A, the thread switches context to A, and has access to all of the pages that A can access, and will continue to do so until the call returns.
In this way, subcontexts allows the creation of extremely fast local RPC services. In this model, we view A as a service and B as its client; A has some access to B's memory (perhaps readonly, perhaps read/write), and B has the ability to call A's registered services.
With a software-only implementation, subcontexts make it possible to implement elegant shared-memory IPC abstraction; with hardware support, we believe that (for some types of services) the performance of a service subcontext could be competitive with a kernel-mode service implementation.
Currently, I am focusing on file fingerprinting, recipe generation, deduplication, and efficient transfers of large files over long distances.
File fingerprinting is always a probabilistic endeavor; there is no guarantee that all, or even most, of the duplication between two files will be discovered by comparing their fingerprints. However, some file fingerprinting algorithms are better than others, and some provide attractive upper bounds on false negatives. We are currently investigating which file fingerprintging algorithms work best, and hope to be able to describe the "best in breed" for a variety of purposes.
Recipes can be generated by leveraging file fingerprinting techniques; the fingerprints can be used to determine the location and/or length of blocks. In this way, we can generate recipes such that two files with duplicate data will often have one or more duplicate blocks.
We are investigating leveraging this technique in order to build efficient distributed file systems. We hope to deploy this technology in file systems related to biological computing, where data sets are huge and generally are appended to over time. If a multi-petabyte dataset, thousands of miles away, is updated periodically with new data, can we automatically detect which block(s) have been changed, and only send those over the network - rather than re-transferring the entire dataset?
While tools exist to do this detection at transfer time (such as rsync), this requires a full, brute-force scan of both the origin and destination files. Moreover, such scans are not normally adaptive, and thus gigantic files can require gigantic data transfers simply to confirm that data has not changed. We are investigating a more general and efficient system, where files are scanned only once (when they are updated) and where recipes automatically adapt their scale to detect duplication (if and when it exists) with a minimum of network overhead.
Assume that you have a well-debugged program performing some critical task. Can you trust the operating system that it runs on?
Bodyguard is a minimal hypervisor, combined with a thin shim layer that stands between your program and the untrusted operating system. The shim and the hypervisor communciate using a channel which the OS can neither intercept, corrupt, snoop upon, nor emulate; together they can detect and prevent any attempt by the OS to snoop on or corrupt the state of your program.
However, Bodyguard is carefully designed to allow the operating system to do any number of safe memory accesses. The OS can swap your pages to disk, and then restore them; it can perform COW copies; it can accept buffers from your program (for write() and the like) and deliver buffers to your program (for read()).
The essence of Bodyguard is that it allows these safe accesses but prevents any sort of unauthorized access; it is invisible unless the operating system does something against the rules.