Next Up Previous Contents References

4 USING PATHS TO OPTIMIZE CODE

USING PATHS TO OPTIMIZE CODE

Poetry: the best words in the best order.
-- Samuel Taylor Coleridge


This chapter presents a case study for using paths to improve execution speed in a networking subsystem. Specifically, it is targeted at reducing protocol processing latency. The reason for choosing this problem is that optimizing for latency is often considered hard since, in contrast to throughput-oriented optimizations, there is rarely a single dominant latency bottleneck [55, 112]. Instead, to improve latency it is typically necessary to improve protocol processing along the entire path of execution [18, 51]. In this sense, the problem is ideally suited to demonstrate some of the potential benefits of Scout paths. It is important, however, to keep in mind that the approach taken in this case study is by no means the only way paths can be exploited to improve execution speed of a system. Dynamic code generation [60], manually crafted vertically integrated code-paths [56, 23], or a language-based approach [14] represent a few other possibilities in this spectrum.

The case study proposes and analyzes four techniques targeted at improving protocol processing. Of these techniques, the first three are path-based and the last one is a compiler-based technique that addresses the overhead due to the deep call chains that are commonly encountered during path execution (and in systems code in general). The path-based techniques optimize for a particular sequence of partial processing functions. This means that different code is needed for each possible sequence that is performance critical, but not necessarily for each path, since paths traversing the same sequence of modules may be able to share the same optimized code. This also means that a Scout realization is straight-forward: the function sequences can, for example, be pre-generated at system build time. At runtime, the only additional processing required is to match the processing function sequence present in a newly created path with the sequences for which optimized code was pre-generated. If there is a match, the function pointers in the interfaces of the path's stages can be redirected to this optimized code.

4.1 Preliminaries

4.2 Latency Reducing Techniques

4.3 Evaluation

4.4 Concluding Remarks


Next Up Previous Contents References