DateTuesday, March 1, 2016
Time11:00 am
Concludes12:15 pm
LocationGould-Simpson 906
Faculty Host: Dr. Michelle Strout
SpeakerPeng (Ryan) Huang
TitlePh.D. Candidate
AffiliationUC San Diego

Toward Understanding and Dealing with Failures in Modern Systems

Many of the services we use everyday now run in data centers or mobile devices. However, building systems in these modern platforms to provide reliable services is difficult. This is evidenced by the fact that despite the large amount of work put into reliability, all modern systems (i.e., cloud and mobile) continue to experience million-dollar outages and frustrating anomalies like battery drain.

In this talk, I will describe my research efforts to better understand and proactively deal with the reliability challenges in modern systems. First I will discuss work that examines failures in cloud services. Instead of focusing on conventional root-cause analysis, this work takes a unique angle to look into the fault-tolerance mechanisms in cloud, and analyze why they did not prevent the service failures. I will summarize several challenges (opportunities) facing cloud system reliability. One such challenge is misconfiguration: existing fault-tolerance techniques often cannot tolerate (or worse are nullified by) configuration errors; and misconfiguration becomes a major source of cloud outages. I will then present work that enables cloud practitioners to proactively prevent configuration error by using a systematic validation framework. The framework consists of a declarative language to express configuration specification, a service that continuously checks if configuration obeys its specification, and a tool that automatically infers basic specification. I will also briefly cover the challenge of app misbehavior in mobile devices and proactive system-level solution to tackle app misbehavior.


Peng Huang is a Ph.D. candidate at UC San Diego advised by Professor Yuanyuan Zhou. His research interests intersect systems, software engineering and programming languages. He is particularly interested in understanding rising problems in real-world systems and reflecting that understanding in new techniques to improve systems. His work has been applied in industry including Microsoft and Teradata, and deployed to many real users. He is currently a research contractor with Facebook working on configuration management. Peng received his MS from UC San Diego, and his BS in computer science and BA in economics from Peking University.