CSE 600 Colloq - Thomas Wenisch

Friday, November 6, 2015 - 14:00 to 15:20
Room 120, New CS Building

Title: Killer Microseconds and the Tail at Scale

Online Data Intensive (OLDI) applications, which process terabytes of data
with sub-second latencies, are the cornerstone of modern internet services
like web search and social media. In this talk, I discuss two system design
challenges that make it very difficult to build efficient OLDI applications
and speculate on possible solution directions.
(1) Killer Microseconds---today's CPUs are highly effective at hiding the
nanosecond-scale latency of memory accesses and operating systems are
highly effective at hiding the millisecond-scale latency of disks.
However, modern high-performance networking and flash I/O frequently lead
to situations where data are a few microseconds away. Neither hardware nor
software offer effective mechanisms to hide these microsecond-scale stalls.
(2) The Tail at Scale---OLDI services typically rely on a strategy of
sharding their data sets over hundreds or even thousands of servers to meet
latency objectives. However, this strategy mandates that fully processing
a request requires waiting for the slowest straggler among these servers.
As a result, exceedingly rare events, such as transient network congestion,
interrupts, OS background activity, or CPU power state changes, which have
negligible impact on the throughput of a single sever nevertheless come to
dominate the latency distribution of the OLDI service. At 1000-node scale,
the 5th '9 of the individual server's latency distribution becomes the 99%
latency tail of the entire request. These two challenges cause OLDI
operators to execute their workloads inefficiently at low utilization to
avoid compounding stalls and tails with queueing delays. There is a
pressing need for systems researchers to find ways to
hide microsecond-scale stalls and track down and address the rare triggers
of 99.999% tail performance anomalies that destroy application-level
latency objectives.

Special acknowledgements to Luiz Barroso, Partha Ranganathan, and Mike
Marty for many of the ideas in this talk.


Thomas Wenisch is an Associate Professor of Computer Science and
Engineering at the University of Michigan, specializing in computer
architecture. His prior research includes memory streaming for commercial
server applications, multiprocessor memory systems, memory disaggregation,
and rigorous sampling-based performance evaluation methodologies. His
ongoing work focuses on computational sprinting, server and data center
architectures, programming models for byte-addressable NVRAM, and
architectures to enable hand-held 3D ultrasound. Wenisch received the NSF
CAREER award in 2009. Prior to his academic career, Wenisch was a software
developer at American Power Conversion, where he worked on data center
thermal topology estimation. He received his Ph.D. in Electrical and
Computer Engineering from Carnegie Mellon University

Computed Event Type: 
Event Title: 
CSE 600 Colloq - Thomas Wenisch