nguyen@osdi16@USENIX

Total: 1

#1 Yak: A High-Performance Big-Data-Friendly Garbage Collector [PDF] [Copy] [Kimi] [REL]

Authors: Khanh Nguyen ; Lu Fang ; Guoqing Xu ; Brian Demsky ; Shan Lu ; Sanazsadat Alamian ; Onur Mutlu

Most “Big Data” systems are written in managed languages, such as Java, C#, or Scala. These systems suffer from severe memory problems due to the massive volume of objects created to process input data. Allocating and deallocating a sea of data objects puts a severe strain on existing garbage collectors (GC), leading to high memory management overheads and reduced performance. This paper describes the design and implementation of Yak, a “Big Data” friendly garbage collector that provides high throughput and low latency for all JVM-based languages. Yak divides the managed heap into a control space (CS) and a data space (DS), based on the observation that a typical data-intensive system has a clear distinction between a control path and a data path. Objects created in the control path are allocated in the CS and subject to regular tracing GC. The lifetimes of objects in the data path often align with epochs creating them. They are thus allocated in the DS and subject to region-based memory management. Our evaluation with three large systems shows very positive results.