zhang@fast19@USENIX

Total: 1

#1 Finesse: Fine-Grained Feature Locality based Fast Resemblance Detection for Post-Deduplication Delta Compression [PDF] [Copy] [Kimi] [REL]

Authors: Yucheng Zhang, Wen Xia, Dan Feng, Hong Jiang, Yu Hua, Qiang Wang

In storage systems, delta compression is often used as a complementary data reduction technique for data deduplication because it is able to eliminate redundancy among the non-duplicate but highly similar chunks. Currently, what we call 'N-transform Super-Feature' (N-transform SF) is the most popular and widely used approach to computing data similarity for detecting delta compression candidates. But our observations suggest that the N-transform SF is compute-intensive: it needs to linearly transform each Rabin fingerprint of the data chunks N times to obtain N features, and can be simplified by exploiting the fine-grained feature locality existing among highly similar chunks to eliminate time-consuming linear transformations. Therefore, we propose Finesse, a fine-grained feature-locality-based fast resemblance detection approach that divides each chunk into several fixed-sized subchunks, computes features from these subchunks individually, and then groups the features into super-features. Experimental results show that, compared with the state-of-the-art N-transform SF approach, Finesse accelerates the similarity computation for resemblance detection by 3.2× ~ 3.5× and increases the final throughput of a deduplicated and delta compressed prototype system by 41% ~ 85%, while achieving comparable compression ratios