Total: 1
The similarity matrix measures the pairwise similarities between a set of data points. It is an essential concept in data processing and is routinely used in practical applications. Obtaining the similarity matrix is usually trivial when the data points are completely observed. However, getting a high-quality similarity matrix often turns hard when there are incomplete observations, which becomes even more complex on sequential data streams. To address the challenge, we propose matrix correction algorithms that leverage the positive semi-definiteness of the similarity matrix to provide improved similarity estimation in both offline and online scenarios. Our approaches have a solid theoretical guarantee of performance and excellent potential for parallel execution on large-scale data. They also exhibit high effectiveness and efficiency in empirical evaluations with significantly improved results over the classical imputation-based methods, benefiting downstream applications with superior performance.