https://papers.cool/arxiv/cs.DSData Structures and Algorithms2024-06-14T00:00:00+00:00python-feedgenCool Papers - Immersive Paper Discoveryhttps://papers.cool/arxiv/2406.08847Roping in Uncertainty: Robustness and Regularization in Markov Games2024-06-14T00:00:00+00:00Jeremy McMahanGiovanni ArtiglioQiaomin XieWe study robust Markov games (RMG) with $s$-rectangular uncertainty. We show a general equivalence between computing a robust Nash equilibrium (RNE) of a $s$-rectangular RMG and computing a Nash equilibrium (NE) of an appropriately constructed regularized MG. The equivalence result yields a planning algorithm for solving $s$-rectangular RMGs, as well as provable robustness guarantees for policies computed using regularized methods. However, we show that even for just reward-uncertain two-player zero-sum matrix games, computing an RNE is PPAD-hard. Consequently, we derive a special uncertainty structure called efficient player-decomposability and show that RNE for two-player zero-sum RMG in this class can be provably solved in polynomial time. This class includes commonly used uncertainty sets such as $L_1$ and $L_\infty$ ball uncertainty sets.https://papers.cool/arxiv/2406.09281Computing congruences of finite inverse semigroups2024-06-14T00:00:00+00:00Luna ElliottAlex LevineJames D. MitchellIn this paper we present an algorithm for computing a congruence on an inverse semigroup from a collection of generating pairs. This algorithm uses a myriad of techniques from computational group theory, automata, and the theory of inverse semigroups. An initial implementation of this algorithm outperforms existing implementations by several orders of magnitude.https://papers.cool/arxiv/2406.08595Approximating Maximum Matching Requires Almost Quadratic Time2024-06-14T00:00:00+00:00Soheil BehnezhadMohammad RoghaniAviad RubinsteinWe study algorithms for estimating the size of maximum matching. This problem has been subject to extensive research. For $n$-vertex graphs, Bhattacharya, Kiss, and Saranurak [FOCS'23] (BKS) showed that an estimate that is within $\varepsilon n$ of the optimal solution can be achieved in $n^{2-\Omega_\varepsilon(1)}$ time, where $n$ is the number of vertices. While this is subquadratic in $n$ for any fixed $\varepsilon > 0$, it gets closer and closer to the trivial $\Theta(n^2)$ time algorithm that reads the entire input as $\varepsilon$ is made smaller and smaller. In this work, we close this gap and show that the algorithm of BKS is close to optimal. In particular, we prove that for any fixed $\delta > 0$, there is another fixed $\varepsilon = \varepsilon(\delta) > 0$ such that estimating the size of maximum matching within an additive error of $\varepsilon n$ requires $\Omega(n^{2-\delta})$ time in the adjacency list model.https://papers.cool/arxiv/2406.08624A Sublinear Algorithm for Approximate Shortest Paths in Large Networks2024-06-14T00:00:00+00:00Sabyasachi BasuNadia KōshimaTalya EdenOmri Ben-EliezerC. SeshadhriComputing distances and finding shortest paths in massive real-world networks is a fundamental algorithmic task in network analysis. There are two main approaches to solving this task. On one hand are traversal-based algorithms like bidirectional breadth-first search (BiBFS) with no preprocessing step and slow individual distance inquiries. On the other hand are indexing-based approaches, which maintain a large index. This allows for answering individual inquiries very fast; however, index creation is prohibitively expensive. We seek to bridge these two extremes: quickly answer distance inquiries without the need for costly preprocessing. In this work, we propose a new algorithm and data structure, WormHole, for approximate shortest path computations. WormHole leverages structural properties of social networks to build a sublinearly sized index, drawing upon the explicit core-periphery decomposition of Ben-Eliezer et al. Empirically, the preprocessing time of WormHole improves upon index-based solutions by orders of magnitude, and individual inquiries are consistently much faster than in BiBFS. The acceleration comes at the cost of a minor accuracy trade-off. Nonetheless, our empirical evidence demonstrates that WormHole accurately answers essentially all inquiries within a maximum additive error of 2. We complement these empirical results with provable theoretical guarantees, showing that WormHole requires $n^{o(1)}$ node queries per distance inquiry in random power-law networks. In contrast, any approach without a preprocessing step requires $n^{\Omega(1)}$ queries for the same task. WormHole does not require reading the whole graph. Unlike the vast majority of index-based algorithms, it returns paths, not just distances. For faster inquiry times, it can be combined effectively with other index-based solutions, by running them only on the sublinear core.https://papers.cool/arxiv/2406.08711Matching with Nested and Bundled Pandora Boxes2024-06-14T00:00:00+00:00Robin BowersBo WaggonerWe consider max-weighted matching with costs for learning the weights, modeled as a "Pandora's Box" on each endpoint of an edge. Each vertex has an initially-unknown value for being matched to a neighbor, and an algorithm must pay some cost to observe this value. The goal is to maximize the total matched value minus costs. Our model is inspired by two-sided settings, such as matching employees to employers. Importantly for such settings, we allow for negative values which cause existing approaches to fail. We first prove upper bounds for algorithms in two natural classes. Any algorithm that "bundles" the two Pandora boxes incident to an edge is an $o(1)$-approximation. Likewise, any "vertex-based" algorithm, which uses properties of the separate Pandora's boxes but does not consider the interaction of their value distributions, is an $o(1)$-approximation. Instead, we utilize Pandora's Nested-Box Problem, i.e. multiple stages of inspection. We give a self-contained, fully constructive optimal solution to the nested-boxes problem, which may have structural observations of interest compared to prior work. By interpreting each edge as a nested box, we leverage this solution to obtain a constant-factor approximation algorithm. Finally, we show any ``edge-based'' algorithm, which considers the interactions of values along an edge but not with the rest of the graph, is also an $o(1)$-approximation.https://papers.cool/arxiv/2406.08985The Behavior of Tree-Width and Path-Width under Graph Operations and Graph Transformations2024-06-14T00:00:00+00:00Frank GurskiRobin WeishauptTree-width and path-width are well-known graph parameters. Many NP-hard graph problems allow polynomial-time solutions, when restricted to graphs of bounded tree-width or bounded path-width. In this work, we study the behavior of tree-width and path-width under various unary and binary graph transformations. Doing so, for considered transformations we provide upper and lower bounds for the tree-width and path-width of the resulting graph in terms of the tree-width and path-width of the initial graphs or argue why such bounds are impossible to specify. Among the studied, unary transformations are vertex addition, vertex deletion, edge addition, edge deletion, subgraphs, vertex identification, edge contraction, edge subdivision, minors, powers of graphs, line graphs, edge complements, local complements, Seidel switching, and Seidel complementation. Among the studied, binary transformations we consider the disjoint union, join, union, substitution, graph product, 1-sum, and corona of two graphs.https://papers.cool/arxiv/2406.09137Dynamic Correlation Clustering in Sublinear Update Time2024-06-14T00:00:00+00:00Vincent Cohen-AddadSilvio LattanziAndreas MaggioriNikos ParotsidisWe study the classic problem of correlation clustering in dynamic node streams. In this setting, nodes are either added or randomly deleted over time, and each node pair is connected by a positive or negative edge. The objective is to continuously find a partition which minimizes the sum of positive edges crossing clusters and negative edges within clusters. We present an algorithm that maintains an $O(1)$-approximation with $O$(polylog $n$) amortized update time. Prior to our work, Behnezhad, Charikar, Ma, and L. Tan achieved a $5$-approximation with $O(1)$ expected update time in edge streams which translates in node streams to an $O(D)$-update time where $D$ is the maximum possible degree. Finally we complement our theoretical analysis with experiments on real world data.https://papers.cool/arxiv/2406.09150Reducing the Space Used by the Sieve of Eratosthenes When Factoring2024-06-14T00:00:00+00:00Samuel HartmanJonathan P. SorensonWe present a version of the sieve of Eratosthenes that can factor all integers $\le x$ in $O(x \log\log x)$ arithmetic operations using at most $O(\sqrt{x}/\log\log x)$ bits of space. This is an improved space bound under the condition that the algorithm takes at most $O(x\log\log x)$ time. We also show our algorithm performs well in practice.https://papers.cool/arxiv/2406.09255Compact Parallel Hash Tables on the GPU2024-06-14T00:00:00+00:00Steef HegemanDaan WöltgensAnton WijsAlfons LaarmanOn the GPU, hash table operation speed is determined in large part by cache line efficiency, and state-of-the-art hashing schemes thus divide tables into cache line-sized buckets. This raises the question whether performance can be further improved by increasing the number of entries that fit in such buckets. Known compact hashing techniques have not yet been adapted to the massively parallel setting, nor have they been evaluated on the GPU. We consider a compact version of bucketed cuckoo hashing, and a version of compact iceberg hashing suitable for the GPU. We discuss the tables from a theoretical perspective, and provide an open source implementation of both schemes in CUDA for comparative benchmarking. In terms of performance, the state-of-the-art cuckoo hashing benefits from compactness on lookups and insertions (most experiments show at least 10-20% increase in throughput), and the iceberg table benefits significantly, to the point of being comparable to compact cuckoo hashing--while supporting performant dynamic operation.https://papers.cool/arxiv/2406.09373Efficient Discrepancy Testing for Learning with Distribution Shift2024-06-14T00:00:00+00:00Gautam ChandrasekaranAdam R. KlivansVasilis KontonisKonstantinos StavropoulosArsen VasilyanA fundamental notion of distance between train and test distributions from the field of domain adaptation is discrepancy distance. While in general hard to compute, here we provide the first set of provably efficient algorithms for testing localized discrepancy distance, where discrepancy is computed with respect to a fixed output classifier. These results imply a broad set of new, efficient learning algorithms in the recently introduced model of Testable Learning with Distribution Shift (TDS learning) due to Klivans et al. (2023). Our approach generalizes and improves all prior work on TDS learning: (1) we obtain universal learners that succeed simultaneously for large classes of test distributions, (2) achieve near-optimal error rates, and (3) give exponential improvements for constant depth circuits. Our methods further extend to semi-parametric settings and imply the first positive results for low-dimensional convex sets. Additionally, we separate learning and testing phases and obtain algorithms that run in fully polynomial time at test time.