Total: 17

Compressed Path Databases (CPDs) are a leading technique for optimal pathfinding in graphs with static edge costs. In this work we investigate CPDs as admissible heuristic functions and we apply them in two distinct settings: problems where the graph is subject to dynamically changing costs, and anytime settings where deliberation time is limited. Conventional heuristics derive cost-to-go estimates by reasoning about a tentative and usually infeasible path, from the current node to the target. CPD-based heuristics derive cost-to-go estimates by computing a concrete and usually feasible path. We exploit such paths to bound the optimal solution, not just from below but also from above. We demonstrate the benefit of this approach in a range of experiments on standard gridmaps and in comparison to Landmarks, a popular alternative also developed for searching in explicit state-spaces.

We present a simple combination of A* and IDA*, which we call A*+IDA*. It runs A* until memory is almost exhausted, then runs IDA* below each frontier node without duplicate checking. It is widely believed that this algorithm is called MREC, but MREC is just IDA* with a transposition table. A*+IDA* is the first algorithm to run significantly faster than IDA* on the 24-Puzzle, by a factor of almost 5. A complex algorithm called dual search was reported to significantly outperform IDA* on the 24-Puzzle, but the original version does not. We made improvements to dual search and our version combined with A*+IDA* outperforms IDA* by a factor of 6.7 on the 24-Puzzle. Our disk-based A*+IDA* shows further improvement on several hard 24-Puzzle instances. We also found optimal solutions to a subset of random 27 and 29-Puzzle problems. A*+IDA* does not outperform IDA* on Rubik’s Cube, for reasons we explain.

We study the following fundamental graph problem that models the important task of deanonymizing social networks. We are given a graph representing an eponymous social network and another graph, representing an anonymous social network, which has been produced by the original one after removing some of its nodes and adding some noise on the links. Our objective is to correctly associate as many nodes of the anonymous network as possible to their corresponding node in the eponymous network. We present two algorithms that attack the problem by exploiting only the structure of the two graphs. The first one exploits bipartite matching computations and is relatively fast. The second one is a local search heuristic which can use the outcome of our first algorithm as an initial solution and further improve it. We have applied our algorithms on inputs that have been produced by well-known random models for the generation of social networks as well as on inputs that use real social networks. Our algorithms can tolerate noise at the level of up to 10%. Interestingly, our results provide further evidence to which graph generation models are most suitable for modeling social networks and distinguish them from unrealistic ones.

Many practical problems are too difficult to solve optimally, motivating the need to found suboptimal solutions, particularly those with bounds on the final solution quality. Algorithms like Weighted A*, A*-epsilon, Optimistic Search, EES, and DPS have been developed to find suboptimal solutions with solution quality that is within a constant bound of the optimal solution. However, with the exception of weighted A*, all of these algorithms require performing node re-expansions during search. This paper explores the properties of priority functions that can find bounded suboptimal solution without requiring node re-expansions. After general bounds are developed, two new convex priority functions are developed that can outperform weighted A*.

In this paper, the Minimum Cost Submodular Cover problem is studied, which is to minimize a modular cost function such that the monotone submodular benefit function is above a threshold. For this problem, an evolutionary algorithm EASC is introduced that achieves a constant, bicriteria approximation in expected polynomial time; this is the first polynomial-time evolutionary approximation algorithm for Minimum Cost Submodular Cover. To achieve this running time, ideas motivated by submodularity and monotonicity are incorporated into the evolutionary process, which likely will extend to other submodular optimization problems. In a practical application, EASC is demonstrated to outperform the greedy algorithm and converge faster than competing evolutionary algorithms for this problem.

Recently, Evolution Strategies (ES) have been successfully applied to solve problems commonly addressed by reinforcement learning (RL). Due to the simplicity of ES approaches, their runtime is often dominated by the RL-task at hand (e.g., playing a game). In this work, we introduce Progressive Episode Lengths (PEL) as a new technique and incorporate it with ES. The main objective is to allow the agent to play short and easy tasks with limited lengths, and then use the gained knowledge to further solve long and hard tasks with progressive lengths. Hence allowing the agent to perform many function evaluations and find a good solution for short time horizons before adapting the strategy to tackle larger time horizons. We evaluated PEL on a subset of Atari games from OpenAI Gym, showing that it can substantially improve the optimization speed, stability and final score of canonical ES. Specifically, we show average improvements of 80% (32%) after 2 hours (10 hours) compared to canonical ES.

In this paper, we define Jump Point Graphs (JP), a preprocessing-based path-planning technique similar to Subgoal Graphs (SG). JP allows for the first time the combination of Jump Point Search style pruning in the context of abstraction-based speedup techniques, such as Contraction Hierarchies. We compare JP with SG and its variants and report new state-of-the-art results for grid-based pathfinding.

We tackle two long-standing problems related to re-expansions in heuristic search algorithms. For graph search, A* can require Ω(2ⁿ) expansions, where n is the number of states within the final f bound. Existing algorithms that address this problem like B and B’ improve this bound to Ω(n²). For tree search, IDA* can also require Ω(n²) expansions. We describe a new algorithmic framework that iteratively controls an expansion budget and solution cost limit, giving rise to new graph and tree search algorithms for which the number of expansions is O(n log C*), where C* is the optimal solution cost. Our experiments show that the new algorithms are robust in scenarios where existing algorithms fail. In the case of tree search, our new algorithms have no overhead over IDA* in scenarios to which IDA* is well suited and can therefore be recommended as a general replacement for IDA*.

While computing resources have continued to grow, methods for building and using large heuristics have not seen significant advances in recent years. We have observed that direction-optimizing breadth-first search, developed for and used broadly in the Graph 500 competition, can also be applied for building heuristics. But, the algorithm cannot run efficiently using external memory -- when the heuristics being built are larger than RAM. This paper shows how to modify direction-optimizing breadth-first search to build external-memory heuristics. We show that the new approach is not effective in state spaces with low asymptotic branching factors, but in other domains we are able to achieve up to a 3x reducing in runtime when building an external-memory heuristic. The approach is then used to build a 2.6TiB Rubik's Cube heuristic with 5.8 trillion entries, the largest pattern database heuristic ever built.

Artificial Intelligence has seen several breakthroughs in two-player perfect information game. Nevertheless, Doudizhu, a three-player imperfect information game, is still quite challenging. In this paper, we present a Doudizhu AI by applying deep reinforcement learning from games of self-play. The algorithm combines an asymmetric MCTS on nodes of information set of each player, a policy-value network that approximates the policy and value on each decision node, and inference on unobserved hands of other players by given policy. Our results show that self-play can significantly improve the performance of our agent in this multi-agent imperfect information game. Even starting with a weak AI, our agent can achieve human expert level after days of self-play and training.

Aggregating responses from crowd workers is a fundamental task in the process of crowdsourcing. In cases where a few experts are overwhelmed by a large number of non-experts, most answer aggregation algorithms such as the majority voting fail to identify the correct answers. Therefore, it is crucial to extract reliable experts from the crowd workers. In this study, we introduce the notion of "expert core", which is a set of workers that is very unlikely to contain a non-expert. We design a graph-mining-based efficient algorithm that exactly computes the expert core. To answer the aggregation task, we propose two types of algorithms. The first one incorporates the expert core into existing answer aggregation algorithms such as the majority voting, whereas the second one utilizes information provided by the expert core extraction algorithm pertaining to the reliability of workers. We then give a theoretical justification for the first type of algorithm. Computational experiments using synthetic and real-world datasets demonstrate that our proposed answer aggregation algorithms outperform state-of-the-art algorithms.

Computing cycle-free solutions in cyclic AND/OR search spaces is an important AI problem. Previous work on optimal depth-first search strongly assumes the use of consistent heuristics, the need to keep all examined states in a transposition table, and the existence of solutions. We give a new theoretical analysis under relaxed assumptions where previous results no longer hold. We then present a generic approachto proving unsolvability, and apply it to RBFAOO and BLDFS, two state-of-the-art algorithms. We demonstrate the performance in domain-independent nondeterministic planning

There are currently two broad strategies for optimal Multi-agent Pathfinding (MAPF): (1) search-based methods, which model and solve MAPF directly, and (2) compilation-based solvers, which reduce MAPF to instances of well-known combinatorial problems, and thus, can benefit from advances in solver techniques. In this work, we present an optimal algorithm, BCP, that hybridizes both approaches using Branch-and-Cut-and-Price, a decomposition framework developed for mathematical optimization. We formalize BCP and compare it empirically against CBSH and CBSH-RM, two leading search-based solvers. Conclusive results on standard benchmarks indicate that its performance exceeds the state-of-the-art: solving more instances on smaller grids and scaling reliably to 100 or more agents on larger game maps.

Minimum vertex cover (MinVC) is a prominent NP-hard problem in artificial intelligence, with considerable importance in applications. Local search solvers define the state of the art in solving MinVC. However, there is no single MinVC solver that works best across all types of MinVC instances, and finding the most suitable solver for a given application poses considerable challenges. In this work, we present a new local search framework for MinVC called MetaVC, which is highly parametric and incorporates many effective local search techniques. Using an automatic algorithm configurator, the performance of MetaVC can be optimized for particular types of MinVC instances. Through extensive experiments, we demonstrate that MetaVC significantly outperforms previous solvers on medium-size hard MinVC instances, and shows competitive performance on large MinVC instances. We further introduce a neural-network-based approach for enhancing the automatic configuration process, by identifying and terminating unpromising configuration runs. Our results demonstrate that MetaVC, when automatically configured using this method, can achieve improvements in the best known solutions for 16 large MinVC instances.

The task of real-time combat game is to coordinate multiple units to defeat their enemies controlled by the given opponent in a real-time combat scenario. It is difficult to design a high-level Artificial Intelligence (AI) program for such a task due to its extremely large state-action space and real-time requirements. This paper formulates this task as a collective decentralized partially observable Markov decision process, and designs a Deep Decentralized Policy Network (DDPN) to model the polices. To train DDPN effectively, a novel two-stage learning algorithm is proposed which combines imitation learning from opponent and reinforcement learning by no-regret dynamics. Extensive experimental results on various combat scenarios indicate that proposed method can defeat different opponent models and significantly outperforms many state-of-the-art approaches.

Cardiac trabeculae are fine rod-like muscles whose ends are attached to the inner walls of ventricles. Accurate extraction of trabeculae is important yet challenging, due to the background noise and limited resolution of cardiac images. Existing works proposed to handle this task by modeling the trabeculae as topological handles for better extraction. Computing optimal representation of these handles is essential yet very expensive. In this work, we formulate the problem as a heuristic search problem, and propose novel heuristic functions based on advanced topological techniques. We show in experiments that the proposed heuristic functions improve the computation in both time and memory.

This paper is concerned with the class of non-convex optimization problems with orthogonality constraints. We develop computationally efficient relaxations that transform non-convex orthogonality constrained problems into polynomial-time solvable surrogates. A novel penalization technique is used to enforce feasibility and derive certain conditions under which the constraints of the original non-convex problem are guaranteed to be satisfied. Moreover, we extend our approach to a feasibility-preserving sequential scheme that solves penalized relaxation to obtain near-globally optimal points. Experimental results on synthetic and real datasets demonstrate the effectiveness of the proposed approach on two practical applications in machine learning.