Total: 17

Evolutionary algorithms (EAs) have been widely applied to solve multi-objective optimization problems. In contrast to great practical successes, their theoretical foundations are much less developed, even for the essential theoretical aspect, i.e., running time analysis. In this paper, we propose a general approach to estimating upper bounds on the expected running time of multi-objective EAs (MOEAs), and then apply it to diverse situations, including bi-objective and many-objective optimization as well as exact and approximate analysis. For some known asymptotic bounds, our analysis not only provides their leading constants, but also improves them asymptotically. Moreover, our results provide some theoretical justification for the good empirical performance of MOEAs in solving multi-objective combinatorial problems.

The minimum weight vertex cover (MWVC) problem is an important combinatorial optimization problem with various real-world applications. Due to its NP hardness, most works on solving MWVC focus on heuristic algorithms that can return a good quality solution in reasonable time. In this work, we propose two dynamic strategies that adjust the behavior of the algorithm during search, which are used to improve a state of the art local search for MWVC named FastWVC, resulting in two local search algorithms called DynWVC1 and DynWVC2. Previous MWVC algorithms are evaluated on graphs with random or hand crafted weights. In this work, we evaluate the algorithms on the vertex weighted graphs that obtained from an important real world problem, the map labeling problem. Experiments show that our algorithm obtains better results than previous algorithms for MWVC and maximum weight independent set (MWIS) on these real world instances. We also test our algorithms on massive graphs studied in previous works, and show significant improvements there.

Evolution Strategies (ES) have recently been demonstrated to be a viable alternative to reinforcement learning (RL) algorithms on a set of challenging deep learning problems, including Atari games and MuJoCo humanoid locomotion benchmarks. While the ES algorithms in that work belonged to the specialized class of natural evolution strategies (which resemble approximate gradient RL algorithms, such as REINFORCE), we demonstrate that even a very basic canonical ES algorithm can achieve the same or even better performance. This success of a basic ES algorithm suggests that the state-of-the-art can be advanced further by integrating the many advances made in the field of ES in the last decades.We also demonstrate that ES algorithms have very different performance characteristics than traditional RL algorithms: on some games, they learn to exploit the environment and perform much better while on others they can get stuck in suboptimal local minima. Combining their strengths and weaknesses with those of traditional RL algorithms is therefore likely to lead to new advances in the state-of-the-art for solving RL problems.

We present a new preprocessing algorithm for embedding the nodes of a given edge-weighted undirected graph into a Euclidean space. The Euclidean distance between any two nodes in this space approximates the length of the shortest path between them in the given graph. Later, at runtime, a shortest path between any two nodes can be computed with an A* search using the Euclidean distances as heuristic. Our preprocessing algorithm, called FastMap, is inspired by the data-mining algorithm of the same name and runs in near-linear time. Hence, FastMap is orders of magnitude faster than competing approaches that produce a Euclidean embedding using Semidefinite Programming. FastMap also produces admissible and consistent heuristics and therefore guarantees the generation of shortest paths. Moreover, FastMap applies to general undirected graphs for which many traditional heuristics, such as the Manhattan Distance heuristic, are not well defined. Empirically, we demonstrate that A* search using the FastMap heuristic is competitive with A* search using other state-of-the-art heuristics, such as the Differential heuristic.

Focal search (FS) is a bounded-suboptimal search (BSS) variant of A*. Like A*, it uses an open list whose states are sorted in increasing order of their f-values. Unlike A*, it also uses a focal list containing all states from the open list whose f-values are no larger than a suboptimality factor times the smallest f-value in the open list. In this paper, we develop an anytime version of FS, called anytime FS (AFS), that is useful when deliberation time is limited. AFS finds a "good" solution quickly and refines it to better and better solutions if time allows. It does this refinement efficiently by reusing previous search efforts. On the theoretical side, we show that AFS is bounded suboptimal and that anytime potential search (ATPS/ANA*), a state-of-the-art anytime bounded-cost search (BCS) variant of A*, is a special case of AFS. In doing so, we bridge the gap between anytime search algorithms based on BSS and BCS. We also identify different properties of priority functions, used to sort the focal list, that may allow for efficient reuse of previous search efforts. On the experimental side, we demonstrate the usefulness of AFS for solving hard combinatorial problems, such as the generalized covering traveling salesman problem and the multi-agent pathfinding problem.

Many state-of-the-art algorithms for solving hard combinatorial problems in artificial intelligence (AI) include elements of stochasticity that lead to high variations in runtime, even for a fixed problem instance. Knowledge about the resulting runtime distributions (RTDs) of algorithms on given problem instances can be exploited in various meta-algorithmic procedures, such as algorithm selection, portfolios, and randomized restarts. Previous work has shown that machine learning can be used to individually predict mean, median and variance of RTDs. To establish a new state-of-the-art in predicting RTDs, we demonstrate that the parameters of an RTD should be learned jointly and that neural networks can do this well by directly optimizing the likelihood of an RTD given runtime observations. In an empirical study involving five algorithms for SAT solving and AI planning, we show that neural networks predict the true RTDs of unseen instances better than previous methods, and can even do so when only few runtime observations are available per training instance.

The maximum k-plex, a generalization of maximum clique, is used to cope with a great number of real-world problems. The aim of this paper is to propose a novel exact k-plex algorithm that can deal with large-scaled graphs with millions of vertices and edges. Specifically, we first propose several new graph reduction methods through a careful analyzing of structures of induced subgraphs. Afterwards, we present a preprocessing method to simplify initial graphs. Additionally, we present a branch-and-bound algorithm integrating the reduction methods as well as a new dynamic vertex selection mechanism. We perform intensive experiments to evaluate our algorithm, and show that the proposed strategies are effective and our algorithm outperforms state-of-the-art algorithms, especially for real-world massive graphs.

The classical disjoint shortest path problem has recently recalled interests from researchers in the network planning and optimization community. However, the requirement of the shortest paths being completely vertex or edge disjoint might be too restrictive and demands much more resources in a network. Partially disjoint shortest paths, in which a bounded number of shared vertices or edges is allowed, balance between degree of disjointness and occupied network resources. In this paper, we consider the problem of finding k shortest paths which are edge disjoint but partially vertex disjoint. For a pair of distinct vertices in a network graph, the problem aims to optimally find k edge disjoint shortest paths among which at most a bounded number of vertices are shared by at least two paths. In particular, we present novel techniques for exactly solving the problem with a runtime that significantly improves the current best result. The proposed algorithm is also validated by computer experiments on both synthetic and real networks which demonstrate its superior efficiency of up to three orders of magnitude faster than the state of the art.

We study the impact of tie-breaking on the behavior of greedy best-first search with a fixed state space and fixed heuristic. We prove that it is NP-complete to determine the number of states that need to be expanded by greedy best-first search in the best case or in the worst case. However, the best- and worst-case behavior can be computed in polynomial time for undirected state spaces. We perform computational experiments on benchmark tasks from the International Planning Competitions that compare the best and worst cases of greedy best-first search to FIFO, LIFO and random tie-breaking. The experiments demonstrate the importance of tie-breaking in greedy best-first search.

As an important and challenging problem in artificial intelligence (AI) game playing, StarCraft micromanagement involves a dynamically adversarial game playing process with complex multi-agent control within a large action space. In this paper, we propose a novel knowledge-guided agent-tactic-aware learning scheme, that is, opponent-guided tactic learning (OGTL), to cope with this micromanagement problem. In principle, the proposed scheme takes a two-stage cascaded learning strategy which is capable of not only transferring the human tactic knowledge from the human-made opponent agents to our AI agents but also improving the adversarial ability. With the power of reinforcement learning, such a knowledge-guided agent-tactic-aware scheme has the ability to guide the AI agents to achieve high winning-rate performances while accelerating the policy exploration process in a tactic-interpretable fashion. Experimental results demonstrate the effectiveness of the proposed scheme against the state-of-the-art approaches in several benchmark combat scenarios.

Subset selection is a fundamental problem in many areas, which aims to select the best subset of size at most $k$ from a universe. Greedy algorithms are widely used for subset selection, and have shown good approximation performances in deterministic situations. However, their behaviors are stochastic in many realistic situations (e.g., large-scale and noisy). For general stochastic greedy algorithms, bounded approximation guarantees were obtained only for subset selection with monotone submodular objective functions, while real-world applications often involve non-monotone or non-submodular objective functions and can be subject to a more general constraint than a size constraint. This work proves their approximation guarantees in these cases, and thus largely extends the applicability of stochastic greedy algorithms.

The problem of selecting a sequence of items from a universe that maximizes some given objective function arises in many real-world applications. In this paper, we propose an anytime randomized iterative approach POSeqSel, which maximizes the given objective function and minimizes the sequence length simultaneously. We prove that for any previously studied objective function, POSeqSel using a reasonable time can always reach or improve the best known approximation guarantee. Empirical results exhibit the superior performance of POSeqSel.

The subset selection problem that selects a few items from a ground set arises in many applications such as maximum coverage, influence maximization, sparse regression, etc. The recently proposed POSS algorithm is a powerful approximation solver for this problem. However, POSS requires centralized access to the full ground set, and thus is impractical for large-scale real-world applications, where the ground set is too large to be stored on one single machine. In this paper, we propose a distributed version of POSS (DPOSS) with a bounded approximation guarantee. DPOSS can be easily implemented in the MapReduce framework. Our extensive experiments using Spark, on various real-world data sets with size ranging from thousands to millions, show that DPOSS can achieve competitive performance compared with the centralized POSS, and is almost always better than the state-of-the-art distributed greedy algorithm RandGreeDi.

Anytime algorithms enable intelligent systems to trade computation time with solution quality. To exploit this crucial ability in real-time decision-making, the system must decide when to interrupt the anytime algorithm and act on the current solution. Existing meta-level control techniques, however, address this problem by relying on significant offline work that diminishes their practical utility and accuracy. We formally introduce an online performance prediction framework that enables meta-level control to adapt to each instance of a problem without any preprocessing. Using this framework, we then present a meta-level control technique and two stopping conditions. Finally, we show that our approach outperforms existing techniques that require substantial offline work. The result is efficient nonmyopic meta-level control that reduces the overhead and increases the benefits of using anytime algorithms in intelligent systems.

Contraction hierarchies and (N-level) subgoal graphs are two preprocessing-based path-planning algorithms that have so far only been compared experimentally through the grid-based path-planning competitions, where both algorithms had undominated runtime/memory trade-offs. Subgoal graphs can be considered as a framework that can be customized to different domains through the choice of a reachability relation R that identifies pairs of nodes on a graph between which it is easy to find shortest paths. Subgoal graphs can exploit R in various ways to speed-up query times and reduce memory requirements. In this paper, we break down the differences between N-level subgoal graphs and contraction hierarchies, and augment contraction hierarchies with ideas from subgoal graphs to exploit R. We propose three different modifications, analyze their runtime/memory trade-offs, and provide experimental results on grids using canonical-freespace-reachability as R, which show that both N-level subgoal graphs and contraction hierarchies are dominated in terms of the runtime/memory trade-off by some of our new variants.

The minimum weight dominating set (MWDS) problem is NP-hard and also important in many applications. Recent heuristic MWDS algorithms can hardly solve massive real world graphs effectively. In this paper, we design a fast local search algorithm called FastMWDS for the MWDS problem, which aims to obtain a good solution on massive graphs within a short time. In this novel local search framework, we propose two ideas to make it effective. Firstly, we design a new fast construction procedure with four reduction rules to cut down the size of massive graphs. Secondly, we propose the three-valued two-level configuration checking strategy to improve local search, which is interestingly a variant of configuration checking (CC) with two levels and multiple values. Experiment results on a broad range of massive real world graphs show that FastMWDS finds much better solutions than state of the art MWDS algorithms.

Curriculum learning is often introduced as a leverage to improve the agent training for complex tasks, where the goal is to generate a sequence of easier subasks for an agent to train on, such that final performance or learning speed is improved. However, conventional curriculum is mainly designed for one agent with fixed action space and sequential simple-to-hard training manner. Instead, we present a novel curriculum learning strategy by introducing the concept of master-slave agents and enabling flexible action setting for agent training. Multiple agents, referred as master agent for the target task and slave agents for the subtasks, are trained concurrently within different action spaces by sharing a perception network with an asynchronous strategy. Extensive evaluation on the VizDoom platform demonstrates the joint learning of master agent and slave agents mutually benefit each other. Significant improvement is obtained over A3C in terms of learning speed and performance.