| Total: 31

LTLf synthesis is the automated construction of a reactive system from a high-level description, expressed in LTLf, of its finite-horizon behavior. So far, the conversion of LTLf formulas to deterministic finite-state automata (DFAs) has been identified as the primary bottleneck to the scalabity of synthesis. Recent investigations have also shown that the size of the DFA state space plays a critical role in synthesis as well. Therefore, effective resolution of the bottleneck for synthesis requires the conversion to be time and memory performant, and prevent state-space explosion. Current conversion approaches, however, which are based either on explicit-state representation or symbolic-state representation, fail to address these necessities adequately at scale: Explicit-state approaches generate minimal DFA but are slow due to expensive DFA minimization. Symbolic-state representations can be succinct, but due to the lack of DFA minimization they generate such large state spaces that even their symbolic representations cannot compensate for the blow-up. This work proposes a hybrid representation approach for the conversion. Our approach utilizes both explicit and symbolic representations of the state-space, and effectively leverages their complementary strengths. In doing so, we offer an LTLf to DFA conversion technique that addresses all three necessities, hence resolving the bottleneck. A comprehensive empirical evaluation on conversion and synthesis benchmarks supports the merits of our hybrid approach.

Both search-based and translation-based planning systems usually operate on grounded representations of the problem. Planning models, however, are commonly defined using lifted description languages. Thus, planning systems usually generate a grounded representation of the lifted model as a preprocessing step. For HTN planning models, only one method to ground lifted models has been published so far. In this paper we present a new approach for grounding HTN planning problems that produces smaller groundings in a shorter timespan than the previously published method.

We study PO and POCL plans with regard to their makespan – the execution time when allowing the parallel execution of causally independent actions. Partially ordered (PO) plans are often assumed to be equivalent to partial order causal link (POCL) plans, where the causal relationships between actions are explicitly represented via causal links. As a first contribution, we study the similarities and differences of PO and POCL plans, thereby clarifying a common misconception about their relationship: There are PO plans for which there does not exist a POCL plan with the same orderings. We prove that we can still always find a POCL plan with the same makespan in polynomial time. As another main result we prove that turning a PO or POCL plan into one with minimal makespan by only removing ordering constraints (called deordering) is NP-complete. We provide a series of further results on special cases and implications, such as reordering, where orderings can be changed arbitrarily.

Markov decision processes (MDPs) are the defacto framework for sequential decision making in the presence of stochastic uncertainty. A classical optimization criterion for MDPs is to maximize the expected discounted-sum payoff, which ignores low probability catastrophic events with highly negative impact on the system. On the other hand, risk-averse policies require the probability of undesirable events to be below a given threshold, but they do not account for optimization of the expected payoff. We consider MDPs with discounted-sum payoff with failure states which represent catastrophic outcomes. The objective of risk-constrained planning is to maximize the expected discounted-sum payoff among risk-averse policies that ensure the probability to encounter a failure state is below a desired threshold. Our main contribution is an efficient risk-constrained planning algorithm that combines UCT-like search with a predictor learned through interaction with the MDP (in the style of AlphaZero) and with a risk-constrained action selection via linear programming. We demonstrate the effectiveness of our approach with experiments on classical MDPs from the literature, including benchmarks with an order of 106 states.

Automated Planning addresses the problem of finding a sequence of actions, a plan, transforming the environment from its initial state to some goal state. In real-world environments, exogenous events might occur and might modify the environment without agent's consent. Besides disrupting agent's plan, events might hinder agent's pursuit towards its goals and even cause damage (e.g. destroying the robot). In this paper, we leverage the notion of Safe States in dynamic environments under presence of non-deterministic exogenous events that might eventually cause dead-ends (e.g. “damage” the agent) if the agent is not careful while executing its plan. We introduce a technique for generating plans that constrains the number of consecutive “unsafe” actions in a plan and a technique for generating “robust” plans that effectively evade event effects. Combination of both approaches plans and executes robust plans between safe states. We empirically show that such an approach effectively navigates the agent towards its goals in spite of presence of dead-ends.

A temporal graph is a dynamic graph where every edge is assigned a set of integer time labels that indicate at which discrete time step the edge is available. In this paper, we study how changes of the time labels, corresponding to delays on the availability of the edges, affect the reachability sets from given sources. The questions about reachability sets are motivated by numerous applications of temporal graphs in network epidemiology and scheduling problems in supply networks in manufacturing. We introduce control mechanisms for reachability sets that are based on two natural operations of delaying time events. The first operation, termed merging, is global and batches together consecutive time labels in the whole network simultaneously. This corresponds to postponing all events until a particular time. The second, imposes independent delays on the time labels of every edge of the graph. We provide a thorough investigation of the computational complexity of different objectives related to reachability sets when these operations are used. For the merging operation, we prove NP-hardness results for several minimization and maximization reachability objectives, even for very simple graph structures. For the second operation, we prove that the minimization problems are NP-hard when the number of allowed delays is bounded. We complement this with a polynomial-time algorithm for the case of unbounded delays.

In many usage scenarios of AI Planning technology, users will want not just a plan π but an explanation of the space of possible plans, justifying π. In particular, in oversubscription planning where not all goals can be achieved, users may ask why a conjunction A of goals is not achieved by π. We propose to answer this kind of question with the goal conjunctions B excluded by A, i. e., that could not be achieved if A were to be enforced. We formalize this approach in terms of plan-property dependencies, where plan properties are propositional formulas over the goals achieved by a plan, and dependencies are entailment relations in plan space. We focus on entailment relations of the form ∧g∈Ag ⇒ ⌝ ∧g∈Bg, and devise analysis techniques globally identifying all such relations, or locally identifying the implications of a single given plan property (user question) ∧g∈Ag. We show how, via compilation, one can analyze dependencies between a richer form of plan properties, specifying formulas over action subsets touched by the plan. We run comprehensive experiments on adapted IPC benchmarks, and find that the suggested analyses are reasonably feasible at the global level, and become significantly more effective at the local level.

Suboptimal heuristic search algorithms can benefit from reasoning about heuristic error, especially in a real-time setting where there is not enough time to search all the way to a goal. However, current reasoning methods implicitly or explicitly incorporate assumptions about the cost-to-go function. We consider a recent real-time search algorithm, called Nancy, that manipulates explicit beliefs about the cost-to-go. The original presentation of Nancy assumed that these beliefs are Gaussian, with parameters following a certain form. In this paper, we explore how to replace these assumptions with actual data. We develop a data-driven variant of Nancy, DDNancy, that bases its beliefs on heuristic performance statistics from the same domain. We extend Nancy and DDNancy with the notion of persistence and prove their completeness. Experimental results show that DDNancy can perform well in domains in which the original assumption-based Nancy performs poorly.

In this paper, we focus on the inference of mutex groups in the lifted (PDDL) representation. We formalize the inference and prove that the most commonly used translator from the Fast Downward (FD) planning system infers a certain subclass of mutex groups, called fact-alternating mutex groups (fam-groups). Based on that, we show that the previously proposed fam-groups-based pruning techniques for the STRIPS representation can be utilized during the grounding process with lifted fam-groups, i.e., before the full STRIPS representation is known. Furthermore, we propose an improved inference algorithm for lifted fam-groups that produces a richer set of fam-groups than the FD translator and we demonstrate a positive impact on the number of pruned operators and overall coverage.

People sometimes act differently when making decisions affecting the present moment versus decisions affecting the future only. This is referred to as time-inconsistent behaviour, and can be modeled as agents exhibiting present bias. A resulting phenomenon is abandonment, which is when an agent initially pursues a task, but ultimately gives up before reaping the rewards. With the introduction of the graph-theoretic time-inconsistent planning model due to Kleinberg and Oren, it has been possible to investigate the computational complexity of how a task designer best can support a present-biased agent in completing the task. In this paper, we study the complexity of finding a choice reduction for the agent; that is, how to remove edges and vertices from the task graph such that a present-biased agent will remain motivated to reach his target even for a limited reward. While this problem is NP-complete in general, this is not necessarily true for instances which occur in practice, or for solutions which are of interest to task designers. For instance, a task designer may desire to find the best task graph which is not too complicated. We therefore investigate the problem of finding simple motivating subgraphs. These are structures where the agent will modify his plan at most k times along the way. We quantify this simplicity in the time-inconsistency model as a structural parameter: The number of branching vertices (vertices with out-degree at least 2) in a minimal motivating subgraph. Our results are as follows: We give a linear algorithm for finding an optimal motivating path, i. e. when k = 0. On the negative side, we show that finding a simple motivating subgraph is NP-complete even if we allow only a single branching vertex — revealing that simple motivating subgraphs are indeed hard to find. However, we give a pseudo-polynomial algorithm for the case when k is fixed and edge weights are rationals, which might be a reasonable assumption in practice.

The controllability of a temporal network is defined as an agent's ability to navigate around the uncertainty in its schedule and is well-studied for certain networks of temporal constraints. However, many interesting real-world problems can be better represented as Probabilistic Simple Temporal Networks (PSTNs) in which the uncertain durations are represented using potentially-unbounded probability density functions. This can make it inherently impossible to control for all eventualities. In this paper, we propose two new dynamic controllability algorithms that attempt to maximize the likelihood of successfully executing a schedule within a PSTN. The first approach, which we call Min-Loss DC, finds a dynamic scheduling strategy that minimizes loss of control by using a conflict-directed search to decide where to sacrifice the control in a way that optimizes overall success. The second approach, which we call Max-Gain DC, works in the other direction: it finds a dynamically controllable schedule and then attempts to progressively strengthen it by capturing additional uncertainty. Our approaches are the first known that work by finding maximally dynamically controllable schedules. We empirically compare our approaches against two existing PSTN offline dispatch approaches and one online approach and show that our Min-Loss DC algorithm outperforms the others in terms of maximizing execution success while maintaining competitive runtimes.

This paper studies the computational complexity of temporal planning, as represented by PDDL 2.1, interpreted over dense time. When time is considered discrete, the problem is known to be EXPSPACE-complete. However, the official PDDL 2.1 semantics, and many implementations, interpret time as a dense domain. This work provides several results about the complexity of the problem, studying a few interesting cases: whether a minimum amount ϵ of separation between mutually exclusive events is given, in contrast to the separation being simply required to be non-zero, and whether or not actions are allowed to overlap already running instances of themselves. We prove the problem to be PSPACE-complete when self-overlap is forbidden, whereas, when allowed, it becomes EXPSPACE-complete with ϵ-separation and undecidable with non-zero separation. These results clarify the computational consequences of different choices in the definition of the PDDL 2.1 semantics, which were vague until now.

Solving a Multi-Agent Pathfinding (MAPF) problem involves finding non-conflicting paths that lead a number of agents to their goal location. In the sum-of-costs variant of MAPF, one is also required to minimize the total number of moves performed by agents before stopping at the goal. Not surprisingly, since MAPF is combinatorial, a number of compilations to Satisfiability solving (SAT) and Answer Set Programming (ASP) exist. In this paper, we propose the first family of compilations to ASP that solve sum-of-costs MAPF over 4-connected grids. Unlike existing compilations to ASP that we are aware of, our encoding is the first that, after grounding, produces a number of clauses that is linear on the number of agents. In addition, the representation of the optimization objective is also carefully written, such that its size after grounding does not depend on the size of the grid. In our experimental evaluation, we show that our approach outperforms search- and SAT-based sum-of-costs MAPF solvers when grids are congested with agents.

Novelty pruning is a planning technique that focuses on exploring states that are novel, i.e., those containing facts that have not been seen before. This seemingly simple idea has had a huge impact on the state of the art in planning though its effectiveness is not entirely understood yet. We relate novelty to dominance pruning, which compares states to previously seen states to eliminate those that are provably worse in terms of goal distance. Novelty can be interpreted as an unsafe approximation of dominance, where states containing novel facts are relevant because they enable new paths to the goal and, therefore, they are less likely to be dominated by others. This provides a framework to understand the success of novelty, resulting in new variants that combine both techniques.

The research in hierarchical planning has made considerable progress in the last few years. Many recent systems do not rely on hand-tailored advice anymore to find solutions, but are supposed to be domain-independent systems that come with sophisticated solving techniques. In principle, this development would make the comparison between systems easier (because the domains are not tailored to a single system anymore) and – much more important – also the integration into other systems, because the modeling process is less tedious (due to the lack of advice) and there is no (or less) commitment to a certain planning system the model is created for. However, these advantages are destroyed by the lack of a common input language and feature set supported by the different systems. In this paper, we propose an extension to PDDL, the description language used in non-hierarchical planning, to the needs of hierarchical planning systems.

The need for multiple plans has been established by various planning applications. In some, solution quality has the predominant role, while in others diversity is the key factor. Most recent work takes both plan quality and solution diversity into account under the generic umbrella of diverse planning. There is no common agreement, however, on a collection of computational problems that fall under that generic umbrella. This in particular might lead to a comparison between planners that have different solution guarantees or optimization criteria in mind. In this work we revisit diverse planning literature in search of such a collection of computational problems, classifying the existing planners to these problems. We formally define a taxonomy of computational problems with respect to both plan quality and solution diversity, extending the existing work. We propose a novel approach to diverse planning, exploiting existing classical planners via planning task reformulation and choosing a subset of plans of required size in post-processing. Based on that, we present planners for two computational problems, that most existing planners solve. Our experiments show that the proposed approach significantly improves over the best performing existing planners in terms of coverage, the overall solution quality, and the overall diversity according to various diversity metrics.

The need for finding a set of plans rather than one has been motivated by a variety of planning applications. The problem is studied in the context of both diverse and top-k planning: while diverse planning focuses on the difference between pairs of plans, the focus of top-k planning is on the quality of each individual plan. Recent work in diverse planning introduced additionally restrictions on solution quality. Naturally, there are application domains where diversity plays the major role and domains where quality is the predominant feature. In both cases, however, the amount of produced plans is often an artificial constraint, and therefore the actual number has little meaning. Inspired by the recent work in diverse planning, we propose a new family of computational problems called top-quality planning, where solution validity is defined through plan quality bound rather than an arbitrary number of plans. Switching to bounding plan quality allows us to implicitly represent sets of plans. In particular, it makes it possible to represent sets of plans that correspond to valid plan reorderings with a single plan. We formally define the unordered top-quality planning computational problem and present the first planner for that problem. We empirically demonstrate the superior performance of our approach compared to a top-k planner-based baseline, ranging from 41% increase in coverage for finding all optimal plans to 69% increase in coverage for finding all plans of quality up to 120% of optimal plan cost. Finally, complementing the new approach by a complete procedure for generating all valid reorderings of a given plan, we derive a top-quality planner. We show the planner to be competitive with a top-k planner based baseline.

We extend goal recognition design to account for partially informed agents. In particular, we consider a two-agent setting in which one agent, the actor, seeks to achieve a goal but has only incomplete information about the environment. The second agent, the recognizer, has perfect information and aims to recognize the actor's goal from its behavior as quickly as possible. As a one-time offline intervention and with the objective of facilitating the recognition task, the recognizer can selectively reveal information to the actor. The problem of selecting which information to reveal, which we call information shaping, is challenging not only because the space of information shaping options may be large, but also because more information revelation need not make it easier to recognize an agent's goal. We formally define this problem, and suggest a pruning approach for efficiently searching the search space. We demonstrate the effectiveness and efficiency of the suggested method on standard benchmarks.

Many important applications, including robotics, data-center management, and process control, require planning action sequences in domains with continuous state and action spaces and discontinuous objective functions. Monte Carlo tree search (MCTS) is an effective strategy for planning in discrete action spaces. We provide a novel MCTS algorithm (voot) for deterministic environments with continuous action spaces, which, in turn, is based on a novel black-box function-optimization algorithm (voo) to efficiently sample actions. The voo algorithm uses Voronoi partitioning to guide sampling, and is particularly efficient in high-dimensional spaces. The voot algorithm has an instance of voo at each node in the tree. We provide regret bounds for both algorithms and demonstrate their empirical effectiveness in several high-dimensional problems including two difficult robotics planning problems.

In this paper, we study the one-shot and lifelong versions of the Target Assignment and Path Finding problem in automated sortation centers, where each agent needs to constantly assign itself a sorting station, move to its assigned station without colliding with obstacles or other agents, wait in the queue of that station to obtain a parcel for delivery, and then deliver the parcel to a sorting bin. The throughput of such centers is largely determined by the total idle time of all stations since their queues can frequently become empty. To address this problem, we first formalize and study the one-shot version that assigns stations to a set of agents and finds collision-free paths for the agents to their assigned stations. We present efficient algorithms for this task based on a novel min-cost max-flow formulation that minimizes the total idle time of all stations in a fixed time window. We then demonstrate how our algorithms for solving the one-shot problem can be applied to solving the lifelong problem as well. Experimentally, we believe to be the first researchers to consider real-world automated sortation centers using an industrial simulator with realistic data and a kinodynamic model of real robots. On this simulator, we showcase the benefits of our algorithms by demonstrating their efficiency and effectiveness for up to 350 agents.

Hierarchical Task Networks (HTN) planning uses a decomposition process guided by domain knowledge to guide search towards a planning task. While many HTN planners allow calls to external processes (e.g. to a simulator interface) during the decomposition process, this is a computationally expensive process, so planner implementations often use such calls in an ad-hoc way using very specialized domain knowledge to limit the number of calls. Conversely, the classical planners that are capable of using external calls (often called semantic attachments) during planning are limited to generating a fixed number of ground operators at problem grounding time. We formalize Semantic Attachments for HTN planning using semi coroutines, allowing such procedurally defined predicates to link the planning process to custom unifications outside of the planner, such as numerical results from a robotics simulator. The resulting planner then uses such coroutines as part of its backtracking mechanism to search through parallel dimensions of the state-space (e.g. through numeric variables). We show empirically that our planner outperforms the state-of-the-art numeric planners in a number of domains using minimal extra domain knowledge.

Agents operating in a multi-agent environment must consider not just their actions, but also those of the other agents in the system. Artificial social systems are a well-known means for coordinating a set of agents, without requiring centralized planning or online negotiation between agents. Artificial social systems enact a social law which restricts the agents from performing some actions under some circumstances. A robust social law prevents the agents from interfering with each other, but does not prevent them from achieving their goals. Previous work has addressed how to check if a given social law, formulated in a variant of ma-strips, is robust, via compilation to planning. However, the social law was manually specified. In this paper, we address the problem of automatically synthesizing a robust social law for a given multi-agent environment. We treat the problem of social law synthesis as a search through the space of possible social laws, relying on the robustness verification procedure as a goal test. We also show how to exploit additional information produced by the robustness verification procedure to guide the search.

Generalized planning aims at computing an algorithm-like structure (generalized plan) that solves a set of multiple planning instances. In this paper we define negative examples for generalized planning as planning instances that must not be solved by a generalized plan. With this regard the paper extends the notion of validation of a generalized plan as the problem of verifying that a given generalized plan solves the set of input positives instances while it fails to solve a given input set of negative examples. This notion of plan validation allows us to define quantitative metrics to asses the generalization capacity of generalized plans. The paper also shows how to incorporate this new notion of plan validation into a compilation for plan synthesis that takes both positive and negative instances as input. Experiments show that incorporating negative examples can accelerate plan synthesis in several domains and leverage quantitative metrics to evaluate the generalization capacity of the synthesized plans.

The objective of goal recognition is to infer a goal that accounts for the observed behavior of an actor. In this work, we introduce and formalize the notion of active goal recognition in which we endow the observer with agency to sense, reason, and act in the world with a view to enhancing and possibly expediting goal recognition, and/or to intervening in goal achievement. To this end, we present an algorithm for active goal recognition and a landmark-based approach to the elimination of hypothesized goals which leverages automated planning. Experiments demonstrate the merits of providing agency to the observer, and the effectiveness of our approach in potentially enhancing the observational power of the observer, as well as expediting and in some cases making possible the recognition of the actor's goal.

The objective of top-k planning is to determine a set of k different plans with lowest cost for a given planning task. In practice, such a set of best plans can be preferred to a single best plan generated by ordinary optimal planners, as it allows the user to choose between different alternatives and thus take into account preferences that may be difficult to model. In this paper we show that, in general, the decision problem version of top-k planning is PSPACE-complete, as is the decision problem version of ordinary classical planning. This does not hold for polynomially bounded plans for which the decision problem turns out to be PP-hard, while the ordinary case is NP-hard. We present a novel approach to top-k planning, called sym-k, which is based on symbolic search, and prove that sym-k is sound and complete. Our empirical analysis shows that sym-k exceeds the current state of the art for both small and large k.