Total: 71

In polymatrix coordination games, each player x is a node of a graph and must select an action in her strategy set. Nodes are playing separate bimatrix games with their neighbors in the graph. Namely, the utility of x is given by the preference she has for her action plus, for each neighbor y, a payoff which strictly depends on the mutual actions played by x and y. We propose the new class of distance polymatrix coordination games, properly generalizing polymatrix coordination games, in which the overall utility of player x further depends on the payoffs arising by mutual actions of players v,z that are the endpoints of edges at any distance h Keywords: Agent-based and Multi-agent Systems: Algorithmic Game Theory Agent-based and Multi-agent Systems: Computational Social Choice Agent-based and Multi-agent Systems: Noncooperative Games

In its most traditional setting, the main concern of optimization theory is the search for optimal solutions for instances of a given computational problem. A recent trend of research in artificial intelligence, called solution diversity, has focused on the development of notions of optimality that may be more appropriate in settings where subjectivity is essential. The idea is that instead of aiming at the development of algorithms that output a single optimal solution, the goal is to investigate algorithms that output a small set of sufficiently good solutions that are sufficiently diverse from one another. In this way, the user has the opportunity to choose the solution that is most appropriate to the context at hand. It also displays the richness of the solution space. When combined with techniques from parameterized complexity theory, the paradigm of diversity of solutions offers a powerful algorithmic framework to address problems of practical relevance. In this work, we investigate the impact of this combination in the field of Kemeny Rank Aggregation, a well-studied class of problems lying in the intersection of order theory and social choice theory and also in the field of order theory itself. In particular, we show that KRA is fixed-parameter tractable with respect to natural parameters providing natural formalizations of the notions of diversity and of the notion of a sufficiently good solution. Our main results work both when considering the traditional setting of aggregation over linearly ordered votes, and in the more general setting where votes are partially ordered.

We present a new and rich model of school choice with flexible diversity goals and specialized seats. The model also applies to other settings such as public housing allocation with diversity objectives. Our method of expressing flexible diversity goals is also applicable to other settings in moral multi-agent decision making where competing policies need to be balanced when allocating scarce resources. For our matching model, we present a polynomial-time algorithm that satisfies desirable properties, including strategyproofness and stability under several natural subdomains of our problem. We complement the results by providing a clear understanding about what results do not extend when considering the general model.

We study the classic problem of fairly allocating a set of indivisible goods among a group of agents, and focus on the notion of approximate proportionality known as PROPm. Prior work showed that there exists an allocation that satisfies this notion of fairness for instances involving up to five agents, but fell short of proving that this is true in general. We extend this result to show that a PROPm allocation is guaranteed to exist for all instances, independent of the number of agents or goods. Our proof is constructive, providing an algorithm that computes such an allocation and, unlike prior work, the running time of this algorithm is polynomial in both the number of agents and the number of goods.

We develop a new framework for designing truthful, high-revenue (combinatorial) auctions for limited supply. Our mechanism learns within an instance. It generalizes and improves over previously-studied random-sampling mechanisms. It first samples a participatory group of bidders, then samples several learning groups of bidders from the remaining pool of bidders, learns a high-revenue auction from the learning groups, and finally runs that auction on the participatory group. Previous work on random-sampling mechanisms focused primarily on unlimited supply. Limited supply poses additional significant technical challenges, since allocations of items to bidders must be feasible. We prove guarantees on the performance of our mechanism based on a market-shrinkage term and a new complexity measure we coin partition discrepancy. Partition discrepancy simultaneously measures the intrinsic complexity of the mechanism class and the uniformity of the set of bidders. We then introduce new auction classes that can be parameterized in a way that does not depend on the number of bidders participating, and prove strong guarantees for these classes. We show how our mechanism can be implemented efficiently by leveraging practically-efficient routines for solving winner determination. Finally, we show how to use structural revenue maximization to decide what auction class to use with our framework when there is a constraint on the number of learning groups.

We consider the problem of the conjoint selection and allocation of projects to a population of agents, e.g. students are assigned papers and shall present them to their peers. The selection can be constrained either by quotas over subcategories of projects, or by the preferences of the agents themselves. We explore fairness and optimality issues and refine the analysis of the rank-maximality and popularity optimality concepts. We show that they are compatible with reasonable fairness requirements related to rank-based envy-freeness and can be adapted to select globally good projects according to the preferences of the agents.

To address the dynamic nature of real-world networks, we generalize competitive diffusion games and Voronoi games from static to temporal graphs, where edges may appear or disappear over time. This establishes a new direction of studies in the area of graph games, motivated by applications such as influence spreading. As a first step, we investigate the existence of Nash equilibria in competitive diffusion and Voronoi games on different temporal graph classes. Even when restricting our studies to temporal paths and cycles, this turns out to be a challenging undertaking, revealing significant differences between the two games in the temporal setting. Notably, both games are equivalent on static paths and cycles. Our two main technical results are (algorithmic) proofs for the existence of Nash equilibria in temporal competitive diffusion and temporal Voronoi games when the edges are restricted not to disappear over time.

We study the parameterized complexity of counting variants of Swap- and Shift-Bribery, focusing on the parameterizations by the number of swaps and the number of voters. Facing several computational hardness results, using sampling we show experimentally that Swap-Bribery offers a new approach to the robustness analysis of elections.

In their AAMAS 2020 paper, Szufa et al. presented a "map of elections" that visualizes a set of 800 elections generated from various statistical cultures. While similar elections are grouped together on this map, there is no obvious interpretation of the elections' positions. We provide such an interpretation by introducing four canonical “extreme” elections, acting as a compass on the map. We use them to analyze both a dataset provided by Szufa et al. and a number of real-life elections. In effect, we find a new parameterization of the Mallows model, based on measuring the expected swap distance from the central preference order, and show that it is useful for capturing real-life scenarios.

A common theme of decision making in multi-agent systems is to assign utilities to alternatives, which individuals seek to maximize. This rationale is questionable in coalition formation where agents are affected by other members of their coalition. Based on the assumption that agents are benevolent towards other agents they like to form coalitions with, we propose loyalty in hedonic games, a binary relation dependent on agents' utilities. Given a hedonic game, we define a loyal variant where agents' utilities are defined by taking the minimum of their utility and the utilities of agents towards which they are loyal. This process can be iterated to obtain various degrees of loyalty, terminating in a locally egalitarian variant of the original game. We investigate axioms of group stability and efficiency for different degrees of loyalty. Specifically, we consider the problem of finding coalition structures in the core and of computing best coalitions, obtaining both positive and intractability results. In particular, the limit game possesses Pareto optimal coalition structures in the core.

The Shapley value is a well recognised method for dividing the value of joint effort in cooperative games. However, computing the Shapley value is known to be computationally hard, so stratified sample-based estimation is sometimes used. For this task, we provide two contributions to the state of the art. First, we derive a novel concentration inequality that is tailored to stratified Shapley value estimation using sample variance information. Second, by sequentially choosing samples to minimize our inequality, we develop a new and more efficient method of sampling to estimate the Shapley value. We evaluate our sampling method on a suite of test cooperative games, and our results demonstrate that it outperforms or is competitive with existing stratified sample-based estimation approaches to computing the Shapley value.

We study the problem of fairly allocating indivisible items to agents with different entitlements, which captures, for example, the distribution of ministries among political parties in a coalition government. Our focus is on picking sequences derived from common apportionment methods, including five traditional divisor methods and the quota method. We paint a complete picture of these methods in relation to known envy-freeness and proportionality relaxations for indivisible items as well as monotonicity properties with respect to the resource, population, and weights. In addition, we provide characterizations of picking sequences satisfying each of the fairness notions, and show that the well-studied maximum Nash welfare solution fails resource- and population-monotonicity even in the unweighted setting. Our results serve as an argument in favor of using picking sequences in weighted fair division problems.

We study generalizations of stable matching in which agents may be matched fractionally; this models time-sharing assignments. We focus on the so-called ordinal stability and cardinal stability, and investigate the computational complexity of finding an ordinally stable or cardinally stable fractional matching which either maximizes the social welfare (i.e., the overall utilities of the agents) or the number of fully matched agents (i.e., agents whose matching values sum up to one). We complete the complexity classification of both optimization problems for both ordinal stability and cardinal stability, distinguishing between the marriage (bipartite) and roommates (non-bipartite) cases and the presence or absence of ties in the preferences. In particular, we prove a surprising result that finding a cardinally stable fractional matching with maximum social welfare is NP-hard even for the marriage case without ties. This answers an open question and exemplifies a rare variant of stable marriage that remains hard for preferences without ties. We also complete the picture of the relations of the stability notions and derive structural properties.

One practical requirement in solving dynamic games is to ensure that the players play well from any decision point onward. To satisfy this requirement, existing efforts focus on equilibrium refinement, but the scalability and applicability of existing techniques are limited. In this paper, we propose Temporal-Induced Self-Play (TISP), a novel reinforcement learning-based framework to find strategies with decent performances from any decision point onward. TISP uses belief-space representation, backward induction, policy learning, and non-parametric approximation. Building upon TISP, we design a policy-gradient-based algorithm TISP-PG. We prove that TISP-based algorithms can find approximate Perfect Bayesian Equilibrium in zero-sum one-sided stochastic Bayesian games with finite horizon. We test TISP-based algorithms in various games, including finitely repeated security games and a grid-world game. The results show that TISP-PG is more scalable than existing mathematical programming-based methods and significantly outperforms other learning-based methods.

When can cooperation arise from self-interested decisions in public goods games? And how can we help agents to act cooperatively? We examine these classical questions in a pivotal participation game, a variant of public good games, where heterogeneous agents make binary participation decisions on contributing their endowments, and the public project succeeds when it has enough contributions. We prove it is NP-complete to decide the existence of a cooperative Nash equilibrium such that the project succeeds. We demonstrate that the decision problem becomes easy if agents are homogeneous enough. We then propose two algorithms to help cooperation in the game. Our first algorithm adds an external investment to the public project, and our second algorithm uses matching funds. We show the cost to induce a cooperative Nash equilibrium is near-optimal for both algorithms. Finally, the cost of matching funds can always be smaller than the cost of adding an external investment. Intuitively, matching funds provide a greater incentive for cooperation than adding an external investment does.

We study learning dynamics in distributed production economies such as blockchain mining, peer-to-peer file sharing and crowdsourcing. These economies can be modelled as multi-product Cournot competitions or all-pay auctions (Tullock contests) when individual firms have market power, or as Fisher markets with quasi-linear utilities when every firm has negligible influence on market outcomes. In the former case, we provide a formal proof that Gradient Ascent (GA) can be Li-Yorke chaotic for a step size as small as Θ(1/n), where n is the number of firms. In stark contrast, for the Fisher market case, we derive a Proportional Response (PR) protocol that converges to market equilibrium. The positive results on the convergence of the PR dynamics are obtained in full generality, in the sense that they hold for Fisher markets with any quasi-linear utility functions. Conversely, the chaos results for the GA dynamics are established even in the simplest possible setting of two firms and one good, and they hold for a wide range of price functions with different demand elasticities. Our findings suggest that by considering multi-agent interactions from a market rather than a game-theoretic perspective, we can formally derive natural learning protocols which are stable and converge to effective outcomes rather than being chaotic.

To promote efficient interactions in dynamic and multi-agent systems, there is much interest in techniques that allow agents to represent and reason about social norms that govern agent interactions. Much of this work assumes that norms are provided to agents, but some work has investigated how agents can identify the norms present in a society through observation and experience. However, the norm-identification techniques proposed in the literature often depend on a very specific and domain-specific representation of norms, or require that the possible norms can be enumerated in advance. This paper investigates the problem of identifying norm candidates from a normative language expressed as a probabilistic context-free grammar, using Markov Chain Monte Carlo (MCMC) search. We apply our technique to a simulated robot manipulator task and show that it allows effective identification of norms from observation.

We present a multi-agent learning algorithm, ALMA-Learning, for efficient and fair allocations in large-scale systems. We circumvent the traditional pitfalls of multi-agent learning (e.g., the moving target problem, the curse of dimensionality, or the need for mutually consistent actions) by relying on the ALMA heuristic as a coordination mechanism for each stage game. ALMA-Learning is decentralized, observes only own action/reward pairs, requires no inter-agent communication, and achieves near-optimal (<5% loss) and fair coordination in a variety of synthetic scenarios and a real-world meeting scheduling problem. The lightweight nature and fast learning constitute ALMA-Learning ideal for on-device deployment.

We propose a new approach to intention progression in multi-agent settings where other agents are effectively black boxes. That is, while their goals are known, the precise programs used to achieve these goals are not known. In our approach, agents use an abstraction of their own program called a partially-ordered goal-plan tree (pGPT) to schedule their intentions and predict the actions of other agents. We show how a pGPT can be derived from the program of a BDI agent, and present an approach based on Monte Carlo Tree Search (MCTS) for scheduling an agent's intentions using pGPTs. We evaluate our pGPT-based approach in cooperative, selfish and adversarial multi-agent settings, and show that it out-performs MCTS-based scheduling where agents assume that other agents have the same program as themselves.

We study the Connected Fair Division problem (CFD), which generalizes the fundamental problem of fairly allocating resources to agents by requiring that the items allocated to each agent form a connected subgraph in a provided item graph G. We expand on previous results by providing a comprehensive complexity-theoretic understanding of CFD based on several new algorithms and lower bounds while taking into account several well-established notions of fairness: proportionality, envy-freeness, EF1 and EFX. In particular, we show that to achieve tractability, one needs to restrict both the agents and the item graph in a meaningful way. We design (XP)-algorithms for the problem parameterized by (1) clique-width of G plus the number of agents and (2) treewidth of G plus the number of agent types, along with corresponding lower bounds. Finally, we show that to achieve fixed-parameter tractability, one needs to not only use a more restrictive parameterization of G, but also include the maximum item valuation as an additional parameter.

Distributed constraint optimization problems (DCOPs) are a powerful model for multi-agent coordination and optimization, where information and controls are distributed among multiple agents by nature. Sampling-based algorithms are important incomplete techniques for solving medium-scale DCOPs. However, they use tables to exactly store all the information (e.g., costs, confidence bounds) to facilitate sampling, which limits their scalability. This paper tackles the limitation by incorporating deep neural networks in solving DCOPs for the first time and presents a neural-based sampling scheme built upon regret-matching. In the algorithm, each agent trains a neural network to approximate the regret related to its local problem and performs sampling according to the estimated regret. Furthermore, to ensure exploration we propose a regret rounding scheme that rounds small regret values to positive numbers. We theoretically show the regret bound of our algorithm and extensive evaluations indicate that our algorithm can scale up to large-scale DCOPs and significantly outperform the state-of-the-art methods.

Citizens' assemblies need to represent subpopulations according to their proportions in the general population. These large committees are often constructed in an online fashion by contacting people, asking for the demographic features of the volunteers, and deciding to include them or not. This raises a trade-off between the number of people contacted (and the incurring cost) and the representativeness of the committee. We study three methods, theoretically and experimentally: a greedy algorithm that includes volunteers as long as proportionality is not violated; a non-adaptive method that includes a volunteer with a probability depending only on their features, assuming that the joint feature distribution in the volunteer pool is known; and a reinforcement learning based approach when this distribution is not known a priori but learnt online.

We study the recently introduced cake-cutting setting in which the cake is represented by an undirected graph. This generalizes the canonical interval cake and allows for modeling the division of road networks. We show that when the graph is a forest, an allocation satisfying the well-known criterion of maximin share fairness always exists. Our result holds even when separation constraints are imposed; however, in the latter case no multiplicative approximation of proportionality can be guaranteed. Furthermore, while maximin share fairness is not always achievable for general graphs, we prove that ordinal relaxations can be attained.

This paper is part of an ongoing endeavor to bring the theory of fair division closer to practice by handling requirements from real-life applications. We focus on two requirements originating from the division of land estates: (1) each agent should receive a plot of a usable geometric shape, and (2) plots of different agents must be physically separated. With these requirements, the classic fairness notion of proportionality is impractical, since it may be impossible to attain any multiplicative approximation of it. In contrast, the ordinal maximin share approximation, introduced by Budish in 2011, provides meaningful fairness guarantees. We prove upper and lower bounds on achievable maximin share guarantees when the usable shapes are squares, fat rectangles, or arbitrary axes-aligned rectangles, and explore the algorithmic and query complexity of finding fair partitions in this setting.

We study the secretary problem in multi-agent environments. In the standard secretary problem, a sequence of arbitrary awards arrive online, in a random order, and a single decision maker makes an immediate and irrevocable decision whether to accept each award upon its arrival. The requirement to make immediate decisions arises in many cases due to an implicit assumption regarding competition. Namely, if the decision maker does not take the offered award immediately, it will be taken by someone else. We introduce a novel multi-agent secretary model, in which the competition is explicit. In our model, multiple agents compete over the arriving awards, but the decisions need not be immediate; instead, agents may select previous awards as long as they are available (i.e., not taken by another agent). If an award is selected by multiple agents, ties are broken either randomly or according to a global ranking. This induces a multi-agent game in which the time of selection is not enforced by the rules of the games, rather it is an important component of the agent's strategy. We study the structure and performance of equilibria in this game. For random tie breaking, we characterize the equilibria of the game, and show that the expected social welfare in equilibrium is nearly optimal, despite competition among the agents. For ranked tie breaking, we give a full characterization of equilibria in the 3-agent game, and show that as the number of agents grows, the winning probability of every agent under non-immediate selections approaches her winning probability under immediate selections.