| Total: 88
This paper considers the capacity expansion problem in two-sided matchings, where the policymaker is allowed to allocate some extra seats as well as the standard seats. In medical residency match, each hospital accepts a limited number of doctors. Such capacity constraints are typically given in advance. However, such exogenous constraints can compromise the welfare of the doctors; some popular hospitals inevitably dismiss some of their favorite doctors. Meanwhile, it is often the case that the hospitals are also benefited to accept a few extra doctors. To tackle the problem, we propose an anytime method that the upper confidence tree searches the space of capacity expansions, each of which has a resident-optimal stable assignment that the deferred acceptance method finds. Constructing a good search tree representation significantly boosts the performance of the proposed method. Our simulation shows that the proposed method identifies an almost optimal capacity expansion with a significantly smaller computational budget than exact methods based on mixed-integer programming.
Norms help regulate a society. Norms may be explicit (represented in structured form) or implicit. We address the emergence of explicit norms by developing agents who provide and reason about explanations for norm violations in deciding sanctions and identifying alternative norms. These agents use a genetic algorithm to produce norms and reinforcement learning to learn the values of these norms. We find that applying explanations leads to norms that provide better cohesion and goal satisfaction for the agents. Our results are stable for societies with differing attitudes of generosity.
We study the problem of fairly allocating a set of indivisible goods to a set of n agents. Envy-freeness up to any good (EFX) criterion (which requires that no agent prefers the bundle of another agent after the removal of any single good) is known to be a remarkable analogue of envy-freeness when the resource is a set of indivisible goods. In this paper, we investigate EFX for restricted additive valuations, that is, every good has a non-negative value, and every agent is interested in only some of the goods. We introduce a natural relaxation of EFX called EFkX which requires that no agent envies another agent after the removal of any k goods. Our main contribution is an algorithm that finds a complete (i.e., no good is discarded) EF2X allocation for restricted additive valuations. In our algorithm we devise new concepts, namely configuration and envy-elimination that might be of independent interest. We also use our new tools to find an EFX allocation for restricted additive valuations that discards at most n/2 -1 goods.
We consider an agent community wishing to decide on several binary issues by means of issue-by-issue majority voting. For each issue and each agent, one of the two options is better than the other. However, some of the agents may be confused about some of the issues, in which case they may vote for the option that is objectively worse for them. A benevolent external party wants to help the agents to make better decisions, i.e., select the majority-preferred option for as many issues as possible. This party may have one of the following tools at its disposal: (1) educating some of the agents, so as to enable them to vote correctly on all issues, (2) appointing a subset of highly competent agents to make decisions on behalf of the entire group, or (3) guiding the agents on how to delegate their votes to other agents, in a way that is consistent with the agents' opinions. For each of these tools, we study the complexity of the decision problem faced by this external party, obtaining both NP-hardness results and fixed-parameter tractability results.
Voting is a crucial methodology for eliciting and combining agents' preferences and information across many applications. Just as there are numerous voting rules exhibiting different properties, we also see many different voting systems. In this paper we investigate how different voting systems perform as a function of the characteristics of the underlying voting population and social network. In particular, we compare direct democracy, liquid democracy, and sortition in a ground truth voting context. Through simulations -- using both real and artificially generated social networks -- we illustrate how voter competency distributions and levels of direct participation affect group accuracy differently in each voting mechanism. Our results can be used to guide the selection of a suitable voting system based on the characteristics of a particular voting setting.
We study signaling in Bayesian ad auctions, in which bidders' valuations depend on a random, unknown state of nature. The auction mechanism has complete knowledge of the actual state of nature, and it can send signals to bidders so as to disclose information about the state and increase revenue. For instance, a state may collectively encode some features of the user that are known to the mechanism only, since the latter has access to data sources unaccessible to the bidders. We study the problem of computing how the mechanism should send signals to bidders in order to maximize revenue. While this problem has already been addressed in the easier setting of second-price auctions, to the best of our knowledge, our work is the first to explore ad auctions with more than one slot. In this paper, we focus on public signaling and VCG mechanisms, under which bidders truthfully report their valuations. We start with a negative result, showing that, in general, the problem does not admit a PTAS unless P = NP, even when bidders' valuations are known to the mechanism. The rest of the paper is devoted to settings in which such negative result can be circumvented. First, we prove that, with known valuations, the problem can indeed be solved in polynomial time when either the number of states d or the number of slots m is fixed. Moreover, in the same setting, we provide an FPTAS for the case in which bidders are single minded, but d and m can be arbitrary. Then, we switch to the random valuations setting, in which these are randomly drawn according to some probability distribution. In this case, we show that the problem admits an FPTAS, a PTAS, and a QPTAS, when, respectively, d is fixed, m is fixed, and bidders' valuations are bounded away from zero.
The Stackelberg security game is played between a defender and an attacker, where the defender needs to allocate a limited amount of resources to multiple targets in order to minimize the loss due to adversarial attack by the attacker. While allowing targets to have different values, classic settings often assume uniform requirements to defend the targets. This enables existing results that study mixed strategies (randomized allocation algorithms) to adopt a compact representation of the mixed strategies. In this work, we initiate the study of mixed strategies for the security games in which the targets can have different defending requirements. In contrast to the case of uniform defending requirement, for which an optimal mixed strategy can be computed efficiently, we show that computing the optimal mixed strategy is NP-hard for the general defending requirements setting. However, we show that strong upper and lower bounds for the optimal mixed strategy defending result can be derived. We propose an efficient close-to-optimal Patching algorithm that computes mixed strategies that use only few pure strategies. We also study the setting when the game is played on a network and resource sharing is enabled between neighboring targets. Our experimental results demonstrate the effectiveness of our algorithm in several large real-world datasets.
We study the problem of allocating m indivisible items to n agents with additive utilities. It is desirable for the allocation to be both fair and efficient, which we formalize through the notions of envy-freeness and Pareto-optimality. While envy-free and Pareto-optimal allocations may not exist for arbitrary utility profiles, previous work has shown that such allocations exist with high probability assuming that all agents’ values for all items are independently drawn from a common distribution. In this paper, we consider a generalization of this model where each agent’s utilities are drawn independently from a distribution specific to the agent. We show that envy-free and Pareto-optimal allocations are likely to exist in this asymmetric model when m=Ω(n log n), which is tight up to a log log gap that also remains open in the symmetric subsetting. Furthermore, these guarantees can be achieved by a polynomial-time algorithm.
We study the problem of allocating indivisible goods among agents in a fair manner. While envy-free allocations of indivisible goods are not guaranteed to exist, envy-freeness can be achieved by additionally providing some subsidy to the agents. These subsidies can be alternatively viewed as a divisible good (money) that is fractionally assigned among the agents to realize an envy-free outcome. In this setup, we bound the subsidy required to attain envy-freeness among agents with dichotomous valuations, i.e., among agents whose marginal value for any good is either zero or one. We prove that, under dichotomous valuations, there exists an allocation that achieves envy-freeness with a per-agent subsidy of either 0 or 1. Furthermore, such an envy-free solution can be computed efficiently in the standard value-oracle model. Notably, our results hold for general dichotomous valuations and, in particular, do not require the (dichotomous) valuations to be additive, submodular, or even subadditive. Also, our subsidy bounds are tight and provide a linear (in the number of agents) factor improvement over the bounds known for general monotone valuations.
Given the ubiquity of AI-based decisions that affect individuals’ lives, providing transparent explanations about algorithms is ethically sound and often legally mandatory. How do individuals strategically adapt following explanations? What are the consequences of adaptation for algorithmic accuracy? We simulate the interplay between explanations shared by an Institution (e.g. a bank) and the dynamics of strategic adaptation by Individuals reacting to such feedback. Our model identifies key aspects related to strategic adaptation and the challenges that an institution could face as it attempts to provide explanations. Resorting to an agent-based approach, our model scrutinizes: i) the impact of transparency in explanations, ii) the interaction between faking behavior and detection capacity and iii) the role of behavior imitation. We find that the risks of transparent explanations are alleviated if effective methods to detect faking behaviors are in place. Furthermore, we observe that behavioral imitation --- as often happens across societies --- can alleviate malicious adaptation and contribute to accuracy, even after transparent explanations.
In participatory budgeting the stakeholders collectively decide which projects from a set of proposed projects should be implemented. This decision underlies both time and monetary constraints. In reality it is often impossible to figure out the exact cost of each project in advance, it is only known after a project is finished. To reduce risk, one can implement projects one after the other to be able to react to higher costs of a previous project. However, this will increase execution time drastically. We generalize existing frameworks to capture this setting, study desirable properties of algorithms for this problem, and show that some desirable properties are incompatible. Then we present and analyze algorithms that trade-off desirable properties.
Residential segregation in metropolitan areas is a phenomenon that can be observed all over the world. Recently, this was investigated via game-theoretic models. There, selfish agents of two types are equipped with a monotone utility function that ensures higher utility if an agent has more same-type neighbors. The agents strategically choose their location on a given graph that serves as residential area to maximize their utility. However, sociological polls suggest that real-world agents are actually favoring mixed-type neighborhoods, and hence should be modeled via non-monotone utility functions. To address this, we study Swap Schelling Games with single-peaked utility functions. Our main finding is that tolerance, i.e., agents favoring fifty-fifty neighborhoods or being in the minority, is necessary for equilibrium existence on almost regular or bipartite graphs. Regarding the quality of equilibria, we derive (almost) tight bounds on the Price of Anarchy and the Price of Stability. In particular, we show that the latter is constant on bipartite and almost regular graphs.
Modeling how agents form their opinions is of paramount importance for designing marketing and electoral campaigns. In this work, we present a new framework for opinion formation which generalizes the well-known Friedkin-Johnsen model by incorporating three important features: (i) social group membership, that limits the amount of influence that people not belonging to the same group may lead on a given agent; (ii) both attraction among friends, and repulsion among enemies; (iii) different strengths of influence lead from different people on a given agent, even if the social relationships among them are the same. We show that, despite its generality, our model always admits a pure Nash equilibrium which, under opportune mild conditions, is even unique. Next, we analyze the performances of these equilibria with respect to a social objective function defined as a convex combination, parametrized by a value λ∈[0,1], of the costs yielded by the untruthfulness of the declared opinions and the total cost of social pressure. We prove bounds on both the price of anarchy and the price of stability which show that, for not-too-extreme values of λ, performance at equilibrium are very close to optimal ones. For instance, in several interesting scenarios, the prices of anarchy and stability are both equal to max{2λ,1-λ}/min{2λ,1-λ} which never exceeds 2 for λ∈[1/5,1/2].
In this work we introduce a new class of mechanisms composed of a traditional Generalized Second Price (GSP) auction, and a fair division scheme in order to achieve some desired level of fairness between groups of Bayesian strategic advertisers. We propose two mechanisms, beta-Fair GSP and GSP-EFX, that compose GSP with, respectively, an envy-free up to one item, and an envy-free up to any item fair division scheme. The payments of GSP are adjusted in order to compensate advertisers that suffer a loss of efficiency due the fair division stage. We investigate the strategic learning implications of the deployment of sponsored search auction mechanisms that obey to such fairness criteria. We prove that, for both mechanisms, if bidders play so as to minimize their external regret they are guaranteed to reach an equilibrium with good social welfare. We also prove that the mechanisms are budget balanced, so that the payments charged by the traditional GSP mechanism are a good proxy of the total compensation offered to the advertisers. Finally, we evaluate the quality of the allocations through experiments on real-world data.
Motivated by putting empirical work based on (synthetic) election data on a more solid mathematical basis, we analyze six distances among elections, including, e.g., the challenging-to-compute but very precise swap distance and the distance used to form the so-called map of elections. Among the six, the latter seems to strike the best balance between its computational complexity and expressiveness.
Advances in multi-agent reinforcement learning (MARL) enable sequential decision making for a range of exciting multi-agent applications such as cooperative AI and autonomous driving. Explaining agent decisions is crucial for improving system transparency, increasing user satisfaction, and facilitating human-agent collaboration. However, existing works on explainable reinforcement learning mostly focus on the single-agent setting and are not suitable for addressing challenges posed by multi-agent environments. We present novel methods to generate two types of policy explanations for MARL: (i) policy summarization about the agent cooperation and task sequence, and (ii) language explanations to answer queries about agent behavior. Experimental results on three MARL domains demonstrate the scalability of our methods. A user study shows that the generated explanations significantly improve user performance and increase subjective ratings on metrics such as user satisfaction.
A fundamental question in social choice and multi-agent systems is aggregating ordinal preferences expressed by agents into a measurably prudent collective choice. A promising line of recent work views ordinal preferences as a proxy for underlying cardinal preferences. It aims to optimize distortion, the worst-case approximation ratio of the (utilitarian) social welfare. When agents rank the set of alternatives, prior work identifies near-optimal voting rules for selecting one or more alternatives. However, ranking all the alternatives is prohibitive when there are many alternatives. In this work, we consider the setting where each agent ranks only her t favorite alternatives and identify almost tight bounds on the best possible distortion when selecting a single alternative or a committee of alternatives of a given size k. Our results also extend to approximating higher moments of social welfare. Along the way, we close a gap left open in prior work by identifying asymptotically tight distortion bounds for committee selection given full rankings.
We study settings in which agents with incomplete preferences need to make a collective decision. We focus on a process of majority dynamics where issues are addressed one at a time and undecided agents follow the opinion of the majority. We assess the effects of this process on various consensus notions—such as the Condorcet winner—and show that in the worst case, myopic adherence to the majority damages existing consensus; yet, simulation experiments indicate that the damage is often mild. We also examine scenarios where the chair of the decision process can control the existence (or the identity) of consensus, by determining the order in which the issues are discussed.
Social decision schemes (SDSs) map the preferences of individual voters over multiple alternatives to a probability distribution over the alternatives. In order to study properties such as efficiency, strategyproofness, and participation for SDSs, preferences over alternatives are typically lifted to preferences over lotteries using the notion of stochastic dominance (SD). However, requiring strategyproofness or strict participation with respect to this preference extension only leaves room for rather undesirable SDSs such as random dictatorships. Hence, we focus on the natural but little understood pairwise comparison (PC) preference extension, which postulates that one lottery is preferred to another if the former is more likely to return a preferred outcome. In particular, we settle three open questions raised by Brandt in Rolling the dice: Recent results in probabilistic social choice (2017): (i) there is no Condorcet-consistent SDS that satisfies PC-strategyproofness; (ii) there is no anonymous and neutral SDS that satisfies PC-efficiency and PC-strategyproofness; and (iii) there is no anonymous and neutral SDS that satisfies PC-efficiency and strict PC-participation. All three impossibilities require m>=4 alternatives and turn into possibilities when m<=3.
We consider opinion diffusion for undirected networks with sequential updates when the opinions of the agents are single-peaked preference rankings. Our starting point is the study of preserving single-peakedness. We identify voting rules that, when given a single-peaked profile, output at least one ranking that is single peaked w.r.t. a single-peaked axis of the input. For such voting rules we show convergence to a stable state of the diffusion process that uses the voting rule as the agents' update rule. Further, we establish an efficient algorithm that maximises the spread of extreme opinions.
Electing a single committee of a small size is a classical and well-understood voting situation. Being interested in a sequence of committees, we introduce two time-dependent multistage models based on simple scoring-based voting. Therein, we are given a sequence of voting profiles (stages) over the same set of agents and candidates, and our task is to find a small committee for each stage of high score. In the conservative model we additionally require that any two consecutive committees have a small symmetric difference. Analogously, in the revolutionary model we require large symmetric differences. We prove both models to be NP-hard even for a constant number of agents, and, based on this, initiate a parameterized complexity analysis for the most natural parameters and combinations thereof. Among other results, we prove both models to be in XP yet W[1]-hard regarding the number of stages, and that being revolutionary seems to be "easier" than being conservative.
Network Creation Games are an important framework for understanding the formation of real-world networks. These games usually assume a set of indistinguishable agents strategically buying edges at a uniform price leading to a network among them. However, in real life, agents are heterogeneous and their relationships often display a bias towards similar agents, say of the same ethnic group. This homophilic behavior on the agent level can then lead to the emergent global phenomenon of social segregation. We study Network Creation Games with multiple types of homophilic agents and non-uniform edge cost, introducing two models focusing on the perception of same-type and different-type neighboring agents, respectively. Despite their different initial conditions, both our theoretical and experimental analysis show that both the composition and segregation strength of the resulting stable networks are almost identical, indicating a robust structure of social networks under homophily.
An autonomous broker that liaises between retail customers and power-generating companies (GenCos) is essential for the smart grid ecosystem. The efficiency brought in by such brokers to the smart grid setup can be studied through a well-developed simulation environment. In this paper, we describe the design of one such energy broker called VidyutVanika21 (VV21) and analyze its performance using a simulation platform called PowerTAC (PowerTrading Agent Competition). Specifically, we discuss the retail (VV21–RM) and wholesale market (VV21–WM) modules of VV21 that help the broker achieve high net profits in a competitive setup. Supported by game-theoretic analysis, the VV21–RM designs tariff contracts that a) maintain a balanced portfolio of different types of customers; b) sustain an appropriate level of market share, and c) introduce surcharges on customers to reduce energy usage during peak demand times. The VV21–WM aims to reduce the cost of procurement by following the supply curve of the GenCo to identify its lowest ask for a particular auction which is then used to generate suitable bids. We further demonstrate the efficacy of the retail and wholesale strategies of VV21 in PowerTAC 2021 finals and through several controlled experiments.
We consider designing reward schemes that incentivize agents to create high-quality content (e.g., videos, images, text, ideas). The problem is at the center of a real-world application where the goal is to optimize the overall quality of generated content on user-generated content platforms. We focus on anonymous independent reward schemes (AIRS) that only take the quality of an agent's content as input. We prove the general problem is NP-hard. If the cost function is convex, we show the optimal AIRS can be formulated as a convex optimization problem and propose an efficient algorithm to solve it. Next, we explore the optimal linear reward scheme and prove it has a 1/2-approximation ratio, and the ratio is tight. Lastly, we show the proportional scheme can be arbitrarily bad compared to AIRS.
Although multistage tasks involving multiple sequential goals are common in real-world applications, they are not fully studied in multi-agent reinforcement learning (MARL). To accomplish a multi-stage task, agents have to achieve cooperation on different subtasks. Exploring the collaborative patterns of different subtasks and the sequence of completing the subtasks leads to an explosion in the search space, which poses great challenges to policy learning. Existing works designed for single-stage tasks where agents learn to cooperate only once usually suffer from low sample efficiency in multi-stage tasks as agents explore aimlessly. Inspired by human’s improving cooperation through goal consistency, we propose Multi-Agent Goal Consistency (MAGIC) framework to improve sample efficiency for learning in multi-stage tasks. MAGIC adopts a goal-oriented actor-critic model to learn both local and global views of goal cognition, which helps agents understand the task at the goal level so that they can conduct targeted exploration accordingly. Moreover, to improve exploration efficiency, MAGIC employs two-level goal consistency training to drive agents to formulate a consistent goal cognition. Experimental results show that MAGIC significantly improves sample efficiency and facilitates cooperation among agents compared with state-of-art MARL algorithms in several challenging multistage tasks.