| Total: 65
Time-inconsistency is a characteristic of human behavior in which people plan for long-term benefits but take actions that differ from the plan due to conflicts with short-term benefits. Such time-inconsistent behavior is believed to be caused by present bias, a tendency to overestimate immediate rewards and underestimate future rewards. It is essential in behavioral economics to investigate the relationship between present bias and time-inconsistency. In this paper, we propose a model for analyzing agent behavior with present bias in tasks to make progress toward a goal over a specific period. Unlike previous models, the state sequence of the agent can be described analytically in our model. Based on this property, we analyze three crucial problems related to agents under present bias: task abandonment, optimal goal setting, and optimal reward scheduling. Extensive analysis reveals how present bias affects the condition under which task abandonment occurs and optimal intervention strategies. Our findings are meaningful for preventing task abandonment and intervening through incentives in the real world.
Policy gradient methods enjoy strong practical performance in numerous tasks in reinforcement learning. Their theoretical understanding in multiagent settings, however, remains limited, especially beyond two-player competitive and potential Markov games. In this paper, we develop a new framework to characterize optimistic policy gradient methods in multi-player Markov games with a single controller. Specifically, under the further assumption that the game exhibits an equilibrium collapse, in that the marginals of coarse correlated equilibria (CCE) induce Nash equilibria (NE), we show convergence to stationary epsilon-NE in O(1/epsilon^2) iterations, where O suppresses polynomial factors in the natural parameters of the game. Such an equilibrium collapse is well-known to manifest itself in two-player zero-sum Markov games, but also occurs even in a class of multi-player Markov games with separable interactions, as established by recent work. As a result, we bypass known complexity barriers for computing stationary NE when either of our assumptions fails. Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.
We consider a social choice setting in which agents and alternatives are represented by points in a metric space, and the cost of an agent for an alternative is the distance between the corresponding points in the space. The goal is to choose a single alternative to (approximately) minimize the social cost (cost of all agents) or the maximum cost of any agent, when only limited information about the preferences of the agents is given. Previous work has shown that the best possible distortion one can hope to achieve is 3 when access to the ordinal preferences of the agents is given, even when the distances between alternatives in the metric space are known. We improve upon this bound of 3 by designing deterministic mechanisms that exploit a bit of cardinal information. We show that it is possible to achieve distortion 1+sqrt(2) by using the ordinal preferences of the agents, the distances between alternatives, and a threshold approval set per agent that contains all alternatives for whom her cost is within an appropriately chosen factor of her cost for her most-preferred alternative. We show that this bound is the best possible for any deterministic mechanism in general metric spaces, and also provide improved bounds for the fundamental case of a line metric.
In pursuit of participatory budgeting (PB) outcomes with broader fairness guarantees, we initiate the study of lotteries over discrete PB outcomes. As the projects have heterogeneous costs, the amount spent may not be equal ex ante and ex post. To address this, we develop a technique to bound the amount by which the ex-post spend differs from the ex-ante spend---the property is termed budget balanced up to one project (BB1). With respect to fairness, we take a best-of-both-worlds perspective, seeking outcomes that are both ex-ante and ex-post fair. Towards this goal, we initiate a study of ex-ante fairness properties in PB, including Individual Fair Share (IFS), Unanimous Fair Share (UFS) and their stronger variants, as well as Group Fair Share (GFS). We show several incompatibility results between these ex-ante fairness notions and existing ex-post concepts based on justified representation. One of our main contributions is a randomized algorithm which simultaneously satisfies ex-ante Strong UFS, ex-post full justified representation (FJR) and ex-post BB1 for PB with binary utilities.
Envy-freeness is one of the most important fairness concerns when allocating items. We study envy-free house allocation when agents have uncertain preferences over items and consider several well-studied preference uncertainty models. The central problem that we focus on is computing an allocation that has the highest probability of being envy-free. We show that each model leads to a distinct set of algorithmic and complexity results, including detailed results on (in-)approximability. En route, we consider two related problems of checking whether there exists an allocation that is possibly or necessarily envy-free. We give a complete picture of the computational complexity of these two problems for all the uncertainty models we consider.
We develop a model of content filtering as a game between the filter and the content consumer, where the latter incurs information costs for examining the content. Motivating examples include censoring misinformation, spam/phish filtering, and recommender systems acting on a stream of content. When the attacker is exogenous, we show that improving the filter’s quality is weakly Pareto improving, but has no impact on equilibrium payoffs until the filter becomes sufficiently accurate. Further, if the filter does not internalize the consumer’s information costs, its lack of commitment power may render it useless and lead to inefficient outcomes. When the attacker is also strategic, improvements in filter quality may decrease equilibrium payoffs.
Equitability (EQ) in fair division requires that items be allocated such that all agents value the bundle they receive equally. With indivisible items, an equitable allocation may not exist, and hence we instead consider a meaningful analog, EQx, that requires equitability up to any item. EQx allocations exist for monotone, additive valuations. However, if (1) the agents' valuations are not additive or (2) the set of indivisible items includes both goods and chores (positively and negatively valued items), then prior to the current work it was not known whether EQx allocations exist or not. We study both the existence and efficient computation of EQx allocations. (1) For monotone valuations (not necessarily additive), we show that EQx allocations always exist. Also, for the large class of weakly well-layered valuations, EQx allocations can be found in polynomial time. Further, we prove that approximately EQx allocations can be computed efficiently under general monotone valuations. (2) For non-monotone valuations, we show that an EQx allocation may not exist, even for two agents with additive valuations. Under some special cases, however, we show existence and efficient computability of EQx allocations. This includes the case of two agents with additive valuations where each item is either a good or a chore, and there are no mixed items.
Principal-agent problems arise when one party acts on behalf of another, leading to conflicts of interest. The economic literature has extensively studied principal-agent problems, and recent work has extended this to more complex scenarios such as Markov Decision Processes (MDPs). In this paper, we further explore this line of research by investigating how reward shaping under budget constraints can improve the principal's utility. We study a two-player Stackelberg game where the principal and the agent have different reward functions, and the agent chooses an MDP policy for both players. The principal offers an additional reward to the agent, and the agent picks their policy selfishly to maximize their reward, which is the sum of the original and the offered reward. Our results establish the NP-hardness of the problem and offer polynomial approximation algorithms for two classes of instances: Stochastic trees and deterministic decision processes with a finite horizon.
We address the problem of improving the worst-case efficiency of pure Nash equilibria (aka, the price of anarchy) in affine congestion games, through a novel use of signalling. We assume that, for each player in the game, a most preferred strategy is publicly signalled. This can be done either distributedly by the players themselves, or be the outcome of some centralized algorithm. We apply this signalling scheme to two well-studied scenarios: games with partially altruistic players and games with resource taxation. We show a significant improvement in the price of anarchy of these games, whenever the aggregate signalled strategy profile is a good approximation of the game social optimum.
We provide the first large-scale data collection of real-world approval-based committee elections. These elections have been conducted on the Polkadot blockchain as part of their Nominated Proof-of-Stake mechanism and contain around one thousand candidates and tens of thousands of (weighted) voters each. We conduct an in-depth study of application-relevant questions, including a quantitative and qualitative analysis of the outcomes returned by different voting rules. Besides considering proportionality measures that are standard in the multiwinner voting literature, we pay particular attention to less-studied measures of overrepresentation, as these are closely related to the security of the Polkadot network. We also analyze how different design decisions such as the committee size affect the examined measures.
When selecting committees based on preferences of voters, a variety of different criteria can be considered. Two natural objectives are maximizing the utilitarian welfare (the sum of voters' utilities) and coverage (the number of represented voters) of the selected committee. Previous work has studied the impact on utilitarian welfare and coverage when requiring the committee to satisfy minimal requirements such as justified representation or weak proportionality. In this paper, we consider the impact of imposing much more demanding proportionality axioms. We identify a class of voting rules that achieve strong guarantees on utilitarian welfare and coverage when combined with appropriate completions. This class is defined via a weakening of priceability and contains prominent rules such as the Method of Equal Shares. We show that committees selected by these rules (i) can be completed to achieve optimal coverage and (ii) can be completed to achieve an asymptotically optimal approximation to the utilitarian welfare if they additionally satisfy EJR+. Answering an open question of Elkind et al. (2022), we use the Greedy Justified Candidate Rule to obtain the best possible utilitarian guarantee subject to proportionality. We also consider completion methods suggested in the participatory budgeting literature and other objectives besides welfare and coverage.
Coalition formation is concerned with the question of how to partition a set of agents into disjoint coalitions according to their preferences. Deviating from most of the previous work, we consider an online variant of the problem, where agents arrive in sequence and whenever an agent arrives, they have to be assigned to a coalition immediately and irrevocably. The scarce existing literature on online coalition formation has focused on the objective of maximizing social welfare, a demanding requirement, even in the offline setting. Instead, we seek to achieve stable coalition structures in an online setting, and focus on stability concepts based on deviations by single agents. We present a comprehensive picture in additively separable hedonic games, leading to dichotomies, where positive results are obtained by deterministic algorithms and negative results even hold for randomized algorithms.
In approval-based committee (ABC) voting, the goal is to choose a subset of predefined size of the candidates based on the voters’ approval preferences over the candidates. While this problem has attracted significant attention in recent years, the incentives for voters to participate in an election for a given ABC voting rule have been neglected so far. This paper is thus the first to explicitly study this property, typically called participation, for ABC voting rules. In particular, we show that all ABC scoring rules even satisfy group participation, whereas most sequential rules severely fail participation. We furthermore explore several escape routes to the impossibility for sequential ABC voting rules: we prove for many sequential rules that (i) they satisfy participation on laminar profiles, (ii) voters who approve none of the elected candidates cannot benefit by abstaining, and (iii) it is NP-hard for a voter to decide whether she benefits from abstaining
Motivated by recent work in computational social choice, we extend the metric distortion framework to clustering problems. Given a set of n agents located in an underlying metric space, our goal is to partition them into k clusters, optimizing some social cost objective. The metric space is defined by a distance function d between the agent locations. Information about d is available only implicitly via n rankings, through which each agent ranks all other agents in terms of their distance from her. Still, even though no cardinal information (i.e., the exact distance values) is available, we would like to evaluate clustering algorithms in terms of social cost objectives that are defined using d. This is done using the notion of distortion, which measures how far from optimality a clustering can be, taking into account all underlying metrics that are consistent with the ordinal information available. Unfortunately, the most important clustering objectives (e.g., those used in the well-known k-median and k-center problems) do not admit algorithms with finite distortion. To sidestep this disappointing fact, we follow two alternative approaches: We first explore whether resource augmentation can be beneficial. We consider algorithms that use more than k clusters but compare their social cost to that of the optimal k-clusterings. We show that using exponentially (in terms of k) many clusters, we can get low (constant or logarithmic) distortion for the k-center and k-median objectives. Interestingly, such an exponential blowup is shown to be necessary. More importantly, we explore whether limited cardinal information can be used to obtain better results. Somewhat surprisingly, for k-median and k-center, we show that a number of queries that is polynomial in k and only logarithmic in n (i.e., only sublinear in the number of agents for the most relevant scenarios in practice) is enough to get constant distortion.
We study online learning and equilibrium computation in games with polyhedral decision sets, a property shared by normal-form games (NFGs) and extensive-form games (EFGs), when the learning agent is restricted to utilizing a best-response oracle. We show how to achieve constant regret in zero-sum games and O(T^0.25) regret in general-sum games while using only O(log t) best-response queries at a given iteration t, thus improving over the best prior result, which required O(T) queries per iteration. Moreover, our framework yields the first last-iterate convergence guarantees for self-play with best-response oracles in zero-sum games. This convergence occurs at a linear rate, though with a condition-number dependence. We go on to show a O(T^(-0.5)) best-iterate convergence rate without such a dependence. Our results build on linear-rate convergence results for variants of the Frank-Wolfe (FW) algorithm for strongly convex and smooth minimization problems over polyhedral domains. These FW results depend on a condition number of the polytope, known as facial distance. In order to enable application to settings such as EFGs, we show two broad new results: 1) the facial distance for polytopes in standard form is at least γ/k where γ is the minimum value of a nonzero coordinate of a vertex of the polytope and k≤n is the number of tight inequality constraints in the optimal face, and 2) the facial distance for polytopes of the form Ax=b, Cx≤d, x≥0 where x∈R^n, C≥0 is a nonzero integral matrix, and d≥0, is at least 1/(c√n), where c is the infinity norm of C. This yields the first such results for several problems such as sequence-form polytopes, flow polytopes, and matching polytopes.
We study the problem of fair sequential decision making given voter preferences. In each round, a decision rule must choose a decision from a set of alternatives where each voter reports which of these alternatives they approve. Instead of going with the most popular choice in each round, we aim for proportional representation, using axioms inspired by the multi-winner voting literature. The axioms require that every group of α% of the voters, if it agrees in every round (i.e., approves a common alternative), then those voters must approve at least α% of the decisions. A stronger version of the axioms requires that every group of α% of the voters that agrees in a β fraction of rounds must approve β⋅α% of the decisions. We show that three attractive voting rules satisfy axioms of this style. One of them (Sequential Phragmén) makes its decisions online, and the other two satisfy strengthened versions of the axioms but make decisions semi-online (Method of Equal Shares) or fully offline (Proportional Approval Voting). We present empirical results for these rules based on synthetic data and U.S. political elections. We also run experiments using the moral machine dataset about ethical dilemmas. We train preference models on user responses from different countries and let the models cast votes. We find that aggregating these votes using our rules leads to a more equal utility distribution across demographics than making decisions using a single global preference model.
Given a mapping from a set of players to the leaves of a complete binary tree (called a seeding), a knockout tournament is conducted as follows: every round, every two players with a common parent compete against each other, and the winner is promoted to the common parent; then, the leaves are deleted. When only one player remains, it is declared the winner. This is a popular competition format in sports, elections, and decision-making. Over the past decade, it has been studied intensively from both theoretical and practical points of view. Most frequently, the objective is to seed the tournament in a way that ``assists'' (or even guarantees) some particular player to win the competition. We introduce a new objective, which is very sensible from the perspective of the directors of the competition: maximize the profit or popularity of the tournament. Specifically, we associate a ``score'' with every possible match, and aim to seed the tournament to maximize the sum of the scores of the matches that take place. We focus on the case where we assume a total order on the players' strengths, and provide a wide spectrum of results on the computational complexity of the problem.
We study fair distribution of a collection of m indivisible goods among a group of n agents, using the widely recognized fairness principles of Maximin Share (MMS) and Any Price Share (APS). These principles have undergone thorough investigation within the context of additive valuations. We explore these notions for valuations that extend beyond additivity. First, we study approximate MMS under the separable (piecewise-linear) concave (SPLC) valuations, an important class generalizing additive, where the best known factor was 1/3-MMS. We show that 1/2-MMS allocation exists and can be computed in polynomial time, significantly improving the state-of-the-art. We note that SPLC valuations introduce an elevated level of intricacy in contrast to additive. For instance, the MMS value of an agent can be as high as her value for the entire set of items. We use a relax-and-round paradigm that goes through competitive equilibrium and LP relaxation. Our result extends to give (symmetric) 1/2-APS, a stronger guarantee than MMS. APS is a stronger notion that generalizes MMS by allowing agents with arbitrary entitlements. We study the approximation of APS under submodular valuation functions. We design and analyze a simple greedy algorithm using concave extensions of submodular functions. We prove that the algorithm gives a 1/3-APS allocation which matches the best-known factor. Concave extensions are hard to compute in polynomial time and are, therefore, generally not used in approximation algorithms. Our approach shows a way to utilize it within analysis (while bypassing its computation), and hence might be of independent interest.
In today's online advertising markets, a crucial requirement for an advertiser is to control her total expenditure within a time horizon under some budget. Among various budget control methods, throttling has emerged as a popular choice, managing an advertiser's total expenditure by selecting only a subset of auctions to participate in. This paper provides a theoretical panorama of a single advertiser's dynamic budget throttling process in repeated second-price auctions. We first establish a lower bound on the regret and an upper bound on the asymptotic competitive ratio for any throttling algorithm, respectively, when the advertiser's values are stochastic and adversarial. Regarding the algorithmic side, we propose the OGD-CB algorithm, which guarantees a near-optimal expected regret with stochastic values. On the other hand, when values are adversarial, we prove that this algorithm also reaches the upper bound on the asymptotic competitive ratio. We further compare throttling with pacing, another widely adopted budget control method, in repeated second-price auctions. In the stochastic case, we demonstrate that pacing is generally superior to throttling for the advertiser, supporting the well-known result that pacing is asymptotically optimal in this scenario. However, in the adversarial case, we give an exciting result indicating that throttling is also an asymptotically optimal dynamic bidding strategy. Our results bridge the gaps in theoretical research of throttling in repeated auctions and comprehensively reveal the ability of this popular budget-smoothing strategy.
Usually, to apply game-theoretic methods, we must specify utilities precisely, and we run the risk that the solutions we compute are not robust to errors in this specification. Ordinal games provide an attractive alternative: they require specifying only which outcomes are preferred to which other ones. Unfortunately, they provide little guidance for how to play unless there are pure Nash equilibria; evaluating mixed strategies appears to fundamentally require cardinal utilities. In this paper, we observe that we can in fact make good use of mixed strategies in ordinal games if we consider settings that allow for folk theorems. These allow us to find equilibria that are robust, in the sense that they remain equilibria no matter which cardinal utilities are the correct ones -- as long as they are consistent with the specified ordinal preferences. We analyze this concept and study the computational complexity of finding such equilibria in a range of settings.
Recent techniques based on Mean Field Games (MFGs) allow the scalable analysis of multi-player games with many similar, rational agents. However, standard MFGs remain limited to homogeneous players that weakly influence each other, and cannot model major players that strongly influence other players, severely limiting the class of problems that can be handled. We propose a novel discrete time version of major-minor MFGs (M3FGs), along with a learning algorithm based on fictitious play and partitioning the probability simplex. Importantly, M3FGs generalize MFGs with common noise and can handle not only random exogeneous environment states but also major players. A key challenge is that the mean field is stochastic and not deterministic as in standard MFGs. Our theoretical investigation verifies both the M3FG model and its algorithmic solution, showing firstly the well-posedness of the M3FG model starting from a finite game of interest, and secondly convergence and approximation guarantees of the fictitious play algorithm. Then, we empirically verify the obtained theoretical results, ablating some of the theoretical assumptions made, and show successful equilibrium learning in three example problems. Overall, we establish a learning framework for a novel and broad class of tractable games.
Dynamic mechanism design is a challenging extension to ordinary mechanism design in which the mechanism designer must make a sequence of decisions over time in the face of possibly untruthful reports of participating agents. Optimizing dynamic mechanisms for welfare is relatively well understood. However, there has been less work on optimizing for other goals (e.g., revenue), and without restrictive assumptions on valuations, it is remarkably challenging to characterize good mechanisms. Instead, we turn to automated mechanism design to find mechanisms with good performance in specific problem instances. We extend the class of affine maximizer mechanisms to MDPs where agents may untruthfully report their rewards. This extension results in a challenging bilevel optimization problem in which the upper problem involves choosing optimal mechanism parameters, and the lower problem involves solving the resulting MDP. Our approach can find truthful dynamic mechanisms that achieve strong performance on goals other than welfare, and can be applied to essentially any problem setting---without restrictions on valuations---for which RL can learn optimal policies.
Researchers building behavioral models, such as behavioral game theorists, use experimental data to evaluate predictive models of human behavior. However, there is little agreement about which loss function should be used in evaluations, with error rate, negative log-likelihood, cross-entropy, Brier score, and squared L2 error all being common choices. We attempt to offer a principled answer to the question of which loss functions should be used for this task, formalizing axioms that we argue loss functions should satisfy. We construct a family of loss functions, which we dub ``diagonal bounded Bregman divergences'', that satisfy all of these axioms. These rule out many loss functions used in practice, but notably include squared L2 error; we thus recommend its use for evaluating behavioral models.
We give a quantitative analysis of the independence of irrelevant alternatives (IIA) axiom. IIA says that the society's preference between x and y should depend only on individual preferences between x and y: we show that, in several contexts, if the individuals express their preferences about additional (``irrelevant'') alternatives, this information helps to estimate better which of x and y has higher social welfare. Our contribution is threefold: (1) we provide a new tool to measure the impact of IIA on social welfare (pairwise distortion), based on the well-established notion of voting distortion, (2) we study the average impact of IIA in both general and metric settings, with experiments on synthetic and real data and (3) we study the worst-case impact of IIA in the 1D-Euclidean metric space.
We study the computational complexity of fairly allocating a set of indivisible items under externalities. In this recently-proposed setting, in addition to the utility the agent gets from their bundle, they also receive utility from items allocated to other agents. We focus on the extended definitions of envy-freeness up to one item (EF1) and of envy-freeness up to any item (EFX), and we provide the landscape of their complexity for several different scenarios. We prove that it is NP-complete to decide whether there exists an EFX allocation, even when there are only three agents, or even when there are only six different values for the items. We complement these negative results by showing that when both the number of agents and the number of different values for items are bounded by a parameter the problem becomes fixed-parameter tractable. Furthermore, we prove that two-valued and binary-valued instances are equivalent and that EFX and EF1 allocations coincide for this class of instances. Finally, motivated from real-life scenarios, we focus on a class of structured valuation functions, which we term agent/item-correlated. We prove their equivalence to the "standard" setting without externalities. Therefore, all previous results for EF1 and EFX apply immediately for these valuations.