Computer Science and Game Theory

2026-05-11 | | Total: 13

#1 Nash without Numbers: A Social Choice Approach to Mixed Equilibria in Context-Ordinal Games [PDF] [Copy] [Kimi] [REL]

Authors: Ian Gemp, Crystal Qian, Marc Lanctot, Kate Larson

Nash equilibrium serves as a fundamental mathematical tool in economics and game theory. However, it classically assumes knowledge of player utilities, whereas economics generally regards preferences as more fundamental. To leverage equilibrium analysis in strategic scenarios, one must first elicit numerical utilities consistent with player preferences, a delicate and time-consuming process. In this work, we forgo precise utilities and generalize the Nash equilibrium to a setting where we only assume a player is capable of providing an ordinal ranking of their actions within the context of other players' joint actions. The key technical challenge is to rethink the definition of a best-response. While the classical definition identifies actions maximizing expected payoff, we naturally look towards social choice theory for how to aggregate preferences to identify the most preferred actions. We define this generalized notion of a context-ordinal Nash equilibrium, establish its existence under mild conditions on aggregation methods, introduce notions of regularization, approximation, and regret, explore complexity for simple settings, and develop learning rules for computing such equilibria. In doing so, we provide a generalization of Nash equilibrium and demonstrate its direct applicability to elicited preferences in human experiments.

Subjects: Computer Science and Game Theory , Multiagent Systems , General Economics

Publish: 2026-05-08 16:51:42 UTC


#2 Zero-determinant Strategy for Moving Target Defense: Existence, Performance, and Computation [PDF] [Copy] [Kimi] [REL]

Authors: Zhaoyang Cheng, Guanpu Chen, Yiguang Hong, Ming Cao, Mikael Skoglund

Moving Target Defense (MTD) is commonly formulated as a repeated security game to mitigate persistent threats. Although the strong Stackelberg equilibrium (SSE) characterizes the defender's optimal strategy in the leader-follower framework, computing the SSE often incurs high computational complexity, which significantly limits its practical deployment in MTD problems with multiple targets. This paper proposes adopting a zero-determinant (ZD) strategy for constructing an MTD strategy that achieves both high defensive performance and substantially low computational complexity. We first derive a necessary and sufficient condition for the existence of ZD strategies and investigate the performance of ZD strategies, which shows their upper-bound performance matches that of the SSE strategy. We then formulate two programs to find the optimal ZD strategy parameters under different conditions. Moreover, we design an algorithm to compute the proposed ZD strategies, along with the computational complexity analysis in comparison with the traditional SSE computation. Finally, we conduct experiments on two practical applications to verify our results.

Subjects: Computer Science and Game Theory , Cryptography and Security

Publish: 2026-05-08 15:16:10 UTC


#3 Differentially Private Auditing Under Strategic Response [PDF1] [Copy] [Kimi1] [REL]

Author: Florian A. D. Burnat

Regulatory audits of AI systems increasingly rely on differential privacy (DP) to protect training data and model internals. We study audit design when the audited developer can strategically respond to the privacy-constrained audit interface. We formalize privacy-constrained auditing as a bilevel Stackelberg game, in which an auditor commits to a query policy and DP budget allocation across harm dimensions, and a strategic developer reallocates mitigation efforts in response. We introduce the welfare-weighted under-detection gap $B_w$, the welfare-weighted true residual harm the audit fails to detect at the developer's strategic best response, and prove that naive DP auditing (uniform or harm-proportional allocation) induces a strictly larger $B_w$ than any non-strategic mitigation baseline whenever effective detectability is heterogeneous, the welfare weights are not comonotone with detectability, and the developer's optimum is interior. We characterize the optimal auditor allocation as a four-factor balance of welfare weight, audit miss-probability, detectability elasticity, and mitigation-cost curvature, and provide a single-level reformulation of the bilevel problem via the developer's KKT system. We propose Strategic Private Audit Design (SPAD), a projected-gradient algorithm with hypergradients computed through the developer's best response.

Subjects: Computer Science and Game Theory , Cryptography and Security , Machine Learning

Publish: 2026-05-08 12:44:03 UTC


#4 The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting [PDF] [Copy] [Kimi] [REL]

Authors: Lauri Lovén, Sasu Tarkoma

Eliciting truthful reports from autonomous agents is a core problem in scalable AI oversight: a principal scores the agent's report using a strictly proper scoring rule, but the agent also benefits from the report through a non-accuracy channel (approval for autonomous action, allocation share, downstream control). The same structure appears in classical mechanism-design settings such as marketplace operation. Our main result is an endogeneity: the principal's optimal oversight necessarily uses a non-affine approval function to screen types, yet any non-affine approval makes truthful reporting suboptimal under the combined objective whenever deviation is undetectable. The principal cannot avoid the perturbation that undermines calibration. This impossibility holds for all strictly proper scoring rules, with a closed-form perturbation formula. A constructive escape exists: a step-function approval threshold achieves first-best screening for every strictly proper scoring rule, because the agent's binary inflate-or-not choice creates a type-space threshold regardless of the generator's curvature. Under the Brier score specifically, the type-independent inflation cost yields a welfare equivalence between second-best and first-best; we prove this equivalence is unique to Brier (the welfare gap under smooth $C^1$ oversight is bounded below by $Ω(\text{Var}(1/G'') (γ/β)^2)$ for every non-Brier rule). Two instances develop the framework: AI agent oversight (the lead motivating setting) and marketplace operation (a parallel mechanism-design domain). The message for AI alignment is direct: smooth scoring-based oversight cannot elicit truthful reports from a strategic agent; sharp thresholds are the calibration-preserving design.

Subjects: Computer Science and Game Theory , Artificial Intelligence , Multiagent Systems , Theoretical Economics , Optimization and Control

Publish: 2026-05-08 12:42:28 UTC


#5 Quotient Semivalues for False-Name-Resistant Data Attribution [PDF] [Copy] [Kimi] [REL]

Authors: Florian A. D. Burnat, Brittany I. Davidson

Data valuation methods allocate payments and audit training data's contribution to machine-learning pipelines; however, they often assume passive contributors. In reality, contributors can split datasets across pseudonymous identities, duplicate high-value examples, create near-duplicates, or launder synthetic variants to inflate their share. We formalize this as false-name manipulation in ML data attribution. Our main construction is the quotient semivalue mechanism: compute Shapley-, Banzhaf-, or Beta-style values over evidence-backed attribution clusters instead of raw identities, using a canonical-representative operator to absorb within-cluster duplication. We prove an impossibility: on a fixed monotone data-value game, exact Shapley-fair attribution over reported identities is incompatible with unrestricted false-name-proofness, even on binary-valued instances, and characterize the split-gain of a general semivalue on a unanimity counter-example. The mechanism is exactly false-name-proof under two structural conditions: false-name-neutral within-cluster allocation and quotient-stable manipulations. Under imperfect provenance, when these conditions hold approximately, manipulation gain and fairness loss are bounded by three measurable quantities: escaped-cluster mass, value-estimation error, and clustering distance. We instantiate the mechanisms in DataMarket-Gym, a benchmark for attribution under strategic provider attacks. On synthetic classification tasks, quotient semivalues with example-level evidence reduce manipulation gain on duplicate and near-duplicate Sybil attacks from $1.74$ under baseline Shapley to $0.96$, near the honest level. The cosine-threshold and (false-merge, false-split) rate sweeps trace the corresponding fairness--Sybil frontier.

Subjects: Computer Science and Game Theory , Cryptography and Security , Machine Learning

Publish: 2026-05-08 12:34:20 UTC


#6 Incentivizing User Data Contributions for LLM Improvement under Withdrawal Rights [PDF1] [Copy] [Kimi1] [REL]

Authors: Di Feng, Chenhao Zhang, Zhanzhan Zhao

The continued improvement of large language models (LLMs) increasingly depends on eliciting high-quality, user-generated data, yet such data are costly to provide and often withheld due to privacy and effort concerns. This creates a fundamental design challenge: how to incentivize data contribution when model improvements require coordinated, threshold-level inputs, while contributions remain privately costly and partially reversible. We develop and theoretically analyze incentive mechanisms for user data contribution that explicitly account for threshold effects and reversibility, focusing on how subsidies and withdrawal rights can be jointly designed to overcome coordination failure. As a natural benchmark, we first consider subsidy-based incentives, under which users respond to posted payments with privately optimal floor contributions. These decentralized responses may fall below the improvement threshold, resulting in subsidy expenditure without model improvements. We then analyze mechanisms with withdrawal rights, in which users report costs, the provider centrally assigns contribution burdens, and users may withdraw before training. We prove that combining cost reporting with personalized assignment can eliminate inefficient provision by ensuring that data are collected only when improvement is sustainable, converting infeasible instances into a null outcome rather than subsidy leakage. Finally, we compare two withdrawal protocols. The simultaneous protocol can achieve lower total cost, while the small-first sequential protocol better incentivizes participation, encouraging greater data provision and thereby increasing the probability of crossing the improvement threshold.

Subject: Computer Science and Game Theory

Publish: 2026-05-08 08:14:29 UTC


#7 Game-Theoretic Analysis of Transaction Selection in DAG-Based Distributed Ledgers [PDF1] [Copy] [Kimi1] [REL]

Authors: Sebastian Müller, Alexandre Reiffers-Masson

Transaction selection in parallel or DAG-based distributed ledger technologies (DLTs) is a crucial challenge that directly impacts throughput, fairness, and validator incentives. In these systems, validators independently choose transactions to include in their blocks, often relying on naive heuristics like uniform or proportional selection. This can lead to inefficient outcomes when validators prioritize their own rewards without considering collective impacts. We analyze two fee allocation mechanisms used in practice: Random Fee Allocation (RFA), where transaction fees are randomly assigned to one validator, and Collaborative Fee Sharing (CFS), where fees are distributed equally among all validators. Using a single-shot game-theoretic framework, we derive symmetric Nash equilibria (NE) for selecting transactions for both mechanisms and propose an optimization-based method to compute these equilibria. Numerical simulations demonstrate that the NE of CFS consistently achieves higher throughput and rewards compared to the NE of RFA, particularly under skewed fee distributions. Additionally, we compare these equilibrium strategies to naive benchmarks (uniform and proportional selection), showing that the proportional strategy outperforms the NE of RSA in many situations. These findings may provide actionable insights into the design of transaction selection and incentive mechanisms, enabling more robust and high-performance DAG-based DLTs.

Subjects: Computer Science and Game Theory , Cryptography and Security

Publish: 2026-05-08 07:42:42 UTC


#8 Incentive Design in Competitive Resource Allocation: Exploiting Valuation Asymmetry in Tullock Contests [PDF1] [Copy] [Kimi1] [REL]

Authors: Gilberto Diaz-Garcia, Keith Paarporn, Jason R. Marden

In competitive resource allocation, a central coordinator may seek to gain an advantage not by directly controlling subordinate agents, but by strategically manipulating the information they receive. We study this problem within the framework of multi-player Tullock contests, where the coordinator influences subordinate players by designing their reported valuations of the contested prize, a mechanism that preserves the Tullock structure of the subordinates' objectives and thereby enables tractable equilibrium analysis. We first characterize the Nash equilibrium of the general multi-player Tullock contest, establishing how valuations and per-unit costs jointly determine equilibrium bids and payoffs. We then derive the optimal reported valuations for a coordinator managing two subordinates against a single opponent, and show that the structure of the optimal solution extends to contests with an arbitrary number of subordinates, reducing the coordinator's optimization to a two-variable problem regardless of system size.

Subject: Computer Science and Game Theory

Publish: 2026-05-07 23:47:09 UTC


#9 In-Context Credit Assignment via the Core [PDF] [Copy] [Kimi] [REL]

Authors: Keegan Harris, Siddharth Prasad, Asher Trockman

We propose incentive-aligned mechanisms for in-context credit assignment: the task of assigning credit for AI-generated content (e.g. code, news articles, short-form videos) among creators whose intellectual property appears in the context window. Our approach is based on the least core solution concept from cooperative game theory, which distributes value in a way that is as stable as possible by ensuring that no subset of creators is significantly under-compensated relative to the value they could generate on their own. We develop algorithms for approximating the least core, which leverage novel routines for constraint seeding and constraint separation. On a web retrieval credit assignment task, we find that our approaches are capable of approximating the least core using orders of magnitude fewer LLM calls compared to alternative methods.

Subjects: Computer Science and Game Theory , Artificial Intelligence , Machine Learning

Publish: 2026-05-07 20:30:44 UTC


#10 A Simple Method for School Choice Lotteries [PDF] [Copy] [Kimi] [REL]

Author: Yasunori Okumura

This note proposes a simple polynomial-time method for constructing an ex ante stable school-choice lottery satisfying equal treatment of equals. The method applies the ETE reassignment to a constrained efficient stable matching and yields a lottery that is not ordinally dominated by any other ex ante stable lottery.

Subjects: Computer Science and Game Theory , Theoretical Economics

Publish: 2026-05-07 07:06:05 UTC


#11 Pricing, Matching, and Bundling: an Equilibrium Analysis of Online Platforms [PDF] [Copy] [Kimi] [REL]

Author: Gary Qiurui Ma

Modern online platforms such as marketplaces, ride-hailing services, and food-delivery systems serve a dual role: they are both markets where participants interact and transact, and operators that design and govern how these markets function. These platforms connect multiple sides, for example buyers, sellers, and couriers, facilitating access that would otherwise be difficult to achieve. By setting the rules of the market, platforms determine who participates, how interactions take place, and how value is created and distributed. In response to these rules, participants may behave strategically, deciding whether to join the platform and which transactions to pursue. This thesis studies how platform design affects market outcomes through three key levers: pricing that determines participants' gains when operating on a platform; matching that governs which interactions are feasible among participants; and bundling that shapes the structure of supply when the platform itself acts as a market participant. Across these levers, the goal in this thesis is to understand how platforms can be designed to balance platform profitability with overall market welfare. The first part of this thesis studies pricing, including both the commission fees that participants pay to a platform and the prices associated with each transaction. The second part of this thesis studies matching. By shaping recommendation systems and consumer search, platforms influence which transactions take place. The third part of this thesis analyzes bundling. As a marketplace operator, a platform may be able to source products from sellers and offer them as bundled packages to buyers. Collectively, this thesis shows how pricing, matching, and bundling serve as complementary design levers through which platforms can shape market outcomes.

Subject: Computer Science and Game Theory

Publish: 2026-05-06 22:12:46 UTC


#12 The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents [PDF] [Copy] [Kimi3] [REL]

Authors: Jiayuan Liu, Tianqin Li, Shiyi Du, Xin Luo, Haoxuan Zeng, Emanuel Tewolde, Tai Sing Lee, Tonghan Wang, Carl Kingsford, Vincent Conitzer

Context window expansion is often treated as a straightforward capability upgrade for LLMs, but we find it systematically fails in multi-agent social dilemmas. Across 7 LLMs and 4 games over 500 rounds, expanding accessible history degrades cooperation in 18 of 28 model--game settings, a pattern we term the memory curse. We isolate the underlying mechanism through three analyses. First, lexical analysis of 378,000 reasoning traces associates this breakdown with eroding forward-looking intent rather than rising paranoia. We validate this using targeted fine-tuning as a cognitive probe: a LoRA adapter trained exclusively on forward-looking traces mitigates the decay and transfers zero-shot to distinct games. Second, memory sanitization holds prompt length fixed while replacing visible history with synthetic cooperative records, which restores cooperation substantially, proving the trigger is memory content, not length alone. Finally, ablating explicit Chain-of-Thought reasoning often reduces the collapse, showing that deliberation paradoxically amplifies the memory curse. Together, these results recast memory as an active determinant of multi-agent behavior: longer recall can either destabilize or support cooperation depending on the reasoning patterns it elicits.

Subjects: Computation and Language , Artificial Intelligence , Computer Science and Game Theory , Multiagent Systems

Publish: 2026-05-08 17:47:13 UTC


#13 Response Time Enhances Alignment with Heterogeneous Preferences [PDF] [Copy] [Kimi] [REL]

Authors: Federico Echenique, Alireza Fallah, Baihe Huang, Michael I. Jordan

Aligning large language models (LLMs) to human preferences typically relies on aggregating pooled feedback into a single reward model. However, this standard approach assumes that all labelers share the same underlying preferences, ignoring the fact that real-world labelers are highly heterogeneous and usually anonymous. Consequently, relying solely on binary choice data fundamentally distorts the learned policy, making the true population-average preference unidentifiable. To overcome this critical limitation, we demonstrate that augmenting preference datasets with a simple, secondary signal -- the user's response time -- can restore the identifiability of the population's average preference. By modeling each decision as a Drift-Diffusion Model (DDM), we introduce a novel, consistent estimator of heterogeneous preferences that successfully corrects the distortions of standard choice-only labels. We prove that our estimator asymptotically converges to the true average preference even in extreme cases where each anonymous labeler contributes only a single choice. Empirically, across both synthetic and real-world datasets, our method consistently outperforms standard baselines that otherwise fail and plateau at a bias floor. Because response times are essentially free to record and require zero user tracking or identification, our results bring promises and open up new opportunities for future data-collection pipelines to improve the social benefit without requiring user-level identifiers or repeated elicitations.

Subjects: Machine Learning , Computer Science and Game Theory , Theoretical Economics , Machine Learning

Publish: 2026-05-07 22:05:23 UTC