2025-04-04 | | Total: 4
The on-demand ride-hailing industry has experienced rapid growth, transforming transportation norms worldwide. Despite improvements in efficiency over traditional taxi services, significant challenges remain, including drivers' strategic repositioning behavior, customer abandonment, and inefficiencies in dispatch algorithms. To address these issues, we introduce a comprehensive mean field game model that systematically analyzes the dynamics of ride-hailing platforms by incorporating driver repositioning across multiple regions, customer abandonment behavior, and platform dispatch algorithms. Using this framework, we identify all possible mean field equilibria as the Karush-Kuhn-Tucker (KKT) points of an associated optimization problem. Our analysis reveals the emergence of multiple equilibria, including the inefficient "Wild Goose Chase" one, characterized by drivers pursuing distant requests, leading to suboptimal system performance. To mitigate these inefficiencies, we propose a novel two-matching-radius nearest-neighbor dispatch algorithm that eliminates undesirable equilibria and ensures a unique mean field equilibrium for multi-region systems. The algorithm dynamically adjusts matching radii based on driver supply rates, optimizing pick-up times and waiting times for drivers while maximizing request completion rates. Numerical experiments and simulation results show that our proposed algorithm reduces customer abandonment, minimizes waiting times for both customers and drivers, and improves overall platform efficiency.
We introduce a model of sequential social learning in which a planner may pay a cost to adjust the private signal precision of some agents. This framework presents a new optimization problem for social learning that sheds light on practical policy questions, such as how the socially optimal level of ad personalization changes according to current beliefs or how a biased planner might derail social learning. We then characterize the optimal policies of an altruistic planner who maximizes social welfare and a biased planner who seeks to induce a specific action. Even for a planner who has equivalent knowledge to an individual, cannot lie or cherry-pick information, and is fully observable, we demonstrate that it can dramatically influence social welfare in both positive and negative directions. An important area for future exploration is how one might prevent these latter outcomes to protect against the manipulation of social learning.
Additive feature explanations rely primarily on game-theoretic notions such as the Shapley value by viewing features as cooperating players. The Shapley value's popularity in and outside of explainable AI stems from its axiomatic uniqueness. However, its computational complexity severely limits practicability. Most works investigate the uniform approximation of all features' Shapley values, needlessly consuming samples for insignificant features. In contrast, identifying the k most important features can already be sufficiently insightful and yields the potential to leverage algorithmic opportunities connected to the field of multi-armed bandits. We propose Comparable Marginal Contributions Sampling (CMCS), a method for the top-k identification problem utilizing a new sampling scheme taking advantage of correlated observations. We conduct experiments to showcase the efficacy of our method in compared to competitive baselines. Our empirical findings reveal that estimation quality for the approximate-all problem does not necessarily transfer to top-k identification and vice versa.
In this paper, we study the online shortest path problem in directed acyclic graphs (DAGs) under bandit feedback against an adaptive adversary. Given a DAG G=(V,E) with a source node vs and a sink node vt, let X⊆{0,1}|E| denote the set of all paths from vs to vt. At each round t, we select a path xt∈X and receive bandit feedback on our loss ⟨xt,yt⟩∈[−1,1], where yt is an adversarially chosen loss vector. Our goal is to minimize regret with respect to the best path in hindsight over T rounds. We propose the first computationally efficient algorithm to achieve a near-minimax optimal regret bound of ˜O(√|E|Tlog|X|) with high probability against any adaptive adversary, where ˜O(⋅) hides logarithmic factors in the number of edges |E|. Our algorithm leverages a novel loss estimator and a centroid-based decomposition in a nontrivial manner to attain this regret bound. As an application, we show that our algorithm for DAGs provides state-of-the-art efficient algorithms for m-sets, extensive-form games, the Colonel Blotto game, shortest walks in directed graphs, hypercubes, and multi-task multi-armed bandits, achieving improved high-probability regret guarantees in all these settings.