2026-02-24 | | Total: 24
We study generative modeling for time series using entropic optimal transport and the Schrödinger bridge (SB) framework, with a focus on applications in finance and energy modeling. Extending the diffusion-based approach of Hamdouche, Henry-Labordère, Pham, 2023, we introduce a jump-diffusion Schrödinger bridge model that allows for discontinuities in the generative dynamics. Starting from a Schrödinger bridge entropy minimization problem, we reformulate the task as a stochastic control problem whose solution characterizes the optimal controlled jump-diffusion process. When sampled on a fixed time grid, this process generates synthetic time series matching the joint distributions of the observed data. The model is fully data-driven, as both the drift and the jump intensity are learned directly from the data. We propose practical algorithms for training, sampling, and hyperparameter calibration. Numerical experiments on simulated and real datasets, including financial and energy time series, show that incorporating jumps substantially improves the realism of the generated data, in particular by capturing abrupt movements, heavy tails, and regime changes that diffusion-only models fail to reproduce. Comparisons with state-of-the-art generative models highlight the benefits and limitations of the proposed approach.
This paper describes a discrete-time model of regularly-issued sovereign debt dynamics under a deficit-driven nominal debt growth regime that explicitly accounts for granular maturity. New issuance follows fixed allocations across a finite maturity ladder, and the government budget constraint determines total borrowing endogenously. In the deterministic baseline, we identify a sustainability condition for convergence to a steady-state and derive closed-form steady portfolio shares, as well as key metrics for steady cost and risk (proxied as one-period rollover ratio). Extending the model to a stochastic recurrence equation (SRE) driven by interest rates and (normalized) deficits that are stationary and mean-reverting, and using a future-cashflow state representation of debt, we identify an analogous condition for ergodic convergence to a unique invariant distribution. This implies that metrics calculated by Monte Carlo debt simulations driven by factors with these properties will recover the ergodic means of the underlying system, independently of initial conditions, provided the simulation horizon is sufficiently long. Analytical formulae for expectations of certain key metrics under this invariant distribution are derived, and agreement with simulation is observed. We find that the introduction of stochastic interest-rate/deficit correlation into the framework leads to intuitive correction terms to their deterministic-baseline counterparts.
Corporate insiders trade for diverse reasons, often possessing Material Non-Public Information (MNPI). Determining whether specific trades leverage MNPI is a significant challenge due to inherent complexity. This study focuses on two critical objectives: accurately detecting Unlawful Insider Trading (UIT) and identifying key features explaining classification. The analysis demonstrates how combining Shapley Values (SHAP) and Causal Forest (CF) reveals these explanatory drivers. The findings underscore the necessity of causality in identifying and interpreting UIT, requiring the consideration of alternative scenarios and potential outcomes. Within a high-dimensional feature space, the proposed architecture integrates state-of-the-art techniques to achieve high classification accuracy. The framework provides robust feature rankings via SHAP and causal significance assessments through CF, facilitating the discovery of unique causal relationships. Statistically significant relationships are documented between the outcome and several key features, including director status, price-to-book ratio, return, and market beta. These features significantly influence the likelihood of UIT, suggesting potential links between insider behavior and factors such as information asymmetry, valuation risk, market volatility, and stock performance. The analysis draws attention to the complexities of financial causality, noting that while initial descriptors offer intuitive insights, deeper examination is required to understand nuanced impacts. These findings reaffirm the architectural flexibility of decision tree models. By incorporating heterogeneity during tree construction, these models effectively uncover latent structures within trade, finance, and governance data, characterizing fraudulent behavior while maintaining reliable results.
This paper reformulates the Greenwood and Guner (2009) marriage and divorce model in continuous time using the HACT methods of Achdou et al. (2022). Replacing the AR(1) match quality process with an Ornstein-Uhlenbeck process yields a tridiagonal generator, reducing the computational complexity of both the value function and stationary distribution calculations from quadratic to linear in the number of grid points. The continuous-time model closely replicates the discrete-time equilibrium across all key outcomes, including the share of married households, the marriage rate, and the divorce rate, while achieving substantial gains in computation time and memory usage.
We study the conditions under which technological advances, in combination with a lognormal wage distribution, incentivize agents into an inefficient educational arms race. Our model emphasizes that lognormal wage distributions imply that agents' wages increase exponentially in the level of their skill as well as in the level of technology. In turn, this exponential relation between skills, technology, and wages pressures agents into an exhausting race for the tails of the economy's skill distribution. Moreover, technological advances and overinvestment in education increase GDP and inequality, while welfare may decline. In an alternative interpretation, our model studies firms that invest in artificial intelligence of their chatbots and AI agents. For a wide range of specifications, firms, just like humans, have an incentive to choose corner solutions where investment is limited only by borrowing constraints.
VOLARE (VOLatility Archive for Realized Estimates - https://volare.unime.it) is an open research infrastructure providing standardized realized volatility and covariance measures constructed from ultra-high-frequency financial data. The platform processes tick-level observations across equities, exchange rates, and futures using an asset-specific pipeline that addresses heterogeneous trading calendars, microstructure noise, and timestamp precision. For equities, price series are cleaned using a documented outlier detection procedure and sampled at regular intervals. VOLARE delivers a comprehensive set of realized estimators, including realized variance, range-based measures, bipower variation, semivariances, realized quarticity, realized kernels, and multivariate covariance measures, ensuring methodological consistency and cross-asset comparability. In addition to bulk dataset download, the platform supports interactive visualization and real-time estimation of established volatility models such as HAR and MEM specifications.
Two-sided platforms must recommend users to users, where matches (termed \emph{dates} in this paper) require mutual interest and activity on both sides. Naive ranking by predicted dating probabilities concentrates exposure on a small subset of highly responsive users, generating congestion and overstating efficiency. We model recommendation as a many-to-many matching problem and design integrators that map predicted login, like, and reciprocation probabilities into recommendations under attention constraints. We introduce \emph{effective dates}, a congestion-adjusted metric that discounts matches involving overloaded receivers. We then propose \emph{exposure-constrained deferred acceptance} (ECDA), which limits receiver exposure in terms of expected likes or dates rather than headcount. Using production-grade predictions from a large Japanese dating platform, we show in calibrated simulations that ECDA increases effective dates and receiver-side dating probability despite reducing total dates. A large-scale regional field experiment confirms these effects in practice, indicating that exposure control improves equity and early-stage matching efficiency without harming downstream engagement.
In this paper, we study how class imbalance, typical of low-default credit portfolios, affects the performance of logistic regression models. Using a simulation study with controlled data-generating mechanisms, we vary (i) the level of class imbalance and (ii) the strength of association between the predictors and the response. The results show that, for a given strength of association, achievable classification accuracy deteriorates markedly as the event rate decreases, and the optimal classification cut-off shifts with the level of imbalance. In contrast, the Gini coefficient is comparatively stable with respect to class imbalance once sample sizes are sufficiently large, even when classification accuracy is strongly affected. As a practical guideline, we summarise attainable classification performance as a function of the event rate and strength of association between the predictors and the response.
Market-order flow in financial markets exhibits long-range correlations. This is a widely known stylised fact of financial markets. A popular hypothesis for this stylised fact comes from the Lillo-Mike-Farmer (LMF) order-splitting theory. However, quantitative tests of this theory have historically relied on proprietary datasets with trader identifiers, limiting reproducibility and cross-market validation. We show that the LMF theory can be validated using publicly available Johannesburg Stock Exchange (JSE) data by leveraging recently developed methods for reconstructing synthetic metaorders. We demonstrate the validation using 3 years of Transaction and Quote Data (TAQ) for the largest 100 stocks on the JSE when assuming that there are either N=50 or N=150 effective traders managing metaorders in the market.
This paper synthesises the existing research on the dynamics of innovation diffusion, with a focus on Bass-type models and their extensions. The theoretical foundation of innovation diffusion proposed by Rogers (1962) and the seminal work of Bass (1969) serve as a starting point for the analysis. We identify and examine various generalizations and stochastic extensions of the Bass model, including counting processes, diffusion processes, and uncertain processes, as well as parameter estimation techniques, from classical statistical techniques to more advanced techniques such as Bayesian filtering and metaheuristic optimisation. We finally explore alternative models of innovation diffusion, with a particular focus on agent-based models. This overview of the evolution of Bass-type models illustrates the progress made in innovation diffusion research over the past decades.
A growing literature documents how religious institutions shape behavior through social influence, but less is known about what happens when religious movements gain political power and use the tools of government to advance their agenda. We use a regression discontinuity design on close mayoral elections in Brazil to show that mayors from parties institutionally tied to Pentecostal denominations increase teenage fertility 3 per 1,000 higher (a 40% increase). This effect appears for cohorts exposed to middle school during the administration. Consistent with a school-based mechanism, we find that the likelihood that municipal schools offer sexual education programs falls by 12.5 percentage points, with no changes in state schools outside mayoral control. We also find elevated STD rates, and higher middle school dropout rates, while slightly older cohorts show no effects. Results are not explained by changes in contraceptive availability in public clinics, pointing to sexual education as the primary mechanism. We also find no effects from other right-wing parties, indicating the importance of institutional links to Pentecostal parties.
This paper investigates the extent of political rent seeking in Hungary in the 2010s. Political capitalism--where powerful private interests influence public policy for private gain--creates opportunities for rent seeking that vary across sectors. The analysis is based on a theoretical model assuming rent seeking occurs in a three-stage process: changes in economic institutions granting regulatory privileges, which are enhanced by political-business networks; this leads to scarcities, and increased market power in certain markets; which then generates rents. To quantify this, the study evaluates Hungarian political capitalism by examining the impact of political decisions on firms' rents, analysing the profit trends of the 1,000 largest Hungarian firms (selected annually by net sales) and comparing their mean profit share (earnings before tax) across two periods: 2008-2012 and 2019-2023. A significant increase in a sector's mean profit share was assumed to indicate increased rent seeking. Using Welch's two-sample t-tests, three sectors were identified as potentially experiencing increased rent seeking: agriculture, construction, and financial and insurance activities. Quantitative findings include a 320% increase in mean agricultural profit share (70% in mean ROA), a more than fivefold increase in construction mean profit share (mean ROA from 3.3% to 10.1%), and a more than 6.5 times increase in financial sector mean profit share. Furthermore, a similar Czech analysis showed no significant increases in any sector's profit share, suggesting that the detected rises in Hungarian sectors are linked to domestic activities rather than external factors, which strengthens the findings.
We propose a fourth--order compact finite--difference (HOC--FD) scheme for the transformed Bates partial integro--differential equation (PIDE). The method employs an implicit--explicit (IMEX) Crank--Nicolson framework for local terms and Simpson quadrature for the jump integral. Benchmarks against second--order finite differences (FD) and quadratic finite elements (FEM, p=2) confirm near--fourth--order spatial accuracy for HOC--FD, near--second--order for FEM, and second--order temporal convergence for all time integrators. Efficiency tests show that HOC--FD achieves similar accuracy at up to two orders of magnitude lower runtime than FEM, establishing it as a practical baseline for option pricing under stochastic volatility jump--diffusion models.
We quantify the Tariff Laffer Curve for the U.S. using a multi-sector Ricardian model calibrated to the 2025 US trade war. We find revenue-maximizing tariffs of 20--30 percent and welfare-maximizing rates of 0--10 percent. We define the Marginal Fiscal Efficiency Index to partition tariffs into welfare-improving, trade-off, and revenue-decreasing regions. Expanding the trade war to more partners raises peak revenue even under retaliation, whereas coordinated retaliation sharply erodes welfare. By January 2026, 20 percent of U.S. tariffs exceed their Laffer peaks. Inverse-optimum estimation reveals diminished U.S. concern for foreign welfare, punitive treatment of China, and rising revenue motives.
This paper investigates whether short-term market overreactions can be systematically predicted and monetized as momentum signals using high-frequency emotional information and modern machine learning methods. Focusing on Apple Inc. (AAPL), we construct a comprehensive intraday dataset that combines volatility normalized returns with transformer-based emotion features extracted from Twitter messages. Overreactions are defined as extreme return realizations relative to contemporaneous volatility and transaction costs and are modeled as a three-class prediction problem. We evaluate the performance of several nonlinear classifiers, including XGBoost, Random Forests, Deep Neural Networks, and Bidirectional LSTMs, across multiple intraday frequencies (1, 5, 10, and 15 minute data). Model outputs are translated into trading strategies and assessed using risk-adjusted performance measures and formal statistical tests. The results show that machine learning models significantly outperform benchmark overreaction rules at ultra short horizons, while classical behavioral momentum effects dominate at intermediate frequencies, particularly around 10 minutes. Explainability analysis based on SHAP reveals that volatility and negative emotions, especially fear and sadness, play a central role in driving predicted overreactions. Overall, the findings demonstrate that emotion-driven overreactions contain a predictable structure that can be exploited by machine learning models, offering new insights into the behavioral origins of intraday momentum and the interaction between sentiment, volatility, and algorithmic trading.
Post-hoc explainability is central to credit risk model governance, yet widely used tools such as coefficient-based attributions and SHapley Additive exPlanations (SHAP) often produce numerical outputs that are difficult to communicate to non-technical stakeholders. This paper investigates whether large language models (LLMs) can serve as post-hoc explainability tools for credit risk predictions through in-context learning, focusing on two roles: translators and autonomous explainers. Using a personal lending dataset from LendingClub, we evaluate three commercial LLMs, including GPT-4-turbo, Claude Sonnet 4, and Gemini-2.0-Flash. Results provide strong evidence for the translator role. In contrast, autonomous explanations show low alignment with model-based attributions. Few-shot prompting improves feature overlap for logistic regression but does not consistently benefit XGBoost, suggesting that LLMs have limited capacity to recover non-linear, interaction-driven reasoning from prompt cues alone. Our findings position LLMs as effective narrative interfaces grounded in auditable model attributions, rather than as substitutes for post-hoc explainers in credit risk model governance. Practitioners should leverage LLMs to bridge the communication gap between complex model outputs and regulatory or business stakeholders, while preserving the rigor and traceability required by credit risk governance frameworks.
This paper investigates systemic risk transmission across stablecoin markets using Quantile Vector Autoregression (QVAR). Analyzing eight major stablecoins with day data coverage from 2021 to 2025, supplemented by minute-level event studies on three additional coins experiencing major depegs until 2025, we document three findings. First, stabilization mechanism dictates tail-risk behavior: fiat-backed stablecoins function as "stability anchors" with near-zero net spillovers across quantiles, while algorithmic and crypto-collateralized designs become risk amplifiers specifically under extreme market conditions. Second, the theoretical risk isolation between fiat and crypto markets breaks down during stress: direct volatility channels emerge between the US Dollar Index and Bitcoin that bypass stablecoin intermediation. Third, Forbes-Rigobon contagion tests across four depeg events show heterogeneous transmission: after adjusting for volatility, algorithmic stablecoins exhibit significant residual contagion while fiat-backed coins show flight-to-quality effects. These findings imply that uniform stablecoin regulation is inappropriate; regulatory capital buffers for extreme losses should be 2--3x higher for non-fiat-backed stablecoins than median-based measures indicate.
Whether and how race is used in selective admissions remains a central question in higher education and civil rights law. In Students for Fair Admissions v. Harvard (2023), the Supreme Court held that race-based affirmative action in college admissions violates the Equal Protection Clause, purportedly ending the practice. This report examines admissions at a public medical school in the pre-SFFA period. Using applicant-level data on over 11,000 applications to Texas Tech University Health Sciences Center Medical School for the 2021 and 2022 cycles, I relate admission decisions to academic merit (MCAT, GPA, science GPA), race, gender, and situational judgment (Casper) scores. Summary statistics, academic-index decompositions, and logistic regression models provide strong evidence of racial preferences: African American and Hispanic applicants are preferred relative to academically similar White and Asian applicants. Counterfactual and preference-removal analyses quantify the magnitude of these disparities. The findings document the kind of race-based preferences that SFFA was meant to address and establish a baseline for assessing whether admissions practice changed after the decision.
The rapid advancement of Large Language Models (LLMs) has led to a surge of financial benchmarks, evolving from static knowledge tests to interactive trading simulations. However, current evaluations of real-time trading performance overlook a critical failure mode: severe behavioral instability in sequential decision-making under uncertainty. We empirically show that LLM-based trading agents exhibit extreme run-to-run variance, inconsistent action sequences even under deterministic decoding, and irrational action flipping across adjacent time steps. These issues stem from stateless autoregressive architectures lacking persistent action memory, as well as sensitivity to continuous-to-discrete action mappings in portfolio allocation. As a result, many existing financial trading benchmarks produce unreliable, non-reproducible, and uninformative evaluations. To address these limitations, we propose AlphaForgeBench, a principled framework that reframes LLMs as quantitative researchers rather than execution agents. Instead of emitting trading actions, LLMs generate executable alpha factors and factor-based strategies grounded in financial reasoning. This design decouples reasoning from execution, enabling fully deterministic and reproducible evaluation while aligning with real-world quantitative research workflows. Experiments across multiple state-of-the-art LLMs show that AlphaForgeBench eliminates execution-induced instability and provides a rigorous benchmark for assessing financial reasoning, strategy formulation, and alpha discovery.
Why do some community-cooperation projects catalyse participation through durable, resilient collaboration networks while others result in negligible impact and leave the local social fabric unchanged? We argue outcomes hinge on participation architecture: simple, visible routines -- onboarding help, templated tasks, lightweight contribution/benefit tracking -- that create easy ``entry portals'' and route work across clusters without heavy hierarchy. We introduce Project Intervention Response Analysis (PIRA), a mixed anthropological-network-analysis framework that compares observed community networks with counterfactual networks absent from project-induced ties. PIRA also adds a new egocentric metric to detect ``architectural alters'' -- latent facilitators and boundary spanners. We begin validating PIRA in a three-month field study in Pomerini, Tanzania, where NGOs coordinated citizens, associations, and specialists. Findings indicate that sociotechnical participation architectures -- not charismatic hubs -- underwrite durable coordination. PIRA offers a reusable method to link organizational design mechanisms to formal network signatures.
We introduce Leap+Verify, a framework that applies speculative execution -- predicting future model weights and validating predictions before acceptance -- to accelerate neural network training. Inspired by speculative decoding in language model inference and by the Automatically Scalable Computation (ASC) architecture for program execution, Leap+Verify decomposes training into three dynamically detected regimes (chaotic, transition, stable) using activation-space cosine similarity as a real-time Lyapunov proxy signal. Within each regime, analytic weight predictors (momentum, linear, quadratic extrapolation) attempt to forecast model parameters K training steps ahead; predictions are accepted only when validated against a held-out loss criterion. We evaluate Leap+Verify on GPT-2 124M and Qwen 2.5-1.5B trained on WikiText-103 across five random seeds, sweeping prediction depth K in {5, 10, 25, 50, 75, 100}. Momentum-based prediction (Adam moment extrapolation) fails catastrophically at both scales, with predicted losses exceeding actuals by 100-10,000x -- a universal norm explosion in optimizer-state extrapolation. Finite-difference predictors (linear, quadratic) succeed where momentum fails: at 124M, they achieve 24% strict acceptance at K=5 in stable regimes; at 1.5B, they achieve 37% strict acceptance in transition regimes. The scale-dependent finding is in regime distribution: GPT-2 124M spends 34% of training in stable regime, while Qwen 1.5B spends 64% in chaotic regime and reaches stable in only 0-2 of 40 checkpoints. Larger models are more predictable when predictable, but less often predictable -- the practical bottleneck shifts from predictor accuracy to regime availability. Cross-seed results are highly consistent (less than 1% validation loss variance), and the three-regime framework produces identical phase boundaries (plus or minus 50 steps) across seeds.
Concentrated liquidity provision in decentralized exchanges presents a fundamental Impulse Control problem. Liquidity Providers (LPs) face a non-trivial trade-off between maximizing fee accrual through tight price-range concentration and minimizing the friction costs of rebalancing, including gas fees and swap slippage. Existing methods typically employ heuristic or threshold strategies that fail to account for market dynamics. This paper formulates liquidity management as an optimal control problem and derives the corresponding Hamilton-Jacobi-Bellman quasi-variational inequality (HJB-QVI). We present an approximate solution RAmmStein, a Deep Reinforcement Learning method that incorporates the mean-reversion speed (theta) of an Ornstein-Uhlenbeck process among other features as input to the model. We demonstrate that the agent learns to separate the state space into regions of action and inaction. We evaluate the framework using high-frequency 1Hz Coinbase trade data comprising over 6.8M trades. Experimental results show that RAmmStein achieves a superior net ROI of 0.72% compared to both passive and aggressive strategies. Notably, the agent reduces rebalancing frequency by 67% compared to a greedy rebalancing strategy while maintaining 88% active time. Our results demonstrate that regime-aware laziness can significantly improve capital efficiency by preserving the returns that would otherwise be eroded by the operational costs.
This paper documents the work of the Clean Air Task Force (CATF) International Working Group (IWG) on Fusion Cost Analysis in 2024-2025, and the methodological extensions implemented in the CATF-supported branch of the pyFECONs fusion power-plant costing framework. Using the standards-aligned chart-of-accounts and physics-to-economics workflow established by ARPA-E. The IWG development reorganizes and deepens the framework around three architecture-defining cost-driver tracks for Magnetic Fusion Energy (MFE), Inertial Fusion Energy (IFE), and Magneto-Inertial Fusion Energy (MIFE). In particular, the generic driver placeholder in Account 22.1.3 is treated as a controlled swap-point and replaced by a full cost-account development for the dominant driver in each class, enabling auditable traceability from requirements and geometry to rolled-up plant costs. On top of this driver-centric foundation, we introduce a probabilistic costing layer that compounds materials price uncertainty, TRL-based maturity uncertainty, and learning-curve uncertainty into cost distributions. We then describe safety-informed costing that enumerates fusion-relevant hazards and maps mitigating systems, structures, and provisions into standardized accounts, together with scenario-parameterized regulatory and financial adders. Finally, we document expanded macroeconomic and finance parameterization and a value-metrics module that complements LCOE with investment and planning measures (NPV, IRR MIRR, revenue requirements, WACC-based annualization, and residual and follow-on value), all computed from the same COA-mapped outputs. Collectively, these additions convert a deterministic, standards-aligned costing backbone into an extensible analysis environment suitable for transparent sensitivity studies, uncertainty propagation, and safety- and finance-coupled interpretation of fusion pilot-plant and NOAK scenarios.
Reliable real estate price indicators are typically published at city level and low frequency, limiting their use for neighborhood-scale monitoring and long-horizon planning. We study whether sub-city price indices can be forecasted at weekly frequency by combining physical development signals from satellite radar with market narratives from news text. Using over 350,000 transactions from Dubai Land Department (2015-2025), we construct weekly price indices for 19 sub-city regions and evaluate forecasts from 2 to 34 weeks ahead. Our framework fuses regional transaction history with Sentinel-1 SAR backscatter, news sentiment combining lexical tone and semantic embeddings, and macroeconomic context. Results are strongly horizon dependent: at horizons up to 10 weeks, price history alone matches multimodal configurations, but beyond 14 weeks sentiment and SAR become critical. At long horizons (26-34 weeks), the full multimodal model reduces mean absolute error from 4.48 to 2.93 (35% reduction), with gains statistically significant across regions. Nonparametric learners consistently outperform deep architectures in this data regime. These findings establish benchmarks for weekly sub-city index forecasting and demonstrate that remote sensing and news sentiment materially improve predictability at strategically relevant horizons.