2026-02-03 | | Total: 12
This study examines the effects of Trump-era tariffs on financial market efficiency by applying multifractal detrended fluctuation analysis to the return and absolute return time series of six major financial assets: the S\&P 500, SSEC, VIX, BTC/USD, EUR/USD, and Gold. Using the Hurst exponent $h(2)$ and multifractal strength, we assess how market dynamics responded to two major global shocks: the COVID-19 pandemic and the implementation of the Trump tariff policy in 2025. The results show that COVID-19 induced substantial changes in both the Hurst exponent and multifractal strength, particularly for the S\&P 500, BTC/USD, EUR/USD, and Gold. In contrast, the effects of the Trump tariffs were more moderate but still observable across all examined time series. The Chinese market index (SSEC) remained largely unaffected by either event, apart from a distinct response to domestic stimulus measures. In addition, the VIX exhibited anti-persistent behavior with $h(2) < 0.5$, consistent with the rough volatility framework. These findings underscore the usefulness of multifractal analysis in capturing structural shifts in market efficiency under geopolitical and systemic shocks.
Financial markets exhibit temporal organization that is not fully captured by volatility measures or linear correlation structure. We study a null validated topological approach for quantifying market complexity and apply it to Bitcoin daily log returns. The analysis uses the $L^1$ norm of persistence landscapes computed from sliding-window delay embeddings. This quantity shows strong co-movement with stochastic volatility during periods of market stress, but remains intermittently elevated during low volatility regimes, indicating dynamical structure beyond fluctuation scale. Rolling correlation analysis reveals that the dependence between geometry and volatility is not stationary. Surrogate based null models provide statistical validation of these observations. Rejection of shuffle surrogates rules out explanations based on marginal distributions alone, while departures from phase randomized surrogates indicate sensitivity to nonlinear and phase dependent temporal organization beyond linear correlations. These results demonstrate that persistence landscape norms provide complementary information about market dynamics across market conditions.
We study whether generative AI can automate feature discovery in U.S. equities. Using large language models with retrieval-augmented generation and structured/programmatic prompting, we synthesize economically motivated features from analyst, options, and price-volume data. These features are then used as inputs to a tabular machine-learning model to forecast short-horizon returns. Across multiple datasets, AI-generated features are consistently competitive with baselines, with Sharpe improvements ranging from 14% to 91% depending on dataset and configuration. Retrieval quality is pivotal: better knowledge bases materially improve outcomes. The AI-generated signals are weakly correlated with traditional features, supporting combination. Overall, generative AI can meaningfully augment feature discovery when retrieval quality is controlled, producing interpretable signals while reducing manual engineering effort.
Prediction markets offer a natural testbed for trading agents: contracts have binary payoffs, prices can be interpreted as probabilities, and realized performance depends critically on market microstructure, fees, and settlement risk. We introduce PredictionMarketBench, a SWE-bench-style benchmark for evaluating algorithmic and LLM-based trading agents on prediction markets via deterministic, event-driven replay of historical limit-order-book and trade data. PredictionMarketBench standardizes (i) episode construction from raw exchange streams (orderbooks, trades, lifecycle, settlement), (ii) an execution-realistic simulator with maker/taker semantics and fee modeling, and (iii) a tool-based agent interface that supports both classical strategies and tool-calling LLM agents with reproducible trajectories. We release four Kalshi-based episodes spanning cryptocurrency, weather, and sports. Baseline results show that naive trading agents can underperform due to transaction costs and settlement losses, while fee-aware algorithmic strategies remain competitive in volatile episodes.
This paper addresses stock price movement prediction by leveraging LLM-based news sentiment analysis. Earlier works have largely focused on proposing and assessing sentiment analysis models and stock movement prediction methods, however, separately. Although promising results have been achieved, a clear and in-depth understanding of the benefit of the news sentiment to this task, as well as a comprehensive assessment of different architecture types in this context, is still lacking. Herein, we conduct an evaluation study that compares 3 different LLMs, namely, DeBERTa, RoBERTa and FinBERT, for sentiment-driven stock prediction. Our results suggest that DeBERTa outperforms the other two models with an accuracy of 75% and that an ensemble model that combines the three models can increase the accuracy to about 80%. Also, we see that sentiment news features can benefit (slightly) some stock market prediction models, i.e., LSTM-, PatchTST- and tPatchGNN-based classifiers and PatchTST- and TimesNet-based regression tasks models.
This study addresses the low-volatility Chinese Public Real Estate Investment Trusts (REITs) market, proposing a large language model (LLM)-driven trading framework based on multi-agent collaboration. The system constructs four types of analytical agents-announcement, event, price momentum, and market-each conducting analysis from different dimensions; then the prediction agent integrates these multi-source signals to output directional probability distributions across multiple time horizons, then the decision agent generates discrete position adjustment signals based on the prediction results and risk control constraints, thereby forming a closed loop of analysis-prediction-decision-execution. This study further compares two prediction model pathways: for the prediction agent, directly calling the general-purpose large model DeepSeek-R1 versus using a specialized small model Qwen3-8B fine-tuned via supervised fine-tuning and reinforcement learning alignment. In the backtest from October 2024 to October 2025, both agent-based strategies significantly outperformed the buy-and-hold benchmark in terms of cumulative return, Sharpe ratio, and maximum drawdown. The results indicate that the multi-agent framework can effectively enhance the risk-adjusted return of REITs trading, and the fine-tuned small model performs close to or even better than the general-purpose large model in some scenarios.
Overfitting remains a critical challenge in data-driven financial modeling, where machine learning (ML) systems learn spurious patterns in historical prices and fail out of sample and in deployment. This paper introduces the GT-Score, a composite objective function that integrates performance, statistical significance, consistency, and downside risk to guide optimization toward more robust trading strategies. This approach directly addresses critical pitfalls in quantitative strategy development, specifically data snooping during optimization and the unreliability of statistical inference under non-normal return distributions. Using historical stock data for 50 S&P 500 companies spanning 2010-2024, we conduct an empirical evaluation that includes walk-forward validation with nine sequential time splits and a Monte Carlo study with 15 random seeds across three trading strategies. In walk-forward validation, GT-Score improves the generalization ratio (validation return divided by training return) by 98% relative to baseline objective functions. Paired statistical tests on Monte Carlo out-of-sample returns indicate statistically detectable differences between objective functions (p < 0.01 for comparisons with Sortino and Simple), with small effect sizes. These results suggest that embedding an anti-overfitting structure into the objective can improve the reliability of backtests in quantitative research. Reproducible code and processed result files are provided as supplementary materials.
Time series encountered in practice are rarely stationary. When the data distribution changes, a forecasting model trained on past observations can lose accuracy. We study a small-footprint test-time adaptation (TTA) framework for causal timeseries forecasting and direction classification. The backbone is frozen, and only normalization affine parameters are updated using recent unlabeled windows. For classification we minimize entropy and enforce temporal consistency; for regression we minimize prediction variance across weak time-preserving augmentations and optionally distill from an EMA teacher. A quadratic drift penalty and an uncertainty triggered fallback keep updates stable. We evaluate this framework in two stages: synthetic regime shifts on ETT benchmarks, and daily equity and FX series (SPY, QQQ, EUR/USD) across pandemic, high-inflation, and recovery regimes. On synthetic gradual drift, normalization-based TTA improves forecasting error, while in financial markets a simple batch-normalization statistics update is a robust default and more aggressive norm-only adaptation can even hurt. Our results provide practical guidance for deploying TTA on non-stationary time series.
The balancing market in the energy sector plays a critical role in physically and financially balancing the supply and demand. Modeling dynamics in the balancing market can provide valuable insights and prognosis for power grid stability and secure energy supply. While complex machine learning models can achieve high accuracy, their black-box nature severely limits the model interpretability. In this paper, we explore the trade-off between model accuracy and interpretability for the energy balancing market. Particularly, we take the example of forecasting manual frequency restoration reserve (mFRR) activation price in the balancing market using real market data from different energy price zones. We explore the interpretability of mFRR forecasting using two models: extreme gradient boosting (XGBoost) machine and explainable boosting machine (EBM). We also integrate the two models, and we benchmark all the models against a baseline naive model. Our results show that EBM provides forecasting accuracy comparable to XGBoost while yielding a considerable level of interpretability. Our analysis also underscores the challenge of accurately predicting the mFRR price for the instances when the activation price deviates significantly from the spot price. Importantly, EBM's interpretability features reveal insights into non-linear mFRR price drivers and regional market dynamics. Our study demonstrates that EBM is a viable and valuable interpretable alternative to complex black-box AI models in the forecast for the balancing market.
In this work, we propose to apply a new model fusion and learning paradigm, known as Combinatorial Fusion Analysis (CFA), to the field of Bitcoin price prediction. Price prediction of financial product has always been a big topic in finance, as the successful prediction of the price can yield significant profit. Every machine learning model has its own strength and weakness, which hinders progress toward robustness. CFA has been used to enhance models by leveraging rank-score characteristic (RSC) function and cognitive diversity in the combination of a moderate set of diverse and relatively well-performed models. Our method utilizes both score and rank combinations as well as other weighted combination techniques. Key metrics such as RMSE and MAPE are used to evaluate our methodology performance. Our proposal presents a notable MAPE performance of 0.19\%. The proposed method greatly improves upon individual model performance, as well as outperforms other Bitcoin price prediction models.
Benoit Mandelbrot's scientific legacy spans an extraordinary range of disciplines, from linguistics and fluid turbulence to cosmology and finance, suggesting the intellectual temperament of a "fox" in Isaiah Berlin's famous dichotomy of thinkers. This essay argues, however, that Mandelbrot was, at heart, a "hedgehog": a thinker unified by a single guiding principle. Across his diverse pursuits, the concept of scaling -- manifested in self-similarity, power laws, fractals, and multifractals -- served as the central idea that structured his work. By tracing the continuity of this scaling paradigm through his contributions to mathematics, physics, and economics, the paper reveals a coherent intellectual trajectory masked by apparent eclecticism. Mandelbrot's enduring insight in the modeling of natural and social phenomena can be understood through the lens of the geometry and statistics of scale invariance.
We document stable cross-asset patterns in cryptocurrency limit-order-book microstructure: the same engineered order book and trade features exhibit remarkably similar predictive importance and SHAP dependence shapes across assets spanning an order of magnitude in market capitalization (BTC, LTC, ETC, ENJ, ROSE). The data covers Binance Futures perpetual contract order books and trades on 1-second frequency starting from January 1st, 2022 up to October 12th, 2025. Using a unified CatBoost modeling pipeline with a direction-aware GMADL objective and time-series cross validation, we show that feature rankings and partial effects are stable across assets despite heterogeneous liquidity and volatility. We connect these SHAP structures to microstructure theory (order flow imbalance, spread, and adverse selection) and validate tradability via a conservative top-of-book taker backtest as well as fixed depth maker backtest. Our primary novelty is a robustness analysis of a major flash crash, where the divergent performance of our taker and maker strategies empirically validates classic microstructure theories of adverse selection and highlights the systemic risks of algorithmic trading. Our results suggest a portable microstructure representation of short-horizon returns and motivate universal feature libraries for crypto markets.