Addressing Market Regime Changes and Heavy-Tailed Returns in Portfolio Optimization via Bayesian VAR and Elliptical Black-Litterman

2026-06-08 • Machine Learning

Machine LearningArtificial Intelligence

AI summaryⓘ

The authors developed a new method called BAVAR-BLED to improve how computers decide to invest money in the stock market. Their approach better handles rare but big changes in market returns and adjusts for changes over time, making smarter investment choices. They combine statistical models with machine learning tools like transformers and CNNs to predict risks and returns more realistically. When tested on 29 major stocks over ten years, their method performed better than existing techniques, with higher returns and safer investments.

Deep Reinforcement LearningPortfolio OptimizationBayesian-Averaging Vector Autoregressive (BAVAR)Black-Litterman ModelElliptical DistributionsStudent's t-distributionTransformer NetworksConvolutional Neural Networks (CNN)Sharpe RatioSortino Ratio

Authors

Daniil Mikriukov, Ruoyu Sun, Angelos Stefanidis, Jionglong Su, Zhengyong Jiang

Abstract

Deep reinforcement learning (DRL) frameworks for portfolio optimization have shown promise for their ability to learn allocation rules dynamically from market data. However, these models fail to account for fat-tailed returns, which characterize actual market behavior with more frequent extreme events. Furthermore, historical data is treated homogeneously, without accounting for temporal importance, leading models to fail during regime changes. We propose a new BAVAR-BLED algorithm that combines methods derived from Bayesian-Averaging Vector Autoregressive (BAVAR) and the Black-Litterman model using Elliptical Distributions (BLED) within a TD3 architecture. BAVAR captures a set of vector autoregressive representations that consider multi-scale temporal features, enabling adaptive allocation decisions based on regime-aware estimates of return expectations and dispersion matrices. These estimates serve as prior inputs to BLED, a model that uses Student's t-distributions, allowing for more realistic fat tail return estimates. The BAVAR-BLED algorithm uses transformer networks for view construction and CNNs for risk-aversion estimates, which modify dynamic allocation decisions based on market conditions. An evaluation of 29 Dow Jones Industrial Average constituents over a decade-long market period shows that BAVAR-BLED significantly outperforms state-of-the-art methods, achieving Sharpe and Sortino ratios of 1.72 and 2.70, respectively, and total returns of 57.26%.

View PDFOpen arXiv