Finance
Authors (alphabetical): Zhenyu Gao, Wenxi Jiang, Yutong Yan
Large Language Models Lookahead Bias Financial Forecasting Asset Pricing
Presentations: ABFER 2026 (scheduled) · CICF 2026 (scheduled) · NBER AI and Economic Measurement 2026 (scheduled)
▸ Abstract
We develop a statistical test to detect lookahead bias in economic forecasts generated by large language models (LLMs). Leveraging state-of-the-art pretraining data detection techniques, we estimate the likelihood that a given prompt appeared in an LLM's training corpus—a statistic we term Lookahead Propensity (LAP). We formally show that a positive correlation between LAP and forecast accuracy indicates both the presence and magnitude of lookahead bias. We apply the test to two forecasting settings: news headlines predicting stock returns and earnings call transcripts predicting capital expenditures. Our approach provides a cost-efficient diagnostic tool for assessing the validity and reliability of LLM-generated forecasts.
Authors: Yutong Yan, Raphael Tang, Zhenyu Gao, Wenxi Jiang, Yao Lu
Large Language Models Lookahead Bias Time-Aware Pretraining
Presentations: AFA Poster Session (2026)
▸ Abstract
In financial backtesting, large language models pretrained on internet-scale data risk introducing lookahead bias that undermines their forecasting validity, as they may have already seen the true outcome during training. To address this, we present DatedGPT, a family of twelve 1.3B-parameter language models, each trained from scratch on approximately 100 billion tokens of temporally partitioned data with strict annual cutoffs spanning 2013 to 2024. We further enhance each model with instruction fine-tuning on both general-domain and finance-specific datasets curated to respect the same temporal boundaries. Perplexity-based probing confirms that each model's knowledge is effectively bounded by its data cutoff year, while evaluation on standard benchmarks shows competitive performance with existing models of similar scale. We provide an interactive web demo that allows users to query and compare responses from models across different cutoff years.
Deciphering Green Preferences and Climate Risk Perceptions: An NLP Approach
Authors (alphabetical): Darwin Choi, Zhenyu Gao, Wenxi Jiang, Yutong Yan, Hulai Zhang
Climate Finance Institutional Investors ESG Textual Analysis NLP
Presentations: European Sustainable Finance PhD Workshop (2025)
▸ Abstract
We employ Natural Language Processing (NLP) to scrutinize regulatory filings, identifying institutional investors' climate change preferences and risk perceptions. These preferences and risk perceptions grow over time and are stronger after a fund has signed the Principles for Responsible Investment (PRI) or is located in regions with stronger global warming beliefs. Investors preferring green assets tend to decrease their portfolio weights in environmentally unfriendly stocks, reflecting a desire to align investments with their values. However, the relationship between climate risk perceptions and portfolio weights of brown stocks varies due to heterogeneous investment strategies. Investors with higher climate risk perceptions are more likely to support environmental shareholder proposals, whereas investors with green preferences are not. These findings provide new insights into sustainable investing behavior under differing investor motivations.
Machine Learning
Bandit Algorithms for Factorial Experiments
Authors: Yutong Yan, Audrey Durand, Joelle Pineau
Machine Learning Optimization Multi-Armed Bandits
Presentations: WiML Workshop, NeurIPS 2019
▸ Abstract
A multi-armed bandit algorithm is developed for factorial experiments. Using tools from advanced probability theory, I first prove that UCT algorithm with Laplace bound has a lower computational complexity than the naïve UCT algorithm. I begin by analyzing UCB1 for non-stationary bandit problems, and then prove UCT algorithm with Laplace Bounds achieves a better lower bound. Also, I demonstrate that the probability of suboptimal choices will converge to zero with a convergence of failure probability. In settings of deep learning, experimental results are also consistent with the theoretical regret bound.
A Theoretical Analysis of Upper Confidence Bound Applied to Trees
Authors: Yutong Yan, Audrey Durand, Joelle Pineau
Machine Learning Optimization Tree Search
▸ Abstract
Using Factorial experiments, I explore multi-armed bandit problems in which a player selects actions (here a sequence) episodically and observes the outcomes. I consider the Upper Confidence Bound applied to Trees (UCT), a popular algorithm for tree search, in order to identify the sequence of choices that maximizes some objective function. Using synthetic experiments, I demonstrate that applying tighter concentration bounds to Linear Bandits can significantly improve the performance of UCT for tree search. Next step is to investigate various factorial experimental design configurations. I also compare the performance of algorithms under three different formulations of the factorial experiment: 1) standard bandits; 2) linear bandits; and 3) bandits for tree search. I observe that capturing the underlying tree structure is essential for robustness, whether the outcome function is linear or not. Furthermore, I observe that the algorithms employed under the bandits formulation of tree search for factorial experimental designs appear more robust to the noise variance than other approaches.
