Empirical Measure Large Deviations for Reinforced Chains on Finite Spaces
Abstract
Let $A$ be a transition probability kernel on a finite state space $\Delta^o =\{1, \ldots , d\}$ such that $A(x,y)>0$ for all $x,y \in \Delta^o$. Consider a reinforced chain given as a sequence $\{X_n, \; n \in \mathbb{N}_0\}$ of $\Delta^o$-valued random variables, defined recursively according to, $$L^n = \frac{1}{n}\sum_{i=0}^{n-1} \delta_{X_i}, \;\; P(X_{n+1} \in \cdot \mid X_0, \ldots, X_n) = L^n A(\cdot).$$ We establish a large deviation principle for $\{L^n\}$. The rate function takes a strikingly different form than the Donsker-Varadhan rate function associated with the empirical measure of the Markov chain with transition kernel $A$ and is described in terms of a novel deterministic infinite horizon discounted cost control problem with an associated linear controlled dynamics and a nonlinear running cost involving the relative entropy function. Proofs are based on an analysis of time-reversal of controlled dynamics in representations for log-transforms of exponential moments, and on weak convergence methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2022
- DOI:
- 10.48550/arXiv.2205.09291
- arXiv:
- arXiv:2205.09291
- Bibcode:
- 2022arXiv220509291B
- Keywords:
-
- Mathematics - Probability;
- Mathematics - Optimization and Control;
- 60F10 (Primary) 93E03 (Secondary)