Description
Raymond Wong, Ph.D.
Associate Professor of Statistics
Texas A&M University
“Balancing Weights for Offline Reinforcement Learning”
Offline policy evaluation is considered a fundamental and challenging problem in reinforcement learning. In this talk, I will focus on the value estimation of a target policy based on pre-collected data generated from a possibly different policy, under the framework of infinite-horizon Markov decision processes. I will discuss a novel estimator with approximately projected state-action balancing weights for the policy value estimation. These weights are motivated by the marginal importance sampling method in reinforcement learning and the covariate balancing idea in causal inference. Corresponding asymptotic convergence will be presented. Our results scale with both the number of trajectories and the number of decision points at each trajectory. As such, consistency can still be achieved with a limited number of subjects when the number of decision points diverges.
Biography
Raymond K. W. Wong is an associate professor and the Ph.D. program director in the Department of Statistics at Texas A&M University. Before joining Texas A&M University, he held a regular faculty position at Iowa State University. He received his Ph.D. in Statistics from the University of California at Davis in 2014. His research interests include causal inference, functional data analysis, low-rank modeling and reinforcement learning. He has served as an associate editor for several statistical journals, including the Canadian Journal of Statistics, the Journal of Computational and Graphical Statistics, and the Journal of the American Statistical Association Review. He is also a member of the editorial board for Chemometrics and Intelligent Laboratory Systems.
Functional Neural Networks - Deep Learning for Functional Data
Location
BB 1.01.20L
Category:
Campus Events