Statistician · Bayesian Methods
Dept. of Statistical Science · Duke
YINYIHONG LIU
Durham, NC
I am a statistician working on Bayesian methods for problems where identity, treatment, and physical structure are all uncertain — entity resolution via partition-valued models, bandit algorithms for optimal treatment regimes under surrogate outcomes, and Gaussian-process emulation of high-energy hydrodynamic simulations. M.Sc. in Statistical Science at Duke, advised by Eric B. Laber and Rebecca C. Steorts. B.Sc. magna cum laude in Mathematics & Data Science from NYU Shanghai.
§01 Research threads Duke · ongoing
I.
Bayesian Microclustering for Entity Resolution
Bounded microclustering models that match identities across noisy administrative records — partition-valued priors with calibrated cluster-size behavior. With R. C. Steorts.
p(Λ | X) ∝ p(X | Λ) · p(Λ)
II.
Bandits with Surrogate Outcomes
Bandit algorithms for estimating optimal treatment regimes when only partially-ordered surrogate outcomes are observed. With E. B. Laber.
π* = argmaxπ E[Y(π) | S]
III.
Gaussian-Process Emulation for Physics
Modeling outputs of hydrodynamic simulations with Gaussian processes; quantifying uncertainty in Bayesian parameter estimation. With S. Mak and the JETSCAPE collaboration.
f(x) ∼ GP(μ(x), k(x, x′))
IV.
Random Forest Theory
Investigated consistency and asymptotic normality of random-forest estimators (NYU Shanghai senior thesis, with W. Wu and C. Gu).
√n (f̂n − f) ⇒ N(0, σ²)
§02 Honors & awards
2023
Dean's Research Award for Master's Students
Duke
2022
Major Honors in Mathematics — top mathematics major
NYU Shanghai
2022
NYU Shanghai Excellence Award — top 20% of class
NYU Shanghai
2018–22
Dean's List, every semester
NYU Shanghai
§03 Teaching Duke · NYUSH
2023 Su
Bayesian Inference for Nuclear Physics — workshop TA
virtual
F19 · S20
MATH-SHU 235 — Probability & Statistics
NYUSH
§04 Publications peer-reviewed & in prep.
in prep.
Manuscript
Bounded Microclustering Models for Entity Resolution
Liu, Y., Aleshin-Guendel, S., Marchant, N. G., Steorts, R. C.
in prep.
Manuscript
Bandit Algorithms under Partially Ordered Surrogates
Liu, Y., Laber, E. B., Brooks, M.
in prep.
Manuscript
Transfer Learning for Bayesian Parameter Estimation of Hydrodynamic Simulations
Liu, Y., Miller, J., Mak, S., and the JETSCAPE collaboration
2021
IEEE · ICSPML
Airbnb Pricing Based on Statistical Machine Learning Models
Liu, Y.
§05 Talks & selected work
2022
Translator — Causal Inference for Statistics, Social, and Biomedical Sciences, Imbens & Rubin
Fudan Univ. · Stat. & Data Science · Chinese ed.
2021
Invited talk — Airbnb Pricing Based on Statistical Machine Learning Models
ICSPML · Stanford (virtual)

All models are wrong, but some are useful.

— G. E. P. Box