Developing weighting methods for maximizing the power of hypotesis test on bandit data (Advisor: Susan A. Murphy, Harvard University)

In thie project, I compared several methods for constructing hypothesis test on bandit data. Then I developed an algorithm for adaptively weight the observations in different time step to maximize the power of hypothesis test. According to the simulations, our weighting methods can increase the power by about 3-5 percent with proper type one error control.

Causal Inference for data with outcome misclassification and selection bias (Advisor: Hong Zhang, University of Science and Technology of China)

In thie project, I derived a method for estimating treatment effect using data with misclassfication in binary outcome and selection bias. I am currently doing simulations about the methods.

Slip and guess prediction based on the question text (Advisor: Qi Liu, University of Science and Technology of China)

I fiirst designed a neural network to model the slip and guess effects based on question text. Then I independently implement the model in python and did the experiments on a huge real world dataset which includes 50 million records. Results shows that our model is widely applicable in different cognitive models and can improve the performance of cognitive modelling tasks.

Estimating mediation effect of the maternal genotype to the babies’ birth weight (Advisor: Hong Zhang, University of Science and Technology of China)

I compared the performance of 3 different methods for estimating causal mediation effects of maternal genotypes on childern’s birth weight. 2 of them are from existing literatures[1,2], which utilize structural equation modelling and adjusted linear regression. I also tried a new method using EM algorithm. Simulation results shows that EM algorithm is more efficient than other methods, though it is more expensive in terms of computation.

Implementation of a novel model(CCMO) in statistical genetics (Advisor: Hong Zhang, University of Science and Technology of China)

I worked together with 2 grad students and implemented a model called CCMO[3] proposed by Prof. Hong Zhang, which aims at dealing environemt-genetic interaction problem. We also compared it with Haplin[4] (another method for the same problem) and logistic regression by simulations.

Causal inference seminars (University of Science and Technology of China)

We discussed Causal Inference written by Donald B.Rubin and some latest papers in causal inference. I have been reported more than 10 times in the seminars.


[1] Warrington, N. M., Freathy, R. M., Neale, M. C., & Evans, D. M. (2018). Using structural equation modelling to jointly estimate maternal and fetal effects on birthweight in the UK Biobank. International journal of epidemiology, 47(4), 1229-1241.

[2] Warrington, N. M., Beaumont, R. N., Horikoshi, M., Day, F. R., Helgeland, Ø., Laurin, C., … & Wood, A. R. (2019). Maternal and fetal genetic effects on birth weight and their relevance to cardio-metabolic risk factors. Nature genetics, 51(5), 804-814.

[3] Zhang, H., Mukherjee, B., Arthur, V., Hu, G., Hochner, H., & Chen, J. (2020). An efficient and computationally robust statistical method for analyzing case-control mother–offspring pair genetic association studies. Annals of Applied Statistics, 14(2), 560-584.

[4] Gjessing, H. K., & Lie, R. T. (2006). Case‐parent triads: Estimating single‐and double‐dose effects of fetal and maternal disease gene haplotypes. Annals of human genetics, 70(3), 382-396.