Dr. Hao Feng develops and applies biostatistical and bioinformatics approaches to better understand high-throughput omics data. He utilizes epigenetic data to define biomarkers, predict and classify disease subtypes. His approaches has wide applications in cancers and neurological studies, such as liver cancer, Alzheimer’s Disease (AD) and PTSD. He also has collaborated in researching how Zika virus alters transcriptome, and how environmental radiation exposure induces epigenetic variables that inform lung cancer.
Working in industry and academic medicine, he has developed Bayesian methods to study tumor growth inhibition and has developed statistical models for the analysis of genetics data, with applications to cancer immunology and trauma research.
Personal Web page:
My research interests include biostatistics, bioinformatics and computational biology. My main focus is to develop and apply biostatistics and bioinformatics methods to better understand high-throughput omics data, with an emphasis on applications in cancer. I utilize epigenetic data to define biomarkers, classify cancer subtypes and predict disease in cell-free DNA. I developed a number of open-source software tools that are freely available on Bioconductor and R-CRAN, with over 10,000 downloads annually. These tools have been widely used in studies on various cancer types. I collaborate closely with physicians and wet-lab researchers to decipher signals from cancer genomics and epigenomics data.
Metrics from Web of Science/publons and/or Scopus/SciVal:
- H-index: 7
- Total publications from CV: 14
- Total citations: 479
- Publications in top-tier journals: 80%
- Collaborative publishing - international: 80%/20%
Reviewer for publications including:
- Journal of Applied Statistics
- Statistical Methods in Medical Research
- Scientific Reports
- Journal of Alzheimer’s Disease
Contributions to science:
- Statistical methods in bioinformatics and computational biology
- Experimental design, data analysis and method development in high-throughput omics data
- Biomarker discovery and signal deconvolution in epigenetics data
- Applications of statistical methods in cancer, neurodegenerative disease, PTSD and cell-free DNA data
- DSS: Dispersion Shrinkage for Sequencing, an R/Biconductor package for differential analysis from high-throughput sequencing data, including differential expression for RNA-seq and differential methylation for bisulfite-sequencing data
- InfiniumPurify: a comprehensive R package for estimating and accounting for tumor purity based on DNA methylation Infinium 450k array data
- cfDNAMethy: a reference-free and reference-based method for disease prediction by cell-free DNA methylation