
Using novel statistical and machine learning methods to analyze data
Overview
This study aims to develop and apply novel statistical and machine learning methods to analyze data provided from the Precision Health (PH) platform, such as electronic health record (EHR) data and de-identified clinical notes, along with genotype data from the Michigan Genomics Initiative (MGI), to study disease etiology. Specifically, the project will leverage machine learning approaches, including natural language processing (NLP) models and large language models (LLM), to extract disease-relevant information from the de-identified clinical notes in PH and integrate this with genotype data from MGI.In addition, the project aims to develop statistical models to understand how historical measurements in electronic health records may influence future clinical or disease outcomes, such as whether past body mass index (BMI) measurements can enhance the prediction of disease risks.
Principal Investigator(s)
Xiang Zhou, PhD
Matt Zawistowski, PhD
Digital Health Innovation Support
Digital Health Innovation provided genetic data, de-identified radiology reports, and Armis2 computing power.