Main Page
Welcome to the U-M Big Data Summer Institute 2019 Wiki!
Contents
- 1 Reading Material
- 2 2019 Presentations
- 2.1 Week 1
- 2.1.1 Day 1: RCRS Training, June 17
- 2.1.2 Day 2: Reproducible Research, Study Design and Inference, and Linear Regression, June 18
- 2.1.3 Day 3: Logistic Regression, Observational Data and Bias, and Probability, June 19
- 2.1.4 Day 4: Causal inference, Parameter Estimation/Likelihood, and Linear Algebra, June 20
- 2.1.5 Day 5: Data Wrangling in R with dplyr, Parts I and II, June 21
- 2.2 Week 2
- 2.2.1 Day 6: Visualization Data in R with ggplot2 - Part I & II and Generalized Linear Models, June 24
- 2.2.2 Day 7: Machine Learning I & Model Selection, June 25
- 2.2.3 Day 8: Machine Learning II & Unsupervised Learning/Clustering I, June 26
- 2.2.4 Day 9: Unsupervised Learning/Clustering II and Model Selection II, June 27
- 2.2.5 Day 10: Python Workshop I and II, June 28
- 2.3 Week 3
- 2.4 Week 4
- 2.4.1 Day 16: From Genomics to Prevention of Cardiovascular Diseases and R Markdown, July 8
- 2.4.2 Day 17: Bayes Computation I and II, July 9
- 2.4.3 Day 18: Reading Like a Scientific Writer and Social Network, July 10
- 2.4.4 Day 19: Stroke Disparities and Human-Centered Computing: Using Speech to Understand Behavior, July 11
- 2.4.5 Day 20: Spatial Epidemiology, July 12
- 2.5 Week 5
- 2.5.1 Day 21: Natural Language Processing I and II, July 15
- 2.5.2 Day 22: Optimization I and II, July 16
- 2.5.3 Day 23: Writing from Point A to Point B and Bayesian Data Integration and Precision Medicine, July 17
- 2.5.4 Day 24: Radiation Oncology and Imaging Analysis and Optimization in Health, July 18
- 2.5.5 Day 25: Confessions of a Clinical Researcher, July 19
- 2.6 Week 6
- 2.1 Week 1
- 3 Day 29: Symposium
- 4 Additional Resources
Reading Material
Machine Learning Group
Research Lecture Slides
- Introduction
- Explore MIMIC
- Getting x and y
- Some tips
- Sample Pipeline
- Training Pipleline
- CNN
- LSTM
- Structuring
- Dataset and DataLoader
- pytorch models
- Model Development
- Population
- Benchmark Features
- Inclusion Exclusion
Readings
- ARF Epidemiology
- CNN for Sentence Clarification
- Learning from Heterogenous Temporal Data
- MIMC 3
- MIMIC Benchmarks and Multitask RNN
- RNNs for Multivariate Time Series
- TREWScore for Septic Shock
- TREWScore Supplement
Genomics Group
Lectures
Intro Exercises
Papers
Population Genetics
Single Cell RNA
Transcriptomics
Online videos to better understand genetics and genomics
Genetics
- Introduction to Genetics by 23andMe (5 videos)
- TED-Ed : How Mendel's pea plants helped us understand genetics - Hortensia Jiménez Díaz
- Genetic Recombination and Gene Mapping by Bozeman Science
- Useful Genetics : A college-level comprehensive genetics course with 292 lectures offered by Rosie Redfield at UBC
Useful 3D Animations
- From DNA to protein - 3D Animation
- DNA Transcription - 3D Animation
- DNA splicing - 3D Animation
- mRNA Translation - 3D Animation
- How DNA is packaged - 3D Animation
- The Central Dogma - 3D Animation
Gene Regulation and Epigenetics
- Epigenetics Lecture by SciShow
- Hi-C Technique : A 3D map of the Human Genome
- The ENCODE Project
- RNAi by Nature Video
Sequencing Technologies
- TED-Ed : The race to sequence the human genome - Tien Nguyen
- DropSeq - Droplet-based Single Cell Sequencing by McCarroll Lab
Data Mining on Large Complex Datasets
Papers
2019 Presentations
Week 1
Day 1: RCRS Training, June 17
- Al-Marzouki - The effect of scientific misconduct on the results of clinical trials: A Delphi survey
- Baggerly Coombes - Deriving Chemosensitivity from cell lines: forensic bioinformatics and reproducible research in high-throughput biology
- Benjamini - Redefine statistical significance
- Ethical guidelines - Ethical Guidelines for Statistical Practice article
- Ethics reviews - Ethical statistical practice review slides
- Moving to a World beyond - Moving to a World Beyond “p < 0.05”
- On being a scientist - On being a scientist article
- On being a scientist ppt - On being a scientist slides
Recorded Lectures
- Cluster computing - Dan Barker
- RCRS Training - Mukherjee
Day 2: Reproducible Research, Study Design and Inference, and Linear Regression, June 18
- Reproducible Research LeFaive
- Study design and Inference Little
- Linear regression Hector
Recorded Lectures
- Reproducible research - LeFaive
- Study designs & Inference - Little
- Linear regression - Hector
Day 3: Logistic Regression, Observational Data and Bias, and Probability, June 19
- Logistic regression - Hector
- Observational data and Bias - Little
- Probability - Hartman
Recorded Lectures
- Logistic regression - Hector
- Observational Data, Bias - Little
- Probability review - Hartman
Day 4: Causal inference, Parameter Estimation/Likelihood, and Linear Algebra, June 20
- Causal Inference - Wu
- Parameter estimation and Likelihood - Little
- Linear algebra - Hartman
Recorded Lectures
- Causal Inference - Wu
- Parameter Estimation / Likelihood - Little
- Linear Algebra - Hartman
Day 5: Data Wrangling in R with dplyr, Parts I and II, June 21
- Dplyr slides - Flickinger
- Dplyr flight practice - Flickinger
- Dplyr flight practice answers - Flickinger
- Dplyr and OCLS practice - Flickinger
- Dplyr and OCLS practice answers - Flickinger
- Dplyr code - Flickinger
- Journey lectures - Boehnke
Recorded Lectures
- R Workshop dplyr Part II - Flickinger
- Journey Lecture - Boehnke
- No Recording Available (Data Wrangling in R with dplyr - Flickinger)
Week 2
Day 6: Visualization Data in R with ggplot2 - Part I & II and Generalized Linear Models, June 24
- Ggplot2 slides - Flickinger
- Ggplot2 mpg exercise - Flickinger
- Ggplot2 mpg excercise answers - Flickinger
- Ggplot2 flights exercise - Flickinger
- Ggplot2 flights answers exercise - Flickinger
Recorded Lectures
- Visualizing Data in R Part I - Flickinger
- Visualizing Data in R - Part II - Flickinger
- Generalized Linear Models - Hartman
Day 7: Machine Learning I & Model Selection, June 25
- Model selection I & II slides - Beesley
- Machine Learning I slides - Wiens
Recorded Lectures
- Model Selection 1 - Beesley
- No recording available (Machine Learning I - Wiens)
Day 8: Machine Learning II & Unsupervised Learning/Clustering I, June 26
- Machine Learning II slides - Wiens
Recorded Lectures
- Correlated data models - Hartman
- No recording available (Machine Learning II - Wiens)
- No recording available (Unsupervised Learning, Clustering I - Koutra)
Day 9: Unsupervised Learning/Clustering II and Model Selection II, June 27
- Model Selection II slides - Beesley
Recorded Lectures
- Model Selection Part II - Beesley
- Assessment of predictive models - Boonstra
- No recording available (Unsupervised Learning, Clustering II - Koutra)
Day 10: Python Workshop I and II, June 28
- Python Workshop I and II - Kamran
Recorded Lectures
- Python Workshop - Kamran
- Python workshop Part II - Kamran
- Journey Lecture - Banerjee
Week 3
Day 11: Data Mining I and Visualization I, July 1
- Visualization I slides - Kay
Recorded Lectures
- Data Mining Part I - Gryak
- Visualization I - Kay
Day 12: Data Mining II and Precision Health, July 2
- Precision Health slides - Kheterpal
Recorded Lectures
- Precision Health - Kheterpal
- No recording available (Data Mining II - Gryak)
Day 13: Visualization II and Intro to Bayes I, July 3
- Visualization II slides - Kay
- Intro to Bayes I slides - Wen
Recorded Lectures
- Introduction to Bayes - Wen
- No recording available (Visualization II - Kay)
Day 14: July 4 - NO CLASS
Day 15: Intro to Bayes II, July 5
- Intro to Bayes II slides - Wen
Recorded Lectures
- Journey lecture - Panigrahi
- Introduction to Bayes II - Wen
- No recording available (Journey Lectures: Taylor)
Week 4
Day 16: From Genomics to Prevention of Cardiovascular Diseases and R Markdown, July 8
Recorded Lectures
- Genomics to Prevention of Cardiovascular Disease - Surakka
- R Markdown - Boonstra
Day 17: Bayes Computation I and II, July 9
Recorded Lectures
- No recordings available (Computation I & II - Chen)
Day 18: Reading Like a Scientific Writer and Social Network, July 10
- Reading Like a Scientific Writer slides - Griffiths
Recorded Lectures
- No recording available (Reading like a scientific writer - Griffiths)
Day 19: Stroke Disparities and Human-Centered Computing: Using Speech to Understand Behavior, July 11
- Stroke Disparities slides - Lisabeth
- Human-Centered Computing slides - Provost
Recorded Lectures
- Stroke disparities - Lisabeth
- No recording available (Human-Centered Computing: Using Speech to Understand Behavior - Provost)
Day 20: Spatial Epidemiology, July 12
- Spatial Epidemiology slides - Zelner
Recorded Lectures
- Journey Lecture - Spino
- No recording available (Spatial Epidemiology - Zelner)
Week 5
Day 21: Natural Language Processing I and II, July 15
Recorded Lectures
- No recording available (Natural Language Processing I & II - Singh)
Day 22: Optimization I and II, July 16
- Optimization slides - Kang
Recorded Lectures
- No recording available (Optimization I & II - Kang)
Day 23: Writing from Point A to Point B and Bayesian Data Integration and Precision Medicine, July 17
- Writing from Point A to Point B slides - Griffiths
- Science of Writing paper - Griffiths
Recorded Lectures
- Data Integration & Precision Medicine - Baladandayuthapani
- No recording available (Writing from Point A to Point D: Simple Strategies for Conveying Complex Ideas - Griffiths)
Day 24: Radiation Oncology and Imaging Analysis and Optimization in Health, July 18
- Radiation Oncology slides - Rao
- Optimization in Health slides - Denton
- Two Stage Biomarker Protocols paper
- Optimizing Active Surveillance Strategies paper
- Optimization of Prostate Biopsy Referral Decisions paper
Recorded Lectures
- Cancer Surveillance - Denton
- Radiation oncology & precision medicine - Rao
Day 25: Confessions of a Clinical Researcher, July 19
Recorded Lectures
- No recording available ( Better, Not Just Bigger Data Analytics: Confessions of a Clinical Researcher - Nallamothu)
- No recording available (Journey Lecture - Orozco del Pino)
- No recording available (Journey Lecture - Beesley)
Week 6
Day 26: Preparing for Graduate School and CVs and Resumes, July 22
- Preparing for Graduate School slides - Kidwell
- CVs and Resumes slides - Forbes
- Resume Rubric slides
- CV Guide handout
- Action Verb handout
- Resume Samples handout
- Word Tricks handout
Recorded Lectures
- CVs & Resumes - Forbes
- Preparing for Grad School - Kidwell
Day 27: Pick Me!, July 23
Recorded Lectures
- No recording available (TBD - Grifftihs)
Day 29: Symposium
2019 Professor Lectures Presentations
2019 Student Poster Presentations
2018 Student Poster Presentations
- Imaging Group Presentation
- Machine Learning Group Presentation
- Genetics Group Presentation
- Data Mining Presentation
2017 Student Poster Presentations
2017 Symposium Reference Files
2017 Symposium Projects
- A Time-to-Event Analysis of Heart Failure via Electronic Health Records
- Melanoma Detection by Classifying Skin Lesion Images
- Classifying Skin Lesions Images Using Adaptive Boosting
- Machine Learning Classification of Skin Lesion Images
- Genomics: Genome Storage and Assembly
- Predicting the Transcriptome from the Genome
- Classification of Cell Types from Peripheral Mononuclear Blood Cells
- EHR-Based Study of Long-Term Infectious Diseases
- Visualizing Lab and Phenotype Associations Using PheWAS and Electronic Health Records
- Data Mining: Microenvironment Microarray Spot Based Approach for Cell Prediction
- Estimating Cell Growth with Machine Learning and Data Mining
Additional Resources
- Daily Schedule - Last update June 4, 2019
- Social events Schedule - Last update June 6, 2019