Main Page
From U-M Big Data Summer Institute Wiki
Welcome to the U-M Big Data Summer Institute 2018 Wiki!
Consult the User's Guide for information on using the wiki software.
Contents
- 1 Reading Material
- 2 2018 Presentations
- 3 Day 29: Symposium
- 4 Additional Resources
Reading Material
Data Mining Group
Papers
HTML Slides
- 6.19 Slides
- 6.20 Slides
- 6.21 Slides
- 6.22 Slides
- 6.27 Slides
- 6.28 Slides
- 7.2 Slides
- 7.9 Slides
- 7.10 Slides
- 7.11 Slides
- BDSI Classification 1
- BDSI Classification 2
Machine Learning Group
Research Lecture Slides
- Introduction
- Explore MIMIC
- Getting x and y
- Some tips
- Sample Pipeline
- Training Pipleline
- CNN
- LSTM
- Structuring
- Dataset and DataLoader
- pytorch models
- Model Development
- Population
- Benchmark Features
- Inclusion Exclusion
Lecture Notes
- Recurrent Neural Networks
- Convolutional Neural Networks
- Deep Learning
- BDSI Lecture
- Flux Guide
- Python Guide
- Python Tutorial
Readings
- ARF Epidemiology
- CNN for Sentence Clarification
- Learning from Heterogenous Temporal Data
- MIMC 3
- MIMIC Benchmarks and Multitask RNN
- RNNs for Multivariate Time Series
- TREWScore for Septic Shock
- TREWScore Supplement
Genomics Group
Lectures
Intro Exercises
Papers
Population Genetics
Single Cell RNA
Transcriptomics
Online videos to better understand genetics and genomics
Genetics
- Introduction to Genetics by 23andMe (5 videos)
- TED-Ed : How Mendel's pea plants helped us understand genetics - Hortensia Jiménez Díaz
- Genetic Recombination and Gene Mapping by Bozeman Science
- Useful Genetics : A college-level comprehensive genetics course with 292 lectures offered by Rosie Redfield at UBC
Useful 3D Animations
- From DNA to protein - 3D Animation
- DNA Transcription - 3D Animation
- DNA splicing - 3D Animation
- mRNA Translation - 3D Animation
- How DNA is packaged - 3D Animation
- The Central Dogma - 3D Animation
Gene Regulation and Epigenetics
- Epigenetics Lecture by SciShow
- Hi-C Technique : A 3D map of the Human Genome
- The ENCODE Project
- RNAi by Nature Video
Sequencing Technologies
- TED-Ed : The race to sequence the human genome - Tien Nguyen
- DropSeq - Droplet-based Single Cell Sequencing by McCarroll Lab
Imaging Group
Papers
2018 Presentations
Week 1
Day 0: June 17
- Orientation Slides 2018 - Bhramar Mukherjee, PhD
Day 1: June 18
- Welcome Slides 2018 - Bhramar Mukherjee, PhD
- Training for Responsible Conduct in Research - Bhramar Mukherjee, PhD
- BDSIOrientationSupplement - Bhramar Mukherjee, PhD
- BDSI Event Presentation - Robert Peng
- BDSI Life in Ann Arbor - Stephen Salermo
- Al-Marzouki_s05 - The effect of scientific misconduct on the results of clinical trials: A Delphi survey
- baggerlycoombes - Deriving chemosensitivity from cell lines: forensic bioinformatics and reproducible research in high-throughput biology
- ethicalguidelines Ethical Guidelines for Statistical Practice
- ODS - On being a scientist 2009
- breiman - Statistical Modeling: The Two Cultures
- breimaninterview - A conversation with Leo Breiman
Recorded Lectures
Day 2: June 19
- Study design and inference. - Roderick Little, PhD
- Reproducible Research - Jedidiah Carlson
- Data Processing - Jed Carlson
- cameronpaulingpnas1976 - Supplemental ascorbate in the supportive treatment of cancer: Prolongation of survival times in terminal human cancer.
- comroe1977 - Experimental studies designed to evaluate the management of patients with incurable cancer
- CREAGAN - Failure of High-Dose Vitamin C therapy to benefit patients with advanced cancer
- neyman34jrss - On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection
Recorded Lectures
- Reproducible Research - J. Carlson
- Study design and Inference - R. Little
Day 3: June 20
- Observational Data - Roderick Little, PhD
- castnejm1989 - Preliminary report
- Tocainideahj80 - Prophylaxis of ventricular tachyarrythmias
- wakefieldlancet - Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children
- R intro file - Matthew Flickinger
- R intro Lecture - Matthew Flickinger
- Linear Algebra - Klemmer
- Matrix Algebra Lecture - Klemmer
Recorded Lectures
- Big Data pt. 2 - Roderick Little, PhD
- Linear Algebra - Klemmer
Day 4: June 21
- Linear Regression Slides - Matthew Zawistowski
- Logistic Regression Slides - Matthew Zawistowski
- R dplyr Slides - Matthew Flickinger
- dplyr R code - Flickinger
- dplyr R Flights - Flickinger
- dplyr OCSLS - Flickinger
Recorded Lectures
- dplyr - Matthew Flickinger
- Logistic Regression - Matthew Zawistowski
- Linear Regression - Matthew Zawistowski
Day 5: June 22
- State of Institute Slides - Bhramar Mukherjee, PhD
- Big Data 3 Slides - Roderick Little, PhD
- ggPlot 2 Slides - Matthew Flickinger, PhD
- Dplyr_Sms
- ggplot2
- Parameter Estimation - Roderick Little, PhD
- Fisher22philtransa - On the Mathematical Foundations of Theoretical Statistics
Recorded Lectures
- Parameter Estimation - Rod Little
- Journey Lecture - Sanchez
Week 2
Day 6: June 25
- Matrix Computations - Peisong Han
- Model Selection - Lauren Beesley
- Python Lecture 1 - Max Smith
- Python Notebook 1 - Max Smith
Recorded Lectures
- Matrix Computations - Han
Day 7: June 26
- Model Selection II - Lauren Beesley
- Machine Learning - Jenna Wiens
- Python Lecture 2 - Max Smith
- Python Notebook 2 - Max Smith
Recorded Lectures
Day 8: June 27
- Clustering - Danai Koutra
- Machine Learning II - Jenna Wiens
- Python Lecture 3 - Max Smith
- Python Notebook 3 - Max Smith
Day 9: June 28
- Casual Interference Slides - Zhenke Wu
- Clustering Part 2 Slides - Danai Koutra
Recorded Lectures
Day 10: June 29
- R Loops Slides - Flickinger
- dplyr NYC Flights
- dplyr OCSLS
- dplyr sms
- ggplot MPG
- ggplot NYC Flights
- R Simulations
Recorded Lectures
- Reading Like A Scientist - Griffiths
- R and Loops - Flickinger
Week 3
Day 11: July 2
- Information Visualization Slides - Kay
- Data Mining Slides - Najarian
Recorded Lectures
- Data Mining - Najarian
- Information Visualization - Kay
Day 12: July 3
- Data Mining 2 Slides - Najarian
- Bayes Slides - Wen
Recorded Lectures
- Data Mining 2 - Najarian
- Bayesian Statistics - Wen
Day 13: July 4 (NO CLASS)
Day 14: July 5
- Bayes Slides 2 - Wen
- Information Visualization 2 Slides - Kay
Recorded Lectures
- Information Visualization 2 - Kay
- Information Visualization 2 Contd. - Kay
- Bayesian Statistics 2 - Wen
Day 15: July 6
- Grad School Slides - Kidwell
Recorded Lectures
- Grad School - Kidwell
Week 4
Day 16: July 9
- Troubleshooting R Slides - Flickinger
- Troubleshooting R code - Flickinger
Recorded Lectures
- Computing - Sun
- Troubleshooting R - Flickinger
Day 17: July 10
- Bayesian Data Analysis - Chen
Recorded Lectures
- Bayesian Data Analysis pt 1 - Chen
- Bayesian Data Analysis pt 2 - Chen
Day 18: July 11
- Quinn Research Slides - Quinn
Recorded Lectures
- Qunn Lecture - Quinn
Day 19: July 12
- Predictive Analytics Slides - Denton
Day 20: July 13
- Resume/CV Slides - Forbes
- Resume Rubric - Forbes
Recorded Lectures
- Resume/CVs Lecture - Forbes
Week 5
Day 21: July 16
- Natural Language Processing Slides - Karandeep
Recorded Lectures
- Natural Language Processing Lecture - Karandeep
Day 22: July 17
- Optimization - Jiang
Recorded Lectures
Day 23: July 18
Recorded Lectures
Scientific Presentations Lecture - Zoellner
- Pick Me Lecture - Griffiths
Day 24: July 19
Recorded Lectures
- Epigenomics Lecture - Smith
- Social Network Lecture - Budak
Day 25: July 20
Recorded Lectures
- Mathematical Modeling Lecture - Eisenberg
- Journey Lecture - Pedro Orozco, BDSI Student Coordinator
Week 6
Day 26: July 23
Day 27: July 24
Day 28: July 25
Day 29: Symposium
2017 Student Poster Presentations
2017 Symposium Reference Files
2017 Symposium Projects
- A Time-to-Event Analysis of Heart Failure via Electronic Health Records
- Melanoma Detection by Classifying Skin Lesion Images
- Classifying Skin Lesions Images Using Adaptive Boosting
- Machine Learning Classification of Skin Lesion Images
- Genomics: Genome Storage and Assembly
- Predicting the Transcriptome from the Genome
- Classification of Cell Types from Peripheral Mononuclear Blood Cells
- EHR-Based Study of Long-Term Infectious Diseases
- Visualizing Lab and Phenotype Associations Using PheWAS and Electronic Health Records
- Data Mining: Microenvironment Microarray Spot Based Approach for Cell Prediction
- Estimating Cell Growth with Machine Learning and Data Mining
Additional Resources
- DataCamp Resources
- [Daily Schedule] - Last update July 3, 2018