Difference between revisions of "Main Page"
From U-M Big Data Summer Institute Wiki
(→Day 22 July 12) |
(→Symposium) |
||
(37 intermediate revisions by the same user not shown) | |||
Line 88: | Line 88: | ||
=== Day 17 July 5 === | === Day 17 July 5 === | ||
− | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=63727adf-6c0d-41fa-aab6-649583ada3ca Supervised Machine Learning (slides & audio)] - Hui Jiang, PhD | + | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=63727adf-6c0d-41fa-aab6-649583ada3ca Supervised Machine Learning 1 (slides & audio)] - Hui Jiang, PhD |
* [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=2ef517f5-bcf4-4656-b437-b8bd277a1a52 Intro to Bayes (slides & audio)] - Bhramar Mukherjee, PhD | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=2ef517f5-bcf4-4656-b437-b8bd277a1a52 Intro to Bayes (slides & audio)] - Bhramar Mukherjee, PhD | ||
Line 95: | Line 95: | ||
=== Day 19 July 7 === | === Day 19 July 7 === | ||
− | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=bce833e7-f85b-4093-930e-69d53d7a84aa Bayes | + | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=bce833e7-f85b-4093-930e-69d53d7a84aa Bayes Computation 1 (slides & audio)] - Tim Johnson, PhD |
− | * [https://drive.google.com/open?id=0B2ht_TCS6xC-MlhhOHR4ZlpfT1U Unsupervised Machine Learning (slides)] - Jenna Wiens, PhD | + | * [https://drive.google.com/open?id=0B2ht_TCS6xC-MlhhOHR4ZlpfT1U Unsupervised Machine Learning 1 (slides)] - Jenna Wiens, PhD |
=== Day 20 July 8 === | === Day 20 July 8 === | ||
Line 111: | Line 111: | ||
=== Day 23 July 13 === | === Day 23 July 13 === | ||
− | * [ | + | * [https://drive.google.com/open?id=0B2ht_TCS6xC-TmE1eEh3dGZ1UG8 Reproducible Research (slides)] - Jed Carlson, Biostatistics PhD student |
− | + | ||
=== Day 24 July 14 === | === Day 24 July 14 === | ||
− | * [ | + | * [https://drive.google.com/open?id=0B2ht_TCS6xC-WkZpekZReDY5Mnc Social Netwrok Analysis (slides)] - Eytan Adar, PhD |
− | * [ | + | * [https://drive.google.com/open?id=0B2ht_TCS6xC-SFhqcmJyUHBPaDg Unsupervised Machine Learning 2 (slides)] - Jenna Wiens, PhD |
=== Day 25 July 15 === | === Day 25 July 15 === | ||
− | * [ | + | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=d667c41f-bdc8-4643-9aaa-c2f6f4a9a5b1 My Journey to Bigger than Average Data (slides & audio)] - Joe Messana, MD |
− | + | ||
=== Day 26 July 18 === | === Day 26 July 18 === | ||
− | * [ | + | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=5e64930a-2f7a-4f69-95b7-61afb986fd00 Supervised Machine Learning 3 (slides & audio)] - Hui Jiang, PhD |
− | * [ | + | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=ebd61dd6-1f44-4d48-b6f6-34e7c532c601 Population Genetics (slides & audio)] - Sebastian Zoellner, PhD |
+ | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=a97b5884-2ffc-4c56-a979-7a27f4b0ce78 Adventures in Human Genetics (slides & audio)] - Goncalo Abecasis, PhD, Department of Biostatistics Chair at the University of Michigan | ||
=== Day 27 July 19 === | === Day 27 July 19 === | ||
− | * [ | + | * [https://drive.google.com/open?id=0B2ht_TCS6xC-MHdqMGNJbGUzMWc Data Privacy and Security (slides)] - Jacob Abernethy, PhD |
− | * [http:// | + | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=17eac995-13a4-4ca5-9bab-aaaf1a3fd2b8 Learning Health Systems (slides & audio)] - Karandeep Singh, PhD |
+ | |||
+ | === Day 28 July 20 === | ||
+ | * [https://sph.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=9acb4056-07b8-4f59-89db-0a7d323439a1 Journey Lecture (slides & audio)] - Phil Boonstra, PhD | ||
+ | |||
+ | == Symposium == | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-enM2bF90VHNnMmc Symposium Welcome] - Bhramar Mukherjee, PhD | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-SXkydVBKMm56WHc Identifying and Correcting for Contamination in DNA Sequencing Studies] - Michael Boehnke, PhD | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-MWxjT0JoeXc1b2c Assessing Time-Varying Causal Effect Moderation using Intensively Collected Longitudinal Intervention Data] - Susan Murphy, PhD | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-bWkwWkhaZy1RaUk Statistical Methods for Personalized Medicine] - Jeremy Taylor, PhD | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-ek1DaVZtb0Vvcnc Software For (and With) Big Data] - Eytan Adar, PhD | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-WmN3R3M3QWtCd00 Using Doctors' Notes to Uncover Everyday Natural Experiments in Healthcare] - Karandeep Singh, MD | ||
+ | * [http://web.eecs.umich.edu/~jabernet/BDSI_2016/flint_water_talk.html#1 Data Science for the Flint Water Crisis] - Jacob Abernethy, PhD and Eric Schwartz. PhD | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-ZlVYd3NiYldiQ28 Doing Data Science] - Rachel Schutt, Chief Data Scientist at News Corp | ||
+ | |||
+ | '''Student Group Presentations''' | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-OVY5WlNhT2VnOTA Data Mining] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-amN6NkMxNnFLNm8 Electronic Health Records (EHR)] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-OEtIT1E2RHFFQlk Genomics] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-TG9ZMzJFcngxWm8 Machine Learning] | ||
+ | |||
+ | '''Student Poster Presentations''' | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-RnVJTHRKWngyRkU Using Data Mining Techniques and Classification Algorithms to Predict Impact of Academic Papers] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-WHc3SXZTdmhXMjA Clustering Gene Expression Profiles of Single Cells Using Expectation Maximization] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-dmJwRUV0WGVLTDA Data Mining: Association Rules Using the Apriori Algorithm] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-cW82UmNwQkk3MnM Exploring U.S. Population Stratification with Genes for Good Data] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-UTdXSnNVMU4ySkk Network Structure in Offshore Leaks Data] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-Z2JsZ3ZMNW53QUE A Genome-Wide Association Study for Mental Illness] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-WmkyWXJSWUV1bjg Evaluation of the End Stage Renal Disease Quality Incentive Program] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-VGJQcTQtNnZpSFU Genome-Wide Association Study of Alcohol Consumption and Stress Levels] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-MjZENURlQzNuaEU Perservation of Semantic Parallels Through Word2Vec] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-eWN1WTFmNWJPNFE Single Cell Diffenence in Gene Expression between Macular and Peripheral Human Retinal Cells] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-U05xUkswNFM3Unc Twitter Sentiment Analysis & The Stock Market] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-SUpnM2NpeDdyWlE Data Curation & Wrangling] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-cHVxU0JCRWlDX0U Predicting CKD and Potential Risk Factors with Multiple Linear Regression] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-MVFCSXZzYUJlMXM The ICIJ Panama Leak: Temporal & Spatial Visualizations of the World's Hidden Wealth] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-eWkwMDdzM0h1MVU A Streamlined Approach for Comparing Two Genomic Catalogs] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-MnZNVWJKLWt5QTQ From Interviews to Blood Tests: Using Statistical and Machine Learning Models to Predict Kidney Function] | ||
+ | * [https://drive.google.com/open?id=0B2ht_TCS6xC-QmdiYkJYamRwelk Flagging Facilities: An Empirical Null Method] | ||
== Getting started == | == Getting started == |
Latest revision as of 12:08, 1 August 2016
Welcome to the U-M Big Data 2016 Summer Program Wiki!
Consult the User's Guide for information on using the wiki software.
Contents
- 1 Reading Material
- 2 2016 Presentations
- 2.1 Day 1: June 13
- 2.2 Day 2: June 14
- 2.3 Day 3: June 15
- 2.4 Day 4: June 16
- 2.5 Day 5 June 17
- 2.6 Day 6 June 20
- 2.7 Day 7 June 21
- 2.8 Day 8 June 22
- 2.9 Day 9 June 23
- 2.10 Day 10 June 24
- 2.11 Day 11 June 27
- 2.12 Day 12 June 28
- 2.13 Day 13 June 29
- 2.14 Day 14 June 30
- 2.15 Day 15 July 1
- 2.16 Day 17 July 5
- 2.17 Day 18 July 6
- 2.18 Day 19 July 7
- 2.19 Day 20 July 8
- 2.20 Day 21 July 11
- 2.21 Day 22 July 12
- 2.22 Day 23 July 13
- 2.23 Day 24 July 14
- 2.24 Day 25 July 15
- 2.25 Day 26 July 18
- 2.26 Day 27 July 19
- 2.27 Day 28 July 20
- 3 Symposium
- 4 Getting started
Reading Material
Data Mining Group
- Automatic Construction and Natural-Language Description
- Interestingness Measures for Data Mining a Survey
- Fast Algorithms for Mining Association Rules
EHR Group
- Statistical Inference, Learning and Models in Big Data
- USRDS 2015 Annual Data Report, Volume 1: Chronic Kidney Disease
- Dialysis Facility Characteristics and Services
- Data Dictionary for Quarterly Dialysis Facility Compare
- USRDS Introduction to Volume 1: CKD in the United States
Genomics Group
- A Global Reference for Human Genetic Variation
- A Map of Human Genome Variation from Population-Scale Sequencing
2016 Presentations
Day 1: June 13
- Orientation 2016 (slides) - Bhramar Mukherjee, PhD
- Welcome Reception (slides) - Bhramar Mukherjee, PhD
- Computing Resources (slides)
- Ethics Review (slides)
- On Being A Scientist (slides)
- Life in Ann Arbor (slides) - Evan Reynolds, Biostatistics PhD Student
- Intro to Probabilities and Distributions (file) - Rebecca Rothwell, Biostatistics PhD student
Day 2: June 14
- Computing Platforms (website) - Jacob Abernethy, PhD
- Introductory Statistics (slides & audio) - Bhramar Mukherjee, PhD
Day 3: June 15
- Intro to Unix (slides) - Hyun Min Kang, PhD
Day 4: June 16
- Data Structure using Python (website) - Jacob Abernethy, PhD
- Observational Data, Bias and Confounding (slides & audio) - Roderick Little, PhD
Day 5 June 17
- Data Structure using Python (website) - Jacob Abernethy, PhD
- From Pure Mathematics to Gene Discovery: A Biostatistician's Journey (slides & audio) - Michael Boehnke, PhD
Day 6 June 20
- Statistical Modeling using R (slides) - Phil Boonstra, PhD
- Study Design and Inference (slides & audio) - Roderick Little, PhD
Day 7 June 21
- Visualization using R (slides & audio) - Phil Boonstra, PhD
- Distributed Computing (slides) - Harsha Madhyastha, PhD
Day 8 June 22
- Fundamentals of Data Processing (slides) - Jed Carlson, Biostatistics PhD student
Day 9 June 23
- Matrix Computation (audio) - Jason Estes, Research Fellow in Biostatistics
Day 10 June 24
- Career Journey and A Principal Approach to Dimensionality Reduction Part 1 (slides) - Stephen Gliske, PhD
- Career Journey and A Principal Approach to Dimensionality Reduction Part 2 (slides) - Stephen Gliske, PhD
- Career Journey and A Principal Approach to Dimensionality Reduction Part 3 (slides) - Stephen Gliske, PhD
Day 11 June 27
- Large Scale Optimization Part 1 (slides & audio) - Tewari Ambuj, PhD
- link Causal Inference (audio) - Lu Wang, PhD
- Likelihood Functions and Parameter Estimation (slides) - Matthew Zawistowski, Research Specialist at the University of Michigan
Day 12 June 28
- Large Scale Optimization Part 2 (slides & audio) - Tewari Ambuj, PhD
- Sequential Decision Making (slides & audio) - Tewari Ambuj, PhD
Day 13 June 29
- R - dplyr (slides) - Matthew Flickinger, Senior Analyst at the University of Michigan
- R - Troubleshooting (slides) - Matthew Flickinger, Senior Analyst at the University of Michigan
Day 14 June 30
- Clustering: Graphical Models and Sampling Algorithm Part 1 (slides & audio) - Long Nguyen, PhD
- Clustering: Graphical Models and Sampling Algorithm Part 2 (slides & audio) - Long Nguyen, PhD
Day 15 July 1
- My SMART Journey (slides & audio) - Kelley Kidwell, PhD
- My Spatial Journey to UM Biostatistics (slides & audio) - Veronica Berrocal, PhD
Day 17 July 5
- Supervised Machine Learning 1 (slides & audio) - Hui Jiang, PhD
- Intro to Bayes (slides & audio) - Bhramar Mukherjee, PhD
Day 18 July 6
- R Data Visualization (slides) - Matthew Flickinger, Senior Analyst at the University of Michigan
Day 19 July 7
- Bayes Computation 1 (slides & audio) - Tim Johnson, PhD
- Unsupervised Machine Learning 1 (slides) - Jenna Wiens, PhD
Day 20 July 8
- Bayes Computation 2 (slides & audio) - Tim Johnson, PhD
- The Moments and the Journey (slides & audio) - Bhramar Mukherjee, PhD
Day 21 July 11
- Visualization 1 (slides) - Eytan Adar, PhD
- Supervised Machine Learning 2 (slides & audio) - Hui Jiang, PhD
Day 22 July 12
- Visualization 2 (slides) - Eytan Adar, PhD
- Personalized Medicine (slides & audio) - Lu Wang, PhD
Day 23 July 13
- Reproducible Research (slides) - Jed Carlson, Biostatistics PhD student
Day 24 July 14
- Social Netwrok Analysis (slides) - Eytan Adar, PhD
- Unsupervised Machine Learning 2 (slides) - Jenna Wiens, PhD
Day 25 July 15
- My Journey to Bigger than Average Data (slides & audio) - Joe Messana, MD
Day 26 July 18
- Supervised Machine Learning 3 (slides & audio) - Hui Jiang, PhD
- Population Genetics (slides & audio) - Sebastian Zoellner, PhD
- Adventures in Human Genetics (slides & audio) - Goncalo Abecasis, PhD, Department of Biostatistics Chair at the University of Michigan
Day 27 July 19
- Data Privacy and Security (slides) - Jacob Abernethy, PhD
- Learning Health Systems (slides & audio) - Karandeep Singh, PhD
Day 28 July 20
- Journey Lecture (slides & audio) - Phil Boonstra, PhD
Symposium
- Symposium Welcome - Bhramar Mukherjee, PhD
- Identifying and Correcting for Contamination in DNA Sequencing Studies - Michael Boehnke, PhD
- Assessing Time-Varying Causal Effect Moderation using Intensively Collected Longitudinal Intervention Data - Susan Murphy, PhD
- Statistical Methods for Personalized Medicine - Jeremy Taylor, PhD
- Software For (and With) Big Data - Eytan Adar, PhD
- Using Doctors' Notes to Uncover Everyday Natural Experiments in Healthcare - Karandeep Singh, MD
- Data Science for the Flint Water Crisis - Jacob Abernethy, PhD and Eric Schwartz. PhD
- Doing Data Science - Rachel Schutt, Chief Data Scientist at News Corp
Student Group Presentations
Student Poster Presentations
- Using Data Mining Techniques and Classification Algorithms to Predict Impact of Academic Papers
- Clustering Gene Expression Profiles of Single Cells Using Expectation Maximization
- Data Mining: Association Rules Using the Apriori Algorithm
- Exploring U.S. Population Stratification with Genes for Good Data
- Network Structure in Offshore Leaks Data
- A Genome-Wide Association Study for Mental Illness
- Evaluation of the End Stage Renal Disease Quality Incentive Program
- Genome-Wide Association Study of Alcohol Consumption and Stress Levels
- Perservation of Semantic Parallels Through Word2Vec
- Single Cell Diffenence in Gene Expression between Macular and Peripheral Human Retinal Cells
- Twitter Sentiment Analysis & The Stock Market
- Data Curation & Wrangling
- Predicting CKD and Potential Risk Factors with Multiple Linear Regression
- The ICIJ Panama Leak: Temporal & Spatial Visualizations of the World's Hidden Wealth
- A Streamlined Approach for Comparing Two Genomic Catalogs
- From Interviews to Blood Tests: Using Statistical and Machine Learning Models to Predict Kidney Function
- Flagging Facilities: An Empirical Null Method