What is statistics?
How does statistics fit into data science?
Start with a research question
Broadly speaking, we can categorize research questions into two categories:
Prediction questions require training a model that will perform well on new data
Inference questions require a model that can assess relationships between an outcome and predictor variable(s)
In this class, we will focus on inference. Next semester, you will focus more on prediction.
Based on the research question, we can identify the population of interest.
Often, it is unrealistic to collect data on the entire population, so we collect a sample.
But how?
Say I want a sample of five students in this class and I want to measure the proportion of the class that identify as extroverts. How could I choose five people?
Gold standard, but not always practical
Many statistical methods assume simple random sampling
Syllabus
Andrea Lane (you can call me Andrea!) she/her
Assistant Professor of the Practice, MIDS and Dept of Statistical Sciences
PhD in Biostatistics, Emory University
Work in health/social justice/community-engaged applications
Hobbies: Sports (baseball, football, basketball), board games, moviesssss
Fit and interpret statistical models, including linear and generalized linear models.
Connect statistical modeling concepts to underlying statistics fundamentals including probability distributions and estimation.
Map a research question and dataset to the appropriate statistical model.
Make careful and critical decisions about model building and consider real-world implications.
Communicate (through written and oral communication) model results to a broad audience.
Intuitive Introductory Statistics by Douglas A. Wolfe and Grant Schneider. (Available through the Duke Library)
An Introduction to Statistical Learning with Applications in R, 2nd edition by James, G., Witten, D., Hastie, T., and Tibshirani, R. (Available online)
Introduction to Modern Statistics, Second Edition by Mine Çetinkaya-Rundel and Johanna Hardin. (Available online)
Canvas has the link to the course website, which is where course materials will be posted
Assignments will be submitted on Gradescope, which you can access through Canvas
Announcements will be posted on the #ids702-fa25 Slack channel in the MIDS Workspace. You’re also welcome to post questions or resources there!
Andrea: Thursday after class (4:30-5:30) in the classroom if available or my office (Gross 223 - by the lockers)
Kayla and Atreya: TBD
Communication: Slack or email; follow up after 48 hours on weekdays
Late submissions:
50% credit within 24 hours
One no-questions-asked 24-hour extension for homework or statistics reflection
No make-up assignments
Academic integrity:
As a Duke student, you agree to uphold the Duke Community Standard
Read Nick Eubank’s advice on using ChatGPT
Duke Counseling and Psychological Services (CAPS)
Student Disability Access Office (SDAO)
Academic Resource Center (ARC)