Statistical Analysis Plan

Due: Friday, November 3 11:59 PM

Purpose

The purpose of the statistical analysis plan is to practice articulating the process you will use to answer your research questions. Some decisions about the modeling and analysis process must be made a priori. This is a chance to practice making those decisions and get feedback on your analysis plan prior to completing your final report.

Statistical Analysis Plan

You will generate a 1-2 page statistical analysis plan that contains the elements listed below. You do not need to run any code for this, so you can generate this plan however you’d like (e.g., quarto, Rmd, Word). Note that you should always use variable descriptions instead of “raw” variable names (e.g., “Salary ($)” instead of “salary_in_usd”).

  • Data Overview: Provide the chief characteristics of your data, including sample size, number of variables, and source (you can use any citation format, but include more than just the link). Briefly describe how the data were collected. What does each row represent? Include your research questions in this section. (This is exactly the same as the EDA report, so you can copy and paste).

  • Modeling: For each research question, address the following:

    • What kind of model will you fit?

    • Is your question related to inference or prediction? How does this affect your modeling process including variable selection and model assessment?

    • List the variables that you are selecting a priori to include in your model (if there are many predictors in your dataset, you can specify broad categories instead of specific variables. Remember to use variable labels instead of raw variable names).

    • You are required to include at least one interaction term in your model. Specify the interaction term that is meaningful for your research question. For example, you could say something like “we are interested in how the relationship between vaccination rate and COVID infection totals changed over time, so we will include an interaction term with month and vaccination rate.”

  • Potential challenges: How might you address the challenges that you identified in your EDA report?

Submit one report per group. One person will submit and select the other group members in the Gradescope submission. Be sure to assign pages in Gradescope when you submit.