"Small" Data: Prediction, Inference, Causality


Stanford School of Engineering



"Small" data are datasets that allow interaction, visualization, exploration and analysis on a local machine to drive business intelligence. This course explores the difference between "small" data and big data and provides an introduction to applied data analysis, with an emphasis on a conceptual framework for thinking about data from both statistical and machine learning perspectives.



  • STATS116
  • experience with R at the level of STATS195
  • 1 year of college level calculus (through calculus of several variables, such as CME100 or MATH51)
  • Background in statistics, experience with spreadsheets recommended.
  • An undergraduate degree with a GPA of 3.0 or equivalent

Topics include

  • Binary classification
  • Bootstrapping
  • Causal inference
  • Experimental design
  • Machine Learning
  • Regression
  • Statistics (frequentist, Bayesian)
  • Time series modeling

Note on Course Availability

The course schedule is displayed for planning purposes – courses can be modified, changed, or cancelled. Course availability will be considered finalized on the first day of open enrollment. For quarterly enrollment dates, please refer to our graduate certificate homepage.