Statistical Learning (Self-Paced)

SOHS-YSTATSLEARNING

Stanford School of Humanities and Sciences


Thumbnail

Description

This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical).

This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data analysis. Computing is done in R. There are lectures devoted to R, giving tutorials from the ground up, and progressing with more detailed sessions that implement the techniques in each chapter.

The lectures cover all the material in An Introduction to Statistical Learning, with Applications in R by James, Witten, Hastie and Tibshirani (Springer, 2013). As of January 5, 2014, the pdf for this book will be available for free, with the consent of the publisher, on the book website.   

Prerequisites

First courses in statistics, linear algebra, and computing.

Instructors

Trevor Hastie, John A Overdeck Professor of Statistics, Stanford University

Robert Tibshirani, Professor in the Departments Health Research and Policy and Statistics, Stanford University

001 Open for Enrollment Online, Open edX

Enroll Now

Instructors:
Delivery Option:
Online
Fees:
Online Course $0.00

Notes

Statement of Accomplishment

If you complete the course, and achieve a passing grade of 50% on the quizzes, you can generate a Statement of Accomplishment from within the course. If you get 90% or higher, your statement will be "with distinction".

Textbooks & Resources

A free online version of An Introduction to Statistical Learning, with Applications in R by James, Witten, Hastie and Tibshirani (Springer, 2013) is available from that website. Springer has agreed to this, so no need to worry about copyright. Of course you may not distribiute printed versions of this pdf file.

You get R for free from http://cran.us.r-project.org/. Typically it installs with a click. You get RStudio from http://www.rstudio.com/ , also for free, and a similarly easy install.

Time Commitment

It will take approximately 3-5 hours per week to go through the materials and exercises in each section.