Linear Regression Glms And Gams With R

How to extend linear regression to specify and estimate generalized linear models and additive models.

Last updated 2022-01-10 | 4.4

- Understand the assumptions of ordinary least squares (OLS) linear regression.
- Specify
- estimate and interpret linear (regression) models using R.
- Understand how the assumptions of OLS regression are modified (relaxed) in order to specify
- estimate and interpret generalized linear models (GLMs).

What you'll learn

Understand the assumptions of ordinary least squares (OLS) linear regression.
Specify
estimate and interpret linear (regression) models using R.
Understand how the assumptions of OLS regression are modified (relaxed) in order to specify
estimate and interpret generalized linear models (GLMs).
Specify
estimate and interpret GLMs using R.
Understand the mechanics and limitations of specifying
estimating and interpreting generalized additive models (GAMs).

* Requirements

* Students will need to install R and R Commander software but ample instruction for doing so is provided.

Description

Linear Regression, GLMs and GAMs with R demonstrates how to use R to extend the basic assumptions and constraints of linear regression to specify, model, and interpret the results of generalized linear (GLMs) and generalized additive (GAMs) models. The course demonstrates the estimation of GLMs and GAMs by working through a series of practical examples from the book Generalized Additive Models: An Introduction with R by Simon N. Wood (Chapman & Hall/CRC Texts in Statistical Science, 2006). Linear statistical models have a univariate response modeled as a linear function of predictor variables and a zero mean random error term. The assumption of linearity is a critical (and limiting) characteristic. Generalized linear models (GLMs) relax this assumption of linearity. They permit the expected value of the response variable to be a smoothed (e.g. non-linear) monotonic function of the linear predictors. GLMs also relax the assumption that the response variable is normally distributed by allowing for many distributions (e.g. normal, poisson, binomial, log-linear, etc.). Generalized additive models (GAMs) are extensions of GLMs. GAMs allow for the estimation of regression coefficients that take the form of non-parametric smoothers. Nonparametric smoothers like lowess (locally weighted scatterplot smoothing) fit a smooth curve to data using localized subsets of the data. This course provides an overview of modeling GLMs and GAMs using R. GLMs, and especially GAMs, have evolved into standard statistical methodologies of considerable flexibility. The course addresses recent approaches to modeling, estimating and interpreting GAMs. The focus of the course is on modeling and interpreting GLMs and especially GAMs with R. Use of the freely available R software illustrates the practicalities of linear, generalized linear, and generalized additive models.

Who this course is for:

  • This course would be useful for anyone involved with linear modeling estimation, including graduate students and/or working professionals in quantitative modeling and data analysis.
  • The focus, and majority of content, of this course is on generalized additive modeling. Anyone who wishes to learn how to specify, estimate and interpret GAMs would especially benefit from this course.

Course content

5 sections • 69 lectures

Introduction to Course Preview 01:51

Preliminaries: Installing R, RStudio, R Commander, Course Materials and Exercise Preview 05:16

Beginning Agenda (slides) Preview 08:18

What is Linear Modeling? (slides, part 1) Preview 05:11

The term "linear" refers to the fact that we are fitting a line. The term model refers to the equation that summarizes the line that we fit. The term "linear model" is often taken as synonymous with linear regression model.

Assumptions of Linear Modeling (slides, part 2) Preview 06:08

Assumptions of Linear Models (regression):

  1. The residuals are independent
  2. The residuals are normally distributed
  3. The residuals have a mean of 0 at all values of X
  4. The residuals have constant variance

Desirable Properties of Beta-hat (slides, part 3) Preview 07:19

Example: Estimate Age of Universe (slides) Preview 04:39

Example: Estimate Age of Universe Live in R (part 1) Preview 07:44

Example: Estimate Age of Universe Live in R (part 2) Preview 09:22

Example: Estimating Age of the Universe (part 3) Preview 08:50

Finish Example and More Notes on Linear Modeling Preview 08:31

Linear Modeling Exercises Preview 01:48

Introduction to GLMs (slides, part 1) Preview 06:58

In statistics, the generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

Introduction to GLMs (slides, part 2) Preview 07:29

Introduction to GLMs (slides, part 3) Preview 07:50

Introduction to GLMs (slides, part 4) Preview 06:44

Example: Binomial (Proportion) Model with Heart Disease (part 1) Preview 07:50

Proportion data has values that fall between zero and one. Naturally, it would be nice to have the predicted values also fall between zero and one. One way to accomplish this is to use a generalized linear model (glm) with a logit link and the binomial family.

Example: Binomial (Proportion) Model with Heart Disease (part 2) Preview 07:26

Example: Binomial (Proportion) Model with Heart Disease (part 3) Preview 08:16

Example: Binomial (Proportion) Model with Heart Disease (part 4) Preview 06:22

GLM Exercises Preview 01:05

Current Agenda Preview 01:46

Linear Regression Exercise Solutions (part 1) Preview 07:31

Linear Regression Exercise Solutions (part 2) Preview 07:29

GLM Exercise Solutions (part 3) Preview 09:30

Example: Poisson Model with Count Data (part 1) Preview 08:15

In statistics, Poisson regression is a form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

Poisson regression models are generalized linear models with the logarithm as the (canonical) link function, and the Poisson distribution function as the assumed probability distribution of the response.

Example: Poisson Model with Count Data (part 2) Preview 09:29

Example: Binary Response Variable (part 1) Preview 04:43

Example: Binary Response Variable (part 2) Preview 06:12

Exercise: GLM to GAM Preview 01:40

Example: Log-Linear Model for Categorical Data Preview 05:55

Log-linear analysis is a technique used in statistics to examine the relationship between more than two categorical variables.

More on Deviance and Overdispersion (slides) Preview 03:11

What are GAMS? (Crawley, slides, part 1) Preview 07:41

In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear predictor depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. GAMs were originally developed by Trevor Hastie and Robert Tibshirani to blend properties of generalized linear models with additive models.

What are GAMs? (Crawley, slides, part 2) Preview 06:02

Demonstrate GAM Ozone Data (part 1) Preview 09:40

Demonstrate GAM Ozone Data (part 2) Preview 09:42

General Approaches for Fitting GAMs (slides) Preview 02:44

What are GAMs? (Wood, slides, part 1) Preview 11:34

Univariate Polynomial GAMs (Wood, slides, part 2) Preview 07:27

Univariate Polynomial GAMs (Wood, slides, part 3) Preview 05:52

GAMs as 4th Order Polynomials (slides, part 1) Preview 06:21

GAMs as 4th Order Polynomials (slides, part 2) Preview 04:29

GAMs as Regression Splines (slides) Preview 03:38

Cubic Splines (slides, part 1) Preview 08:45

Cubic Splines (slides, part 2) Preview 04:21

Function to Establish Basis for Spline (slides) Preview 07:33

Build-a-GAM (slides, part 1) Preview 07:46

Build-a-GAM (slides, part 2) Preview 10:16

Build-a-GAM (slides, part 3) Preview 06:17

Build-a-GAM Demonstration in R Script Preview 11:34

Build-a-GAM Cross Validation Preview 08:13

Bivariate GAMs with 2 Explanatory Independent Variables (slides, part 1) Preview 09:17

Bivariate GAMs with 2 Explanatory Independent Variables (slides, part 2) Preview 07:31

Exercises Preview 01:33

Current Agenda (slides) Preview 05:23

Cherry Trees and Finer Control (slides, part 1) Preview 08:10

Finer Control of GAM (slides, part 2) Preview 10:52

Using Smoothers with More than One Predictor (slides) Preview 07:04

More on Alternative Smoothing Bases (slides) Preview 08:06

Parametric Model Terms (slides) Preview 08:29

Example: Brain Imaging (part 1) Preview 07:51

Example: Brain Imaging (part 2) Preview 08:09

Example: Brain Imaging (part 3) Preview 07:38

Example: Brain Imaging (part 4) Preview 07:03

Example: Brain Imaging (part 5) Preview 07:41

Example: Air Pollution in Chicago (part 1) Preview 09:33

Example: Air Pollution in Chicago (part 2) Preview 09:17

Air Pollution in Chicago (part 3) Preview 04:40

More Exercises Preview 05:41