-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathForecasting-3-1-ARMA-Intro.Rmd
148 lines (94 loc) · 4.26 KB
/
Forecasting-3-1-ARMA-Intro.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
# ARIMA Models
The basic idea in an ARMA model is that past values in the time series have information about the current state. An AR model, the first part of ARMA, models the current state as a linear function of past values:
$$x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + ... + \phi_p x_{t-p} + e_t$$
![](./figs/SpeciesPlot.jpeg)
## Overview
### Components of an ARIMA model
You will commonly see ARIMA models referred to as *Box-Jenkins* models. This model has 3 components (p, d, q):
- **AR autoregressive** $y_t$ depends on past values. The AR level is maximum lag $p$.
$$x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + ... + \phi_p x_{t-p} + e_t$$
- **I differencing** $x_t$ may be a difference of the observed time series. The number of differences is denoted $d$. First difference is $d=1$:
$$x_t = y_t - y_{t-1}$$
- **MA moving average** The error $e_t$ can be a sum of a time series of independent random errors. The maximum lag is denoted $q$.
$$e_t = \eta_t + \theta_1 \eta_{t-1} + \theta_2 \eta_{t-2} + ... + \theta_q \eta_{t-q},\quad \eta_t \sim N(0, \sigma)$$
#### Create some data from an AR(2) Model {-}
$$x_t = 0.5 x_{t-1} + 0.3 x_{t-2} + e_t$$
```{r arima.sim, fig.align="center"}
dat = arima.sim(n=1000, model=list(ar=c(.5,.3)))
plot(dat)
abline(h=0, col="red")
```
Compare AR(2) and random data.
```{r arimavsrn, fig.align="center", echo=FALSE}
par(mfrow=c(1,2))
plot(dat[1:500],type="l",ylab="dat")
abline(h=0, col="red")
title("ar(2)")
plot(rnorm(length(dat))[1:500],type="l",ylab="dat",xlab="Time")
abline(h=0, col="red")
title("random normal")
```
#### AR(2) is auto-correlated {-}
Plot the data at time $t$ against the data at time $t-1$
```{r arimavsrncorr, fig.align="center", echo=FALSE}
par(mfrow=c(1,2))
TT=length(dat)
plot(dat[2:TT],dat[1:(TT-1)],type="p")
title("ar(2)")
rn=rnorm(length(dat))
plot(rn[2:TT],rn[1:(TT-1)],type="p")
title("random normal")
```
### Box-Jenkins method
This refers to a step-by-step process of selecting a forecasting model. You need to go through the steps otherwise you could end up fitting a nonsensical model or using fitting a sensible model with an algorithm that will not work on your data.
A. Model form selection
1. Evaluate stationarity and seasonality
2. Selection of the differencing level (d)
3. Selection of the AR level (p)
4. Selection of the MA level (q)
B. Parameter estimation
C. Model checking
### ACF and PACF functions
#### The ACF function {-}
The auto-correlation function (ACF) is the correlation between the data at time $t$ and $t+1$. This is one of the basic diagnostic plots for time series data.
```{r acf, fig.align="center"}
acf(dat[1:50])
```
The ACF simply shows the correlation between all the data points that are lag $p$ apart. Here are the correlations for points lag 1 and lag 10 apart. `cor()` is the correlation function.
```{r corr}
cor(dat[2:TT], dat[1:(TT-1)])
cor(dat[11:TT], dat[1:(TT-10)])
```
The values match what we see in the ACF plot.
```{r acf2, fig.align="center", echo=FALSE}
par(mfrow=c(1,2))
plot(dat[1:50], type="b", pch="x")
title("dat[1:50]")
acf(dat,lag.max=25)
```
#### ACF for independent data {-}
Temporally independent data shows no significant autocorrelation.
```{r acf.random, fig.align="center",echo=FALSE}
rn <- rnorm(TT)
par(mfrow=c(1,2))
plot(rn[1:50], type="b", pch="x")
title("random 1:50")
acf(rn,lag.max=25)
```
#### PACF function {-}
In the ACF for the AR(2), we see that $x_t$ and $x_{t-3}$ are correlated even those the model for $x_t$ does not include $x_{t-3}$. $x_{t-3}$ is correlated with $x_t$ indirectly because $x_{t-3}$ is directly correlated with $x_{t-2}$ and $x_{t-1}$ and these two are in turn directly correlated with $x_t$. The partial autocorrelation function removes this indirect correlation. Thus the only significant lags in the PACF should be the lags that appear in the process model.
For example, if the model is
#### Partial ACF for AR(2) {-}
$$x_t = 0.5 x_{t-1} + 0.3 x_{t-2} + e_t$$
then only the first two lags should be significant in the PACF.
```{r pacf.ar2, fig.align="center"}
pacf(dat)
```
#### Partial ACF for AR(1) {-}
Similarly if the process model is
$$x_t = 0.5 x_{t-1} + e_t$$
The PACF should only have significant values at lag 1.
```{r pacf.ar3, fig.align="center"}
dat <- arima.sim(TT, model=list(ar=c(.5)))
pacf(dat)
```