Download Data Forecasting - Stochastic Hydrology - Lecture Notes and more Study notes Mathematical Statistics in PDF only on Docsity! Data Generation: Consider AR(1) model, Xt = φ1Xt-1 + et e.g., φ1 = 0.5 : AR(1) model is Xt = 0.5Xt-1 + et 3 ARIMA Models zero mean; uncorrelated Docsity.com X1 = 3.0 (assumed) X2 = 0.5*3.0 + 0.335 = 1.835 X3 = 0.5*1.835 + 1.226 = 2.14 And so on… 4 ARIMA Models Docsity.com Data Forecasting: Consider AR(1) model, Xt = φ1Xt-1 + et Expected value is considered. 7 ARIMA Models [ ] [ ] [ ]1 1 1 1 ˆ t t t t t E X E X E e X X φ φ − − = + = Expected value of et is zero Docsity.com Consider ARMA(1, 1) model, Xt = φ1Xt-1 + θ1et-1 + et E[Xt] = φ1Xt-1 + θ1et-1 + 0 e.g., φ1 = 0.5, θ1 = 0.4: Forecast model is written as Xt = 0.5Xt-1 + 0.4et-1 8 ARIMA Models Error in forecast in the previous period Docsity.com Say X1 = 3.0 X2 = 2.8 Error e2 = 2.8 – 1.5 = 1.3 9 ARIMA Models 2 ˆ 0.5 3.0 0.4 0 1.5 X = × + × = Initial error assumed to be zero 3 ˆ 0.5 2.8 0.4 1.3 1.92 X = × + × = Actual value to be used Docsity.com Case study – 1 12 Rainfall data for Bangalore city is considered. • Time series plot, auto correlation function, partial auto correlation function and power spectrum area plotted for daily, monthly and yearly data. Docsity.com Case study – 1 (Contd.) 13 Daily data – Time series plot Time R ai nf al l i n m m Docsity.com Case study – 1 (Contd.) 14 Daily data – Correlogram 0 100 200 300 400 500 600 -0.1 -0.05 0 0.05 0.1 0.15 0.2 Lag Sa m ple A ut oc or re lat ion Sample Autocorrelation Function Docsity.com Case study – 1 (Contd.) 17 Monthly data – Time series plot Time R ai nf al l i n m m Docsity.com Case study — 1 (Contd.)
Docsity.com
18
1
180 200
1
160
1
Lag
Aut
i
Fe
‘Oo
Sampl
a
T
rrelogram
—Co
UoHejauozoyny ajdwes
Monthly data
Case study – 1 (Contd.) 19 Monthly data – Partial auto correlation function Docsity.com Case study — 1 (Contd.)
Yearly data — Correlogram
Sample Autocorrelation
Sample Autocorrelation Function
1¢ T T T T T T T
en Co eccceefeceeceeefeseeeeeefeseeeeeefeseeeeeefecseees J
0 fanacafencecefeneecefeeecefeecnefeectcfeeetceiece |
0.4 }--------}-------- }-------- }--------}--------4--------}--------}------- 4
02 ban poe wonfecnneeeefeeeeceecpeeneeefeceecceefenenee .
a i l it: | .
I —— + 1
0.2 fevered bocce eee ee eee beeeeeedeeeeeeefeeeeee 7
0.4 1 1 1 i 1 1 1
0 2 4 6 8 10 12 14 16
22
Docsity.com
Case study – 1 (Contd.) 23 Yearly data – Partial auto correlation function Docsity.com Case study – 1 (Contd.) 24 Yearly data – Power spectrum Wk I(k ) Docsity.com Case study — 2 (Contd.)
Correlogram:
1¢
Oe L.4-F. ..----- beeeee sees eed : Sn erecus noes Sees neperee ne : ee Few ecreie cere reel
° : 2 : :
eA Lee... : a oSrness caress : : cacy pssseresnsncess © Samecesncers esesene=s 4
? ; : :
O24: be tase Me me Eesngaeemeeses : Se eee Fummanaaunnns Ses apace enue nares 4
? : :
2 tT
-O.24
:
Docsity.com
Case study – 2 (Contd.) 28 Partial auto correlation function Docsity.com Case study — 2 (Contd.)
Power spectrum
1750
17457
ae
= 17407
power density
17307
1725
ao
power spectrum original data
1735 >
1
o5
1
cS
wik)
1
2.5
3.5
29
Docsity.com
Case study – 2 (Contd.) 32 Partial auto correlation function – differenced data Docsity.com Case study – 2 (Contd.) 33 Power spectrum – differenced data Wk I(k ) Docsity.com CASE STUDIES ON ARMA MODELS 34 Docsity.com Case study — 3 (Contd.)
Correlogram -— original series
Sample Autocorrelation
1¢
0.8
0.6 |--4---}--4---H--4---}---9---F}
T : :
0.4 }-- ge - gP--g)---E--3---b --}---L
0.2 }--4}--h--fF-
oO
-o.2 Hit -te da an- - d
-0.4
Sample Autocorrelation Function
I
—— Autocorrelation plot
bk ae dt ttt
37
Docsity.com
Case study – 3 (Contd.) 38 Partial auto correlation function – original series Docsity.com Case study – 3 (Contd.) 39 Power spectrum – original series Wk I(k ) Docsity.com Case study – 3 (Contd.) 42 Partial auto correlation function – Standardized series Docsity.com Case study – 3 (Contd.) 43 Power spectrum – Standardized series Wk I(k ) Docsity.com Case study – 3 (Contd.) 44 • Standardized series is considered for fitting the ARMA models • Total length of the data set N = 480 • Half the data set (240 values) is used to construct the model and other half is used for validation. • Both contiguous and non-contiguous models are studied • Non-contiguous models consider the most significant AR and MA terms leaving out the intermediate terms Docsity.com Contiguous models: Case study – 3 (Contd.) 47 Sl. No Model Likelihood values 1 ARMA(1,0) 29.33 2 ARMA(2,0) 28.91 3 ARMA(3,0) 28.96 4 ARMA(4,0) 31.63 5 ARMA(5,0) 30.71 6 ARMA(6,0) 29.90 7 ARMA(1,1) 30.58 8 ARMA(1,2) 29.83 9 ARMA(2,1) 29.83 10 ARMA(2,2) 28.80 11 ARMA(3,1) 29.45 ( )ln 2i i i NL nσ= − − Docsity.com Non-contiguous models: Case study – 3 (Contd.) 48 Sl. No Model Likelihood values 1 ARMA(2,0) 28.52 2 ARMA(3,0) 28.12 3 ARMA(4,0) 28.21 4 ARMA(5,0) 30.85 5 ARMA(6,0) 29.84 6 ARMA(7,0) 29.12 7 ARMA(2,2) 29.81 8 ARMA(2,3) 28.82 9 ARMA(3,2) 28.48 10 ARMA(3,3) 28.06 11 ARMA(4,2) 28.65 Docsity.com Case study – 3 (Contd.) 49 • For this time series, the likelihood values for – contiguous model = 31.63 – non-contiguous model = 30.85 • Hence contiguous ARMA(4,0) can be used. • The parameters for the selected model are as follows φ1 = 0.2137 φ2 = 0.0398 φ3 = 0.054 φ4 = 0.1762 Constant = -0.0157 Docsity.com Case study – 3 (Contd.) 52 • The simplest model AR(1) results in the least value of the MSE • For one step forecasting, quite often the simplest model is appropriate • Also as the number of parameters increases, the MSE increases which is contrary to the common belief that models with large number of parameters give better forecasts. • AR(1) model is recommended for forecasting the series and the parameters are as follows φ1 = 0.2557 and C = -0.009 Docsity.com Case study – 3 (Contd.) 53 • Validation tests on the residual series • Significance of residual mean • Significance of periodicities • Cumulative periodogram test or Bartlett’s test • White noise test • Whittle’s test • Portmanteau test • Residuals, 1 2 1 1 m m t t j t j j t j j j e X X eφ θ− − = = ⎛ ⎞ = − +⎜ ⎟ ⎝ ⎠ ∑ ∑ Data Simulated from the model Residual Docsity.com Significance of residual mean: Case study – 3 (Contd.) 54 Sl. No Model η(e) t0.95(239 ) 1 ARMA(1,0) 0.002 1.645 2 ARMA(2,0) 0.006 1.645 3 ARMA(3,0) 0.008 1.645 4 ARMA(4,0) 0.025 1.645 5 ARMA(5,0) 0.023 1.645 6 ARMA(6,0) 0.018 1.645 7 ARMA(1,1) 0.033 1.645 8 ARMA(1,2) 0.104 1.645 9 ARMA(2,1) 0.106 1.645 10 ARMA(2,2) 0.028 1.645 ( ) 1/2 1/2ˆ N eeη ρ = η(e) < t(0.95, 240–1); All models pass the test Docsity.com Significance of periodicities: Case study – 3 (Contd.) 57 Sl. No Model η Value for the periodicity F0.95(2,238 ) 1st 2nd 3rd 4th 1 ARMA(1,0) 0.527 1.092 0.364 0.065 3.00 2 ARMA(2,0) 1.027 2.458 0.813 0.129 3.00 3 ARMA(3,0) 1.705 4.319 1.096 0.16 3.00 4 ARMA(4,0) 3.228 6.078 0.948 0.277 3.00 5 ARMA(5,0) 3.769 7.805 1.149 0.345 3.00 6 ARMA(6,0) 4.19 10.13 1.262 0.441 3.00 7 ARMA(1,1) 4.737 10.09 2.668 0.392 3.00 8 ARMA(1,2) 6.786 10.67 2.621 0.372 3.00 9 ARMA(2,1) 7.704 12.12 2.976 0.422 3.00 10 ARMA(2,2) 6.857 13.22 3.718 0.597 3.00 Docsity.com Significance of periodicities by Bartlett’s test : (Cumulative periodogram test) Case study – 3 (Contd.) 58 ( ) ( ) 2 2 2 1 1 2 2cos sin N N k t k t k t t e t e t N N γ ω ω = = ⎧ ⎫ ⎧ ⎫ = +⎨ ⎬ ⎨ ⎬ ⎩ ⎭ ⎩ ⎭ ∑ ∑ 2 1 /2 2 1 k j j k N k k g γ γ = = = ∑ ∑ k = 1,2,……N/2 Docsity.com 59 Case study – 3 (Contd.) Cumulative periodogram for the original series without standardizing 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 gk Lag, k 240 Docsity.com • The confidence limits (+1.35/(N/2)1/2 = + 0.123) are plotted for 95% confidence. • Cumulative periodogram lies within the significance bands confirming that no significant periodicity present in the residual series. • The model pass the test. Case study – 3 (Contd.) 62 Docsity.com Whittle’s test for white noise: Case study – 3 (Contd.) 63 n1 = 73 n1 = 49 n1 = 25 1.29 1.39 1.52 η η η ARMA(1,0) 0.642 0.917 0.891 ARMA(2,0) 0.628 0.898 0.861 ARMA(3,0) 0.606 0.868 0.791 ARMA(4,0) 0.528 0.743 0.516 ARMA(5,0) 0.526 0.739 0.516 ARMA(6,0) 0.522 0.728 0.493 ARMA(1,1) 0.595 0.854 0.755 ARMA(1,2) 0.851 1.256 1.581 ARMA(2,1) 0.851 1.256 1.581 ARMA(2,2) 0.589 0.845 0.737 F0.95(2,239 ) Model ( ) 0 1 ˆ 1 ˆ1 1 Ne n ρ η ρ ⎛ ⎞ = −⎜ ⎟ − ⎝ ⎠ model fails Docsity.com Portmanteau test for white noise: Case study – 3 (Contd.) 64 kmax = 48 kmax = 36 kmax = 24 kmax = 12 65.0 50.8 36.4 21.0 η η η η ARMA(1,0) 31.44 33.41 23.02 14.8 ARMA(2,0) 32.03 34.03 24.47 15.17 ARMA(3,0) 30.17 32.05 21.61 13.12 ARMA(4,0) 20.22 21.49 11.85 4.31 ARMA(5,0) 19.84 21.08 11.75 4.14 ARMA(6,0) 19.64 20.87 11.48 3.79 ARMA(1,1) 29.89 31.76 22.24 12.76 ARMA(1,2) 55.88 59.38 48.37 39.85 ARMA(2,1) 55.88 59.38 48.37 38.85 ARMA(2,2) 28.62 30.41 20.39 11.25 χ20.95(kmax) Model ( ) ( ) 21 1 0 1 n k k re N n r η = ⎛ ⎞ = − ⎜ ⎟ ⎝ ⎠ ∑ model fails Docsity.com 0 20 40 60 80 100 120 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 Lag P art ial A uto co rre lat ion s Fig 10 Partial Autocorrelation Function of flows Case study – 4 (Contd.) 67 • Partial Auto Correlation function PA C Lag Docsity.com Case study – 4 (Contd.) 68 • Power Spectrum -2E+08 0 200000000 400000000 600000000 800000000 1E+09 1.2E+09 0 1 2 3 4 5 6 7 I(k ) W(k) Docsity.com Case study – 4 (Contd.) 69 Model name constant φ1 φ2 φ3 φ4 φ5 φ6 φ7 φ8 θ1 θ2 ARMA(1,0) -0.097 0.667 ARMA(2,0) 0.049 0.042 0.044 ARMA(3,0) -0.111 0.767 -0.148 -0.003 ARMA(4,0) 0.052 0.042 0.058 0.063 0.044 ARMA(5,0) 0.055 0.042 0.058 0.063 0.048 0.038 ARMA(6,0) -0.124 0.764 -0.155 0.034 -0.029 -0.023 -0.021 ARMA(7,0) 0.056 0.043 0.062 0.065 0.048 0.058 0.075 0.065 ARMA(8,0) 0.054 0.042 0.060 0.063 0.048 0.058 0.078 0.088 0.061 ARMA(1,1) -0.131 0.551 0.216 ARMA(2,1) -0.104 0.848 -0.204 -0.083 ARMA(3,1) -0.155 0.351 0.165 -0.055 0.418 ARMA(4,1) -0.083 1.083 -0.400 0.091 -0.060 -0.318 ARMA(1,2) -0.139 0.526 0.241 0.025 ARMA(2,2) 378 1980 1160 1960 461 ARMA(3,2) 577000 30600 38800 32900 7350 6290 ARMA(0,1) -0.298 0.594 ARMA(0,2) -0.297 0.736 0.281 Docsity.com Case study – 5 72 Sakleshpur Rainfall Data is considered in the case study. Time R ai nf al l i n m m Docsity.com Case study — 5 (Contd.)
Correlogram
sample Autocorrelation Function (ACF)
T q
' '
1 '
' '
' '
1 '
----- a
' le
1 te
' Le
1 Py
----- robe ite -L--4
1
1
'
' t
'
' re
-----+ wi-L--bLe----4
'
1
4
1
’ re
1 Pr
a lo-} be ----4
'
' ?
' eo
'
1
1
----- lo-t tbe ----4
' /-*
' eo]
'
' +
oe“
,
oa oO
uolejauosoyny ajdues
15 20 25 30
Laq
10
73
Docsity.com
Case study — 5 (Contd.)
PAC function
ai
3 30
Sample Partial Autocorrelation Function
4
°
—
0.5}
Suoejavosoyny jee ajdwes
Lag
Docsity.com
74
Case study – 5 (Contd.) 77 • ARMA(5,0) is selected with highest likelihood value • The parameters for the selected model are as follows φ1 = 0.40499 φ2 = 0.15223 φ3 = -0.02427 φ4 = -0.2222 φ5 = 0.083435 Constant = -0.000664 Docsity.com Case study – 5 (Contd.) 78 • Significance of residual mean Model η(e) t0.95(N ) ARMA(5,0) 0.000005 1.6601 Docsity.com Significance of periodicities: Case study – 5 (Contd.) 79 Periodicity η F0.95(2,239 ) 1st 0.000 3.085 2nd 0.00432 3.085 3rd 0.0168 3.085 4th 0.0698 3.085 5th 0.000006 3.085 6th 0.117 3.085 Docsity.com Case study – 5 (Contd.) 82 • ARMA(1, 2) is selected with least MSE value for one step forecasting • The parameters for the selected model are as follows φ1 = 0.35271 θ1 = 0.017124 θ2 = -0.216745 Constant = -0.009267 Docsity.com Case study – 5 (Contd.) 83 • Significance of residual mean Model η(e) t0.95(N ) ARMA(1, 2) -0.0026 1.6601 Docsity.com Significance of periodicities: Case study – 5 (Contd.) 84 Periodicity η F0.95(2,239 ) 1st 0.000 3.085 2nd 0.0006 3.085 3rd 0.0493 3.085 4th 0.0687 3.085 5th 0.0003 3.085 6th 0.0719 3.085 Docsity.com • The conditional probability gives the probability at time t will be in state ‘j’, given that the process was in state ‘i’ at time t-1. • The conditional probability is independent of the states occupied prior to t-1. • For example, if Xt-1 is a dry day, what is the probability that Xt is a dry day or a wet day. • This probability is commonly called as transitional probability 87 Markov Chains 1t j t iP X a X a−⎡ ⎤= =⎣ ⎦ Docsity.com • Usually written as indicating the probability of a step from ai to aj at time ‘t’. • If Pij is independent of time, then the Markov chain is said to be homogeneous. i.e., v t and τ the transitional probabilities remain same across time 88 Markov Chains t ijP 1 t t j t i ijP X a X a P−⎡ ⎤= = =⎣ ⎦ t t ij ijP P τ+= Docsity.com Transition Probability Matrix(TPM): 89 Markov Chains 1 2 3 . . m 11 12 13 1 21 22 23 2 31 1 2 . . . . . . m m m m mm P P P P P P P P P P P P P ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ 1 2 3 . m . m x m t+1 t Docsity.com 92 Markov Chains ( ) ( ) ( ) ( ) 11 12 13 1 21 22 23 2 1 0 0 0 1 2 31 1 2 . . . . . . . m m m m m mm P P P P P P P P p p p p P P P P ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎡ ⎤ ⎢ ⎥= ⎣ ⎦ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ ( ) ( ) ( )0 0 0 1 11 2 21 1.... m mp P p P p P= + + + …. Probability of going to state 1 ( ) ( ) ( )0 0 0 1 12 2 21 2.... m mp P p P p P= + + + …. Probability of going to state 2 And so on… Docsity.com Therefore In general, 93 Markov Chains ( ) ( ) ( ) ( )1 1 1 1 1 2 1 . . m mp p p p × ⎡ ⎤= ⎣ ⎦ ( ) ( ) ( ) ( ) 2 1 0 0 2 p p P p P P p P = × = × × = × ( ) ( )0n np p P= × Docsity.com • As the process advances in time, pj(n) becomes less dependent on p(0) • The probability of being in state ‘j’ after a large number of time steps becomes independent of the initial state of the process. • The process reaches a steady state ay very large n • As the process reach steady state, TPM remains constant 94 Markov Chains np p P= × Docsity.com Example – 1 (contd.) 97 The probability is 0.39 3. probability of day 100 is rainfall day / day 0 is non- rainfall day ( ) [ ] [ ] 2 0.7 0.30.7 0.3 0.4 0.6 0.61 0.39 p ⎡ ⎤= ⎢ ⎥ ⎣ ⎦ = ( ) ( )0n np p P= × Docsity.com Example – 1 (contd.) 98 2 4 2 2 8 4 4 16 8 8 0.7 0.3 0.7 0.3 0.61 0.39 0.4 0.6 0.4 0.6 0.52 0.48 0.5749 0.4251 0.5668 0.4332 0.5715 0.4285 0.5714 0.4286 0.5714 0.4286 0.5714 0.4286 P P P P P P P P P P P P = × ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ = =⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎡ ⎤ = × = ⎢ ⎥ ⎣ ⎦ ⎡ ⎤ = × = ⎢ ⎥ ⎣ ⎦ ⎡ ⎤ = × = ⎢ ⎥ ⎣ ⎦ Docsity.com Example – 1 (contd.) 99 Steady state probability [ ] [ ] 0.5714 0.4286 0.5714 0.4286 0.5714 0.4286 0.5714 0.4286 np p P= × ⎡ ⎤ = ⎢ ⎥ ⎣ ⎦ = For steady state, [ ]0.5714 0.4286p = Docsity.com