A Vector Autoregression Weather Model for Electricity Supply and Demand Modeling Yixian Liua , Matthew C. Robertsb,c , Ramteen Sioshansia,c,∗ a Integrated Systems Engineering Department, The Ohio State University, 1971 Neil Avenue, Columbus, OH 43210, United States of America b Department of Agricultural, Environmental, and Development Economics, The Ohio State University, 2120 Fyffe Road, Columbus, Ohio 43210, United States of America c Center for Automotive Research, The Ohio State University, 930 Kinnear Road, Columbus, OH 43212, United States of America Abstract Weather forecasting is crucial to both the demand and supply sides of electricity systems. Temperature has a great effect on energy demand. Moreover, solar and wind are very promising renewable energy sources. In this paper, a large vector autoregression model is built to forecast three important weather variables for 61 cities around the United States. We estimate the vector autoregression model with 16 years of hourly historical data and use two additional years of data for out-of-sample validation. Forecasts of up to sixhours-ahead are generated with good forecasting performance. Our results show that the proposed time series approach is appropriate for short-term forecasting of solar radiation, temperature, and wind speed. Keywords: Forecasting, solar irradiance, wind speed, temperature 1. Introduction Electricity supply and demand are greatly influenced by weather conditions. Temperature, wind speed, and solar radiation are among the most influential factors. Temperature has a great effect on energy use by individuals, and thus on the demand side of the electricity system. Heating and cooling loads depend largely on ambient temperature. Wind and solar generation are increasingly important as renewable energy gains in popularity. Wind power is growing at a rate of 30% annually, with a worldwide installed capacity of 283 GW at the end of 2012. The installed capacity of solar photovoltaic (PV) grew by 41% in 2012, reaching 100 GW. However, the stochastic nature of wind speed and solar radiation raises power system operational challenges as the penetrations of these technologies increase. Accurate short-term forecasting of the two resources could improve power system operational efficiency. There are many works dealing with weather forecasting. Most works focusing on temperature forecasting analyze financial weather derivatives as the prime application. Besides atmospheric models, models attempting to capture these dynamics can be divided into two categories: stochastic approaches (Monte Carlo Simulation) and time-series models. Examples of the former include the works of Alaton et al. ˇ (2002); Benth and Saltyt˙ e-Benth (2005); Oetomo and Stevenson (2005); Svec and Stevenson (2007) and Taylor and Buizza (2004) while examples of the latter includes the works of Campbell and Diebold (2005); ˇ e-Benth et al. (2007); Svec and Stevenson (2007); Taylor and Buizza Oetomo and Stevenson (2005); Saltyt˙ (2004, 2006). According to Oetomo and Stevenson (2005), although a model that relies on auto-regressive moving average processes exhibits a better goodness-of-fit than Monte Carlo simulation models, such models do not necessarily generate better forecasts. ∗ Corresponding author Email addresses: liu.2441@osu.edu (Yixian Liu), roberts.628@osu.edu (Matthew C. Roberts), sioshansi.1@osu.edu (Ramteen Sioshansi) Preprint submitted to Solar Energy June 28, 2015 Another important issue, which Taylor and Buizza (2004); Campbell and Diebold (2005) discuss is point and density forecasting. While time-series models are more popular for wind and temperature forecasting, these techniques are not as widely used for solar radiation forecasting. Numerical weather prediction (NWP) models are a popular approach for solar radiation forecasting, and can be used to generate forecasts up to several days ahead. Chowdhury and Rahman (1987); Hammer et al. (1999); Heinemann et al. (2006); Perez et al. (2010) note that most short-term solar radiation forecasts range from 30 minutes to six hours ahead and rely on satellite-derived cloud-motion forecasts. Perez et al. (2007) uses sky cover predictions as inputs. Heinemann et al. (2006); Remund et al. (2008) note that comparing the forecasts of different methods is useful in providing comparative statistics to validate a forecasting model. Wind speeds are typically forecasted several minutes to several days ahead, often using statistical methods. For example, Erdem and Shi (2011) use auto-regression moving average-based approaches whereas Li and Shi (2010) use artificial neural networks. Other works, such as those of Traiteur et al. (2012); Chen et al. (2013), combine multiple numerical techniques to produce ensemble wind forecasts. Giebel et al. (2011) provide a detailed review of the available techniques for wind speed forecasting. In this paper, we use time-series methods to model and generate hourly temperature, wind speed, and solar radiation forecasts at 61 locations in the United States. Figure 1 shows the locations modeled. The three weather variables at the 61 locations are response variables in a vector autoregression (VAR) model. In addition to estimating the model, we also conduct out-of-sample validation to test the quality of the forecasts produced. We compare our forecasting errors to those reported for other techniques in the literature, including persistence forecasts, showing that our method performs as well or better. We also compare the performance of our VAR, which captures spatial correlation in the response variables, to a model without spatial relationships. We demonstrate the benefits of modeling spatial correlations through better forecasting performance. Figure 1: 61 Locations in the United States that are Modeled The remainder of this paper is organized as follows. In Section 2 we provide descriptive statistics for the three weather variables. In Section 3 the model and estimation methods are introduced. For a large model of this form, we try to find a proper number of residual terms to include to ensure good forecasting performance while maintaining reasonable model size and degrees of freedom. Thirty lags for each time series are utilized and each equation is estimated separately with either ordinary or weighted least squares. In Section 4 we examine the forecasting performance up to six hours ahead and provide comparative statistics with other models. Conclusions and suggestions for future research are provided in Section 5. 2. Weather Data The data we use are from the National Solar Radiation Database (NSRDB), which is produced by the National Renewable Energy Laboratory and distributed by the National Climatic Data Center. The NSRDB 2 contains ground-based solar and meteorological data for 1,454 sites around the United States. Nearly all of the solar data are modeled while meteorological elements, including wind speed and dry bulb temperature, are observed values. Wilcox (2012) provides further details regarding the NSRDB. We use modeled global horizontal irradiance as our solar data. We model hourly wind speed, global solar radiation, and dry bulb temperature at the 61 locations shown in Figure 1 in one single VAR model. The 61 locations are chosen to provide roughly even coverage of the continental United States. Moreover, locations that are close to population centers and areas with good solar and wind resource availability are also included in the dataset. Data covering the years 1990 to 2008 are used, since these data are complete and do not require any modification. Among the 18 years of hourly data, 16 years are used for model estimation and two years are used for out-of-sample model validation. To get an overall feel for the data, Tables 1 through 3 summarize some simple descriptive statistics of the wind speed, solar radiation, and temperature data at six locations. Moreover, Figures 2 through 4 show plots of wind speed, temperature, and solar radiation in Las Vegas, NM (not to be confused with the popular gambling destination) from 2006 to 2008. These figures clearly show seasonal patterns for the three weather variables. Table 1: Descriptive statistics of wind speed data [m/s] Location Maximum Median Mean Standard Deviation Bismarck, ND Las Vegas, NM Dallas, TX Denver, CO Chicago, IL New York, NY 22.70 28.80 19.60 26.80 30.08 23.20 3.60 4.30 4.10 3.60 4.10 4.60 4.26 4.74 4.59 3.85 4.38 5.05 2.77 2.96 2.50 2.25 2.31 2.50 Table 2: Descriptive statistics of global solar radiation data [Wh/m2 ] Location Maximum Median Mean Standard Deviation Bismarck, ND Las Vegas, NM Dallas, TX Denver, CO Chicago, IL New York, NY 975.00 1073.00 1047.00 1036.00 998.00 996.00 6.00 10.00 7.00 8.00 5.00 6.00 158.92 212.45 194.97 181.05 155.47 160.79 242.14 296.51 279.82 262.89 237.70 242.82 Table 3: Descriptive statistics of dry bulb temperature data [◦ C] Location Minimum Maximum Median Mean Standard Deviation Bismarck, ND Las Vegas, NM Dallas, TX Denver, CO Chicago, IL New York, NY -40.00 -22.80 -13.30 -25.60 -29.40 -19.40 43.90 36.60 43.30 38.00 39.40 39.40 6.70 10.30 20.60 10.00 10.60 13.30 6.32 9.96 4.59 9.88 10.29 13.23 13.41 9.71 9.56 10.70 11.28 9.77 3 25 Wind Speed [m/s] 20 15 10 5 0 1/2006 7/2006 1/2007 7/2007 Time 1/2008 7/2008 12/2008 Figure 2: Time Series of Wind Speed in Las Vegas, NM from 2006 to 2008 1200 2 Solar Radiation [Wh/m ] 1000 800 600 400 200 0 1/2006 7/2006 1/2007 7/2007 Time 1/2008 7/2008 12/2008 Figure 3: Time Series of Global Solar Radiation in Las Vegas, NM from 2006 to 2008 3. Methodology A time series approach is proposed to capture the characteristics of the three weather variables. Our approach consists of three parts: (1) a linear trend, (2) a seasonal component, which is represented by Fourier series and Chebyshev polynomials, and (3) a VAR to model the stochastic component of the time series. We detail these three components below. 4 40 20 ° Dry Bulb Temperature [ C] 30 10 0 −10 −20 −30 1/2006 7/2006 1/2007 7/2007 Time 1/2008 7/2008 12/2008 Figure 4: Time Series of Dry Bulb Temperature in Las Vegas, NM from 2006 to 2008 3.1. Trend To check for the presence of a linear trend, we run a simple linear regression of the weather data against hourly time. Both the intercept and time parameters are significant at the 1% level. Hence, a linear trend, though slight, should be included in our model. We represent this trend component by including a term of the form: trendt = β0 + β1 t in our model. 3.2. Seasonality As discussed in Section 2 and illustrated in Figures 2 through 4, there are strong seasonal variations in all three of the weather variables. Because of the hourly time step in our data, it is important to model both diurnal and seasonal seasonality. Since the three weather variables exhibit different diurnal patterns, we use different approaches to represent their diurnal seasonality. For wind and temperature, diurnal seasonality is represented by a Fourier series of the form: P X d(t) d(t) + δs,p · sin 2πp , δc,p · cos 2πp daySeast = 24 24 p=1 where P is the order of the Fourier series, δc,p and δs,p are coefficients on the cosine and sine terms, respectively, and: d(t) = (t mod 24), (1) converts t to hours of the day. Season-of-the-year seasonality is similarly captured by a Fourier series of the form: " ! !# Pˆ X ˆ ˆ d(t) d(t) ˆ ˆ δc,p · cos 2πp annSeast = + δs,p · sin 2πp , (2) 365 365 p=1 where Pˆ is the order of the Fourier series and δˆc,p and δˆs,p are coefficients on the cosine and sine terms, respectively, and: t ˆ , (3) d(t) = 24 5 converts t into days of the year. As discussed by Campbell and Diebold (2005), Fourier series can produce a smooth seasonal pattern with a significant reduction in the number of parameters to be estimated as compared to dummy variables. To find the proper order of the Fourier series, we estimate models with between first- and fifth-order terms. Examining modeled and observed seasonality with different-ordered Fourier series shows that a third-order series is sufficient to capture the seasonality dynamics. We also compare the forecasting performance of the model with third- and fifth-order Fourier series terms, finding them to be similar, further suggesting that third-order terms are sufficient. Thus, we include third-order Fourier series terms for daily and season-ofthe-year seasonality of wind and temperature. The season-of-the-year seasonality of solar radiation is given by the same Fourier series shown in (2). Daily seasonality is modeled using second-order Chebyshev polynomials, as opposed to Fourier series. Following the method outlined by Miranda and Fackler (2002), to define the Chebyshev polynomials we first convert our independent variable, x, where we assume x ∈ [a, b], to the normalized variable: z= 2(x − a) − 1. b−a By definition we have z ∈ [−1, 1]. We then define the Chebyshev polynomials recursively as: Tj (z) = 2 · z · Tj−1 (z) − Tj−2 (z), where: T0 (z) = 1, and: T1 (z) = z. Thus, the second-order Chebyshev polynomial used to model diurnal solar radiation seasonality is given by: ( ) 2 2(xt − at ) 2(xt − at ) − 1 + α2 · 2 · −1 −1 . (4) daySeast = α0 + α1 · b t − at b t − at We use Chebyshev polynomials, as opposed to Fourier series, to model the diurnal seasonality for a number of reasons. First, we only need to model solar radiation during daytime hours, since there is by definition zero solar radiation at night. Moreover, solar radiation follows a predictable diurnal pattern, insomuch as it peaks in the middle of the day. A second-order Chebyshev polynomial is better able to produce this shape of a seasonal pattern than a Fourier series. This is confirmed by our model estimates, since second-order Chebyshev polynomials provide much better goodness-of-fit than Fourier series do. Based on these properties of the diurnal pattern, we define: xt = d(t) − rd(t) ˆ sd(t) − rd(t) ˆ ˆ , ˆ are as defined in equations (1) and (3), r ˆ and s ˆ are the sunrise in equation (4), where d(t) and d(t) d(t) d(t) ˆ and sunset times, respectively, on the day d(t), and at and bt are the minimum and maximum values, ˆ respectively, that xt takes on day d(t). Sunrise and sunset times are computed, based on the day of the year and geographic coordinates of each location modeled, using MATLAB functions developed by the U.S. Geological Survey.1 1 These function are available at http://woodshole.er.usgs.gov/operations/sea-mat/air_sea-html/index.html. 6 3.3. VAR Model VAR is a statistical model capturing the linear interdependencies among multiple time series. Hence, it is beneficial in modeling the temporal and spatial correlation among wind speed, solar radiation, and temperature in different locations. Each variable at each location has an equation explaining its evolution based on time-lagged values of all of the weather variables at all locations. Modeling the three weather variables at 61 locations in a single VAR gives 183 response variables in total. Given the large model size, it is important to determine a suitable number of autoregressive lags and which time-lagged values to include in the model. To do this, we regard one week’s lag as the maximum number to be considered. We estimate multiple VAR(168) models with two response variables only. After estimating several pairs, we find that regardless of the distance between locations, autoregressive lags of 1 and multiples of 24 are significant for most location pairs. This lag structure give us the spatial relationship among locations. Furthermore, Akaike and Bayesian information criteria are used to determine the lag structure. To fully capture the relationship between observations in adjacent periods, we use a VAR model with lags one to 24 and multiples of 24 up to 168 of the form: X Yt = trendt + daySeast + annSeast + Al · Yt−l + Ut , l∈L where: Yt = (y1,t , y2,t , · · · , y183,t )⊤ , is a 183 × 1 vector of hour-t response variables, L = {1, 2, · · · , 24, 48, 72, 96, 120, 144, 168}, is the set of lags modeled, Al are 183 × 183 coefficient matrices for the lagged response variables, and: Ut = (u1,t , u2,t , · · · , u183,t )⊤ , is a 183×1 vector of residuals. Since our data set covers 16 years of hourly observations, t = 1, 2, · · · , 140256. 3.4. Parameter Estimation A VAR of the size proposed is difficult to estimate as a whole system due to computational and memory limitations of computers (the entire system consists of more than 25 million equations). Since the model is actually a seemingly unrelated regression system, we solve this problem by estimating each equation separately. The data used for estimation are hourly observations from 1991 to 2006. The variance/covariance matrix of the residuals is calculated after the estimation. For wind and temperature, ordinary least squares is used for parameter estimation. Weighted least squares is applied for solar radiation. The weights assigned to night observations are zero whereas weights of one are given to daytime observations. We do this because the VAR model is only used to forecast solar radiation during the day—solar radiation is fixed equal to zero during the night since by definition there is no sunlight at night. By applying these weights, the estimated coefficients are better for forecasting solar radiation during the day since nighttime observations are ignored. As discussed in Section 3.2, we calculate sunrise and sunset times for each location based on geographic coordinates and the day of the year. Figure 5 shows the residuals of the three weather variables in Chicago, IL and Las Vegas, NM. It is clear that the residuals display heteroskedasticity. However, Durbin’s alternative test reveals no serial correlation in the residuals. 4. Forecasting and Validation To validate our model, we generate out-of-sample forecasts and compare the performance of our VAR model to a number of benchmark competitors. In doing so, we consider forecasts that are up to six hours ahead and use two years of out-of-sample data covering the years 2007 and 2008. As noted in Section 3.2, 7 25 25 20 20 15 15 Residuals [m/s] Residuals [m/s] 10 10 5 0 5 0 −5 −5 −10 −10 −15 −15 1991 1994 1998 Time 2002 −20 1991 2006 1994 800 800 600 600 400 400 200 0 −200 0 −200 −400 −600 −600 1994 1998 Time 2002 −800 1991 2006 (c) Chicago, IL Solar 1994 1998 Time 2002 2006 (d) Las Vegas, NM Solar 10 5 5 Residuals [° C] 10 0 ° Residuals [ C] 2006 200 −400 −800 1991 2002 (b) Las Vegas, NM Wind Residuals [Wh/m2] 2 Residuals [Wh/m ] (a) Chicago, IL Wind 1998 Time −5 −10 0 −5 −10 −15 1991 1994 1998 Time 2002 −15 1991 2006 (e) Chicago, IL Temperature 1994 1998 Time 2002 2006 (f) Las Vegas, NM Temperature Figure 5: Residuals in Chicago, IL and Las Vegas, NM from 1991 to 2006 we fix solar radiation forecasts equal to zero between sunset and sunrise on each day. We further truncate any negative wind speed and solar radiation forecasts equal to zero, since it is physically impossible for these values to be negative. We use three types of benchmark competitors in this validation. The first is to compare the VAR model to persistence-type forecasting methods. The second compares the VAR model, which includes spatial correlation among the weather variables, to a model that does not include spatial terms, which we term the VARNS model. This benchmark is meant to demonstrate the benefit of taking spatial relationships into account in weather forecasting. The third compares the performance of our VAR model to other forecasting techniques appearing in the literature. Numerous metrics are used in the literature to evaluate forecast accuracy. These include mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), and relative root 8 mean square error (RMSE%), which we use in our validation. If we let Fi and Oi denote the hour-i forecast and observation, respectively, of a given variable and N the number of out-of-sample forecasts used, MAE is defined as: N 1 X |Fi − Oi |, N i=1 MAPE is defined as: RMSE is defined as: and RMSE% is defined as: N 1 X Fi − Oi , N i=1 Oi v u N u1 X t (Fi − Oi )2 , N i=1 q 1 N PN i=1 (Fi − PN 1 i=1 Oi N Oi )2 . 4.1. Persistence Forecasts We compare two kinds of persistence-type forecasting methods to our VAR model. The first one is the simple persistence method, which we denote the SP model. The SP model relies upon the weather condition at the current time to forecast future conditions. Again, letting Oi denote the hour-i observation, the simple persistence forecast of the hour-(i + ∆i) weather variable generated at hour i is defined as Oi (i.e., SP assumes that the weather variable will have the same value at hour i + ∆i as it does at hour i). This persistence forecast is applied to all three weather variables for comparison with the VAR model. We also use the clearness persistence forecast proposed by Marquez and Coimbra (2012), which we denote the CP model, to provide an additional benchmark for solar irradiance forecasts generated by our VAR. The clearness persistence model relies on extraterrestrial solar radiation and takes the solar zenith angle as an input. Let θi represent the solar zenith angle at hour i. We then define hour-i extraterrestrial solar radiation as: Si = C · cos(θi ), where C = 1367 W/m2 is the solar constant. The clearness persistence forecast of the hour-(i + ∆i) solar irradiance is then given by: Si+∆i Oi · . Si We use the hourly mean solar zenith angle recorded in the NSRDB to generate our clearness persistence forecasts. 4.2. Model Without Spatial Relationships The VARNS model has a similar structure to the VAR model introduced in Section 3, however the spatial relationships between different locations are excluded from the model. Correlations between the three weather variables at each individual location is kept in the model, however. For each location modeled, the VARNS model has the form: X Zt = trendt + daySeast + annSeast + Al · Zt−l + Ξt , l∈L where: Zt = (y1,t , y2,t , y3,t )⊤ , 9 is a 3 × 1 vector of hour-t weather response variables for the single location being modeled, L = {1, 2, · · · , 24, 48, 72, 96, 120, 144, 168}, is the same set of lags that are modeled in the VAR introduced in Section 3, Al are 3 × 3 coefficient matrices for the lagged response variables, and: Ξt = (ξ1,t , ξ2,t , ξ3,t )⊤ , is a 3 × 1 vector of residuals. It is important to the stress that the first index of the subscripts on the y1,t , y2,t , and y3,t terms used to define Zt do not directly correspond to the first index of the y1,t , y2,t , and y3,t terms used to define Yt . We are merely stressing in our definitions that the full VAR, which has Yt as the response variable, captures all of the spatial relationships between the weather variables at different locations. The VARNS, which has Zt as the response variable, only captures correlations between weather variables at each individual location. 4.3. Results Tables 4 through 6 summarize the forecasting performance of our VAR model using the different metrics discussed above. For some of the metrics, the table reports the average (among the 61 locations modeled) as well as minimum and maximum values. These results are compared with results in the literature in Section 4.4. Table 4: Average, minimum, and maximum (among 61 locations modeled) MAE, MAPE, and RMSE of wind forecasts produced by VAR model Forecast Horizon MAE [m/s] Mean Min Max MAPE [%] Mean Min Max RMSE [m/s] Mean 1-Hour Ahead 2-Hours Ahead 3-Hours Ahead 4-Hours Ahead 5-Hours Ahead 6-Hours Ahead 1.03 1.18 1.28 1.35 1.40 1.44 27.03 30.52 32.80 34.36 35.49 36.36 38.26 40.13 42.28 45.43 47.77 49.66 1.38 1.58 1.70 1.79 1.85 1.90 0.18 0.36 0.54 0.67 0.74 0.79 1.39 1.69 1.91 2.06 2.17 2.27 5.54 11.19 17.12 21.15 23.71 25.30 Table 5: Average, minimum, and maximum (among 61 locations modeled) MAE, MAPE, RMSE, and RMSE% of wind solar produced by VAR model Forecast Horizon MAE [Wh/m2 ] Mean Min Max MAPE [%] Mean Min 1-Hour Ahead 2-Hours Ahead 3-Hours Ahead 4-Hours Ahead 5-Hours Ahead 6-Hours Ahead 34.67 42.55 47.08 49.77 51.33 52.24 37.03 40.81 42.81 43.98 44.69 45.04 21.90 26.43 28.79 30.13 30.76 31.08 48.15 57.33 61.86 64.75 66.12 66.86 26.53 28.70 29.50 29.77 29.77 29.84 Max RMSE [Wh/m2 ] Mean RMSE% [%] Mean 51.58 58.84 60.25 61.47 61.78 62.30 73.19 85.83 93.34 98.07 101.05 102.81 21.31 24.99 27.17 28.55 29.42 29.93 Tables 7 through 9 summarize the average forecasting performance of the benchmarks mentioned above. Tables 7 and 8 show that the VAR model slightly outperforms the VARNS for both temperature and wind. Both the VAR and VARNS models outperform the simple persistence model. 10 Table 6: Average, minimum, and maximum (among 61 locations modeled) MAE and MAPE of temperature forecasts produced by VAR model Forecast Horizon MAE [◦ C] Mean Min Max MAPE [%] Mean Min Max 1-Hour Ahead 2-Hours Ahead 3-Hours Ahead 4-Hours Ahead 5-Hours Ahead 6-Hours Ahead 0.68 0.98 1.22 1.42 1.57 1.70 1.06 1.41 1.77 2.10 2.41 2.67 11.21 16.39 20.62 24.06 26.89 29.20 29.98 46.59 59.84 71.49 81.47 89.80 0.31 0.63 0.87 0.98 1.05 1.10 2.26 3.10 3.70 4.14 4.48 4.74 Table 7: Average (among 61 locations modeled) MAE of VAR, VARNS, and SP models Forecast Horizon Temperature [◦ C] VAR VARNS SP Solar Radiation [Wh/m2 ] VAR VARNS SP Wind Speed [m/s] VAR VARNS SP 1-Hour Ahead 2-Hours Ahead 3-Hours Ahead 4-Hours Ahead 5-Hours Ahead 6-Hours Ahead 0.68 0.98 1.22 1.42 1.57 1.70 34.67 42.55 47.08 49.77 51.33 52.24 1.03 1.18 1.28 1.35 1.40 1.44 0.70 1.06 1.35 1.59 1.79 1.95 0.99 1.77 2.50 3.18 3.80 4.36 34.26 43.64 49.11 52.36 54.22 55.25 63.08 109.19 152.96 194.04 231.74 265.39 1.03 1.20 1.32 1.40 1.46 1.51 1.10 1.36 1.56 1.73 1.88 2.01 Table 8: Average (among 61 locations modeled) MAPE of VAR, VARNS, and SP models Forecast Horizon Temperature [◦ C] VAR VARNS SP Solar Radiation [Wh/m2 ] VAR VARNS SP Wind Speed [m/s] VAR VARNS SP 1-Hour Ahead 2-Hours Ahead 3-Hours Ahead 4-Hours Ahead 5-Hours Ahead 6-Hours Ahead 11.21 16.39 20.62 24.06 26.89 29.20 37.03 40.81 42.81 43.98 44.69 45.04 27.03 30.52 32.80 34.36 35.49 36.36 11.65 17.74 22.82 27.06 30.68 33.71 14.52 25.13 35.00 44.22 52.83 60.67 41.80 49.20 52.01 53.45 54.41 55.03 117.65 295.33 474.65 638.94 775.06 871.71 27.08 30.80 33.24 34.97 36.22 37.20 29.48 36.12 41.21 45.54 49.24 52.44 For solar radiation, the MAE of one-hour-ahead forecasts produced by the VARNS is slight lower than that of the VAR, however the VAR model outperforms the VARNS in terms of MAPE and RMSE. Note that the calculation of MAPE requires a division by zero when actual values are zero. Thus, the MAPE of actual solar irradiance observations that are zero and not correctly forecasted cannot be calculated. Table 9 shows that in general, the VAR model with spatial information is better than the model without in terms of solar forecasting performance. Both the VAR and VARNS are better than the two persistence models, especially when the forecasting horizon increases. This is also illustrated in Figure 6, which shows the average RMSE of the different models as a function of the forecasting horizon. 4.4. Comparative Studies Our VAR model provides good forecasting performance compared to other methods reported in the literature, showing that our model can be used for providing short-term temperature, wind speed, and solar radiation forecasts. The average (across the 61 locations modeled) performance of our model is comparable 11 Table 9: Average (among 61 locations modeled) RMSE of solar radiation forecasts produced by VAR, VARNS, CP, and SP models Forecast Horizon VAR VARNS CP SP 1-Hour Ahead 2-Hours Ahead 3-Hours Ahead 4-Hours Ahead 5-Hours Ahead 6-Hours Ahead 73.19 85.83 93.34 98.07 101.05 102.81 73.20 87.48 96.26 101.82 105.21 107.09 79.61 105.28 136.22 171.39 207.38 240.79 111.06 178.99 242.22 298.01 345.16 383.29 400 VAR VARNS CP SP 350 RMSE [Wh/m 2 ] 300 250 200 150 100 50 1 2 3 4 5 6 Forecast Horizon [Hours Ahead] Figure 6: Average (Among 61 Locations Modeled) RMSE of Solar Radiation Forecasts Produced by VAR, VARNS, CP, and SP Models to other works, while our model performs significantly better at some locations (this is indicated by the minimum values of the performance metrics reported in Tables 4 through 6). Moreover, Tables 4 through 6 suggest that our VAR models provides relatively robust weather forecasts up to six-hours ahead. Perez et al. (2007) forecast wind speed using a blended ensemble, which consists of the Weather Research and Forecasting Single Column Model and time series forecasts that are calibrated with Bayesian model averaging. The MAEs of their hour-ahead wind speed forecasts are between 0.9 m/s and 0.95 m/s during the day and are between 1.01 m/s and 1.07 m/s overnight. Erdem and Shi (2011) compare four approaches based on an autoregressive moving average method for hour-ahead forecasting. Their method has MAEs ranging from 0.8 m/s to 2.3 m/s. Li and Shi (2010) present a comparison study on the application of different artificial neural networks in hour-ahead wind speed forecasting and measure forecasting performance in terms of MAE, MAPE, and RMSE. The best MAE, MAPE, and RMSE among the locations they model are 0.95 m/s, 19.4%, and 1.254 m/s, respectively. Chen et al. (2013) produce wind speed forecasts using a Gaussian process applied to the outputs of an NWP model. Their hour-ahead and five-hour-ahead forecasts have RMSEs of 1.8 m/s and 2.2 m/s, respectively. More short-term solar radiation forecasting is done using cloud motion derived from satellite images. Examples of this method includes the works of Heinemann et al. (2006); Perez et al. (2010); Traiteur et al. 12 (2012). Perez et al. (2010) report an increase in the RMSE% from 25% to 42% as the forecasting horizon goes from hour-ahead to six-hours-ahead. Traiteur et al. (2012) compare their forecasts against single point ground-truth stations and report RMSEs that vary from 68 Wh/m2 to 120 Wh/m2 for hour-ahead forecasts and 140 Wh/m2 to 200 Wh/m2 to six-hour-ahead forecasts. Erdem and Shi (2011) generate one-, two-, and three-hours-ahead solar forecasts and report RMSE%s of 23%, 32%, and 38%, respectively. Remund et al. (2008) compare short-term global radiation forecasts of three different models and find that ECMWF (Global Model of the European Centre for Medium-Range Weather Forecasts) is the best, with an RMSE% that stays at about 38% for one- to five-hours-ahead forecasting. Taylor and Buizza (2004) compare point forecasts of daily air temperature generated by six different models to actual observations. The best MAE of an hour-ahead forecast that they report is 0.9◦ C, as opposed to an average of 0.68◦ C generated by our model. 5. Conclusions In this paper, we propose a time series VAR model to forecast temperature, solar radiation, and wind speed at 61 locations around the United States. The forecasting performance is good for all the three weather variables. Given the influence of the three weather variables on electricity systems, the model is able to provide proper inputs for electricity supply and demand modeling. The consideration of spatial relationship allows the model to provide good forecasts for multiple locations while considering cross correlations among the locations modeled. The comparison of the VAR and VARNS models shows that the spatial terms do help in improving forecasting performance. The VAR model proposed is also flexible in size. The forecasting performance is similar when it is used to forecast the three weather variables for fewer locations (results for these more limited models are excluded for sake of brevity). We also show that the VAR model performs similarly to or better than other methods proposed in the literature, including persistence forecasts. Another important contribution of this paper is that it shows that a time series approach can be used to provide robust short-term solar radiation forecasts with good forecasting performance. This work does suggest several areas of future research. Although the VAR model proposed provides good forecasts, it may be redundant given its large size. Each equation has about five thousands parameters to be estimated. Not every one of these parameters contributes to the overall forecast. Thus, it may be possible to further customize the model and autoregressive structure of the model to better exploit the correlations in the data. The residuals also display heteroskedasticity, which weighted or generalized leastsquares techniques may reduce. Acknowledgments The authors thank Armin Sorooshian and the editor for helpful suggestions and conversations. The work presented in this paper was supported by the National Science Foundation under Grant Number 1029337. This work was also supported by an allocation of computing time from the Ohio Supercomputer Center. References Alaton, P., Djehiche, B., Stillberger, D., 2002. On modelling and pricing weather derivatives. Applied Mathematical Finance 9, 1–20. ˇ Benth, F. E., Saltyt˙ e-Benth, J., 2005. Stochastic Modelling of Temperature Variations with a View Towards Weather Derivatives. Applied Mathematical Finance 12, 53–85. Campbell, S. D., Diebold, F. X., March 2005. Weather Forecasting for Weather Derivatives. Journal of the American Statistical Association 100, 6–16. Chen, N., Qian, Z., Meng, X., Nabney, I. T., 3-9 August 2013. Short-TermWind Power Forecasting Using Gaussian Processes. In: 23rd International Joint Conference on Artificial Intelligence. Beijing, China. Chowdhury, B. H., Rahman, S., 4-8 May 1987. Forecasting sub-hourly solar irradiance for prediction of photovoltaic output. In: 19th IEEE Photovoltaic Specialists Conference. Institute of Electrical and Electronics Engineers, New Orleans, Louisiana, USA. 13 Erdem, E., Shi, J., April 2011. ARMA based approaches for forecasting the tuple of wind speed and direction. Applied Energy 88, 1405–1414. Giebel, G., Brownsword, R., Kariniotakis, G., Denhard, M., Draxl, C., January 2011. The State of the Art in Short-Term Prediction of Wind Power A Literature Overview, 2nd Edition. Tech. rep., ANEMOS.plus. Hammer, A., Heinemann, D., Lorenz, E., L¨ uckehe, B., July 1999. Short-term forecasting of solar radiation: a statistical approach using satellite data. Solar Energy 67, 139–150. Heinemann, D., Lorenz, E., Girodo, M., 2006. Forescasting of Solar Radiation. In: Dunlop, E. D., Wald, L., Suri, M. (Eds.), Solar Energy Resource Management for Electricity Generation from Local Level to Global Scale. Nova Publishers, Ch. 7, pp. 83–94. Li, G., Shi, J., July 2010. On comparing three artificial neural networks for wind speed forecasting. Applied Energy 87, 2313–2320. Marquez, R., Coimbra, C. F. M., 13-17 May 2012. Comparison of Clear-Sky Models for Evaluating Solar Forecasting Skill. In: Proceedings of the World Renewable Energy Forum 2012. American Solar Energy Society, Denver, pp. 4443–4449. Miranda, M. J., Fackler, P. L., 2002. Applied Computational Economics and Finance. MIT Press, Cambridge, Massachusetts. Oetomo, T., Stevenson, M. J., August 2005. Hot or Cold? A Comparison of Different Approaches to the Pricing of Weather Derivatives. Journal of Emerging Market Finance 4, 101–133. Perez, R., Kivalov, S., Schlemmer, J., Hemker Jr., K., Renn´e, D., Hoff, T. E., December 2010. Validation of short and medium term operational solar radiation forecasts in the US. Solar Energy 84, 2161–2172. Perez, R., Moore, K., Wilcox, S. M., Renn´ e, D., Zelenka, A., June 2007. Forecasting solar radiation—Preliminary evaluation of an approach based upon the national forecast database. Solar Energy 81, 697–838. Remund, J., Perez, R., Lorenz, E., 2008. Comparison of Solar Radiation Forecasts for the USA. In: 2008 European PV Conference. Valencia, Spain. ˇ Saltyt˙ e-Benth, J., Benth, F. E., Jalinskas, P., 2007. A Spatial-temporal Model for Temperature with Seasonal Variance. Journal of Applied Statistics 34, 823–841. Svec, J., Stevenson, M. J., 2007. Modelling and forecasting temperature based weather derivatives. Global Finance Journal 18, 185–204. Taylor, J. W., Buizza, R., August 2004. A comparison of temperature density forecasts from GARCH and atmospheric models. Journal of Forecasting 23, 337–355. Taylor, J. W., Buizza, R., January-March 2006. Density forecasting for weather derivative pricing. International Journal of Forecasting 22, 29–42. Traiteur, J. J., Callicutt, D. J., Smith, M., Roy, S. B., October 2012. A Short-Term Ensemble Wind Speed Forecasting System for Wind Power Applications. Journal of Applied Meteorology and Climatology 51, 1763–1774. Wilcox, S. M., August 2012. National Solar Radiation Database 1991-2010 Update: User’s Manual. Tech. Rep. NREL/TP5500-54824, National Renewable Energy Laboratory. 14
© Copyright 2024