Agronomy Journal Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow An erratum has been published
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Boken, V. K.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Boken, V. K.
Agricola
Right arrow Articles by Boken, V. K.
Related Collections
Right arrow Wheat
Agronomy Journal 92:1047-1053 (2000)
© 2000 American Society of Agronomy

WHEAT

Forecasting Spring Wheat Yield Using Time Series Analysis

A Case Study for the Canadian Prairies

Vijendra Kumar Boken

Dep. of Geography, Southwest Texas State Univ., San Marcos, TX 78666-4616 USA

vijendra_boken{at}und.nodak.edu


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 
Techniques commonly used for wheat (Triticum aestivum L.) yield estimation employ weather data over the growing season. However, yield estimates are also required before wheat is sown—particularly by the grain-exporting agencies to help them determine, in advance, wheat-export targets. In that case, time series techniques relying on past yield data can be used for yield forecasting. In this paper, a procedure for applying time series analysis to forecast yield is described. A few techniques (linear trend, quadratic trend, simple exponential smoothing, double exponential smoothing, simple moving averaging, and double moving averaging) were tested to model the average spring wheat yield series for Saskatchewan, Canada. Using 1975–1993, 1975–1994, and 1975–1995 spring wheat yield data, yields were forecasted for 1994, 1995, and 1996, respectively. Based on a deterministic measure (i.e., mean squared error, MSE), it was found that the quadratic model produced the most accurate forecast during the model development periods (1975–1993, 1975–1994, and 1975–1995) and model testing periods (1994, 1995, and 1996). Further, a discussion is provided on improving the forecast by forecasting the yield for the homogeneous subareas (within Saskatchewan) instead for the entire Saskatchewan as a unit. The subareas could be constructed on the basis of soil-climatic conditions or yield fluctuation, using a geographic information system.

Abbreviations: AR, autoregressive • ARMA, autoregressive moving average • ARIMA, autoregressive integrated moving average • CWB, Canadian Wheat Board • MSE, mean squared error • GIS, geographic information system


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 
SPRING WHEAT (hereafter referred to as wheat) is a major export crop of the Canadian Prairies. The Prairies extend northwards from 49°N (Canada-USA border) to 54°N lat, and east-west from eastern Manitoba, across Saskatchewan to western Alberta, between approximately 96°W and 114°W long. The Prairie climate is classified as continental (Hare and Thomas, 1974) with severe winters and a short growing season. Wheat is sown in the late spring (from end of April to early May) and harvested in the early autumn (from late August to early September). Of the total wheat produced on the Prairies, approximately 75% is exported. In past 10 yr the annual wheat exports have ranged from 10 to 20 million tonnes with an average of about 15 million tonnes as is evident from the annual reports on exports (e.g., Canadian Grain Commission, 1996). The export contributes significantly to the Canadian economy. The Canadian Wheat Board (CWB), the agency responsible for exporting wheat, requires preharvest estimates of wheat yield (production per unit of area) to determine wheat quantities available for exports in the year ahead. The degree of accuracy of these estimates affects export profits. The more accurate the yield estimates, the wiser the economic decisions and the higher may be the profits.

In this paper, the yield estimates are classified into two categories: (i) long-term, and (ii) short-term estimates. The long-term estimates are required during October–November preceding the sowing year (i.e., at the initial stage of the export planning), and the short-term estimates are required around the harvest time (i.e., at the final stage of export planning). As the points-in-time at which these two types of estimates are required differ, so do the groups of techniques that can be employed to obtain them. While techniques to obtain the short-term estimates use weather data over the growing season (viz. Sakamoto, 1978; Idso et al., 1979; Slabbers and Dunin, 1981; Barnett and Thompson, 1982; Diaz et al., 1983; Campbell et al., 1988; Cordery and Graham, 1989; Walker, 1989; Raddatz et al., 1994; Toure et al., 1995; Kumar and Panu, 1997; Kumar, 1998), the weather data cannot be available for obtaining the long-term estimates which are required before wheat is sown. As the yield is known to be most influenced by weather conditions during the growing season, it is a common practice to estimate yield using weather data; attempts to obtain long-term estimates without using weather data are limited.

As an alternative to weather data, past yield data (i.e., the time series of yield data in past years) is used to obtain the long-term yield estimates by modeling the series using appropriate techniques. Presently only linear trend technique is commonly applied to obtain the long-term estimates for the Prairies. However, other factors such as economic and market projections (e.g., projected market prices for wheat and other crops) are also taken into account, but the exact mechanism for their inclusion into the yield-estimating procedure was not available. Hence the focus of this paper is restricted to improving the long-term estimates using only past yield data.

With this objective, Saskatchewan was selected for the study as it contributes the largest proportion of exported wheat (about 60% of the total) among the Prairie provinces; other two provinces of the Prairies (Alberta and Manitoba) contribute only 25 and 15%, respectively (Walker, 1989).

Various time series models were developed using average yields for the study area for the 1975 to 1996 period. The following section explains theoretical concepts to model a time series using different techniques that are relevant to the present dataset. These techniques include the trend, moving averaging, and exponential smoothing techniques. The subsequent sections describe how these techniques are employed to model the yield series. Finally, a discussion is provided on selection of the best technique and how to further improve the forecast.


    Time series techniques for forecasting: theoretical concepts
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 
In a time series analysis, a variable to be forecasted (yield, in the present case) is modeled as a function of time, that is,

(1)
where Yt is yield for year t, f (t) is a function of time t, and {epsilon}t refers to error (i.e., the difference between reported yield and forecasted yield for year t). Once a functional relationship between yield and time (in other words, a time series model) is established, yield can be forecasted for year t + 1. The process for developing this model begins by examining whether the time series under study is stationary or nonstationary.

Series Categorization—Stationary or Nonstationary
A time series can be broadly categorized as: (i) stationary, or (ii) nonstationary. If the statistical properties (mean, variance, and autocorrelation) of the series are independent of time, the series is categorized as stationary; otherwise it is referred to as nonstationary. Categorization of a series is a prerequisite to selection of groups of time series techniques that can be considered to model the series.

The Augmented Dickey-Fuller test is widely used for testing whether a series is stationary or nonstationary (Gujarati, 1995). To execute this test, a regression equation is developed between two variables: (i) Yt, as a dependent variable, and (ii)Yt-1, as an explanatory variable. Further, it is tested whether the regression coefficient thus obtained can be statistically treated equal to 1, in order for the series to be stationary. This task is accomplished by comparing the absolute Dickey-Fuller {tau} statistic (i.e., estimated regression coefficient divided by its standard error) with the absolute critical {tau} statistic.

A series tested as nonstationary may be transformed to a stationary series. Such a transformation will pave the way for application of techniques otherwise extraneous to the nonstationary series, resulting in employment of an increased number of techniques to model the series. The greater the number of models developed to describe a series, the more reliable the selection of the best technique.

An easy and commonly used method for transforming a nonstationary series to a stationary one is the method of differencing. By applying this method to the present case, a nonstationary series is transformed to a new series by subtracting Yt-1 from Yt (when level of differencing is 1), Yt-2 from Yt (when level of differencing is 2), ... and Yt-n from Yt (when level of differencing is n). To begin the process of the transformation, the level of differencing is chosen as 1 and the transformed series is tested for stationarity. If the series does not pass the test, the next higher level of differencing is chosen and the process of transformation is repeated. Usually, up to a third level of differencing is sufficient to transform a series.

The above process of series-categorization and transformation leads to the next step—modeling the series using appropriate time-series techniques. Different techniques can be applied to model stationary and nonstationary series. However, because of the inherent shortness of the series in the present case as elaborated later in the paper, only simple forecasting techniques are considered.


    The stationary series case
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 
Stationary series (original or transformed) can be modeled using the simple moving averaging and the simple exponential smoothing; the more complicated Box-Jenkins techniques such as the autoregressive moving averaging (ARMA) and autoregressive integrated moving averaging (ARIMA) techniques were excluded from the analysis.

Simple Moving Averaging
There are two types of averaging techniques: (i) simple averaging and (ii) moving averaging techniques. With simple averaging technique, the mean of all observations (i.e., yields in current and past years) in a series is used to forecast yield for the next year. In contrast, with moving averaging technique the analyst is more concerned with recent observations. A mean is computed for a specified number of the most recent observations, which is used to forecast the next observation. As a new observation becomes available, a new mean is computed by dropping the oldest observation and including the newest one. The new mean is then used to forecast yield for the next year as expressed by the following equation.

(2)
where, t+1 is forecasted yield for year t + 1, Yt is reported yield for year t, and p is the number of terms specified in the moving averaging technique.

In averaging techniques, equal weights are given to all observations. If more weights are to be given to the most recent observations, exponential smoothing is selected to model the series.

Simple Exponential Smoothing
This technique is based on averaging the series data in a decreasing (exponential) manner. The weights used are w (for the most recent observation), w(1- w) for the next observation, w(1- w)2 for the next, and so on. The weight, w, is termed the smoothing constant which ranges from 0 to 1. The actual value of w determines the extent to which the most current observation influences the forecast. The resulting exponential smoothing leads to the following equation.

(3)
where St and St-1 are the smoothed values for year t and t-1, respectively; the smoothed value for year t becomes the forecasted value for year t + 1.

In order to estimate an optimum value for w, a procedure is used which minimizes the MSE in the forecasts; the MSE is defined as follows:

(4)
where, is estimated yield for year i, Yi is reported yield for year i, and N is the total number of observations.


    The nonstationary series case
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 
The following are some of the techniques that can be employed to model a nonstationary series.

Trend Analysis
Because of numerous factors (e.g., technology, fertilizer use, pest management, and use of improved seed varieties), yield series may show a linear or quadratic trend, as defined by Eq. [5] and [6], respectively.

(5)

(6)
where, ß0, ß1, and ß2 are coefficients.

In addition to a trend, the series may contain a cyclic variation (wave like fluctuation around the trend). To model such a nonstationary cycle, a double moving averaging technique is used (Hanke and Reitsch, 1995).

Double Moving Averaging
A double moving averaging technique is used when a nonstationary series has a linear trend. With this technique, one set of the moving averages is computed first and then a second set of the moving averages is computed from the first one.

Double Exponential Smoothing
With this technique, each observation in a series is assumed to be the summation of two components: (i) level, L, and (ii) trend, T, components. These components can be estimated using a classical approach, wherein the initial estimates of the components are computed by fitting a linear regression between the series variable (i.e., yield) and time. The resulting regression equation has two parameters: a regression constant and a slope coefficient. The regression constant is used as an initial estimate of the level component, and the slope coefficient is used as the initial estimate of the trend component. The subsequent estimates of the level and trend components are obtained by the following updating equations (Hanke and Reitsch, 1995):

(7)

(8)

(9)

In Eq. [7] and [8], w1 and w2 are smoothing constants whose optimum values can be estimated by an iteration method, minimizing the MSE.

The theoretical concepts guiding the time series modeling have been explained above. Following these concepts, times series consisting of annual average yields for Saskatchewan were modeled to obtain the long-term yield estimates.


    Modeling the yield series
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 
There are 20 crop districts in Saskatchewan, for which yield data were available with the CWB since 1975; the original source of the data was Statistics Canada. From the district-level data, annual average yields for Saskatchewan were computed and thus a 1975 to 1996 yield series was generated (Table 1) . From this series, three separate series (1975–1993, 1975–1994, and 1975–1995) were extracted for this study, to augment the width of the analysis which would lead to more reliable conclusions. The yields were forecasted for 1994, 1995, and 1996 by modeling the 1975 to 1993, 1975 to 1994, and 1975 to 1995 series, respectively, as explained below.


View this table:
[in this window]
[in a new window]
 
Table 1 Average spring wheat yield for Saskatchewan, 1975 to 1996

 
As a first step in the process of the development of a time series model, the stationarity of a series is examined by the Augmented Dickey-Fuller test. The COINT procedure of SHAZAM statistical software (SHAZAM, version 7, Vancouver, BC; McGraw Hill) was applied to perform this test on three series selected for the study. Table 2 contains the {tau} statistics obtained as a result of the test. It can be observed from Table 2 that the original yield series (i.e., with zero level of differencing) are nonstationary because their absolute {tau} statistic is less than the absolute critical {tau} statistic (3.13). It is only after the second level of differencing that each of three series becomes stationary when the absolute {tau} value exceeds the absolute critical {tau} value. Three original series (nonstationary) and their corresponding transformed series (stationary) are now available to be modeled using time series techniques. As per the theoretical concepts explained earlier in the paper in regards to selecting a technique to model a series, four techniques (linear trend, quadratic trend, double exponential smoothing, and double moving averaging) were applied to model the original series, and two techniques (simple exponential smoothing and simple moving averaging) were applied to model the transformed series. While the forecasts for the original series were directly obtained in the case of the former, the same were indirectly obtained in the case of the latter (from the forecasts for the transformed series).


View this table:
[in this window]
[in a new window]
 
Table 2 The {tau} statistics computed for different yield series for Saskatchewan. (Asymptotic critical value at 10% confidence level = -3.13)

 
The coefficients for trend models and their standard errors were obtained using JMP-IN software (SAS Inst., Cary, NC; Duxbury Press, New York) and are presented in Table 3 . There was no coefficient involved in the moving averaging techniques; the only specification was the number of terms which was chosen as 4. In order to develop the simple and double exponential smoothing models, separate computer programs in C++ were written. Smoothing constants (w, w1, and w2), ranging from 0.1 to 0.9, with an interval of 0.1, were considered. For w equal to 0.1, the MSE was found to be minimum in the case of simple exponential smoothing. In the case of the double exponential smoothing, all combinations of w1 and w2 were considered; a value of 0.1 for w1 as well as for w2 resulted in a minimum MSE.


View this table:
[in this window]
[in a new window]
 
Table 3 Coefficients of trend models developed for spring wheat yield series for Saskatchewan

 
The forecasting performance of the time series models developed above is illustrated in Fig. 1 during two periods: (i) period of model development and (ii) period of model testing. The period of model development refers to the range of years that was considered to model a series; and the period of model testing refers to the year for which yield was forecasted using the model. The following section discusses the process for evaluating the performance of different techniques in an attempt to select the best one.





View larger version (70K):
[in this window]
[in a new window]
 
Fig. 1 (I, II, III). Comparing the spring wheat yields as forecasted by various time series techniques, with the reported yields for Saskatchewan

 

    Evaluation of forecasting techniques
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 
Although various stochastic and deterministic measures are available in the literature to evaluate the performance of forecasting techniques, in this paper the MSE (a deterministic method) was used for the purpose. While the more appropriate period for evaluating the performance of a time series technique is the model testing period, the model testing period includes only a single year in the present case. It was therefore considered appropriate to evaluate the performance also over the model development period. Tables 4 and 5 summarize the performance of the techniques during the model development and the model testing periods, respectively. During both periods, the MSE was a minimum for the case of the quadratic trend. Hence, the quadratic trend technique can be treated as the best forecasting technique. It should, however, be noted here that a technique concluded as the best on the basis of a deterministic method may not remain the best if a stochastic method (based on a regression analysis between the reported yields and forecasted yields) is used to evaluate its performance. In order to underpin the process of selection for the best forecasting technique, further research may be required to develop a hybrid method for evaluating the forecasting performance of a technique using both deterministic and stochastic methods.


View this table:
[in this window]
[in a new window]
 
Table 4 Evaluating the performance of various techniques to forecast spring wheat yield for Saskatchewan, during the model development period

 

View this table:
[in this window]
[in a new window]
 
Table 5 Evaluating the performance of various techniques to forecast spring wheat yield for Saskatchewan, during the model testing period

 
Improving the Forecast
After the best technique is finalized, it is worth exploring how to further improve the forecast for the study area. In this regard, events that can be attributed to lowering the performance of the forecasting techniques need to be examined. One such event that has relevance to the present case is occurrence of a severe drought (i.e., when yield is significantly lower than the average yield as happened in 1988). It can be observed from Fig. 1 that the estimated yield for the 1988 drought year was at a higher variance with the reported yield. This variation contributed significantly to increasing the MSE and, in turn, lowering the forecasting performance. In addition, the yield pertaining to the severe drought year acts as an outlier and influences characteristics (e.g., stationarity) of the series. Though a severe drought event is rather an essential component of yield series for a drought-prone region such as the Prairies, because of a short length of the series in the present case this event can be interpreted as an outlier and not an integrated component of the series. A longer series with a greater number of severe drought events will better withstand the negative impacts of severe droughts on the performance of a forecasting technique and would lead to improved forecasts. Also, it would be possible to expand the model testing period to strengthen conclusions regarding selection of the best technique.

Besides the severe drought, a high degree of fluctuation in average reported yields (Fig. 1) presents complexity in forecasting. This complexity could be abridged by constructing homogeneous subareas (within the study area) and forecasting yield separately for each subarea. Forecasted yields from the subareas could then be aggregated to obtain forecast for the study area, though an appropriate method for combining the forecasts from the subareas is required to be developed. Since the forecasts for the subareas are expected to be more accurate because of a smaller degree of yield fluctuation, it will result in an overall more accurate procedure for obtaining the long-term yield estimates for Saskatchewan.

As the concept of subarea-based yield estimation looks logical, a question arises—how to construct these subareas for which the yearly fluctuation in yield is less irregular in time and space? It can be attempted, for example, in two ways: (i) by choosing soil-climatic zones as the subareas, or (ii) by applying spatial analysis to the district-level yield data. While information on soil-climatic zones for a region can be easily available, a brief description follows regarding the spatial analysis.

There are 20 crop districts in Saskatchewan whose average yield vary significantly (Table 6) . In a map with crop district boundaries, the average yield can be shown as point locations (e.g., the centroids of the crop districts). From the spatial distribution of these 20 points (representing district yields), the entire cropped area in the province can be divided into much finer grids. For each grid the average yield could be estimated from the district level yields using interpolation methods (e.g., trend, kriging, inverse distance weighted) available with a geographic information system (GIS) software. The grids can then be grouped into a fewer zones (say, five) by selecting a yield range for each zone (Boken, V.K., M. Hemmasi, and R.P. Waldkirch. 1999. Mapping spatial variation in spring wheat yields in North Dakota using geographic information system. Poster presented at 2nd EPSCoR conference held on Sept. 10, at the North Dakota State University, Fargo (unpublished data)).


View this table:
[in this window]
[in a new window]
 
Table 6 Spring wheat yield variation in Saskatchewan (1975– 1996)

 
In order for the concept of the subareas to succeed for improving the long-term yield estimates, the availability of yield data for much finer areal units (as opposed to a crop-district) will be required.


    Concluding remarks
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 
In this paper, an attempt is made to obtain a long-term forecast of average spring wheat yield for Saskatchewan without using any weather data. Analysis is based on time series of past yields. The 1975–1993, 1975–1994, 1975–1995 yield data were used and six forecasting techniques (linear trend, quadratic trend, simple exponential smoothing, double exponential smoothing, simple moving averaging, and double moving averaging) were employed to model these series. Using a deterministic measure (i.e., MSE) for evaluating the performance of a forecasting technique, the quadratic-trend technique was found to be the best over the model development periods (1975–1993, 1975–1994, and 1975–1995) and model testing periods (1994, 1995, and 1996). These conclusions, nonetheless, can be strengthened by analyzing a longer yield series. Besides, the forecast of the average spring wheat yield for Saskatchewan may be improved if yields were forecasted for each of a few homogeneous subareas (within Saskatchewan, where yield variation is less irregular in time and space) rather than for entire Saskatchewan as a single unit. The subareas could be constructed on the basis of soil-climatic zones or using spatial analysis and interpolation tools available with a GIS software. Forecasted yields from the subareas could be aggregated to obtain the forecast for the entire province. Nonetheless, it warrants the needs for the availability of the yield data on much larger scale than the crop-district level and for the further research on developing a robust method for combining the forecasts from the subareas.


    ACKNOWLEDGMENTS
 
The author thanks anonymous reviewers for their comments, some encouraging and some critical but all very useful to bring the paper in its present form.

Received for publication August 9, 1998.
    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 Time series techniques for...
 The stationary series case
 The nonstationary series case
 Modeling the yield series
 Evaluation of forecasting...
 Concluding remarks
 REFERENCES
 




This article has been cited by other articles:


Home page
Agron. J.Home page
M. Fornaciari, F. Orlandi, and B. Romano
Yield Forecasting for Olive Trees: A New Approach in a Historical Series (Umbria, Central Italy)
Agron. J., October 19, 2005; 97(6): 1537 - 1542.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow An erratum has been published
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Boken, V. K.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Boken, V. K.
Agricola
Right arrow Articles by Boken, V. K.
Related Collections
Right arrow Wheat


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Crop Science Vadose Zone Journal
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome