|
|
||||||||
a Dep. of Crop and Soil Sciences, Michigan State Univ., East Lansing, MI 48824-1325
b W.K. Kellogg Biological Station and Dep. of Crop and Soil Sciences, Michigan State Univ., Hickory Corners, MI 49060-9516
* Corresponding author (kravche1{at}msu.edu)
Received for publication November 10, 2005.
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: AIC, Akaike Information Criterion LTER, Long-Term Ecological Research site RCBD, randomized complete block design SA, spatial analyses SA1, random field analysis with correlated errors based on plot data SA2, random field analysis with trend and spatially correlated residuals SA2a, random field analysis with trend and correlated errors based on subsample data
| INTRODUCTION |
|---|
|
|
|---|
The traditional statistical analysis used most commonly in field research is RCBD analysis. Statistical analyses that can be more efficient than RCBD are analyses that account for spatial information, that is, for spatial correlations among experimental units. These include Papadakis analysis and its modifications (Kempton and Howes, 1981; Besag and Kempton, 1986; Bhatti et al., 1991; Ball et al., 1993; Brownie et al., 1993; Casler, 1999), trend analysis (Kirk et al., 1980; Warren and Mendez, 1982; Tamura et al., 1988; Bowman, 1990; Casler, 1999) and random field analysis (Zimmerman and Harville, 1991; Schabenberger and Pierce, 2002).
However, spatial analyses will not necessarily be better than RCBD in all possible circumstances. For example, the effectiveness of spatial procedures in comparison with RCBD depends on block size and orientation and on strength of spatial correlation (Stroup, 2002). In particular, the spatial analyses are not more effective than RCBD when the spatial structure of the studied variable cannot be accurately characterized. Accurate assessment of spatial structure requires as a rule a relatively large number of data points. Thus, spatial analyses often perform poorly in experiments with a relatively small number of experimental plots and/or subsamples. Understandably, the spatial analyses have been used routinely and extensively in plant breeding experiments where the number of experimental plots often approaches or exceeds 100 (all previously cited references as well as Cargnelutti et al., 2003; Duarte and Vencovsky, 2005; Hamann et al., 2002; Segovia-Lerma et al., 2004; Weisz et al., 2005; Yang et al., 2004). Spatial analyses have been recommended recently for agronomic experiments with a large number (>60) of subsample yield measurements collected per each experimental plot (Hong et al., 2005). However, successful applications of spatial analyses in general agronomic field experiments with smaller numbers of experimental plots have been rather limited.
We hypothesize that when soil C is a response variable under investigation, spatial analysis will be more effective than RCB even in small-sized experiments. The reasoning leading to this hypothesis is as following.
It has been noted that if spatial structure of a studied property is relatively strong (nugget/sill ratio of 0.1 or less) fewer data points might be needed to identify the presence of spatial structure (Kravchenko, 2003) and to characterize it. Thus, for variables with strong spatial structure accurate assessment can be obtained in smaller experiments than for variables with medium or weak spatial structure (nugget/sill ratios of 0.10.6 and >0.6, respectively). Among soil properties, soil C content often has been found to have strong spatial structure with relatively large spatial correlation ranges (e.g., Cambardella et al., 1994; Robertson et al., 1993, 1997; McBratney and Pringle, 1998; Mueller and Pierce, 2003; Terra et al., 2004).
Hence, spatial methods of analysis have a potential for providing soil C researchers with a powerful analysis tool that could enable detection of smaller differences between management treatments. Smaller differences in soil C can be generated by management effects in a shorter experimental time, thus, at less expense. An additional benefit to soil C researchers is the ability to use secondary information from experimental sites in deciding on potential efficiency and usefulness of different data analysis methods. Topography is among the main factors driving spatial distribution and variability of soil C. Thus, for the experiments where soil C is the response variable topographical diversity of the experimental site might be an indicator of whether spatial methods should be considered.
Thus, the objectives of this study are, first, to evaluate the potential performance of spatial methods of data analysis in small agricultural experiments where soil C is the primary variable of interest and to compare it with the performance of RCBD analysis and, second, to consider whether topographical characteristics of the potential experimental sites can be used by researchers as a decision guide for implementing spatial analyses.
Based on comparisons between the methods of spatial analysis conducted by Zimmerman and Harville (1991), Brownie et al. (1993), and Wu and Dutilleul (1999), we use here only the methods that were found to be most effective in previous studies. Specifically, we consider, (i) analysis with spatially correlated residuals, and (ii) analysis with a trend and spatially correlated residuals. We use actual total soil C measurements rather than simulated data to ensure that variability from both spatial variability patterns and measurement errors reflects realistic field conditions.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
A total of 597 elevation measurements collected from the LTER site with land-based laser in 1988 (Robertson et al., 1997) were used to characterize topographical diversity of the studied plots. Distance between measurements ranged from 4 to 20 m and approximately eight to nine elevation measurements were available for each plot. The elevation measurements were converted into a cell-based terrain map on a 15- by 15-m grid by means of inverse distance weighting with power of 2 and 6 nearest neighbors using ArcGIS 9.0 Spatial Analyst (ESRI, 2004). This terrain map has been reported by Kravchenko et al. (2005). Terrain slope and flow accumulation values were derived from elevation map using ArcGIS 9.0 Spatial Analyst (ESRI, 2004).
Experiment Simulations
For experiment simulations we regarded each of the 11 experimental plots as a uniform separate field site with its corresponding set of soil C data. We will refer to the 11 LTER experimental plots as field sites and to the plots from the simulated experiments placed in these field sites simply as plots.
To compare performance of the statistical analyses, the total soil C data sets from each field site were overlaid with simulated RCBD experiments (Fig. 1). A simulated experiment at each site consisted of three blocks with five plots per block for a total of 15 plots. The blocks were delineated based on closest proximity. Such block delineation is commonly used in setting up field experiments in sites with relatively flat terrain. The plots were 6 by 10 m in size and most of them contained three soil subsamples. Because of missing data several plots contained only one or two subsamples. Five fictitious treatments were assigned at random to the plots within each block. No treatment effect was simulated, that is, only the original soil C measurements were used in the data analyses. A total of 15 simulated experiments were conducted at each site. Each simulated experiment consisted of a different random assignment of treatments to the experimental plots, that is, a total of 15 randomizations were performed for each of the 11 field sites. The 15 simulated experiments were not related to the total number of plots. It was just a feasible number of randomizations that produced convincing results on method comparisons.
Description of Spatial Methods of Analysis
Theoretical aspects of the spatial analysis used in this study, known as random field analysis, are developed by Zimmerman and Harville (1991), and are described in detail in a number of other studies (e.g., Zimmerman and Harville, 1991; Brownie et al., 1993; Stroup, 2002; Hong et al., 2005). Hence, we will provide only a brief overview of the statistical methods we used.
First, we looked at a classical RCBD analysis, which remains the most commonly used type of analysis for RCB field experimental designs. Statistical model for soil C observations (yij) in a RCBD analysis consists of treatment and block effects and residuals,
![]() |
2. Thus, the covariance matrix for the residuals, R, consists of zero off-diagonal covariance,
2, elements, which correspond to zero correlation between the plots separated by distance h,
![]() |
The first spatial analysis method that we considered is the random field analysis with correlated errors (SA1). The main difference of SA1 method from RCBD is its approach to residuals. Unlike in a traditional analysis of variance the residuals are not assumed to be independent from each other, but correlated. Their correlations reflect the spatial structure in variability of the studied property with expectation that observations/residuals closely located are more similar to each other than those separated by greater distance. Parameters describing the spatial structure form a basis for the covariance matrix of the residuals, R, in the SA1 statistical model. Thus, spatial structure is accounted for by modeling covariance structure of the residuals. The statistical model for the analysis consists of only treatment effects and residuals,
![]() |
(h), which are related to covariances, C(h), as C(h) =
2
(h). Semivariogram models used in geostatistical applications, for example, spherical, exponential, Gaussian, are usually employed to model the covariance structure of the spatially correlated residuals. For illustration, if the spatial structure of the residuals is represented by an exponential model, then the covariance,
2ij, elements of the R matrix are obtained as
![]() |
2ij is the covariance between the two plots, i and j, separated by distance h, and a is the spatial correlation range. In theory, the R from spatially correlated residuals would lead to lower values of standard errors for treatments and contrasts between the treatments and, thus, to more efficient analysis.
The second spatial analysis considered in this study is the random field analysis with trend and spatially correlated residuals (SA2), which was found to be superior to other spatial analyses in a number of other studies (e.g., Zimmerman and Harville, 1991; Brownie et al., 1993). The underlying statistical model of this analysis consists of fixed effects of treatments and fixed effects of the functions of location of the experimental units, that is x and y coordinates of plots. For illustration, model with linear trend in x and y direction is,
![]() |
We were concerned that in the small simulated experiments of this study the number of plots might be insufficient for adequate modeling of the spatial trend. Thus, we also considered a modified version of SA2 method (SA2a). In SA2a, instead of plot averages, we used geo-referenced location and soil C information of every individual subsample, hypothesizing that more numerous geo-referenced subsample data would allow for more accurate estimation of the trend parameters. The statistical model with linear trend in x and y direction based on subsample data is
![]() |
In the SA2 (analysis based on the plot locations) the order of the polynomial function used in the model was limited to two for x coordinate and one for y coordinate. For the SA2a (analysis based on subsample locations) we used the polynomial functions of the x and y coordinates up to the fourth order. Only the trend components significant at P < 0.01 were kept in the analysis following recommendations of Tamura et al. (1988) and Brownie et al. (1993).
The analyses were conducted using PROC MIXED procedure in SAS. For each analysis that involved modeling covariance structure of the residuals we considered three functions, namely, spherical, exponential, and Gaussian. For each function two parameters, that is, residual variance,
2, and spatial correlation range, a, were estimated by restricted maximum likelihood, a default method of PROC MIXED. Log-likelihood tests (Littell et al., 1996) showed that nugget effects were not significantly different from zero (P < 0.05) thus were not included in the models. After estimation of the spatial covariance model parameters, the restricted maximum likelihood procedure then uses them to obtain estimates of treatment effects along with variances/standard errors for the treatment effects and comparisons between the treatments.
Performance Comparison Criteria
To compare performance of statistical models of the studied analyses we used the Akaike Information Criterion (AIC), as described in Littell et al. (1996). The AIC is calculated based on log-likelihood values taking into account the number of model parameters. Lower AIC values indicate better performing statistical model. For the outcome of each of the 15 randomizations in each field site, the AIC values of RCBD analysis were compared with those of SA1, SA2, and SA2a analyses. If RCBD analysis produced the lowest AIC value it was concluded that RCBD model was the optimal for the outcome of a given randomization in a given field site and spatial analyses were considered no further. When AIC values of spatial analyses were less than those of RCBD we proceeded with comparing analysis efficiencies.
As a criterion of efficiency of the analysis we used the average standard error obtained by averaging standard errors from all pair-wise comparisons between the treatment means (Zimmerman and Harville, 1991; Brownie et al., 1993; Qiao et al., 2000). The smaller the standard error for difference among treatment means the smaller are the differences between the treatments that can be detected by using a certain statistical analysis, thus, the higher the efficiency of that statistical analysis.
Power Analysis Illustration
Since there was no treatment effect in the simulated experiments of this study only comparisons in efficiency based on average standard errors were possible. Thus, we decided to further illustrate how higher efficiency of the analysis may translate into smaller statistically detectable differences in total soil C between the treatments. For that we conducted a power analysis using the approach outlined by Stroup (2002) using SA1 method as an example and we compared the results with those of RCBD analysis. We computed the numbers of replications (plots per treatment) needed to declare a certain difference between the treatments to be statistically significant at
= 0.05 with power of 0.80. To obtain the differences we designated one of the fictitious treatments to be a treatment that increases total soil C values, while all the other treatments were assumed not to contribute to an increase in soil C. To model these treatment effects the difference between the treatments was added to the first treatment while the remaining treatments remained intact.
We studied the differences that could be detected with number of replications ranging from 2 to 10. As mentioned previously, the strength and extent of spatial correlation are important in defining how well the spatial analyses perform. Thus, we illustrated power calculations using three different spatial correlation ranges. The spatial correlation ranges considered were 15, 25, and 35 m. The minimum difference/number of replication curves were developed based on the pooled value of the variance obtained from total C data from all 11 data sets, equal to 0.041. For RCBD analysis we used estimated error and block variances of 0.026 and 0.015, respectively, obtained also based on pooled estimates from the simulated experiments of the studied 11 field data sets.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
Performance of the SA1 and SA2a methods varied in different field sites. In 9 of the studied 11 data sets the AIC values from at least one of the two studied spatial analyses in either all or in the majority of the 15 randomizations were lower than those from RCBD. This indicates that the statistical models that accounted for spatial correlation were preferable to the RCBD model in these field sites. Spatial analyses were not considered further in the two field sites where RCBD produced the lowest AIC values in majority of 15 randomizations (Table 2). Average standard errors from all pair-wise comparisons between the treatment means were calculated based on both RCBD and SA1 and SA2a analyses for the nine field sites where spatial models produced lower AIC values than those of RCBD. In eight of these nine field sites the average standard errors from at least one of the two studied spatial analyses were lower than those of the RCBD analysis. Thus, the minimum differences between the treatments, which could be detected statistically with a certain level of significance, were lower if data from these field sites were analyzed via spatial methods instead of traditional RCBD analysis.
|
The SA2a method was more efficient than the SA1 method in five of the studied field sites (Table 2). However, only in field sites 8 and 10 was the improvement in efficiency substantial. The best performance for the SA2a method was observed in field site 8 where it produced a 55.6% improvement in standard error over RCBD as compared to 29.6% improvement of SA1 method.
Effects of Individual Site Characteristics on Method Performances
The variations in performance of spatial methods in different field sites were associated with differences in topographical diversity and variability in the strength of spatial structure, which for soil C is itself to a large extent a function of topographical diversity. The two of the field sites (field sites 1 and 2), where none of the studied spatial methods of data analysis produced lower AIC values than RCBD analysis, were the field sites with somewhat lower variability of total C (Table 1). The field site 1 had the smallest range of C values among the 11 studied field sites. Coefficients of variation (CV) for these two field sites were among the lowest of the studied data (Table 1). Spatial variability characteristics indicated no spatial structure. The sample variograms for soil C data of the sites 1 and 2 were best modeled by pure nugget effect model. As an example, the sample variogram for data set 2 is shown in Fig. 2
. The topography of these sites was also among the flattest, with terrain slopes of 0.7 degrees (Table 1).
|
In five of the studied field sites (field sites 711) both spatial methods of data analysis produced lower AIC values than RCBD and lower average standard errors than those of RCBD in every one of the 15 randomizations (Table 2). Except for field site 7, these sites had the higher ranges of soil C values and higher CV values than those of field sites 1 and 2. They all had an overall stronger spatial correlation than the other field sites in the study. The nugget/sill ratios from the variograms of these data sets were lower than in all other data sets. Example of a sample variogram for these sites (field site 9) is shown on Fig. 2. Field sites 8 to 10 had above-average terrain slopes. Field sites 7 to 10 also had the highest values of maximum flow accumulation among the studied 11 sites. High flow accumulation values usually correspond to the concave areas that accumulate water flowing from the surrounding terrain. Thus, the field sites that include such areas have more distinct patterns in spatial distribution of soil C reflected in stronger spatial correlation.
Minimum Detectable Differences
Power curves reflecting relationships between the number of replications (blocks) and the minimum difference between the treatments that can be detected as statistically significant at certain level of significance are shown in Fig. 3
. For the spatial correlation range of 15 m, comparable to the plot size (10 by 6 m), minimum detectable differences of SA1 are not different from those obtained in a standard RCBD analysis. For example, the minimum difference that can be detected by both RCBD and SA1 methods in an experiment with three replications, plot size of 6 by 10 m, and spatial correlation range of 15 m is equal to 0.45% (Fig. 3). However, at larger spatial correlation ranges (25 and 35 m) minimum detectable differences of SA1 are smaller and the spatial analysis thus more efficient as compared with the RCBD approach. For example, in an experiment with three replications and plot size of 6 by 10 m the minimum detectable differences in total C obtained by SA1 are equal to 0.30 and 0.25% for 25- and 35-m spatial correlation ranges, respectively. That is, it is substantially lower than the difference that is detected by RCBD analysis (Fig. 3). Thus, accounting for spatial correlation can lead to substantial reduction in number of samples in soil C related studies as compared to the numbers of samples reported previously (Garten and Ashwood, 2002).
|
| CONCLUSIONS |
|---|
|
|
|---|
As expected, performance of the spatial analyses depended on the strength of spatial variability in the studied sites. In sites where strong spatial structure was observed for soil C distribution, spatial analyses were preferable to RCBD. In most instances, spatial analyses produced consistently lower standard errors which would result in lower minimum detectable differences between treatments than those of RCBD analysis.
We found that topographical diversity was a good indicator of the strength of spatial correlation in soil C and thus, of the potential efficiency of the spatial analyses. In the sites with more diverse terrain spatial analyses were more efficient than RCBD. Note that the topographic diversity of the studied fields can be regarded as overall low with terrain slopes of the field sites not exceeding 2 degrees. However, even in this relatively flat topography spatial correlation in soil C distribution was sufficiently strong for spatial analyses to produce more efficient results. Only in the field sites with the flattest terrain was the spatial structure of soil C distribution negligible and thus there were no advantages to use spatial analysis. We believe that using spatial analyses will be even more advantageous when the experiments are placed in sites with more diverse terrain than that of this study. Power analysis supported the empirical observations from the studied sites indicating higher efficiency of the spatial analysis in data sets with large spatial correlation ranges as compared to standard RCBD analysis.
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Crop Science | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||