|
|
||||||||
a Dep. of Crop and Soil Sciences, Michigan State Univ., East Lansing, MI 48824-1325
b W.K. Kellogg Biological Station and Dep. of Crop and Soil Sciences, Michigan State Univ., Hickory Corners, MI 49060-9516
* Corresponding author (kravche1{at}msu.edu)
Received for publication September 1, 2005.
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: CT, chisel-plowed management with conventional chemical inputs CTcover, chisel-plowed management with a winter leguminous cover crop and no chemical inputs NT, no-till management with conventional chemical inputs N/S, nugget to sill ratio
| INTRODUCTION |
|---|
|
|
|---|
If a densely sampled secondary variable(s) is correlated with a sparsely sampled primary variable, it can be used to improve the mapping accuracy of the primary variable via various geostatistical procedures, including cokring, external drift kriging, or regression kriging (Isaaks and Srivastava, 1989; Goovaerts, 1997; Webster and Oliver, 2001). Recent examples of geostatistical applications for predicting soil C and other soil properties using various sources of dense secondary information are given by Gessler et al. (2000), Mueller and Pierce (2003), Terra et al. (2004), and Simbahan et al. (2006); many of the earlier examples are referenced in the review by McBratney et al. (2003). Walter et al. (2003) used a combination of deterministic modeling with stochastic methodology in predicting soil C distribution across a terrain with diverse land uses. Topographical characteristics are among commonly used sources of additional information for improving the mapping accuracy of soil C in soils of the U.S. Midwest (Gessler et al., 2000; Mueller and Pierce, 2003; Terra et al., 2004).
Another source of dense spatial field information is crop yield data collected via combine monitors. The use of combine yield monitors that obtain georeferenced measurements of crop yield has grown tremendously in the last several years. Almost 20% of all corn (Zea mays L.) and soybean [Glycine max (L.) Merr.] yields in the USA currently are being collected via yield monitors (Economic Research Service, 2006). Yield maps obtained from such monitors for a number of years reflect areas within the field having different yield potentials due to variability of either topographical or soil characteristics. As such, they have been used extensively in precision agriculture research to characterize within-field soil variability and to derive management zones (Lark and Stafford, 1997; Lark and Stafford, 1998; Blackmore, 2000; Flowers et al., 2005). Webster and Oliver (2001) illustrated the application of yield-monitor data in improving soil P mapping. Bishop and McBratney (2001) have used yield-monitor data for predicting soil cation exchange capacity.
We hypothesized that both topographical information and the long-term yield monitor data could be of value in improving accuracy in soil C mapping. Given the large number of producers that collect yield monitor data, the long-term yield monitor data could be particularly valuable when other sources of soil and topographical information are unavailable. There is no information, however, on how useful the long-term yields could be in practice and what is the level of improvement in accuracy that could be reached by using yield as well as topographical information. The objective of this study was to assess the potential for dense topographical data and crop yield monitor data to improve mapping of total soil C. To achieve this objective, we first evaluated the strength of the relationships of soil C with auxiliary data, that is, topographical information, such as relative elevation, terrain slope, terrain curvature, and aspect, and crop grain yield information collected at multiple sites with different management practices. Second, we quantified the improvement achieved in soil C mapping when the auxiliary information was used, and third, examined factors that potentially affect the usefulness of auxiliary information in C mapping. The studied factors included the strength of the relationships between the primary and secondary variables and the strength of the spatial correlation of the primary variable. We used data from 12 60- by 60-m field sites, which allowed assessment of the variability in strength of the relationship between C and yield across a diverse landscape. Because of time- and labor-intensive efforts needed for intensive soil sampling and C measurements, in this preliminary study we only obtained total soil C data at the 0- to 5-cm depth.
| MATERIALS AND METHODS |
|---|
|
|
|---|
The 12 plots selected for this study were four replications of three LTER treatments: chisel-plowed management with conventional chemical inputs (CT), no-till with conventional chemical inputs (NT), and chisel-plowed management with a winter leguminous cover crop and no chemical inputs (CTcover). Soil sampling was conducted in June of 2003. At each plot, approximately 100 georeferenced soil samples were collected from the 0- to 5-cm depth. The sampled areas of each plot were 60 by 60 m. Twenty of the samples were collected on a regular triangular grid with distance between grid points in the eastwest direction of 13 m and in the northsouth direction of 15 m. The remaining samples were taken at varying distances from the regular grid sampling points. At each sampling location, the sample was taken from between the plant rows and was composited from five 2.5-cm-diameter cores collected within a 0.2-m radius. Total C was measured using a Carlo-Erba (Milan, Italy) CN analyzer. As determined by triplicate analysis runs on approximately one-fifth of the samples, the measurement error varied around 3 to 5%. A detailed description of the sampling scheme, soil analyses, and elevation data and terrain attribute processing was presented by Kravchenko et al. (2006).
The elevation measurements were recorded every 2 m using a 12-channel Leica SR530 real-time kinematic DGPS receiver (Leica Geosystems, St. Gallen, Switzerland). The measurements were converted into a cell-based terrain map on a 4- by 4-m grid using ArcGIS 9.0 Spatial Analyst (Environmental Systems Research Institute, 2004). The grid size was selected so that almost all grids included at least one of the elevation measurement points, thus not affecting derivation of other terrain attributes. Then terrain slope, aspect, and curvature were derived from the elevation data using the surface hydrologic analysis functions of ArcGIS 9.0 Spatial Analyst. Aspect was reclassified into two categories: southern orientation and all other orientations.
Crop yield data included corn, soybean, and wheat (Triticum aestivum L.) grain yields collected via yield monitors from the studied plots during 1996 through 2002. For a detailed description of the collection, processing, and cleaning of these yield data, see Kravchenko et al. (2005). The number of yield data points that remained within the studied area of each plot after data processing ranged from at least 500 for wheat to as many as 1600 for corn and soybean. Point yield data were converted into cell-based yield maps on a 4- by 4-m grid (ArcGIS 9.0 Spatial Analyst) using inverse distance weighting with power of two and 15 nearest neighbors. Then, yield data from each year were standardized in each plot as (Zi Zm)/s, where Zi is the yield at location i, and Zm and s are the plot mean yield and the standard deviation, respectively.
Data Analysis
The relationship of total C with 7-yr average standardized yield was studied using simple linear regression. The relationship of total C with topography was studied using multiple regression that included the linear effects of relative elevation, terrain slope, curvature, and reclassified aspect. The need for quadratic terms for elevation, terrain slope, and curvature was investigated separately in each plot and the quadratic terms were added to the regression equation whenever significant at the 0.05 level of significance. To assess the contribution of the 7-yr average crop yield to total C prediction along with topography, the standardized average yield was added to the previously selected topographical regression model of each studied plot. The regression analyses were performed separately in each studied plot as well as in the data set of all 12 plots combined and were conducted using PROC REG (SAS Institute, 2001).
Regression kriging (Odeh et al., 1995; Hengl et al., 2004) was used to evaluate potential improvement in C mapping when using auxiliary information compared with mapping based on available C measurements via only ordinary kriging. For that, in each plot data set we selected 20 C samples located on a regular triangular grid to be a model (or training) data set. The choice of the model data was based on the following considerations: First, sampling on a regular or semiregular grid is among the most common practices in soil sampling and a triangular grid is known to be somewhat superior to a rectangular grid (Webster and Oliver, 2001). Second, the distances between the grid samples within the plots used in this experiment are obviously much smaller than those of sampling on a field or a farm scale, which is often done on the basis of one sample per hectare or one sample per acre; however, the total number of samples (20) is comparable with the number of samples typically obtained per field and used in mapping in field- or farm-scale sampling situations.
The model data set for the combined 12-plot data was created by using two samples from each plot, one from the southwest corner of the plot and one from the northeast corner of the plot. This produced a relatively regularly sampled model data set with 24 points. The sampling density of this model data set was more similar to that of typical field- or farm-scale sampling than the model data sets from the individual plots. In plot data sets and in the combined data set, all remaining samples were used for independent testing. For individual plots, the independent test data sets consisted of approximately 70 to 80 observations. For the combined data, the independent test data set consisted of >1100 observations.
We considered regression kriging with multiple regression equations with topographical variables to assess mapping improvement due to topographical information, and the regression kriging with simple linear regressions with the 7-yr average standardized yield to assess mapping improvement due to crop yield information. Both regression kriging and ordinary kriging were performed using the model data sets and then were used to obtain predictions of the independent test data values.
Sample variograms for the original model data sets for ordinary kriging and for the residuals for regression kriging were fitted with variogram models using the weighted least squares approach, with weights proportional to the number of data pairs in the sample variogram value and inversely proportional to the fitted variogram value (Cressie, 1985). The three most common variogram model types, i.e., spherical, exponential, and Gaussian, were used. A preliminary choice of the model type was based on the visual examination of the sample variogram, then the MSE values of the models that appeared suitable were compared and the one that produced the lowest MSE was selected for kriging. We recognized that the small number of model data points in this study substantially limits the reliability of sample variograms and their fitting.
Sample variogram calculations and fitting were also performed for the complete data sets of both individual plots and the combined data set. A ratio between the variogram model nugget and the sill (sum of nugget and partial sill) was used as a characteristic of the spatial correlation strength in total C distribution across studied landscapes. Expressed as a percentage, the nugget to sill ratio (N/S) reflects the contribution to the overall variability of random variation occurring at distances shorter than those of the minimal sampling distance. Sample variogram calculation, variogram model fitting, and kriging were performed using PROC VARIOGRAM, PROC NLIN, and PROC KRIGE2D, respectively (SAS Institute, 2001).
For each test data set, RMSEs were calculated based on predicted and observed test data values. Relative improvement over ordinary kriging due to using regression kriging was assessed using the respective RMSE:
![]() | [1] |
A relatively small number of C data sets (only 12) used in this study were found to be insufficient for an in-depth assessment of possible sources of influence on relative improvement due to regression kriging over ordinary kriging. Thus, we used multiple simulated data sets. A data set from one of the experimental plots was used as a conditioning data set. Gaussian simulations were performed to create multiple sets of 2500 simulated observations of primary and secondary variables (GSLIB, Deutsch and Journel, 1998). The simulated data sets varied in the strength of the relationship between the primary and secondary variables, with R2 values between primary and secondary variables in the simulated data sets ranging from 0.10 to 0.95 in approximately 0.10 increments. The simulated data sets also varied in the strength of spatial correlation in the primary variable with the N/S ratio ranging from 1% (primary variable with strong spatial correlation) to 70% (primary variable with weak spatial correlation) in approximately 10% increments. From each complete set of 2500 simulated observations, we randomly selected 400 data points that were used as a model data set and 100 data points to be used as a test data set. Regression kriging and ordinary kriging predictions for test data were obtained using model data sets and RMSE and relative improvements of regression kriging over ordinary kriging were calculated as described above. Then, based on the results from all simulated data sets, we assessed the effect of the strength of the relationship between the primary and secondary variables, i.e., R2, and the effect of the strength of the spatial correlation of the primary variable, i.e., N/S ratio, on the relative improvement in mapping accuracy due to regression kriging over ordinary kriging. The relationship between the obtained relative improvement, R2, and N/S data were fitted with a polynomial regression equation. A hierarchical approach was used in polynomial regression fitting with higher order and interaction terms being added sequentially to the initial first-order equation until further higher order additions were not significant at the 0.05 level.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Total C was significantly (P < 0.05) positively correlated with yield in all but one studied plot (CT-r4; Table 1). The percentage of total C variability explained by the 7-yr average yield ranged from 7% (CTcover-r3) to 74% (CT-r3; Table 1). In only two of the studied plots did the yield explain >50% of the total C variability.
|
When 7-yr average yield data were included in the topographical regression models, the yield contribution was statistically significant in only five of the studied plots as well as in the combined data set (Table 1). The regression slopes for yield in these data sets were greater than zero, indicating that even when topographical characteristics were controlled for, the areas within the fields with above-average long-term standardized yield were also the areas with higher total C.
By adding standardized average yield to topographical regression models, we assessed the hypothesis that areas with consistently high or low yields contribute different biomass inputs to the topsoil and that these differences will explain a substantially large additional amount of total C variability besides that already accounted for by topographical characteristics. This hypothesis, however, was not well supported by the data. One reason is that the dense topographical data of this study were more accurate than the yield-monitor data, which despite cleaning and checking could still contain a significant amount of spatial error (Drummond and Sudduth, 2003). The other reason for better performance of topographical data in predicting total C is that topography is a factor that substantially influences yields itself as well as total C. Indeed, in this study the R2 values for regression between 7-yr average yield and topographical features ranged from 0.22 to 0.80 (data not shown). These results are consistent with information on the importance of topography in crop yield variability reported by Kravchenko and Bullock (2000), Cox et al. (2003), Jiang and Thelen (2004), Kaspar et al. (2004), and others. Also, it is possible that corn, soybean, or wheat grain yield is not a good indicator of the actual amounts of biomass inputs to the soil.
Total C predictions of the independent test data sets using the data from 20 grid samples based on the regression kriging with 7-yr average standardized yield resulted in R2 for the regression between predicted and observed values to exceed 0.50 in five of the studied data sets (Table 2). In regression kriging with topography, R2 exceeded 0.50 also in five of the 12 studied data sets. For the combined data set, the R2 values between observed and predicted test data were equal to 0.40 and 0.31 for regression kriging with yield and topography, respectively. Overall, prediction performances of topography- and yield-based regression kriging were relatively similar. Using yield produced lower RMSE and higher R2 values than those of topography in five plot data sets and in the analysis of the combined data set, while topography performed somewhat better in the remaining seven plot data sets.
|
We hypothesize that smoothly varying terrain and, related to it, smoothly varying spatial distributions of crop yields and soil C were a partial reason for the lack of apparent advantage in using topography or yield for C mapping in this study, even in those plots where the relationships between the auxiliary variables and C were relatively strong. When strength of spatial correlation of the primary variable is relatively high (low N/S ratios), the improvement in mapping accuracy due to using secondary variables is known to be relatively small when accounting for secondary information via cokriging procedures (Goovaerts, 1997). Our assessment of regression kriging performance in the simulated data sets at varying R2 values and N/S ratios demonstrated that it is also true in accounting for secondary information in regression kriging procedures. The polynomial regression equation relating relative improvement with R2 and primary variable's N/S ratios indicates that even though the R2 positively affects the relative improvement, it also significantly interacts with N/S ratio (
= 0.05):
![]() | [2] |
|
|
| CONCLUSIONS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
G. Martinez, K. Vanderlinden, R. Ordonez, and J. L. Muriel Can Apparent Electrical Conductivity Improve the Spatial Characterization of Soil Organic Carbon? Vadose Zone J., July 7, 2009; 8(3): 586 - 593. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Crop Science | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||