Agronomy Journal 93:802-808 (2001)
© 2001 American Society of Agronomy
PRODUCTION PAPER
Selecting the High-Yield Subpopulation for Diagnosing Nutrient Imbalance in Crops
Lotfi Khiaria,
Léon-Étienne Parent*,a and
Nicolas Tremblayb
a Dep. of Soil Sci. and Agri-Food Eng., Laval Univ., Paul-Comtois Building, Sainte-Foy, QC, Canada G1K 7P4
b Hortic. Res. and Dev. Cent., 430 Gouin Blvd., St-Jean-sur-Richelieu, QC, Canada J3B 3E6
* Corresponding author (leon-etienne.parent{at}sga.ulaval.ca)
Received for publication August 10, 2000.
 |
ABSTRACT
|
|---|
Plant nutrient status is currently diagnosed using empirically derived nutrient norms from an arbitrarily defined high-yield subpopulation above a quantitative yield target. Generic models can assist Compositional Nutrient Diagnosis (CND) in providing a yield cutoff value between low- and high-yield subpopulations for small databases. Our objective was to compute the minimum yield target for sweet corn (Zea mays L.) and the corresponding critical CND nutrient imbalance index using a cumulative variance ratio function and the chi-square distribution function. Population (40 observations) and validation (20 observations) data were selected at random from a survey database of 240 observations including commercial yields and leaf nutrient concentrations. A filling value (Rd) was computed as the difference between 100% and the sum of d nutrient proportions [Rd = 100 - (N + P + K + ...)]. The CND nutrient expressions were the row-centered ratios of N, P, and Rd proportions in tissue specimens. Variance ratio computations of CND nutrient expressions among two subpopulations arranged in a decreasing yield order were iterated across population data. The proportion of low-yield subpopulation computed at the inflection point of a cubic cumulative variance ratio function was 67.5%, the minimum proportion of low-yield specimens. That exact probability corresponded to a theoretical chi-square value (CND r2) of 1.5 for three components. The critical CND r2 value was validated using independent samples and the sum of the squared CND nutrient indices. The procedure is applicable to small-size crop nutrient databases for solving nutrient imbalance problems in specific agroecosystems. A calculation example is presented.
Abbreviations: CND, Compositional Nutrient Diagnosis CVA, Critical Value Approach DRIS, Diagnosis and Recommendation Integrated System
 |
INTRODUCTION
|
|---|
APPROACHES TO DIAGNOSING leaf nutrient status include the Critical Value Approach (CVA) (Bates, 1971), the Diagnosis and Recommendation Integrated System (DRIS) (Walworth and Sumner, 1987), and Compositional Nutrient Diagnosis (CND) (Parent and Dafir, 1992; Parent et al., 1994). When selecting nutrient norms, a yield cutoff value is decided arbitrarily for defining a high-yield subpopulation. For CVA, the cutoff value is generally 90 to 95% of maximum yield while relating percentage yield to nutrient concentration, assuming that all nutrients except the one being diagnosed are in sufficient, nonexcessive amounts. Parent et al. (1995) found that CND considerably improved the yieldtissue N relationships as polynomial or linear-plateau curves compared with CVA in fertilizer trials with conifer seedlings, onion (Allium cepa L.), and potato (Solanum tuberosum L.).
For DRIS and CND, the high-yield subpopulation is selected from a crop survey database. Walworth and Sumner (1987) proposed to consider variance ratios of nutrient expressions to discriminate between the subpopulations. However, no formal procedure was proposed to optimize the partition. Parent and Dafir (1992) expected that multivariate analysis could provide a means to define the high-yield subpopulation. Parent et al. (1994) proposed the chi-square distribution function to define a CND threshold value for nutrient imbalance.
At the local level, small databases are available to define effective nutrient norms as related to yield target (Walworth et al., 1988). Escano et al. (1981) pointed out that local calibration improved the accuracy of DRIS diagnosis. However, DRIS provides no generic approach to support local diagnosis of nutrient imbalance using small databases.
The objective of this paper is to present a generic approach to select a minimum yield target for the high-yield subpopulation from a small-survey sweet corn database and to provide a CND threshold nutrient imbalance index using the chi-square distribution function.
 |
RATIONALE
|
|---|
As indicated by Parent and Dafir (1992), plant tissue composition forms a d dimensional nutrient arrangement, i.e., simplex (Sd) made of d + 1 nutrient proportions including d nutrients and a filling value defined as follows:
 | (1) |
where 100 is the dry matter concentration (%); N, P, K, ... are nutrient proportions (%); and Rd is the filling value between 100% and the sum of d nutrient proportions computed as follows:
 | (2) |
The nutrient proportions become scale invariant after they have been divided by the geometric mean (G) of the d + 1 components including Rd (Aitchison, 1986) as follows:
 | (3) |
Row-centered log ratios are computed as follows:
 | (4) |
and
 | (5) |
were VX is the CND row-centered log ratio expression for nutrient X. This operation is a control to insure that VX computations have been conducted properly. By definition, the sum of tissue components is 100% (Eq. [1]), and the sum of their row-centered log ratios including the filling value must be zero (Eq. [5]).
Let V*N, V*P, V*K, ... V*Rd and SD*N, SD*P, SD*K, ... SD*Rd be the CND norms as means and standard deviations of row-centered log ratios of d nutrients, respectively. The row-centered log ratios of independent specimens are standardized as follows:
 | (6) |
where IN, ..., IRd are the CND indices.
Additivity or independence among the compositional data is ascertained using a row-centered log ratio transformation (Aitchison, 1986). The CND indices as defined by Eq. [6] are standardized and linearized variables as dimensions of a circle (d + 1 = 2), a sphere (d + 1 = 3), or a hypersphere (d + 1 > 3) in a d + 1 dimensional space. The CND nutrient imbalance index of a diagnosed specimen is its CND r2 and is computed as follows:
 | (7) |
Each specimen is thus characterized by its radius, r, computed from the CND nutrient indices. The sum of d + 1 squared independent, unit-normal variables produces a new variable having a chi-square distribution with d + 1 degrees of freedom (Ross, 1987). Because CND indices are independent, unit-normal variables, the CND r2 values must have chi-square distribution. The chi-square distribution function provides an advantage for CND over DRIS as a generic support model for small databases.
As defined by Eq. [6] and [7], the closer to zero that the CND indices, and thus the CND r2 or chi-square values are, the higher the probability to obtain a high yield. Theoretically, at the critical chi-square value of zero where the ideal nutrient balance is reached, 100% of the population would be expected to produce below-cutoff yield by definition. Most crops would be expected to have the potential to produce low target yields when a high critical chi-square value is set. Therefore, we need a criterion to separate a low- from a high-yield subpopulation to provide an acceptable proportion of expected low yields in a population. Simply setting a yield target, as is usually done for establishing nutrient norms, does not provide a minimum yield cutoff value between the low- and high-yield subpopulations in a survey population.
Steps for Selecting a High-Yield Subpopulation
Step 1. The Mathematical Approach
In a survey population, it is desirable to maximize the number of specimens unequivocally belonging to the low-yield or unhealthy subpopulation (Walworth and Sumner, 1987). The DRIS studies showed that the higher the yield was, the narrower the range of leaf nutrient ratios. Thus, the variance ratios of nutrient expressions between the low- and high-yield subpopulations must depend on the partition between the low- and high-yield subpopulations. An optimum partition should be defined between the two subpopulations. However, there had been no formalism proposed for the partition. An optimum partition could be determined by considering variance ratio functions for nutrient indexes along a decreasing order of yield values. At yield cutoff, a proportion of the whole population is assigned to the low-yield subpopulation. That proportion is an exact probability corresponding to a threshold CND r2 value between low- and high-yield subpopulations. The selected approach is then linked to the chi-square distribution function as a generic model for CND. Should exact probabilities increase with higher yield goals, the CND r2 would decrease according to the chi-square distribution function.
The variance ratio must be low when comparing the variance of a nutrient expression for lowest yields with that for the remainder in a survey population. Conversely, the variance ratio must be high when comparing the variance of a nutrient expression for highest yields with those for the remainder in a survey population. Hence, a curvilinear yieldnutrient relationship must show a yield cutoff value between the low- and high-yield subpopulations at a point where the cumulative variance ratio function between the two subpopulations changes its concavity, i.e., at its inflection point (Fig. 1). The first derivative of the cumulative variance ratio function decreases below the inflection point and increases above it. Thus, discrimination between the low- and the high-yield subpopulations is improved above inflection point. Yield cutoff value at inflection point must be the minimum yield value for separating the two subpopulations. The high-yield subpopulation above inflection point is selected as follows:
- Rank the observations in a decreasing yield order.
- Compute the row-centered log ratios of the nutrient proportions using Eq. [2] to [4].
- Iterate a partition of the database between two subpopulations using the CateNelson procedure (Nelson and Anderson, 1977). In the first partition, the two highest yield values form one group, and the remainder of yield values forms another group; thereafter, the three highest yield values form one group, and the remainder of yield values forms the other. This process is repeated until the two lowest yield values form one group, and the remainder of yield values forms the other. At each iteration, the first subpopulation comprises n1 observations, and the second comprises n2 observations for a total of n observations (n = n1 + n2) in the whole database.
- For the two subpopulations obtained at each iteration, compute the variance of CND VX values. Then compute the variance ratio for component X as follows:
 | (8) |
where fi(VX) is the ratio function between two subpopulations for nutrient X at the ith iteration (i = n1 - 1) and VX is the CND row-centered log ratio expression for nutrient X. The first variance ratio function computed from the two highest yields is put on the same line as the highest yield (here 12.9 Mg ha-1), thus leaving three empty bottom lines.
- The cumulative variance ratio function is the sum of variance ratios at the ith iteration from the top. The cumulated variance ratios for a given iteration is computed as a proportion of total sum of variance ratios across all iterations to compare the discrimination power of the VX between low- and high-yield subpopulations on a common scale. Compute the cumulative variance ratio function FCi
as follows:
 | (9) |
where n1 - 1 is partition number and n is total number of observations (n1 + n2). The denominator is the sum of variance ratios across all iterations, and thus is a constant for component X.
- The cumulative function FCi
related to yield (Y) shows a cubic pattern as follows:
 | (10, 7.) |
The inflection point is the point where the model shows a change in concavity. It is obtained by deriving Eq. [10] twice as follows:
 | (11) |
The inflection point is obtained by equating the second derivative of Eq. [10] to zero as follows:
 | (12) |

View larger version (11K):
[in this window]
[in a new window]
|
Fig. 1. Theoretical relationship between yield and the cumulative variance ratio function, FCi (VX), for a given row-centered log ratio.
|
|
The solution for the yield cutoff value is -b/3a (Fig. 1). The highest yield cutoff value (highest discrimination power) among the d + 1 nutrient computations was retained to calculate the proportion of the low-yield subpopulation below yield cutoff used as critical value for the chi-square cumulative distribution function. The highest yield cutoff value across nutrient expressions was selected to ascertain that the minimum yield target for a high-yield subpopulation will be classified as a high yield whatever the nutrient expression.
Step 2. Derive the Theoretical Threshold Nutrient Imbalance Index
The CND r2 values are distributed like chi-square variables. The chi-square distribution function gives the probability P(X > x) that a chi-square random variable X having d + 1 degrees of freedom is greater than a critical chi-square value x. This distribution provides a simple generic model to obtain CND r2. As shown by the chi-square distribution function, the higher that the proportion of the low-yield subpopulation is, the lower the critical chi-square. The larger that the number of nutrients (d) under diagnosis is, the higher the critical chi-square. We used the proportion of low-yield subpopulation at yield cutoff as an exact probability of the cumulative chi-square distribution function corresponding to a critical chi-square value with three degrees of freedo9for the two-nutrient CND r2 examined below.
Step 3. Validate the Threshold Nutrient Imbalance Index
The theoretical threshold CND r2 obtained in the survey database from the chi-square distribution function can be first validated in a third step by applying Eq. [7] to the survey population. The critical CND indices are derived using the CateNelson partitioning procedure, squared, added to CND r2, and compared with the critical chi-square value obtained for the survey population. A second validation is performed following the same steps as above but using a smaller set of randomly selected data.
 |
MATERIALS AND METHODS
|
|---|
Samples of 40 survey (Table 1) and 20 validation specimens (Table 2) were taken randomly in a survey database of 240 sweet corn observations. Briefly, commercial yields were determined and five nutrient determinations were made during the 19951997 period in the Montérégie region, south of Montreal, QC, Canada. Plant density averaged 58000 plants ha-1. Soils were Humic Haplaquents and Inceptisols. Fertilizers had been applied before seeding according to local recommendations, and no additional fertilizer was applied thereafter. Composite aboveground portions of corn seedlings (1015 subsamples randomly taken in 5- by 10-m plots) were sampled 3 to 4 wk after emergence at the V4V6 stage (approx. 30 cm high) (Benton Jones et al., 1991). Tissues were oven-dried at 70°C for 48 h, ground in a Wiley mill, and digested according to Isaac and Johnson (1976). Nutrient concentrations were determined colorimetrically for N and P using a Bran and Luebbe TRAACS 800 autoanalyzer. Ear yields were collected in two 2-m-long rows in the central part of the plots. Ear yields were weighed at commercial maturity stage with the wrapping leaves removed.
View this table:
[in this window]
[in a new window]
|
Table 1. Computation procedure to select the high-yield subpopulation (i = 113) from tissue analyses of survey specimens in S2.
|
|
Statistical Analysis
The CND norms were the means and standard deviations of row-centered log ratios in a high-yield subpopulation. The CateNelson ANOVA procedure (Nelson and Anderson, 1977) was used to partition yield data between two groups in the validation database to determine threshold values for CND nutrient and r2 indices. All computations were made using Excel software (Microsoft, 1997).
 |
RESULTS AND DISCUSSION
|
|---|
Compositional Nutrient Diagnosis Example for Simplex S2 (N, P)
Step 1
The computations of the cumulative variance ratio functions in S2 for N and P (d = 2) are presented in Table 1. First, yields were ranked in a decreasing order. The R2 was obtained by subtracting N and P percentages from 100% (Eq. [2]). The geometric mean was computed using Eq. [3], and row-centered log ratios VN, VP, and VR2 were computed using Eq. [4]. Each cumulative variance function FCi
, FCi
, and FCi
(Eq. [9]) is the sum of variance ratios between two groups for row-centered log ratios fi(VN), fi(VP), and fi
(Eq. [8]).
For the fifth iteration (Table 1), ear yield was 8.21 Mg ha-1. The variance ratio f5(VN) was computed as follows:
 | (13) |
The cumulative variance function FC5
was computed as follows:
 | (14) |
Relating FCi
, FCi
, and FCi
values to yield gave three cubic models (Fig. 2) with inflection points (Eq. [10]) at -b/3a. Yield cutoff values were -0.76/3(0.03) < 0 for VN, 4.84/3(0.24) = 6.72 Mg ha-1 for VP, and 0.53/3(0.03) = 5.89 Mg ha-1 for VR2. We retained 6.72 Mg ha-1 to define the high-yield subpopulation, which included 13 of the 40 specimens, or 32.5% of the population (Table 1).

View larger version (19K):
[in this window]
[in a new window]
|
Fig. 2. Equations relating ear yield to the cumulative variance ratio function in S2 for computing yield cutoff between low- and high-yield subpopulations at inflection point.
|
|
Step 2
In simplex S2 (Table 1), 27 of the 40 specimens (i.e., 40-13), or 67.5% of the population, was below yield cutoff of 6.72 Mg ha-1. That yield cutoff value was considered the minimum yield value for separating the low- and high-yield subpopulations. The corresponding chi-square value was 1.5 (Fig. 3).

View larger version (17K):
[in this window]
[in a new window]
|
Fig. 3. The chi-square cumulative distribution functions with 3 df to obtain theoretical threshold Compositional Nutrient Diagnosis (CND) r2 values in S2 (1.5) for yield cutoff at 67.5% of low-yield subpopulation.
|
|
Step 3
The validation of the critical chi-square value is conducted using survey and independent specimens and CND norms. The means and standard deviations of VN, VP, and VR2 values for the high-yield specimens (italic values in Table 1) were the CND norms computed as follows:
 | (15) |
 | (16) |
 | (17) |
 | (18) |
 | (19) |
 | (20) |
Thus, V*N + V*P + V*R2 = -0.284 + -2.607 + 2.891 = 0, according to Eq. [5]. In order to validate the critical chi-square distribution of CND r2 values, r2 values were computed by adding up squared nutrient indices. The row-centered log ratio (VN, VP, and VR2) values of the validation population are presented in Table 2. The CND indices IN, IP, and IR2 were calculated using Eq. [6]. For Sample 11 in Table 2 showing a VN value of -0.411, a VP value of -2.718, and a VR2 value of 3.129, nutrient indices were:
 | (21) |
 | (22) |
 | (23) |
The CND indices may have positive or negative values. Nutrient indices close to zero should provide the ideal nutrient balance between N and P. The global nutrient index was computed as follows, squaring the values for IN, IP, and IR2 calculated above:
 | (24) |
Using survey specimens, the relationship between CND r2 and chi-square values had a close fit with a coefficient of determination of >0.999 (P < 0.001) (data not shown), as expected from theory (Ross, 1987).
Validation of the critical CND r2 using independent specimens is shown in Table 2 and Fig. 4. The iterative CateNelson partitioning procedure applied to the relationship between ear yield and CND r2 (Fig. 4) showed a critical value of about 1.4, close to the threshold value of 1.5 obtained at Step 2. The corresponding yield was 6.40 Mg ha-1, close to yield cutoff of 6.72 Mg ha-1 obtained from survey data at Step 1.
For a d + 1 or three-dimensional diagnosis, a threshold CND r2 value of 1.5 could be illustrated as a critical sphere with a radius r of 1.2 centered at V*N, V*P, and V*R2, and including high-yield specimens. That critical radius would be specific to the sweet corn data examined and yield cutoff obtained because the proportion of the low-yield subpopulation at yield cutoff may be different with our sweet corn database compared with another crop database.
Should yield target be higher than yield cutoff, the critical chi-square value must be lower for the same set of nutrient norms. As deduced from Eq. [7], nutrient ranges must also be narrower for the same set of CND nutrient norms. Because the computation of FCi(VX) begins with highest yield level, the high-yield subpopulation is given more weight for defining yield cutoff than the low-yield subpopulation. As a result, should more high-yield specimens be added to the database, the yield cutoff value could also increase, and nutrient norms (primarily standard deviation) could change as well as the CND r2 threshold value.
 |
CONCLUSION
|
|---|
A mathematical approach is presented to separate low- and high-yield subpopulations and establish a critical CND nutrient imbalance index using two sweet corn data sets. A cubic cumulative variance ratio function, FCi
, partitioned the low- and high-yield subpopulations at inflection points. The proportion of low-yield specimens in total population is a chi-square value of the cumulative function used to define a critical chi-square or CND r2. Other calculation steps using a larger number of nutrients and fertilizer trials are required for validating critical CND index ranges in a hyperdimensional body. This improved CND procedure relies on compositional data analysis and the chi-square distribution function. A small database could generate specific CND nutrient norms as means and standard deviations of nutrient multiratios characterizing the high-yield subpopulation in a given agroecosystem.
 |
ACKNOWLEDGMENTS
|
|---|
We are grateful to the Fédération québécoise des producteurs de fruits et légumes de transformation, the Québec Food Processors Association, the Matching Investment Initiative Program of Agriculture and Agri-Food Canada, and the Natural Science and Engineering Research Council of Canada for financial support. The technical assistance of Yvon Perron and Marcel Tétreault is gratefully acknowledged.
 |
REFERENCES
|
|---|
- Aitchison, J. 1986. Statistical analysis of compositional data. Chapman and Hall, New York.
- Bates, T.E. 1971. Factors affecting critical nutrient concentrations in plant and their evaluation: A review. Soil Sci. 112:116130.
- Benton Jones, J., Jr., B. Wolf, and H.A. Mills. 1991. Plant analysis handbook. A practical sampling, preparation, analysis, and interpretation guide. Micro-Macro Publ., Athens, GA.
- Escano, C.R., C.A. Jones, and G. Uehara. 1981. Nutrient diagnosis in corn grown in Hydric Dystrandepts: II. Comparison of two systems of tissue diagnosis. Soil Sci. Soc. Am. J. 45:11401144.[Abstract/Free Full Text]
- Isaac, R., and W. Johnson. 1976. Determination of total nitrogen in plant tissue, using a block digestor. J. AOAC Int. 59:98100.
- Microsoft. 1997. Microsoft Excel 97. Incline Village, NV.
- Nelson, L.A., and R.L. Anderson. 1977. Partitioning of soil test-crop response probability. p. 1938. In M. Stelly (ed.) Soil testing: Correlating and interpreting the analytical results. ASA Spec. Publ. 29. ASA, Madison, WI.
- Parent, L.E., and M. Dafir. 1992. A theoretical concept of compositional nutrient diagnosis. J. Am. Soc. Hortic. Sci. 117:239242.[Abstract/Free Full Text]
- Parent, L.E., A.N. Cambouris, and A. Muhawenimana. 1994. Multivariate diagnosis of nutrient imbalance in potato crops. Soil Sci. Soc. Am. J. 58:14321438.[Abstract/Free Full Text]
- Parent, L.E., M. Poirier, and M. Asselin. 1995. Multinutrient diagnosis of nitrogen status in plants. J. Plant Nutr. 18:10131025.
- Ross, S.M. 1987. Introduction to probability and statistics for engineers and scientists. John Wiley & Sons, New York.
- Walworth, J.L., and M.E. Sumner. 1987. The Diagnosis and Recommendation Integrated System (DRIS). Adv. Soil Sci. 6:149188.
- Walworth, J.L., H.J. Woodard, and M.E. Sumner. 1988. Generation of corn tissue norms from a small, high-yield database. Commun. Soil Sci. Plant Anal. 19:563577.
This article has been cited by other articles:

|
 |

|
 |
 
M-C. Belanger, A. A. Viau, G. Samson, and M. Chamberland
Determination of a Multivariate Indicator of Nitrogen Imbalance (MINI) in Potato Using Reflectance and Fluorescence Spectroscopy
Agron. J.,
October 19, 2005;
97(6):
1515 - 1523.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Khiari, L.-E. Parent, and N. Tremblay
Critical Compositional Nutrient Indexes for Sweet Corn at Early Growth Stage
Agron. J.,
July 1, 2001;
93(4):
809 - 814.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Khiari, L.-E. Parent, and N. Tremblay
The Phosphorus Compositional Nutrient Diagnosis Range for Potato
Agron. J.,
July 1, 2001;
93(4):
815 - 819.
[Abstract]
[Full Text]
[PDF]
|
 |
|