Agronomy Journal Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 17 August 2005
Published in Agron J 97:1291-1294 (2005)
DOI: 10.2134/agronj2004.0216
© 2005 American Society of Agronomy
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Wu, J.
Right arrow Articles by Watson, C. E.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Wu, J.
Right arrow Articles by Watson, C. E.
Agricola
Right arrow Articles by Wu, J.
Right arrow Articles by Watson, C. E.
Related Collections
Right arrow Statistics
Right arrow Crop Ecology
Right arrow Crop Genetics
Right arrow Other Crop Management
Right arrow Cotton
Right arrow Experiment Design
Right arrow Biometrics

Cotton

Comparisons of Two Statistical Models for Evaluating Boll Retention in Cotton

Jixiang Wua, Johnie N. Jenkinsb,*, Jack C. McCarty, Jr.b and Clarence E. Watsonc

a Dep. of Plant and Soil Sci., Mississippi State Univ., Mississippi State, MS 39762
b Crop Sci. Res. Lab., USDA-ARS, Mississippi State, MS 39762
c MAFES Administration, Mississippi State Univ., Mississippi State, MS 39762

* Corresponding author (jnjenkins{at}msa-msstate.ars.usda.gov)

Received for publication August 13, 2004.

    ABSTRACT
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Boll number is one of the most important traits related to yield of upland cotton (Gossypium hirsutum L.). Evaluation of boll retention properties at different fruiting sites would provide useful information for cotton breeding and cotton growth management. The presence or absence of a boll at each fruiting position can be considered as binomially distributed. In this study, 188 upland cotton recombinant inbred (RI) lines, two parental lines, and a control cultivar, Stoneville 474, were used. These lines were planted at Mississippi State, MS in 1999. The data set was analyzed by the mixed linear model and logistic regression model. The results showed that the boll retention for the first position was significantly different among nodes but expressed similar total numbers from the first position among RI lines. Estimates for boll retention were similar for both models; however, the logistic regression model gave smaller confidence intervals for each estimate than the mixed linear model.

Abbreviations: CIL, confidence interval length • RI, recombinant inbred (lines)


    INTRODUCTION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
BOLL NUMBER is one of the most important traits related to yield of upland cotton (Gossypium hirsutum L.). The transformation and development of bolls on a plant is time and space dependent. Some researchers have focused on studying boll retention properties at different nodes and positions as well as total boll number per plant (Jenkins et al., 1990a, 1990b; Kerby et al., 1987; Jenkins and McCarty, 1995; Shoemaker, 2000). This research not only evaluated the positional contributions to total yield production but also evaluated growth behavior such as earliness. Previous research showed that bolls from first position contribute 66–75%, and bolls from second position 18–21%, to total yield of modern cultivars (Jenkins et al., 1990a, 1990b; Jenkins and McCarty, 1995; Kerby et al., 1987).

Generally, this space-dependent character, boll number or boll retention, is treated as a continuously distributed variable, which can be analyzed by the analysis of variance (ANOVA) models; however, one potential problem is that the error variation of boll retention could be quite different across positions and nodes due to changes in environmental and physiological conditions during the growing season. A data set with heterogeneity of random error variations would violate one of the requirements of the ANOVA model. As previously stated, boll retention at different fruiting sites could be correlated and have different error variations; thus the mixed model with different error variance–covariance structures may be used to improve the statistical testing powers (Littell et al., 1996).

On the other hand, a boll at a specific fruiting site on a plant is expressed as either present or absent, and thus it can be considered as binomially distributed. Boll retention varies among different fruiting sites. For example, boll retention at the first position in the middle of the plant is generally greater than that at other positions. Logistic regression analysis is often used to investigate the relationship between a binary trait and a set of explanatory variables. Several books have discussed logistic regression (Collett, 1991; Agresti, 1990, 1996; Cox and Snell, 1989; Hosmer and Lemeshow, 1996; McCullagh and Nelder, 1989). Currently, statistical software packages such as SAS are available for logistic regression analysis. In addition, SAS version 8.0 enables researchers to specify categorical variables as explanatory variables in the model (SAS Inst., 1999b).

In the current study, data from first-position bolls of 188 upland cotton RI lines, two parental lines, and one commercial cultivar in 1 yr, were used. Data analyses were conducted subject to the mixed linear model with four error structures and subject to the logistic regression model. Boll retention and its 95% confidence intervals at different fruiting nodes were calculated for both the mixed linear model and the logistic model. The purposes of this study were to compare the estimates of boll retention and their statistical precision. The results should help researchers determine which model should be used to analyze a binary-type trait.


    MATERIALS AND METHODS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Materials
One hundred eighty-eight RI lines (F8) were developed by single-hill (bulked progeny row) procedure (Fehr, 1987) from the G. hirsutum intraspecific cross HS46 (P1) x MARCABUCAG8US-1-88 (P2) (Shappley et al., 1998a, 1998b). A cross between P1 and P2 was made at Mississippi State, MS in 1991, and the F1 generation was grown in 1992. One hundred F2 seeds from one F1 individual were planted in the greenhouse and selfed in 1992. The F3 seeds were planted in 12-m single-row plot (named as single hill) at Mississippi State in the spring of 1994, and plants were self-pollinated and bulked by progeny row. In the winter of 1994, F4 selfed seeds were sent to a nursery in Mexico for generation increase by selfing and bulked by progeny row to obtain F5 seeds. In the spring of 1995, two-row F5 plots from each F2–derived family were planted, and 25 individual plants were selfed and harvested to obtain F6 seeds. In the winter of 1996, one seed from each of 25 selfed plants from each F2–derived family was sent to Mexico. Up to eight plants from each family were selfed to produce F7 seeds. In the winter of 1998, up to eight individual plant progenies from each of 94 F2–derived families were planted and hand-harvested separately (F8 seeds). Two lines were randomly chosen from each F2–derived family to reduce the population size.

The 188 RI lines (two lines from each of 94 F2–derived families), two parental lines, and one commercial cultivar, Stoneville 474, were grown at the Plant Science Research Center, Mississippi State, MS in 1999 with four replications. Plots consisted of two rows, 12 m long with row spacing of 0.97 m. The field soil was a leeper silty clay loam. Before boll sample hand picking and machine harvest, 10 normal plants (no aborted terminals) in each two-row plot were randomly selected to determine boll retention for the first position from Main-Stem Nodes 5 to 22. Data were recorded as boll present = 1 or absent = 0. For each plot, boll retention of the first position for each node was calculated as number of bolls present divided by number of plants.

Methods
The linear model used for mixed linear approach is yij = µ + Ni + eij, where µ is the population mean, Ni is the node effect, and eij is the residual. The eij could have some genetic correlation at different nodes for the same genotype and different residual variances; thus, we consider four types of variance–covariance structures [CSH, ARH(1), UN, and CS; see Table 1 for the definitions] in the linear model (Littell et al., 1996; SAS Inst., 1999b), among them, CS type is equivalent to the linear model yij = µ + Ni + Gj + eij, which can be also analyzed by GLM or ANOVA method, and Gj is genotypic effect and is considered a random effect. The least-squared means and standard errors for boll retention at each node were estimated for four different structures. Confidence interval length (CIL) of 95% for each parameter was calculated based on each standard error. The formula used for the calculation of CIL of 95% is: CIL = 2 x t0.025 x SE, where SE is the standard error for boll retention at a specific node.


View this table:
[in this window]
[in a new window]
 
Table 1. Different variance–covariance structures used for mixed linear model analyses.

 
Total number of bolls present for each node and each genotype over replications was used for logistic regression model. In the logistic regression analysis, the link function of logit was employed. The model used for logistic regression was {pi} = {pi} = eµ+Ni+Gj/, where the definitions µ, Ni, and Gj have been stated above. ij = + i + Gj and standard error ({eta}ij) were calculated. The boll retention was estimated by = eij/ and 95% CIL was estimated via [ij + z0.025 * (ij)] – [ijz0.025 * (ij)] for the logistic regression model (SAS Inst., 1999a). All data analyses were conducted using SAS 8.0 (SAS Inst., 1999b).


    RESULTS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Phenotypic Data
Mean boll retention over all RI cotton lines for different nodes is summarized in Table 2. No fruiting node (Position 1) had boll retention greater than 50%. Node 5 had approximately 20% boll retention while Nodes 6 through 13 had greater than 30% boll retention. Only Nodes 7 and 8 had greater than 40% boll retention. Node 7 had the highest boll retention, reaching approximately 44%. Above Node 7, boll retention decreased. Normally, bolls from the middle nodes account for the majority of the contribution to total boll number per plant. For example, the contribution from Nodes 6 through 15 accounted for 84% of total boll number for the first position. Contribution from Nodes 7–12 accounted for 55%.


View this table:
[in this window]
[in a new window]
 
Table 2. Statistical properties for boll retention for different nodes of recombinant inbred lines.

 
Mixed Linear Model Analysis
Sum of squares for boll retention based on genotype x node means at the first position obtained using the ANOVA approach are summarized in Table 3. Both genotype and node had significant impacts on boll retention at the first position. To further determine the relative importance of genotypic and node effects contributing to the phenotypic variance, we considered both genotypic and node effects as random, and variance components for genotypic and node effects were estimated using the results listed in Table 3. Node effects contributed to 78.6% of total variation while genotypic effects contributed to 1.7% of total variation. Thus, the data suggested that node effects had more important impact on boll retention than genotypic effects in this study, which was almost negligible. The residual including node x genotype interaction effects contributed to 19.7% of total variation in boll retention.


View this table:
[in this window]
[in a new window]
 
Table 3. Sum of squares for boll retention (%) among genotypes and nodes by the general linear model.

 
Residual variances for boll retention on different nodes estimated by the mixed linear model approach for three repeated measurement variance–covariance structures [CSH, ARH(1), and UN] are summarized in Table 4. The residual variances obtained by these three variance–covariance structures varied among nodes, and they were similar for these three variance–covariance structures.


View this table:
[in this window]
[in a new window]
 
Table 4. Estimates of residual variance for three variance–covariance structures using the mixed linear model approach.{dagger}

 
Logistic Regression Analysis
Both genotype and node were considered as categorical explanatory variables in the logistic regression analysis. A stepwise selection procedure was applied to choose the candidate explanatory variables during the analysis. In this study, only node was selected to have significant effects on boll retention in the logistic regression model (Table 5). Nodes 5 through 16 expressed positive effects while Node 17 and above expressed negative effects, indicating that Nodes 5 through 16 had higher boll retention than Node 17 and above. Therefore, boll retention could be estimated by the following formula: = {pi} = e+i/, where the estimated values for Ni and µ are listed in Table 5.


View this table:
[in this window]
[in a new window]
 
Table 5. Estimated node effects in the logistic regression model using recombinant inbred lines.

 
Comparisons between Mixed Linear Model and Logistic Regression Model
Mean boll retention and standard errors for different nodes with RI population were estimated for mixed models and logistic regression models. Estimated boll retention appeared to be very similar for the two models (Table 6), suggesting that both mixed models and logistic model could offer similar estimates. Mean 95% CIL was 2.08, 2.11, 2.07, 2.25, and 1.59% for the CSH, ARH(1), UN, CS, and logistic, respectively. On the other hand, the logistic regression model gave smaller CIL than the mixed linear model with four types of error structures for each estimate of boll retention (Table 6). It suggested that the use of some error structures in the mixed linear model may provide a higher precision than the use of a linear model approach.


View this table:
[in this window]
[in a new window]
 
Table 6. Estimated first-position boll retention and their 95% confidence interval length (CIL) on different nodes.

 

    DISCUSSION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Understanding the properties of space-dependent boll retention is useful for breeders and growers to understand breeding and production. In previous studies, this trait was considered as a continuously and normally distributed variable. In this study, we looked at boll retention differently and analyzed it using a logistic regression model. This study showed that the logistic regression model gave similar mean estimates as the mixed model with different error structures but provided different confidence intervals. Although we found that the logistic regression model provided the smaller confidence intervals, we still cannot prove that the logistic model is better than the other. The reasons are possibly that the mixed linear approach (including ANOVA method) calculates the standard error of each least square mean based on the residual variance while the standard error for an estimated probability in a logistic regression model depends on the estimated probability and the number of plants observed.

Boll retention for the first position in the middle of the plant for the RI lines in this study was lower than that for widely grown cultivars (Jenkins and McCarty, 1995). Retention could be improved through a proper breeding scheme and/or improvement of environmental conditions. Due to the instability, small-sized bolls, and poor fiber quality for Node 16 and above in cotton production, bolls produced from these nodes usually can be ignored. High boll retention for Nodes 6 through 15 should be very important for improving cotton production because bolls in the middle of a plant normally yield better fiber and account for the major contribution to total cotton yield (Jenkins and McCarty, 1995). Little difference was found for boll retention at the first position for Nodes 7 to 14 among commercial cultivars except for the short-season genotypes DH 126 and DES119 and full-season cultivars DP90 and DP5690 (Jenkins and McCarty, 1995). Similar results were found in this study; however, the unpublished data and many other studies showed that total boll number and cotton yield were mainly controlled by genotypic effects. Possibly, bolls (or boll retentions) at the second position may make yield differences among genotypes. Thus, the potential for boll retention at other positions, such as second position, on the middle nodes of a plant could be an important consideration for yield improvement. Field management should focus on how to improve boll retention probability on the first two positions.

The main objective was to compare the results between the mixed linear model and the logistic regression model. Due to the time and labor required to collect boll retention data for 191 cotton lines, this investigation was conducted only for the first position in 1 yr; however, we believe that the results obtained from a large data set provided reliable information for comparing the two statistical models to evaluate boll retention and possibly other binary traits. If genotype x environment interactions have strong impacts on boll retention, repeating the experiment in multiple environments would be needed.


    NOTES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Contribution of the USDA-ARS in cooperation with the Mississippi Agric. and Forestry Exp. Stn.


    REFERENCES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 





This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Wu, J.
Right arrow Articles by Watson, C. E.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Wu, J.
Right arrow Articles by Watson, C. E.
Agricola
Right arrow Articles by Wu, J.
Right arrow Articles by Watson, C. E.
Related Collections
Right arrow Statistics
Right arrow Crop Ecology
Right arrow Crop Genetics
Right arrow Other Crop Management
Right arrow Cotton
Right arrow Experiment Design
Right arrow Biometrics


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Crop Science Vadose Zone Journal
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome