Agronomy Journal Grow Your Career With ASA
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Web of Science (2)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Frömke, C.
Right arrow Articles by Bretz, F.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Frömke, C.
Right arrow Articles by Bretz, F.
Agricola
Right arrow Articles by Frömke, C.
Right arrow Articles by Bretz, F.
Related Collections
Right arrow Biometrics
Right arrow Numerical Solutions
Right arrow Experiment Design
Right arrow Software
Right arrow Statistics
Published in Agron. J. 96:1323-1330 (2004).
© American Society of Agronomy
677 S. Segoe Rd., Madison, WI 53711 USA

Statistics

Simultaneous Tests and Confidence Intervals for the Evaluation of Agricultural Field Trials

Cornelia Frömke* and Frank Bretz

Research Unit of Bioinformatics, Univ. of Hannover, Herrenhaeuser Strasse 2, 30419 Hannover, Germany

* Corresponding author (froemke{at}bioinf.uni-hannover.de)

Received for publication June 18, 2003.

    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
Agricultural designs commonly involve the simultaneous analysis of more than two factors and more than two levels for each factor. This article addresses the use of adequate statistical tests while taking multiplicity into account. Two data sets and their appropriate statistical analysis are presented as illustrative examples. The first data set is a split-plot design, with two fixed factors and a random block factor, and the second data set is a two-factor factorial complete randomized design. Two SAS macros are presented to analyze the data sets properly in terms of the multiplicity problem, and these macros compute simultaneous confidence intervals and multiplicity-adjusted p values. The two macros, called %SimultanTests and %SimultanIntervals, are based on exact evaluations of the underlying multivariate t distribution and are extensions of the published macros %SimTests and %SimIntervals. The macros are widely applicable tools, which can be used with relatively little effort for the analysis of many agricultural designs.

Abbreviations: MCT, multiple-contrast test • SCT, single-contrast test


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
MULTIPLE COMPARISON problems appear in almost any agricultural field experiment, see for example, Clifton-Brown et al. (2001) and Valencia et al. (1999) for details. With an increasing number of experimental questions and hypotheses, the probability to reject a true null hypothesis increases. This problem is referred to in the literature as the multiplicity problem. To control this problem, one has to apply the familywise error rate (FWE), which is the probability of rejecting at least one true null hypothesis when the entire family of hypotheses is considered.

Special methods have been introduced in the literature to analyze such multiple-experiment questions correctly in view of controlling this error rate, see Hochberg and Tamhane (1987) for a general reference. An example where multiplicity appears could be a study where the effect of three irrigation systems (a, b, and c) is tested on two types of cabbage (Brassica oleracea L.) (1 and 2). One might be interested which of the systems has the best influence on the yield of cabbage in general. This results in the all-pairwise comparisons of the systems a vs. b, b vs. c, and a vs. c. In addition, one might want to compare the two types of cabbage for each of the irrigation systems, which leads to three more tests. One can easily see that there is a multiplicity problem: If each null hypothesis is tested against an uncorrected level {alpha}, then the probability of rejecting at least one hypothesis erroneously is larger than the preassigned {alpha} level. In fact, if the six test statistics were independent, the true error rate could be as large as 26.5%. Thus, the experimenter would get an erroneously significant result with a higher probability than the initial 5%. Another agricultural example containing multiplicity could be a study with new insecticides where seven doses and an admitted insecticide (positive control) are tested on the mortality of aphids. The experimenter might be interested which of the doses results in a higher mortality of aphids compared with the control. One would have to test seven null hypotheses: each insecticide against the admitted insecticide. Again, a disregarded multiplicity problem leads to an increased number of erroneously significant results.

Most statistical software packages implement some multiplicity-controlling procedures. The SAS Version 8 (SAS Inst., Inc., Cary, NC) procedures GLM, MIXED, and MULTTEST, for example, provide multiplicity-adjusted simultaneous comparisons of several means. Many designs can be easily analyzed with PROC GLM. It accepts data from a general linear model framework and performs multiple testing for factorial designs containing nested effects for example and other designs. Examples to correct for multiplicity used by GLM are the {alpha} adjustment of Bonferroni (Fisher, 1935), the procedure of Scheffé (Scheffé, 1953), Tukey's all-pairwise comparisons (Tukey, 1953), and Dunnett's comparisons with a control (Dunnett, 1955).

The first two methods are known to be very conservative and should not be used for a well-designed, preplanned experiment. The latter two methods however are restricted to special types of experimental questions and thus are not applicable to a more general type of comparisons. Further, PROC GLM does not provide the computation of simultaneous intervals for an arbitrary set of comparisons; simultaneous confidence intervals are computed for all pairwise and many-to-one comparisons, only. PROC MIXED provides essentially the same methods as PROC GLM does while allowing for the incorporation of both random and fixed effects.

PROC MULTTEST provides methods to control the familywise error rate for arbitrary sets of comparisons of several means. It is therefore not restricted to pairwise differences. An example is a trend test where several doses of one treatment are tested. Considering all doses in the test, one analyzes whether the measured effect is influenced by increasing doses. This cannot be done with pairwise tests. Standard multiplicity adjustments such as the procedures according to Holm (1979) and Hochberg (1988) are included in PROC MULTTEST. In addition, PROC MULTTEST uses more powerful resampling methods, such as the bootstrap and permutation techniques, which are computationally intensive. Unlike the other methods, correlations and distributional characteristics are incorporated into the adjustments since the joint empirical distribution of the test statistics is used (Westfall and Young, 1993). The disadvantage of PROC MULTTEST is the restriction to one-way layouts. Thus, the evaluation of typical agricultural designs involving more than one factor is not possible. However, raw p values, computed by, for example, PROC MIXED and entered in PROC MULTTEST with the option PDATA, are adjusted using standard methods only (e.g., Holm, 1979; Hochberg, 1988). Thus, the more powerful resampling methods, that is, bootstrap and permutation, are not available for complex designs as, for example, a split-plot design.

In this article, we present two SAS macros, %SimultanTests and %SimultanIntervals, to perform multiple comparisons. They take logical dependencies between the hypotheses and stochastic dependencies between the test statistics into account and are thus by construction very powerful. Due to their general structure, the macros are not restricted on specific designs, and they can be used to analyze many agricultural designs with relatively little effort. These two macros are improved versions of %SimTests and %SimIntervals, which were introduced by Westfall et al. (1999). %SimTests and %SimIntervals simulate the multivariate t distribution. Therefore, the resulting critical values or the simultaneous confidence limits depend on simulation. With a large number of simulation runs, these discrepancies can be reduced; however, this increases the computation time of the PC, see Westfall et al. (1999)(p. 89) for details. Given a fixed computation time, %SimultanTests and %SimultanIntervals are more precise compared with the original macros because they use the exact computation of the multivariate t distribution proposed by Genz and Bretz (2002).

This article presents examples that show the analysis of agricultural designs using the macros. A brief introduction into the theoretical background is also given. Advantages compared with the SAS procedures are shown directly at the analysis of the examples. Thus, in the beginning, an example data set and its design is introduced. Then the statistical methods are described, and afterwards we will give an introduction to the macros. Thereafter, the example is analyzed. Then a second example is discussed and analyzed. Afterwards, we present differences in terms of runtime and accuracy of the original macros introduced by Westfall et al. (1999) and our improved versions. In the end, we will give some conclusions.


    EXAMPLE DATA SET
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
The example data set is part of a greenhouse experiment. The effect of two factors, two substrates, and four doses of a fertilizer and a control (water) on spinach (Spinacia oleracea L.) was studied. Each spinach plant was planted into a flowerpot with one of the substrates, and later, one of the fertilizer doses was applied. The pots were placed one behind the other on a rotating conveyor belt. Six blocks, each containing 10 pots, were created to eliminate a possible nuisance factor. In each block, all levels of the two factors appeared twice. In total, 60 plants were used, and the measurements were the dry weights (g). The data set is provided in Table 1.


View this table:
[in this window]
[in a new window]
 
Table 1. Dry weight of spinach split by the factors substrate and fertilizer.

 
To simplify the application of the substrates, the experimenter used a split-plot design with substrates as the whole-plot factor and fertilizer as the subplot factor. In the further analysis, the factors dose and substrate and the interaction between them are modeled as fixed effects while the factor block and the interaction block by main factor are random effects.

The experimenter was interested in whether there was a significant difference between the substrates and whether increasing doses showed a significantly increased effect. This data set is easily analyzed with PROC GLM, and the resulting F values give us information on whether factors or interactions show an overall effect. Since we are interested in more detailed comparisons among the various factor levels, the SAS procedures are inadequate to address the questions above. In the following, we show how to represent the experimental questions as contrasts and how to evaluate them correctly by taking the multiplicity into account.


    STATISTICAL METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
The general linear model we are working with is

where y is the vector containing the observations, X is the design matrix, ß is the vector of the fixed effects, and e is the normal random error term with independent components having a mean zero and common variance {sigma}2. Extensions to mixed models are given later. Our interest is the estimation of the fixed-effects vector ß and to test inferences. The latter is performed by contrast tests. A single-contrast test (SCT) is defined as

where the i denotes the estimate of the true mean µi in the ith group (i = 1, ..., k) and the estimator of the pooled standard deviation is presented by s. The weight ci denotes the contrast coefficient under the condition {sum}i ci = 0, and ni denotes the sample size of group i. Note that the test statistic tSCT is univariate t distributed with v degrees of freedom. Using this test, we could analyze, e.g., whether the lowest dose (Dose 1) of the fertilizer in the example of the previous section produces a significantly higher dry weight compared with the control. Then, we have to set the contrast coefficients to c1 = –1, c2 = 1, and c3 = c4 = c5 = 0. This alters the general test statistic of the SCT to

If we are interested in more than one test, then we would do several contrast tests—for each comparison of interest, one SCT. However, without consideration of the multiplicity, this usage would result in an increased false positive rate. This failure rate can be controlled with the multiple-contrast test (MCT)—see Mukerjee et al. (1987) for example. The test statistic is simply the maximum of all SCTs of interest:

where q is the number of SCTs and the test statistic is multivariate t-distributed, see Bretz et al. (2001).

An example for MCTs is the well-known many-to-one test of Dunnett (1955). This test analyzes each treatment group of interest against a specified control group, that is, in our example, each of the four dose groups against the control. The contrast matrix of our example is

Here, the MCT contains four SCTs. With the MCT, we will get a global information if at least one of the tests is significant, and we will achieve a result for each pairwise contrast.

Another common example for MCTs is the all-pair test of Tukey (1953). This test analyzes all possible pairwise comparisons. In our example, the MCT would contain 10 SCTs with the following contrast matrix:


    THE MACROS
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
As mentioned above, the macros use MCTs to perform tests with an adjustment for multiplicity. To be more exact, multiple tests are done, and the results of all SCTs with adjustment are shown. %SimultanIntervals computes the exact critical value from the multivariate t distribution and computes the simultaneous confidence intervals for each specified null hypothesis. Further, the adjusted p values of all SCTs are shown. A more powerful testing procedure is used by %SimultanTests, which includes logical constraints between the hypotheses for adjustment on multiplicity.

The invocation of the macros is straightforward. They use three other supporting macros to specify the data set, the summary statistics, and the contrasts; these macros are %MakeGLMStats, %Contrasts, and %Estimates. To perform the statistical analysis via the %Simultan* (abbreviation for %SimultanTests and %SimultanIntervals) macros, one has to specify the name of the data set, the group variable, and the model with %MakeGLMStats. Further, specific sets of contrasts can be chosen with this macro. However, to analyze any sets of contrasts for arbitrary linear models, %Contrasts has to be used. In some designs, the estimation of the least square means, the degrees of freedom, the variance estimator, and the covariance matrix via %MakeGLMStats is not appropriate as for mixed models; then, the %Estimates macro has to be used.

Finally, the %Simultan* macros themselves have to be specified. All entries of both macros have a default; thus, no input parameters have to have an entry. Both macros have the input parameters seed, which is the random-number seed, and side, for lower-tailed, upper-tailed, or two-tailed testing. Further, both macros have input parameters for the exactness of the critical value computations; these are maxpts, abseps, and releps. Additionally, %SimultanTests provides, with the input parameter type, the choice between logical constraint or unconstraint tests, and verbose prints supplementary information. %SimultanIntervals has no further input parameters.


    ANALYSIS OF THE EXAMPLE
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
In this section, the above-introduced example is further discussed and then analyzed using the %Simultan* macros. First, we will introduce the model of our split-plot design. Afterwards, some problems are discussed, which occur when using the procedures in SAS for a split-plot design containing mixed effects. Then, the contrasts of interest and the invocation of the macros are presented, and finally, the results are shown.

The model of our data set is:

where yijh (i = 1, ... I, j = 1, ..., J, h = 1, ..., H) is the yield of spinach of the ith substrate (main factor) and of the jth dose (subfactor) in the hth block. The general mean is denoted by µ, {alpha}i is the ith effect of the main factor, ßj is the effect of level j of the subfactor, and bh represents level h of the block factor. Two interactions are included in the model: ({alpha}ß)ij is the interaction between level i of the main factor and level j of the subfactor, and {phi}ih is the error term of the main factor or interaction of the ith substrate and the hth block, respectively. Finally, {epsilon}ijh is the error term of the subfactor. The fixed-effect parameters are subject to the usual summation constraints.

The mean square error for the main factor is a combination of the mean squares of the block and the interaction block by main factor. PROC MIXED can compute these combinations. However, as discussed earlier, there are no methods to control the multiplicity for arbitrary sets of comparisons of several means, except for example single-step procedures. Note that the contrast statement in PROC GLM uses the split-plot mean square error, that is, the mean square error for the subfactor, as the default denominator for all F statistics unless a different error term is specified as an option. However, PROC GLM does not allow a linear combination of mean squares to be used as a denominator of an F statistic in a contrast statement. These and other problems occurring with PROC GLM analyzing a split-plot design with mixed effects are discussed by Littell et al. (2000)(p. 56).

In our example, the experimenter was interested in the difference between the two levels of the main-factor substrate and in the increase of the effect with increased doses. We will start with the analysis of the former factor. Because the main factor contained two levels only, there is no multiplicity problem. Thus, we can analyze the comparison between them with PROC MIXED. This is the syntax:

The resulting two-sided p value is 0.0164, and the 95% confidence interval is (0.219;1.374); compared with an a priori chosen familywise error rate of 5%, the difference between the two substrates is significant. Now, we analyze the subfactor dose. As already explained, we are interested in the efficacy of the fertilizer. Efficacy of a fertilizer can be demonstrated by an increase of the dose, which has to show an increased effect. To show the efficacy of a higher dose to a lower one, we use the step contrasts described by Hirotsu (1982). In this approach, the lower doses are subsequently pooled and compared with the weight average of the remaining higher doses. For the comparisons of the five doses of the fertilizer, we use the contrast matrix defined in Table 2 to test an increase of the spinach dry weight.


View this table:
[in this window]
[in a new window]
 
Table 2. Step contrasts for the subfactor fertilizer.

 
Thus, if we want to test with the %Simultan* macros the increase of the spinach dry weight from Fertilizer 1 to 2, then we analyze whether there is an increasing effect from fertilizer control and 1 to 2, 3, and 4. To analyze the doses and the control of the factor fertilizer with the%Simultan* macros, we have to compute the mean square errors, the least square means, and the degrees of freedom first. Above, we have shown that these parameters can be estimated via %MakeGLMStats. However, for the split-plot design with mixed effects, this macro produces incorrect values because neither the hierarchical nor the mixed structure can be constructed in the model statement. Thus, we have to compute the above written estimates ourselves and enter them into the %Estimates macro. For these calculations, we can use the dose results from PROC MIXED to get the least square means and the mean square error of the subfactor. After specifying the estimates, we create our contrasts of interest with the %Contrasts macro. Here, we can use the contrasts of Table 2.

With this information, the invocation of the macros is straightforward. This is the invocation of %Contrasts:

In the macro environment, the first line denotes the contrasts (in brackets), and the last line shows the labels, again in brackets, of the contrasts. Afterwards, we invoke the %Estimates macro:

The first row in the environment denotes the least square means, the second row gives the error term of the main factor, the third row presents the covariance matrix where the mean square error is multiplied by an identity matrix with the number of rows and columns equal to the number of levels of the main factor and the product is divided by the sample size per level of the main factor, and the last row shows the degrees of freedom of the mean square error of the main factor. At the final step, we invoke %SimultanTests and %SimultanIntervals:

Because we are interested in an increasing effect with higher doses, we set the option side=U for one-sided testing on increase where U denotes upper-tailed testing. Table 3 shows the results for %SimultanIntervals. The first row presents the estimated critical value for the selected familywise error rate. It is computed from the multivariate t distribution. Then the table with the results is shown. The first two rows denote the labels of the contrasts and the estimates for the treatment combinations. The estimates are simply the difference between the combinations of the least square means. Afterwards, the standard error of the differences and the test statistics are printed. Next, the unadjusted and the adjusted p values are shown. Finally, the one-sided 1 – {alpha} confidence interval is presented. As we can see from the adjusted p values and the confidence intervals, none of the comparisons is significant. The output from %SimultanTests is similar (Table 4). Again, first the labels of the contrasts and the estimate of the linear combinations of least square means are shown. Afterwards, the standard errors are presented. Then, the raw p values and the p values from the closed-testing procedure, either adjusted with the method of Bonferroni or computed from the multivariate t distribution, are provided. Although the adjusted p values are smaller than those from %SimultanIntervals, the results are the same: There is no significant increase of the effect with increasing doses. The last value is a measurement for the exactness of the estimated adjusted p values.


View this table:
[in this window]
[in a new window]
 
Table 3. Output of %SimultanIntervals for the first example (estimated 95% quantile = 2.221).

 

View this table:
[in this window]
[in a new window]
 
Table 4. Output of %SimultanTests for the first example [unconstrained (free combinations) step-down tests].

 

    SECOND EXAMPLE: ANALYSIS OF INTERACTION CONTRASTS
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
In this example, an experiment is done to analyze the effect of the application of water and a rotation of crops on the fresh weight (g) of lettuce (Lactuca sativa L.). The study took place in a greenhouse where the lettuce was planted into flowerpots. The first factor, the water application, has two levels: normal and reduced application. A rotation of crops is the second factor. It has the levels 1, 2, and 3, so that 1 denotes that the lettuce was planted into unused substrate, Rotation 2 is already-used substrate (once), and Rotation 3 is twice-used substrate. For each combination of the factors, 12 plants were used. Note that in the second rotation level, there are four missing values, which are assumed to be missing at random. Table 5 shows the data set.


View this table:
[in this window]
[in a new window]
 
Table 5. Fresh weight of lettuce split by water application and crop rotation.

 
The experimenter was interested in possible interactions between the levels of the two factors. This interaction can be measured when the effect of a level of one factor depends on the level of the other factor. The model contains the grand mean, the two factors, the interaction term, and the error term:

where yijk (i = 1, ..., I, j = 1, ..., J, k = 1, ..., K) is the fresh weight (g) of the kth lettuce plant of the ith water application and the jth crop rotation. The general mean is denoted by µ, {alpha}i is the effect of the water application, ßj denotes the crop rotation, and ({alpha}ß)ij represents the interaction term and {epsilon}ijk is the error term. Later, we will specify our contrasts of interest, in terms of the interaction terms ({alpha}ß)ij.

The SAS procedures provide a global test whether the model contains interactions or not. However, in some experiments, it is essential to know which levels of the factors interact. With the macros %SimultanIntervals and %SimultanTests, such computations can be done with interaction contrasts. An interaction contrast has the form:

For our experiment, the interaction contrasts are specified as shown in Table 6. The contrast coefficients are denoted by a letter and a number; the former one represents the level of the water application, and the latter one is the level of the crop rotation. Thus, a contrast test with the contrast n1 – n2 – r1 + r2 analyzes whether the difference between the first and the second crop rotation at the normal water application is significantly different from the difference at the reduced water application.


View this table:
[in this window]
[in a new window]
 
Table 6. Interaction contrasts for the lettuce plant example.

 
We will use these contrasts in the macro %Contrasts. However, to invoke %SimultanTests and %SimultanIntervals, we still need the summary statistics. Because of the simplicity of the design, it is not necessary to compute the mean square error, the degrees of freedom, and the least square means with, for example, PROC MIXED, as we did in the first example. We can use %MakeGLMStats, which needs only the name of the data set, the class variables, and the model. This is the invocation of %MakeGLMStats:

After the invocation of %MakeGLMStats, %Contrasts has to be specified:


Because %MakeGLMStats computes the summary statistics and this macro needs a coefficient for the grand mean µ, we have to include a zero in the first position of every specified contrast. Finally, the %Simultan* macros have to be invoked. Notice that we did not make an entry for one- or two-sided testing. Our interest is only whether the difference of two levels of the factor crop rotation is dependent on the level of the water application. Thus, we are testing two sided, which is the default entry of the macros. Table 7 shows the result of the %SimultanTests macro. Comparing with the familywise error rate of 5%, all adjusted p values show a significant result. This means that the differences between the levels of the crop rotations depend on the level of the water application. %SimultanIntervals shows similar results (Table 8). Again, the adjusted p values from %SimultanIntervals are equal or greater than the adjusted ones from %SimultanTests. However, we get the additional information of the 95% simultaneous confidence intervals for the parameters of interest.


View this table:
[in this window]
[in a new window]
 
Table 7. Output of %SimultanTests for the second example [logically constrained (restricted combinations) step-down tests].

 

View this table:
[in this window]
[in a new window]
 
Table 8. Output of %SimultanIntervals for the second example (estimated 95% quantile = 2.401).

 

    NUMERICAL COMPARISONS
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
In this section, the original macros from Westfall et al. (1999) are compared with our enhanced versions. First, the output of the discussed macro is shown and compared with the new versions. Afterwards, the differences in runtime given a specific parameter of accuracy are shown for %SimIntervals and %SimultanIntervals, %SimTests, and %SimultanTests. Then, the accuracy of the adjusted p values, critical values, and simultaneous confidence limits of the macros, given a runtime of 1 min, are compared. For this purpose, the first example data set is used. In the beginning, we will present the output of %SimIntervals and compare it with our new version. The invocation of the %SimIntervals macro results in the output shown in Table 9. Compared with the output of %SimultanIntervals (see Table 3), this output differs only in the results of those statistics where the multivariate t distribution is simulated. These are the confidence limits, the adjusted p values, and the critical value. The estimates, standard errors, test statistics, and raw p values are the same as in %SimIntervals. Although the outputs seem to be similar, differences between the macros can be seen by comparing the runtime and the accuracy of the results. For all comparisons, we used a 2.6GHz PC with 256 MB RAM.


View this table:
[in this window]
[in a new window]
 
Table 9. Numerical comparisons: output of %SimIntervals (estimated 95% quantile = 2.232).

 
First, the macros %SimIntervals and %SimultanIntervals are compared for their runtime given a specific accuracy of the critical value. To ensure an accuracy of 0.0007 or less, it takes a CPU runtime of 181 s for %SimIntervals, whereas %SimultanIntervals needs a CPU runtime of 19 s. Note, however, that these values are specific for the analyzed example. In other applications, the time differences may be smaller or larger depending on the number of contrasts and their correlations.

Allowing a runtime of 1 min for both macros, one can compare the variability of the results. We computed for the same data set the adjusted p values, lower confidence bounds, and critical values but using 20 different random-number seeds to assess the variability. Afterwards, we calculated the standard deviations of all nine results (four adjusted p values, four lower confidence bounds, and the critical value). While %SimIntervals has a standard deviation of the critical value of 0.00092 and a mean standard deviation of 0.00019 of the adjusted p values and 0.0008 of the lower confidence bound, %SimultanIntervals shows a smaller variability. For this macro, the critical value has a standard deviation of 0.00006 and a mean standard deviation of 0.00012 for the adjusted p values and 0.000059 for the lower confidence bound. The results indicate that %SimultanIntervals achieves a higher accuracy for the runtime than %SimIntervals.

Now we compare %SimultanTests with %SimTests. Again, we start with the comparison of the outputs given by the macros. Because the output of %SimultanTests can be seen in Table 4, in Table 10, we just show the one printed out by %SimTests. Naturally, the estimates, standard errors, raw p values, and p values from the closed-testing procedure, adjusted with the Bonferroni method, are the same. Again the statistics differ where the multivariate t distribution is simulated. In particular, these are the p values, which are corrected for multiplicity with the closed-testing procedure, and adjustments are done by the use of the critical values from the multivariate t distribution. As for the %SimultanIntervals macro, we compare %SimultanTests with its original version in terms of runtime and accuracy. First, we will present the differences in runtime, given a specific accuracy of the adjusted p values. To guarantee an accuracy less or equal to 0.0001 of the adjusted p values, %SimultanTests needed less than 1 s to analyze the four contrasts. Compared with %SimultanTests, %SimTests has a higher runtime. To achieve the same accuracy of the adjusted p values, %SimTests required 219 s.


View this table:
[in this window]
[in a new window]
 
Table 10. Numerical comparisons: output of %SimTests [unconstrained (free combinations) step-down tests].

 
Now, we analyze the macros for differences in the accuracy of the adjusted p values given a runtime of less than or equal to 1 min. The macros are both invoked for 20 different random number seeds, and the mean standard deviation of the four adjusted p values is calculated. With a runtime of 1 min or less, the mean standard deviation of the adjusted p values of %SimTests is 0.00018. In contrast to this, the mean standard deviation of the results computed by %SimultanTests is 0.00004 with a runtime of 6 s.


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 
The SAS procedures provide solutions for the analysis of many agricultural designs. However, statistical questions, as for example, a split-plot design with mixed effects, do exist, which either cannot be solved with these common procedures or are technically tricky to solve. Many of these problems can be analyzed with the macros %SimultanTests and %SimultanIntervals. The macros are powerful and practical tools. They can be used with relatively little effort and are a valuable addition to the analysis of many designs.

In this article, only a small part of the possibilities of the %Simultan* macros is shown. They can also be used for repeated measurements, multiple endpoints, and other applications. Westfall et al. (1999) present many examples with the related macros %SimTests and %SimIntervals. However, the invocation of these macros is a bit different, so when our macros are used, the code has to be adapted. The improvement to the macros of Westfall lies in the efficient calculation of the critical values from the multivariate t distribution instead of computing them via simulation. Thus, because %SimTests and %SimIntervals simulate the critical values, their results are less efficiently computed compared with using %SimultanIntervals and %SimultanTests. Further, our improved versions have a substantially lower runtime and a higher accuracy compared with the original macros.

The example data sets and the SAS macros proposed in this paper can be downloaded from www.bioinf.uni-hannover.de/~froemke/art/ag04/index.html (verified 2 June 2004).


    ACKNOWLEDGMENTS
 
Our thanks to Dr. A. Wissemeier, BASF Aktiengesellschaft Fertilizer Development, and M. Wilkening for making their data available as examples. For their technical support, help, and advice, we gratefully acknowledge Prof. L.A. Hothorn and Prof. P.H. Westfall. We are in debt to the associate editor and two referees for many constructive comments and suggestions.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 EXAMPLE DATA SET
 STATISTICAL METHODS
 THE MACROS
 ANALYSIS OF THE EXAMPLE
 SECOND EXAMPLE: ANALYSIS OF...
 NUMERICAL COMPARISONS
 CONCLUSIONS
 REFERENCES
 





This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Web of Science (2)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Frömke, C.
Right arrow Articles by Bretz, F.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Frömke, C.
Right arrow Articles by Bretz, F.
Agricola
Right arrow Articles by Frömke, C.
Right arrow Articles by Bretz, F.
Related Collections
Right arrow Biometrics
Right arrow Numerical Solutions
Right arrow Experiment Design
Right arrow Software
Right arrow Statistics


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Crop Science Vadose Zone Journal
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome