Abstract
In crossed, two-factor studies with one observation per factor-level combination, interaction effects between factors can be hard to detect and can make the choice of a suitable statistical model difficult. This article describes hiddenf, an R package that enables users to quantify and characterize a certain form of interaction in two-factor layouts. When effects of one factor (a) fall into two groups depending on the level of another factor, and (b) are constant within these groups, the interaction pattern is deemed "hidden additivity" because within groups, the effects of the two factors are additive, while between groups the factors are allowed to interact. The hiddenf software can be used to estimate, test, and report an appropriate factorial effects model corresponding to hidden additivity, which is intermediate between the unavailable full factorial model and the overly-simplistic additive model. Further, the software also conducts five statistical tests for interaction proposed between 1949 and 2014. A collection of 17 datasets is used for illustration.The crossed, two-factor layout with one observation per factor-level combination is a simple and widely used plan. These designs arise in the context of (a) completely randomized two-factor experiments, (b) randomized complete block experiments that block on a source of nuisance variability, and (c) observational studies with two factors. If the effects of the two factors do not interact, then this design permits inference for both factors simultaneously. The following model is commonly used for data collected using these designs:
\[\label{eq:RCBD} y_{ij}=\mu + \tau_i + \beta_j + \epsilon_{ij}, (\#eq:eqRCBD)\]
where \(i=1,\ldots,c\) and \(j=1,\ldots,r\) are indices for levels of
the two factors \(C\) and \(R\), \(\mu\) represents the overall mean of the
outcome \(y_{ij}\), and the \(\tau_i\) and \(\beta_j\) terms quantify the effects of
factors \(C\) and \(R\). The errors are frequently assumed
independent and identically normally distributed, with constant error
variance \(\sigma^2\). Since data from
these designs are easy to display using two-way tables, frequently the
levels of factors \(R\) and \(C\) are called "rows" and "columns,"
respectively. In this report (and the hiddenf
manual), Factor \(R\) is referred to as
the "row" or "grouped" factor, while factor \(C\) is called the "column" factor.
Model (@ref(eq:eqRCBD)) assumes that the effect of the row factor on the
response \(y_{ij}\) does not depend on
the level of the column factor, and vice versa. In statistical parlance
this is known as the assumption of "additivity." This assumption is
sometimes made out of necessity as the alternative of including the
interaction term from the full factorial effects model precludes
statistical inference. After computing the interaction mean square, zero
degrees of freedom remain with which to estimate the error variance,
\(\sigma^2\). Data arising from
(@ref(eq:eqRCBD)) are shown in Figure 1 panel
A.
If the effect of factor \(C\) varies
across levels of the factor \(R\), then
there is statistical interaction. In this case (@ref(eq:eqRCBD)) is
misspecified. Graphical assessment of additivity is commonly made using
an interaction plot. Figure 1 panel B is an
interaction plot for the cjejuni.mtx data set included in
the hiddenf package (Qiu
2013).
cjejuni.mtx data, which visually exhibit
non-additivity.Since the data in Figure 1 panel B do not
exhibit parallelism, there is graphical evidence to suggest that
(@ref(eq:eqRCBD)) may not be a suitable model. This is problematic
because inferences conducted under (@ref(eq:eqRCBD)) will be incorrect
if (@ref(eq:eqRCBD)) fails to describe the true nature of the variables
under investigation. Fortunately, many methods have been devised to
assess non-additivity in these designs, including those in (Tukey 1949), (Anscombe
and Tukey 1963), (Mandel 1961),
(Mandel 1971), (Johnson and Graybill 1972), (Hirotsu 1982), (Kharrati-Kopaei and Sadooghi-Alvandi 2007),
(Tusell 1990), (Boik 1993b), (Franck,
Nielsen, and Osborne 2013), and (Malik,
Mohring, and Piepho 2014). (Alin and Kurt
2006) provide a review of methods available up to 2006. We return
to the analysis of these and other data in subsequent sections of the
paper.
Despite the wide usage of unreplicated two-way designs, computational
tools to assess non-additivity are lacking, particularly for more recent
methods. Currently, the additivityTests
package (Simecek and Simeckova 2012)
provides test statistics, critical thresholds, and binary reject/fail to
reject decisions for tests proposed in (Tukey
1949), (Mandel 1961), (Boik 1993b), (Tusell
1990), (Johnson and Graybill 1972),
and (Simecek and Simeckova 2012). The
additivityTests package does not produce p-values.
This paper describes the R package hiddenf, which makes
available several tools for the analysis of non-additivity in
unreplicated two-way layouts. This package contributes to available
computational resources by (a) providing statistical and graphical
diagnostic tools within the context of hidden additivity that enable the
user to go beyond overall tests of additivity towards the inferential
goal of characterizing and quantifying the magnitude of interaction, and
(b) providing p-value computations for five methods to detect
non-additivity. Supported tests include those proposed in (Tukey 1949), (Mandel
1961), (Kharrati-Kopaei and
Sadooghi-Alvandi 2007), (Franck, Nielsen,
and Osborne 2013), and (Malik, Mohring,
and Piepho 2014). The latter three are newly available in an
open-source repository via the hiddenf package, which is
available from the Comprehensive R Archive Network (CRAN).
Hidden additivity occurs when the levels of factor \(R\) fall into a smaller number of groups,
such that within these groups the effect of factor \(C\) is constant across levels of factor
\(R\), but there is \(C \times group\) interaction (Franck, Nielsen, and Osborne 2013). Group
membership of levels of \(R\) can be
regarded as latent, unobservable random variables. Inspection of Figure
1 panel B provides an example. If one group is
formed from the two lines (processing plants) that reach their peak in
year 3, and another from the two lines that reach their peak in year 4,
then the year and plant effects are roughly additive within these
groups.
The HiddenF() function accepts an \(r \times c\) data matrix as input and
returns an object of the HiddenF class. Objects of this class
store the p-value from the all-configurations maximum interaction F
(ACMIF) test (Franck, Nielsen, and Osborne
2013) for hidden additivity, the number of configurations under
consideration, the configuration that achieves the maximum interaction F
score (or maximum hidden additivity), and the data as a list. Several
generic functions, including print(), anova(),
summary() and plot(), can be applied to
objects of the HiddenF class to produce output useful for the
quantification and characterization of interaction. By default, the
HiddenF() function considers the row factor as the variable
to group. Submitting a transposed matrix executes the grouping on the
column variable. Deciding which factor to group is subjective, and can
be approached using domain knowledge or a trial-and-error approach.
Figure 2 summarizes the functionality of the
hiddenf package.
HiddenF() function, which
returns an object of the HiddenF class. The user can then
characterize hidden additivity and obtain p-values for tests of
non-additivity.The remainder of the paper is organized as follows. The C. jejuni data are first described as a relevant example. The following section reviews the testing procedure for hidden additivity. Functionality to detect hidden additivity using hiddenf is then detailed. Four other supported methods to detect non-additivity are then reviewed. Seventeen data sets are described and explored using the methods supported by hiddenf. The article concludes with a summary.
We now return to the data from Figure 1 panel
B. The vertical axis in this plot is the proportion of disease-resistant
bacteria samples of the C. jejuni strain taken from turkey
processing facilities. These data were collected over a five year period
across four processing plants in North Carolina. Each line corresponds
to a plant, and the year labels correspond to 2008-2012. The matrix
cjejuni.mtx contains the fractions of bacteria that are
classified as the strain C. jejuni.
> library(hiddenf)
> data(cjejuni.mtx)
> cjejuni.mtx
[,1] [,2] [,3] [,4] [,5]
[1,] 0.16 0.08 0.44 0.06 0.10
[2,] 0.21 0.10 0.16 0.55 0.25
[3,] 0.16 0.08 0.56 0.26 0.26
[4,] 0.07 0.16 0.21 0.42 0.04
Hidden additivity is a useful concept for the analysis of many types
of data because the structure is easy to visualize and interpret. The
ACMIF test (Franck, Nielsen, and Osborne
2013) to detect hidden additivity assigns classical significance
testing-based p-values. A combined testing and plotting approach allows
the researcher to interpret the way in which column effects change
across groups of the rows.
The ACMIF test works by (1) considering each placement of factor \(R\) levels into two nonempty groups, (2)
testing factor \(C \times group\)
interaction, and (3) considering the test that provides the most
evidence of interaction (and greatest additivity of \(C\) and \(R\) within groups) after correcting for
multiplicity using a Bonferroni adjustment. Simulation suggests that the
Bonferroni correction is not overly conservative for \(r \leq 7\) despite the magnitude of the
correction factor \(2^{r-1}-1\),
because the Bonferroni adjusted thresholds are remarkably similar to
simulated critical thresholds for a variety of \(c\) and \(r\) values (Franck,
Nielsen, and Osborne 2013). If \(g\) is an index for the latent group
membership such that \(g=1,2\), then
the factorial effects model corresponding to the hidden additivity
structure is given by
\[\label{eq:groupmod} y_{ij}^g = \mu + \tau_i + \gamma_g + (\tau\gamma)_{ig} + \beta_{j(g)} + \epsilon_{ijg} \\ (\#eq:eqgroupmod)\]
where \(\mu\) is the overall mean, \(\tau_i\) and \(\gamma_g\) are the main effects of factor \(C\) and the grouping variable, \((\tau\gamma)_{ig}\) represents factor \(C \times group\) interaction, and \(\beta_{j(g)}\) is the effect of factor \(R\), nested within group. The ACMIF test is based on the largest \(F\)-ratio corresponding to \(H_0: (\tau\gamma)_{ig} = 0\) among all assignments of \(R\) factor levels into two non-empty groups.
Objects in the HiddenF class can be used to characterize
hidden additivity further with application of the generic
plot(), print(), summary(), and
anova() functions. The following code produces an enhanced
interaction plot to help visualize hidden additivity (Figure 3):
> cjejuni.out <-HiddenF(cjejuni.mtx)
> plot(cjejuni.out, main="Hidden Additivity of time effects across plants",
+ rfactor="plant", cfactor="year", colorvec=c("blue", "red"), lwd=3, legendx=TRUE)
The enhanced interaction plot in Figure 3 displays levels of factor \(C\) (year) on the horizontal axis, levels
of factor \(R\) (plant) using lines
that are visually distinguished by line type, and the outcome values on
the vertical axis. Color is used to display which levels of factor \(R\) are assigned to which groups based on
the maximal interaction \(F\) statistic
among all possible configurations and (@ref(eq:eqgroupmod)). Note that
the grouping colors appear whether or not the ACMIF p-value (test for
\(C \times group\)) to detect hidden
additivity is significant. The default colors are black and red for the
two groups, but they can be customized by supplying an argument called
colorvec which is a vector of length two giving color
names. Another optional argument, legendx, may be set to
TRUE in order to provide a legend whose location will be
determined according to where on the plot the user clicks.
Because of the ellipsis(…), arguments such as main (for a
title) or lwd (for specifying line widths) can be passed to
matplot or legend. The arguments
lty, type, ylab, and
xlab are utilized by the plot.HiddenF()
function to produce the enhanced interaction plot and therefore are not
available for customization by the user.
The print() function returns results from the ACMIF
test.
> print(cjejuni.out)
The ACMIF test for the hidden additivity form of interaction
F=8.965 p-value =0.03309 df=4,8
(Bonferroni-adjusted for all 7 possible configurations)
This output can also be obtained by typing the name of an object of
class HiddenF into the console. The \(p\)-value above has been adjusted for
multiplicity using the Bonferroni multiplier of \(2^{4-1}-1=7\). Additionally, the
method argument in the print.HiddenF()
function supports all of the methods supported by the hiddenf
package discussed later. To see which of these seven possible
configurations of rows in two non-empty groups leads to the greatest
additivity within groups, use the generic summary()
function, which returns information useful for analysis based on
(@ref(eq:eqgroupmod)):
> summary(cjejuni.out)
Number of configurations: 7
Minimum adjusted pvalue: 0.03308869
Rows in group 1: 1 3
Rows in group 2: 2 4
Column means for grp 1: 0.16 0.08 0.5 0.16 0.18
Column means for grp 2: 0.14 0.13 0.185 0.485 0.145
The output above includes the number of configurations under consideration, the smallest Bonferroni adjusted p-value among these, the identity of the rows that fall into each group and the means across levels of \(C\) for both groups. Each of these items is returned in a list (not shown). To produce an ANOVA table:
> anova(cjejuni.out)
The ACMIF test for the hidden additivity form of interaction
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
group 1 0.00000 0.000005 0.0009 0.977350
col 4 0.18753 0.046882 8.0450 0.006606
row 2 0.03673 0.018365 3.1514 0.097874
group:col 4 0.20897 0.052243 8.9648 0.004727
Residuals 8 0.04662 0.005828
C.Total 19 0.47986
(Pvalues in ANOVA table are NOT corrected for multiplicity.)
For these data, the interaction between time (col) and
plant group (group) accounts for more variability (\(43\%\)) than any other term in the hidden
additivity model. This partial coefficient of determination is computed
by dividing the group \(\times\) column
sums of squares by the total sum of squares displayed above. The anova()
function can also accommodate other tests for non-additivity by
specifying the method argument as "KKSA", "Mandel" or
"Tukey" (see Other Tests for Non-additivity section).
Levels of factor \(C\) can be grouped
instead by transposing the input data matrix before applying the
HiddenF() function. The output below suggests no
statistical evidence for hidden additivity in the levels of factor \(C\).
> cjejuni.trans.mtx<-t(cjejuni.mtx)
> cjejuni.trans.out<-HiddenF(cjejuni.trans.mtx)
> print(cjejuni.trans.out)
The ACMIF test for the hidden additivity form of interaction
F=3.63 p-value =0.8671 df=3,9
(Bonferroni-adjusted for all 15 possible configurations)
Note that support for more than 20 rows is not yet included due to the exponential increase in computational demand as the number of grouped levels increases. Factors with more levels than this will be accommodated in future versions of the software.
The center option in the plot command allows the user to
graph the hidden additivity plot with data centered at the row means to
better see the concordance of rows with regard to column effects. This
is a way of de-noising the plot. Figure 4
demonstrates this feature using data from (Graybill 1954). These data will be further
analyzed later. Inspection of the right panel of Figure 4 suggests that the third genotype appears to
give inferior yields for rows in the red group (relative to the
performance of other genotypes in these blocks), but does better for
rows colored black. To produce this plot:
> data(Graybill.mtx)
> Graybill.out <- HiddenF(Graybill.mtx)
> par(mfrow=c(1,2))
> plot(Graybill.out)
> plot(Graybill.out, center=TRUE, main="Hidden Additivity plot\ncenter=TRUE")
The ACMIF approach (Franck, Nielsen, and Osborne 2013) to detect hidden additivity has been described above. The subsections below include code examples and briefly review the other four supported approaches to detect non-additivity in unreplicated studies.
The one degree of freedom test (Tukey 1949) is the earliest and most widely-known approach. The following model for non-additivity corresponds to Tukey’s procedure:
\[\label{eq:tukmod} y_{ij}=\mu + \tau_i + \beta_j + \nu\tau_i\beta_j + \epsilon_{ij}. (\#eq:eqtukmod)\]
A hypothesis test for this approach is based on the null hypothesis
\(H_0:\nu=0\). Under the null,
(@ref(eq:eqtukmod)) reduces to (@ref(eq:eqRCBD)). The one degree of
freedom test arises by fitting the squared fitted values from the
additive model as a model term, then assessing the partial F-test for
statistical significance corresponding to this term.
The TukeyPvalue() function accepts an object of class
HiddenF and returns a list that contains the p-value for the
one degree of freedom test and an lm object that contains the
model that includes the squared predictions from (@ref(eq:eqRCBD)) as
described above. The method argument in the
anova.HiddenF() function can be used to produce an ANOVA
table for this test. Since the p-value is large, the output below does
not provide statistical evidence for non-additivity of the form
suggested in (@ref(eq:eqtukmod)).
> TukeyPvalue(cjejuni.out)
$pvalue
[1] 0.7077391
$singledf.out
Call:
lm(formula = y ~ rows + cols + psq, data = hfobj$tall)
Coefficients:
(Intercept) rows2 rows3 rows4 cols2 cols3
0.095454 0.029036 0.030905 0.005445 -0.026989 0.043692
cols4 cols5 psq
0.044568 0.006369 1.569601
An extension of the one degree of freedom test is based on the following model (Mandel 1961):
\[\label{eq:mandelmod} y_{ij}=\mu + \tau_i + \beta_j + \theta_i\beta_j + \epsilon_{ij}, (\#eq:eqmandelmod)\]
where \(\epsilon_{ij}\) are i.i.d.
\(N(0,\sigma^2)\). The test for
rows-linear non-additivity is based on the null hypothesis \(H_0: \theta_i = 0\) for \(i=1,\ldots, t\). Under this null,
(@ref(eq:eqmandelmod)) reduces to (@ref(eq:eqRCBD)).
By setting \(\mu_i = \mu + \tau_i\) and
\(\theta_i = (\phi_i-1)\), the response
variable \(y_{ij}\) in
(@ref(eq:eqmandelmod)) can be represented as an intercept \(\mu_i\) and slope \(\phi_i\) that depend on effect \(\beta_j\). Letting \(i\) and \(j\) index "rows" and "columns" respectively
leads to the phrase "rows-linear" model, which was established later
(see Alin and Kurt 2006).
The ANOVA table sum of squares associated with the \(\theta_i\) term is
\[SS_{rowlin}=\sum_i( \frac{\sum_j y_{ij}(\bar{y}_{.j}-\bar{y}_{..})}{\sum_j(\bar{y}_{.j}-\bar{y}_{..})^2} -1 )^2 \sum_j(\bar{y}_{.j}-\bar{y}_{..})^2.\]
The residual sum of squares has the following form:
\[SS_{res}=\sum_i \sum_j [(y_{ij}-\bar{y}_{i.}) - \frac{\sum_j y_{ij}(\bar{y}_{.j}-\bar{y}_{..})}{\sum_j(\bar{y}_{.j}-\bar{y}_{..})^2}(\bar{y}_{.j}-\bar{y}_{..}) ]^2.\]
The test statistic is:
\[F_{rowlin} = \frac{SS_{rowlin}/(a-1)}{SS_{res}/(a-1)(b-2)}.\]
The statistic \(F_{rowlin}\) has an
\(F\) distribution with \((a-1)\) numerator and \((a-1)(b-2)\) denominator degrees of freedom
under the null hypothesis. Construction of a columns-linear test is
accomplished by defining a HiddenF object on the transposed
data matrix and calling the MandelPvalue() in a similar
fashion.
The MandelPvalue() function accepts an object of class
HiddenF and returns a list that includes a p-value, sums of
squares, F ratio, and degrees of freedom for the rows-linear test. The
output below fails to detect interaction using Mandel’s rows-linear
test:
> MandelPvalue(cjejuni.out)
$pvalue
[1] 0.9458807
$SumSq
SSRow SSCol SSMandel SSE SSTot
0.036735000 0.187530000 0.009849526 0.245740474 0.479855000
$Fratio
[1] 0.120243
$df
[1] 3 9
The error mean square (MSE) subtable comparison approach (Kharrati-Kopaei and Sadooghi-Alvandi 2007) forms an \(F\) statistic that is the ratio of residual sums of squares that arise from rows being placed into two groups, with each group containing at least two rows,
\[F_{KKSA} = \frac{(r_2-1)SS_{E1}}{(r_1-1)SS_{E2}}.\]
Under the assumptions of additivity and homogeneous variance, \(F_{KKSA}\) has an \(F\) distribution with \((r_1-1)(c-1)\) numerator and \((r_2-1)(c-1)\) denominator degrees of
freedom under the null hypothesis, no matter which rows are placed into
which group. This approach suggests non-additivity when error mean
square values within the subtables are discrepant.
As with the ACMIF procedure, grouping is subjective and is accomplished
by choosing one of the two factors upon which to form the groups, and
then placing that factor’s levels into two groups. Different groupings
result in different p-values. Groups are typically chosen based on a
priori knowledge, inspection of an interaction plot, or by
screening several candidate groupings and applying multiplicity
adjustment to the resulting p-values. The hiddenf package
adopts the authors’ (Kharrati-Kopaei and
Sadooghi-Alvandi 2007) suggestion of assessing the \(2^{r-1}-r-1\) possible groupings for factor
\(R\) and Bonferroni adjusting the
resulting p-values to achieve an \(\alpha\) level test.
The KKSAPvalue() function accepts an object of class
HiddenF and returns a list that contains the maximal \(F\) statistic among configurations, a
Bonferroni adjusted p-value, a length \(r\) vector that indicates the groupings of
levels of factor \(R\) that correspond
to the calculated p-value, and numerator and denominator degrees of
freedom for the test statistic. Data may be transposed in order to form
subtables according to factor \(C\). At
significance level \(\alpha=.05\) the
output below does not reject the null hypothesis of additivity according
to the MSE subtable comparison approach when grouping levels of factor
\(R\).
> t(KKSAPvalue(cjejuni.out))
fmax pvalue grp.vector NumDf DenomDf
[1,] 1.748821 1 Numeric,4 4 4
A recent approach (Malik, Mohring, and Piepho
2014) considers non-additivity elicited due to cell values which
are remote relative to the additive model. Not all cells are assumed to
contribute to the interaction pattern. The method detects non-additivity
by exploiting the large residuals that arise under this structure.
Execution of the residual clustering method proceeds as follows. First,
residuals from (@ref(eq:eqRCBD)) \(\hat{r}_{ij(1)} = y_{ij} -
\hat{y}_{ij(1)}\) are placed into 3 groups using k-means
clustering (MacQueen 1967) in one
dimension. Then, a two degrees of freedom cluster effect is incorporated
into the additive model:
\[y_{ij}=\mu + \tau_i + \beta_j + \kappa_{k(ij)} + \epsilon_{ij}\]
where \(\kappa_{k(ij)}\) represents
the effect of the \(kth\) cluster,
\(k=1,2,3\). The subscript \(k(ij)\) indicates that the \(ijth\) residual is assigned to exactly one
of the \(k\) clusters for \(i=1,\ldots,c\) and \(j=1,\ldots,r\). Since the cluster term is
suggested by the data, the partial \(F\) test corresponding to the cluster term
does not exhibit a central \(F\) distribution under the null. Rather,
the null distribution for this statistic may be approximated via Monte
Carlo simulation for the purpose of conducting statistical inference.
Critical thresholds for a variety of \(c\) and \(r\) are available (Malik, Mohring, and Piepho 2014). The
hiddenf package computes p-values and critical thresholds for
user-supplied \(c\) and \(r\) using the functions
MalikPvalue and MalikTab, respectively. By
default \(N=500\) Monte Carlo
replicates are used to approximate the p-value, which guarantees the
standard error of the p-value \(SE(p-value)
\leq 0.023\).
The MalikPvalue function accepts an object of class
HiddenF and returns a Monte Carlo p-value, observed test
statistic, and Monte Carlo sample size used in the computation. The
k-means approach uses the method of (Hartigan and
Wong 1979) with 100 starting values per Monte Carlo iteration.
The output below suggests that the C. jejuni data do not
exhibit non-additivity of the form suggested by this method. This
approach is invariant to data transposition.
> t(MalikPvalue(cjejuni.out, N=10000))
(Pvalue from Malik's test estimated with N=10000 Monte Carlo datasets)
pvalue Tc N
[1,] 0.8498 40.97983 10000
The MalikTab function accepts as input a user-specified \(r\) and \(c\), and returns a Monte Carlo estimate of the 90%, 95%, and 99% for the null distribution of the test statistic. A variety of such thresholds are provided in (Malik, Mohring, and Piepho 2014). The default number of Monte Carlo replicates is \(N=1,000\). An example is given below.
> Mtab.24.6<-MalikTab(r=24, c=6, N=10000)
> ls(Mtab.24.6)
[1] "q" "Tcsim"
> Mtab.24.6\$q
r c 99% 95% 90%
24.0000 6.0000 445.5725 404.3399 384.0591
additivityPvalues functionThe additivityPvalues function accepts an object of
class HiddenF and returns p-values for each of the supported
methods.
> t(additivityPvalues(cjejuni.out))
(Pvalue from Malik's test estimated with N=500 Monte Carlo datasets)
Malik.pvalue Mandel.pvalue Tukey.pvalue KKSA.pvalue ACMIF.pvalue
[1,] 0.834 0.9459 0.7077 1 0.0331
Table 1 summarizes an investigation of 17 data sets using the methods
supported by hiddenf. These examples are taken from
bioinformatic, agricultural, industrial, and medical settings, and are
all publicly available. To our knowledge, this is the largest collection
and description of data that has been used to study non-additivity in
the literature to date. Table 1 includes columns for references, labels,
and p-values. Note that ACMIF c, columns-linear, and KKSA c are obtained
by submitting the transposed data matrix to HiddenF(). A
brief description of each data set follows.
The "Liming" data arises from an experiment that explores the utility of
seven types of blast furnace slags as liming material in agriculture on
three types of soil (Carter, Collier, and Davis
1951). The data were analyzed in the context of non-additivity in
(Johnson and Graybill 1972) and (Kharrati-Kopaei and Sadooghi-Alvandi 2007). The
outcome variable is yields of corn in bushels per acre.
The "Penicillin" data set (Davies and Goldsmith
1972) assesses variability between six samples of penicillin on
24 plates. The outcome variable is the diameter of the zone of
inhibition. The data were analyzed for non-additivity in (Malik, Mohring, and Piepho 2014).
The "Osmotic" data set (Kharrati-Kopaei and
Sadooghi-Alvandi 2007) explores five varieties of safflower grown
in solutions with six osmotic potentials. Safflowers are an important
crop due to their oil which can be used for cooking and their flowers
which can be used as a substitute for saffron. The outcome variable is
average root weight.
The "Fertilizer" example originally appeared in (Ostle 1963) and was re-analyzed to detect
non-additivity in (Kharrati-Kopaei and
Sadooghi-Alvandi 2007). This example explores the ratio of dry to
wet wheat in four blocks for four fertilizers.
The "Bottles" example was originally analyzed in (E. R. Ott and Snee 1973) and also explored in
(Boik 1993a) and (Kharrati-Kopaei and Sadooghi-Alvandi 2007). The
study assesses the performance of a six-headed machine on five
occasions.
The "Grain Yields" data measures the yield of a grain crop in bushels
per acre for five levels of fertilizer on five blocks. The data come
from (Ostle 1963) and are re-analyzed in
(Kharrati-Kopaei and Sadooghi-Alvandi
2007).
The "Absorbance" data (Mandel 1991)
measures absorbancy of wood pulp of nine polysachharide concentrations
from seven different laboratories. The data are also analyzed in (Alin and Kurt 2006).
The "Permeability" example examines permeability of three sheets of
building material on nine different days. The data appear in (Hald 1952) and (Giesbrecht and Gumpertz 2004).
The "Wool" data (Lentner and Bishop 1993)
examines four cleaning processes for wool from five different batches.
The outcome is losses in weight in \(mg\) of the sample after cleaning.
The "Red Blood Cells" data measures the number of red blood cells
counted by five doctors in ten counting chambers. The data appear in
(Biggs and Macmillan 1948) and were also
analyzed in (Boik 1993b).
The "Tukey" data set is an illustrative example that includes three rows
and four columns from (Tukey 1949).
The "Ethyl Alcohol" data (Osborne, McKelvy, and
Bearce 1913) measures the density of aqueous solutions of ethyl
alcohol at six concentrations and seven temperatures. These data were
also analyzed in (Mandel 1971).
The "Insecticide" data (R. L. Ott and Longnecker
2001) comes from a complete block design that measures the impact
of four plots and three insecticides on the number of string bean
seedlings that emerge.
The "Rubber" data (Mandel 1961) contains
data on stress measurements in \(Kg/cm^2\) on seven types of natural rubber
vulcanizates from 11 laboratories.
The "C. jejuni" data measures the fraction of bacteria found on
disease-resistant turkeys (Qiu 2013).
These data were collected over a five year period across four turkey
plants in North Carolina. These data are displayed in Figures 1 and 3.
The "Wheat" data (Graybill 1954) measures
wheat yields in bushels per acre in four varieties in 13 locations.
These data are plotted in Figure 4.
The eight Copy number variation data sets "CNV1-CNV8" compare
discrepancies in the copy number signal between normal and tumor tissue
samples from a comparative genomic hybridization array. Each data set
corresponds to a separate location in the genome that is tested with the
assay. Six dogs each had two tissue samples (one normal and one tumor)
upon which the assay was conducted. The eight specific sets were
selected from 5899 sets that were analyzed in a previous study and
because they showed evidence of hidden additivity (Franck, Nielsen, and Osborne 2013). The hidden
additivity exhibited in these copy number responses might suggest the
existence of multiple tumor sub-types for lymphoma in dogs. The full
data for this example may be accessed here: http://www.sciencedirect.com/science/article/pii/S0167947313001618.
The results of non-additivity tests using methods supported by the hiddenf package are reported in Table 1. The data come from 17 sources, and there are 24 total sets with the copy number variation contributing eight individual sets. This collection is not intended as a representative sample of the larger class of all unreplicated two-factor data, since many sets were included due to their apparent non-additivity. This exercise is intended to show the breadth of applications for which the study of non-additivity is important. To facilitate discussion, the standard p-value less than \(\alpha=0.05\) criterion is used to declare statistical significance without further multiplicity correction, although the reader is invited to interpret the raw p-values however they like. The ACMIF test for hidden additivity was significant for 16 of 24 sets when grouping levels of factor \(R\). The test was significant for eight of 16 sets when grouping levels of factor \(C\). The Tukey test was significant for eight of 24 sets. The rows-linear and columns-linear tests showed significant non-additivity for seven and six sets (out of 16 possible), respectively. The KKSA test grouping levels of \(R\) and \(C\) revealed non-additivity for 13 of 22 and five of 13 sets, respectively. The Malik approach yielded significant non-additivity for nine of 24 sets. Since all tests assume different restrictions on the form of interaction, no single test is optimal for every conceivable pattern. Hence, differential performance among methods is expected for different types of data. Each test excels at detecting different patterns of non-additivity.
This paper describes the main functionality of the hiddenf package. hiddenf provides descriptive and inferential tools to visualize and characterize hidden additivity. Further, hiddenf includes the ability to compute p-values for five tests for non-additivity. The package is illustrated using seventeen data sets spanning studies in industrial applications, agriculture, and the medical sciences. Non-additivity is evident in many of these data sets, motivating the importance of statistical interaction in unreplicated studies across a variety of scientific domains. The hiddenf package contributes to existing software resources specifically by making three recent tests for non-additivity available in an open-source repository for the first time (in addition to two historical methods) and by providing visualization and descriptive tools to further explore hidden additivity.
| Citation | Label | r | c | ACMIF r | ACMIF c | Tukey | row-linear | col-linear | KKSA r | KKSA c | Malik |
|---|---|---|---|---|---|---|---|---|---|---|---|
| (Carter, Collier, and Davis 1951) | Liming | 7 | 3 | 0.1993 | 0.0110 | 0.9106 | 0.9045 | 0.8604 | 0.0068 | \(\ddagger\) | 0.2471 |
| (Davies and Goldsmith 1972) | Penicillin | 6 | 24 | 0.0387 | \(\diamond\) | 0.5382 | 0.9374 | \(\diamond\) | 1.000 | \(\diamond\) | 0.2231 |
| (Kharrati-Kopaei and Sadooghi-Alvandi 2007) | Osmotic Bars | 6 | 5 | 0.2075 | 0.0388 | 0.4450 | 0.5300 | 0.1848 | 0.1859 | 0.4155 | 0.7453 |
| (Ostle 1963) | Fertilizer | 4 | 4 | 0.0107 | 0.0043 | 0.0012 | 0.0146 | 0.0077 | 0.0339 | 0.0493 | 0.4373 |
| (E. R. Ott and Snee 1973) | Bottles | 6 | 5 | 0.0001 | 0.0031 | 0.0003 | 0.0025 | 0.0006 | 0.0178 | 0.4925 | 0.8839 |
| (Ostle 1963) | Grain yields | 5 | 5 | 0.6556 | 0.1485 | 0.1139 | 0.0103 | 0.4425 | 0.0528 | 1.0000 | 0.0403 |
| (Mandel 1991) | Absorbance | 7 | 9 | 0.0001 | <.0001 | <.0001 | <.0001 | <.0001 | <.0001 | <.0001 | 0.9972 |
| (Hald 1952) | Permeability | 9 | 3 | 0.4066 | 0.2629 | 0.7844 | 0.7761 | 0.9544 | 0.3908 | \(\ddagger\) | 0.8674 |
| (Lentner and Bishop 1993) | Wool | 4 | 5 | 0.2355 | 0.4087 | 0.0359 | 0.0519 | 0.0410 | 0.7155 | 0.1998 | 0.4182 |
| (Biggs and Macmillan 1948) | Red blood cells | 5 | 10 | 0.3291 | 0.0825 | 0.2322 | 0.6790 | 0.4903 | 0.0668 | 1.0000 | 0.8702 |
| (Tukey 1949) | Tukey | 3 | 4 | 0.2213 | 0.5773 | 0.8566 | 0.8743 | 0.9298 | \(\ddagger\) | 0.5806 | 0.4769 |
| (Osborne, McKelvy, and Bearce 1913) | Ethyl Alcohol | 6 | 7 | <.0001 | <.0001 | <.0001 | <.0001 | <.0001 | 0.0002 | 0.0469 | 0.9954 |
| (R. L. Ott and Longnecker 2001) | Insecticide | 3 | 4 | 0.4038 | 0.5569 | 0.6875 | 0.3846 | 0.8125 | \(\ddagger\) | 0.3000 | 0.7603 |
| (Mandel 1961) | Rubber | 11 | 7 | <.0001 | <.0001 | 0.3465 | <.0001 | 0.5403 | 0.0021 | 0.0058 | 0.9787 |
| (Qiu 2013) | C. jejuni | 4 | 5 | 0.0331 | 0.8671 | 0.7077 | 0.9459 | 0.8842 | 1.0000 | 0.2862 | 0.8443 |
| (Graybill 1954) | Wheat | 13 | 4 | <.0001 | 0.0070 | <.0001 | 0.0003 | <.0001 | 0.0320 | 0.0143 | 0.1335 |
| (Franck, Nielsen, and Osborne 2013) | CNV1 | 6 | 2 | 0.0007 | \(\ddagger\) | 0.0138 | \(\ddagger\) | \(\ddagger\) | 0.0310 | \(\ddagger\) | 0.0009 |
| CNV2 | 6 | 2 | 0.0005 | \(\ddagger\) | 0.0659 | \(\ddagger\) | \(\ddagger\) | 0.0010 | \(\ddagger\) | 0.0164 | |
| CNV3 | 6 | 2 | <.0001 | \(\ddagger\) | 0.4674 | \(\ddagger\) | \(\ddagger\) | 0.0017 | \(\ddagger\) | 0.0018 | |
| CNV4 | 6 | 2 | 0.0005 | \(\ddagger\) | 0.1659 | \(\ddagger\) | \(\ddagger\) | 0.1249 | \(\ddagger\) | 0.0182 | |
| CNV5 | 6 | 2 | 0.0005 | \(\ddagger\) | 0.2388 | \(\ddagger\) | \(\ddagger\) | 0.0259 | \(\ddagger\) | 0.0173 | |
| CNV6 | 6 | 2 | 0.0007 | \(\ddagger\) | 0.6187 | \(\ddagger\) | \(\ddagger\) | 0.2799 | \(\ddagger\) | 0.0224 | |
| CNV7 | 6 | 2 | 0.0006 | \(\ddagger\) | 0.0277 | \(\ddagger\) | \(\ddagger\) | 0.0939 | \(\ddagger\) | 0.0193 | |
| CNV8 | 6 | 2 | 0.0007 | \(\ddagger\) | 0.1391 | \(\ddagger\) | \(\ddagger\) | 0.0340 | \(\ddagger\) | 0.0216 |
We’d like to acknowledge Zahra Shenavari of Shiraz University for helpful comments regarding the error mean square subtable comparison approach.