When analyzing multi omics datasets, the search for features that could serve as biomarkers is an important aspect. Because these biomarkers might be used in clinical settings for disease diagnosis etc., it is extremely important to minimize false positives. One possible error source are confounding variables: The biomarker is not directly linked to the disease but influenced by a third (confounding) variable, that in turn is linked to the disease.
The R package metadeconfoundR was developed to address
this issue. It first uses univariate statistics to find associations
between omics features and disease status or metadata. Using nested
linear model comparison post hoc testing, those associations are checked
for confounding effects from other covariates/metadata and a status
label is returned. Instead of assuming The tool is able to handle large
scale multi-omics datasets in a reasonable time, by parallel processing
suitable for high-performance computing clusters. In addition, results
can be summarized by a range of plotting functions.
The main (metadeconfoundR::MetaDeconfound()) analysis is
a two step process:
Figure 1: metadeconfoundR pipeline overview.
First, significant associations between single omics features (like
gut microbial OTUs) and metadata (like disease status, drug
administration, BMI) are identified. Based on the data type of the
respective metadata, either wilcox.test() (for binary),
cor.test() (for continuous numerical) or
kruskal.test() (for neither numerical nor binary) is used.
All three tests are rank-based to minimize assumptions about data
distribution (Fig. 1, left). In addition to collecting p-values
for all computed tests, effect size is measured as Cliff’s Delta and
Spearman’s Rho for binary and continuous data, respectively. Since there
is no suitable effect size metric for categorical data with more than 2
levels, no value is reported here. It is recommended to introduce binary
pseudo-variables for each level of the categorical metadata to partially
circumvent this drawback.
In the second step, all hits are checked for confounding effects and
a status is reported for each feature-metadata combination (Fig.
1, center and right; Fig. 2). A “hit” here is defined as a
feature-metadata association with small enough fdr-corrected p-value and
big enough effect size reported from the first, naive part of the
pipeline. Thresholds for both parameters can be set via
QCutoff and DCutoff when starting the
analysis. Since confounding of signal can only happen with more than one
metadata variable associated to a certain feature, all features with
only one significant metadata association are trivially deconfounded and
get status “No Covariates (OK_nc)”.
The actual confounder detection is done by performing a set of two likelihood ratio tests (LRTs) of nested linear models. For each possible combination of a feature and two of its associated metavariables, three models are fitted to the rank-transformed feature:
LRTs reveal whether inclusion of covariate1 and/or covariate2 significantly improves the performance of the model. Random and/or fixed effects can be added to all models by the user. These additional effects will, however, not be considered in the first naive association testing step of the pipeline.
Importantly, metadeconfoundR will always
rank-transform the features during analysis.
Figure 2: detailed status labelling decision tree.
# CRAN
install.packages("metadeconfoundR")
# github stable version
devtools::install_github("TillBirkner/metadeconfoundR")
# github developmental version
devtools::install_github("TillBirkner/metadeconfoundR@develop")
library(metadeconfoundR)Minimal input consists of two data.frames for feature data (Tab. 1) and metadata (Tab. 2), respectively. Both data.frames must have one row per sample (sample names as rownames) with matching order of sampleIDs and one feature/meta-variable per column. The first column of the metadata data.frame must be binary (i.e be numeric and consist of only 0/1 entries.) Usually this is the control/case variable, but any other binary meta-variable will work as well.
| MS0001 | MS0006 | MS0007 | MS0008 | MS0012 | |
|---|---|---|---|---|---|
| BGI003A | 0 | 42.0 | 4.0 | 153.0 | 126.0 |
| BGI089A | 0 | 155.5 | 34.5 | 360.5 | 116.5 |
| DLF001 | 3 | 67.0 | 6.0 | 443.0 | 40.0 |
| DLF002 | 1 | 58.0 | 18.0 | 175.0 | 181.5 |
| DLF003 | 45 | 43.0 | 0.0 | 66.0 | 74.0 |
| DLF004 | 0 | 41.0 | 1.0 | 206.0 | 37.0 |
| Status | Dataset | Metformin | continuous_dummy | altered_dummy | |
|---|---|---|---|---|---|
| BGI003A | 0 | CHN | 0 | 0.0617863 | 0.0617863 |
| BGI089A | 0 | CHN | 0 | 0.2059746 | 0.2059746 |
| DLF001 | 1 | CHN | 0 | 0.1765568 | 0.1765568 |
| DLF002 | 1 | CHN | 0 | 0.6870228 | 0.6870228 |
| DLF003 | 1 | CHN | 1 | 0.3841037 | 0.3841037 |
| DLF004 | 1 | CHN | 0 | 0.7698414 | 0.7698414 |
MetaDeconfound() has built-in quality checks for
formatting of the input data but it is best to check proper formatting
beforehand.
Ensure that colnames and rownames do not contain any problematic
characters by e.g running them through make.names() and
check for same order of rows in both input data.frames.
data(reduced_feature)
data(metaMatMetformin)
# check correct ordering
all(rownames(metaMatMetformin) == rownames(reduced_feature))
#> [1] TRUE
all(order(rownames(metaMatMetformin)) == order(rownames(reduced_feature)))
#> [1] TRUE
example_output <- MetaDeconfound(
featureMat = reduced_feature,
metaMat = metaMatMetformin,
returnLong = TRUE,
logLevel = "ERROR"
)Random and/or fixed effects can be included in the
modeling process by supplying the randomVar and/or
fixedVar parameter (Fig. 3, right).
RandDataset_output <- MetaDeconfound(
featureMat = reduced_feature,
metaMat = metaMatMetformin,
randomVar = c("Dataset"),
returnLong = TRUE,
logLevel = "ERROR"
)For a full list of input parameters please refer to the help page.
Output can be returned either as a list of wide format data.frames (default) or as a single long format data.frame (Tab. 3). In both cases raw p-values (Ps), multiple testing corrected p-values (Qs), corresponding effect sizes (Ds), and confounding status (status) are reported for each possible combination of a feature and a meta-variable.
While Ps, Qs, and Ds are always solely based on the naive association
testing, the status label reflects effects of included random/fixed
effects. A naively significant feature, metadata association that is
reducible to a random effect can, thus, have a Q-value <
QCutoff and still be labeled as “NS” (not significant).
| feature | metaVariable | Ps | Qs | Ds | status |
|---|---|---|---|---|---|
| MS0001 | Status | 0.0039510 | 0.0103973 | -0.1396941 | AD |
| MS0006 | Status | 0.0000321 | 0.0001147 | 0.2023848 | OK_sd |
| MS0007 | Status | 0.0000001 | 0.0000016 | 0.2425731 | OK_sd |
| MS0008 | Status | 0.7124524 | 0.7740977 | 0.0179492 | NS |
| MS0012 | Status | 0.1847586 | 0.3079311 | 0.0645627 | NS |
Interpretation of the MetaDeconfound() output relies
heavily on the assigned status labels as well as the graphical summary
supplied by the BuildHeatmap() function. The
possible labels also shown in Fig. 2 are: - “NS” (Not
Significant): A non-significant association, or an association fully
reducible to a specified fixed or random effect. - “OK_nc” (no
covariates): A significant association not confounded by other
covariates, because there are no additional covariates associated with
the same feature. - “OK_sd” (strictly deconfounded): A significant
association that remains significant even when accounting for other
covariates associated with the same feature. These associations are
considered robust despite the presence of additional covariates. -
“OK_d” (doubtful): A significant association that remains deconfounded,
but where the estimated coefficient of the covariate in the full model
is weak, in the sense that its confidence interval spans 0. This marks
cases where the deconfounding result is still valid (i.e. a significant
model comparison likelihood ratio test), but the underlying model may be
less stable or less clearly separated from edge cases. - “AD”
(Ambiguously Deconfounded): A significant association for which two
candidate covariates exhibit very similar explanatory power, preventing
the statistically significant disentangling of their confounding
relationship. This situation commonly arises when metadata variables are
strongly correlated or colinear. - “C: …” (Confounded): A significant
association that is no longer supported after accounting for one or more
alternative metadata variables. The observed relationship is likely
driven by the covariate(s) listed here and should not be interpreted as
an independent association.
The prevalence of the different non-NS labels strongly depends on the
complexity and curation level of the supplied metadata. In a first naive
MetaDeconfound() run, including all available metadata,
higher rates of associations labeled “C: …” or “AD” might occur. This
happens when the metadata contain derived/dependent variables (e.g. BMI,
height, and weight) or very similar representations of the same variable
(e.g. height quantile and height in cm).
Here, the summarizing heatmap generated by BuildHeatmap() is useful for identifying structures. While columns with nearly identical label and color distributions often indicate correlated or colinear covariates giving rise to increased numbers of “AD” assignments, recurring confounding relationships between specific metadata variables can reveal redundant measurements, derived variables, or hidden dependencies within the metadata. Users are encouraged to iteratively refine their metadata based on these patterns and their expert knowledge of the included variables, and to rerun MetaDeconfound() to obtain a clearer and more interpretable dependency structure.
Importantly, the presence of correlated metadata variables does not automatically result in large numbers of “AD” or “C: …” assignments. Many correlated variables remain distinguishable in practice because one variable provides a significantly better explanation of the observed feature variation than the other, or because both encode distinct explanatory power. Consequently, “AD” and “C: …” labels should not be interpreted as a general consequence of metadata correlation, but rather as indicators of specific situations in which covariates exhibit substantial overlap in their explanatory power due to dependency, collinearity/redundancy, or where the available data lack sufficient power to disentangle their effects.
MetadeconfoundR can test for associations between to omics spaces,
while controlling for confounders/mediators. Simply supply a second
data.frame of features as mediationMat when running
MetaDeconfound(). Note that only metavariables supplied via
metaMat will be tested as potential confounders/mediators
in this case, and multiple testing p-value correction will be done for
ncol(featureMat)*ncol(mediationMat) by default, while
p-values for the supplied metadata will not be corrected. A different
correction approach can be forced by setting the
adjustLevel argument accordingly.
reduced_featureMedi <- reduced_feature[, 1:40]
mediationMat <- reduced_feature[, 41:50]
example_outputMedi <- MetaDeconfound(
featureMat = reduced_featureMedi,
metaMat = metaMatMetformin,
mediationMat = mediationMat,
returnLong = TRUE,
logLevel = "ERROR"
)When mediation analysis is performed, MetaDeconfound()
will add an additional “groupingVar” column to the final output. This
column states from which input data.frame a reported association
originates: metaMat or mediationMat. When
supplying output from such a MetaDeconfound() run to BuildHeatmap(), this additional “groupingVar”
information is used to split the resulting plot, separating the
mediationMat features from the metadata.
BuildHeatmap(
example_outputMedi,
keepMeta = colnames(metaMatMetformin),
d_range = "full"
) +
theme(strip.background = element_rect(fill = "red"))Figure 3: BuildHeatmap() output for mediation analysis data
Minimal input consists only of an output object from the main
MetaDeconfound() function. This will return in a ggplot2
heatmap with effect size as tile color and black asterisks or grey
circles indicating significant and not confounded or confounded
associations based on corrected p-values. The plot is clustered on both
axes and features as well as meta-variables without any associations
passing effect size and significance cutoffs (default:
q_cutoff = 0.1, d_cutoff = 0.01) are removed
(Fig. 4).
left <- BuildHeatmap(example_output)
right <- BuildHeatmap(RandDataset_output)
grid.arrange(left, right, ncol = 2)Figure 4: default output of the BuildHeatmap() function
For both this default heatmap, as well as the alternative cuneiform
plot (cuneiform = TRUE), a range of customizations are
available. In Fig. 5 meta-variables not passing the effect size
and significance cutoffs are manually kept in the plot
(keepMeta), and the shown range of effect sizes is set from
-1 to +1 (d_range = "full"). For a full list of options,
again, refer to the help page.
BuildHeatmap(
example_output,
cuneiform = TRUE,
keepMeta = colnames(metaMatMetformin),
d_range = "full"
)Figure 5: alternative cuneiform output of the BuildHeatmap() function
The BuildHeatmap() function returns a ggplot2 object.
This makes it possible to perform some easy alterations manually
(Fig. 6).
BuildHeatmap(example_output) +
theme(
legend.position = "none",
axis.text.y = element_text(face = "italic"),
plot.title = element_blank(),
plot.subtitle = element_blank()
)Figure 6: post-plotting ggplot2 alterations
An additional column “groupingVar” can be added to the long format
MetaDeconfound() output, containing a set of group names.
It can be used to separate the supplied metavariables into different
groups. When running Mediation
analysis, this column gets added automatically, so that features
from mediationMat are clearly separated from the
metavariables of metaMat in the BuildHeatmap()
output. This splitting is based on ggplot2::facet_grid(),
which allows for similar customizations through the
ggplot2::theme() function as the main plot. The order of
the groups in the resulting plot can be changed by editing the order of
factor levels of the groupingVar column.
Minimal input consists only of an output object from the main
MetaDeconfound() function. This will return in a list of
ggplot2/ggrpah circos plots with one plot per input feature. All
metavariables significantly associated to the respective feature are
shown as circles with filling color and line thickness depicted as
effect size and confounding status, respectively. The
confounder-confounded relationship between the metavariables are shown
as directional lines connecting them.
Figure 7: BuildConfounderMap() example plot
The function ImportLongPrior() of the metadeconfoundR
package allows for easy integration of a list of feature-metadata
pairings into a new run of the main Metadeconfound()
function.
These pairings must be supplied in the form of a long-format
data.frame with the first column called “feature” containing names of
omics features and the second column called “metaVariable” containing
names of the respective associated metadata variable. When only these
two columns are used as input for ImportLongPrior(), all
given pairings are assumed to be significant.
Additional columns called “Qs” and “status”, as produced by metadeconfoundR can also be supplied.:
| feature | metaVariable | Ps | Qs | Ds | status | |
|---|---|---|---|---|---|---|
| 101 | MS0001 | Metformin | 0.0164764 | 0.0484600 | -0.1553640 | AD |
| 102 | MS0006 | Metformin | 0.0497210 | 0.1243026 | 0.1276733 | NS |
| 103 | MS0007 | Metformin | 0.0000008 | 0.0000059 | 0.3018286 | OK_sd |
| 104 | MS0008 | Metformin | 0.8044179 | 0.8379353 | -0.0161268 | NS |
| 105 | MS0012 | Metformin | 0.1202921 | 0.2227631 | 0.1010798 | NS |
Whenever the “status” column is part of the input, only feature–metadata pairings having a status other than “NS” are assumed to be significant.
To assure only features and metadata present in the current dataset
are imported, the two input data.frames for the
Metadeconfound() function, featureMat and metaMat, also
need to be supplied to ImportLongPrior(). It is therefore
important to have consistent naming of variables between datasets.
minQValues <- ImportLongPrior(longPrior = example_output,
featureMat = reduced_feature,
metaMat = metaMatMetformin)In any case, ImportLongPrior() will return a wide-format
data.frame containing significance values for the supplied pairings. If
a “Qs” column was part of the longPrior input, these multiple-testing
corrected p-values will be used for the output. If no “Qs” was supplied,
artificial “significant” values of -1 will be used to mark significant
associations.
| Status | Dataset | Metformin | continuous_dummy | altered_dummy | |
|---|---|---|---|---|---|
| MS0001 | 0.0103973 | 0 | 0.0484600 | NA | 0.0626943 |
| MS0006 | 0.0001147 | 0 | NA | NA | 0.0031886 |
| MS0007 | 0.0000016 | 0 | 0.0000059 | NA | NA |
| MS0008 | NA | 0 | NA | NA | 0.0008288 |
| MS0012 | NA | 0 | NA | NA | 0.0000325 |
This data.frame can now be used as the minQValues input
parameter of MetaDeconfound().
example_output2 <- MetaDeconfound(featureMat = reduced_feature,
metaMat = metaMatMetformin,
minQValues = minQValues)This way, all significant feature–metadata pairings in the
minQValues data.frame will be treated as potential
confounders in MetaDeconfound().
Note that this example is only to demonstrate the general process of
integrating prior knowledge into a MetaDeconfound()
analysis. Using the output of a MetaDeconfound() run as
minQValues input for a second run with the exact same
features and metadata will not lead to any new insights since the set of
QValues calculated by MetaDeconfound() and the set supplied
using the minQValues parameter are identical in this
case.
Effect sizes returned by MetaDeconfound() are marginal
effect sizes based on naive associations, and so do not take into
account potential confounders. These marginal effect sizes can,
therefore, be inflated and might not reflect the true effect size
attributable exclusively to a metavariable. However, building large
models including all available metavariables in order to compute partial
effect sizes is neither feasible nor robust. Instead, we first run a
normal MetaDeconfound() analysis. Subsequently, we supply
the output from that run to GetPartialEfSizes(). This
function calculates a partial effect size only for those metavariables
labeled not as “NS” (not significant). Random and/or fixed effects used
in the MetaDeconfound() analysis need to be supplied here
again, alongside the original input data.frames.
For each feature, a full linear model with all identified metavariables as predictors is being built alongside a set of smaller models, each missing one of the metavariables. The partial R² is calculated as the difference in the coefficient of determination (“R²”) between the full model and one of the reduced models that excludes the respective metavariable.
These values as well as two different normalizations are reported in three new columns:
Finally, the sign of the marginal effect size, i.e. the direction of the association, is transferred to the partial effect size, and the R² of the full model (“maxRsq”) is appended for each feature.
ex_out_partial <- GetPartialEfSizes(
featureMat = reduced_feature,
metaMat = metaMatMetformin,
metaDeconfOutput = RandDataset_output,
randomVar = c("Dataset")
)Output of GetPartialEfSizes() can be graphically
summarized using BuildHeatmap(), but will show the partial
effect sizes only if the plotPartial parameter is set. Note
that the marginal effect sizes now get replaced by the specified partial
R² values. The default effect size cutoff (d_cutoff) of
0.01 might need to be adjusted to be less strict.
Figure 8: BuildHeatmap() plotting of partial effect sizes calculated by GetPartialEfSizes()
R version 4.6.0 (2026-04-24)
Platform: x86_64-pc-linux-gnu
locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: ggraph(v.2.2.2), kableExtra(v.1.4.0), gridExtra(v.2.3), ggplot2(v.4.0.3), metadeconfoundR(v.1.0.7), detectseparation(v.0.4.0) and rmarkdown(v.2.31)
loaded via a namespace (and not attached): tidyselect(v.1.2.1), viridisLite(v.0.4.3), dplyr(v.1.2.1), farver(v.2.1.2), viridis(v.0.6.5), S7(v.0.2.2), fastmap(v.1.2.0), tweenr(v.2.0.3), digest(v.0.6.39), lifecycle(v.1.0.5), magrittr(v.2.0.5), compiler(v.4.6.0), rlang(v.1.2.0), sass(v.0.4.10), tools(v.4.6.0), igraph(v.2.3.2), yaml(v.2.3.12), knitr(v.1.51), labeling(v.0.4.3), graphlayouts(v.1.2.3), xml2(v.1.5.2), plyr(v.1.8.9), RColorBrewer(v.1.1-3), registry(v.0.5-1), withr(v.3.0.2), purrr(v.1.2.2), sys(v.3.4.3), grid(v.4.6.0), polyclip(v.1.10-7), colorspace(v.2.1-2), scales(v.1.4.0), iterators(v.1.0.14), MASS(v.7.3-65), cli(v.3.6.6), reformulas(v.0.4.4), generics(v.0.1.4), otel(v.0.2.0), rstudioapi(v.0.19.0), reshape2(v.1.4.5), minqa(v.1.2.8), cachem(v.1.1.0), ggforce(v.0.5.0), pander(v.0.6.6), stringr(v.1.6.0), splines(v.4.6.0), vctrs(v.0.7.3), boot(v.1.3-32), Matrix(v.1.7-5), jsonlite(v.2.0.0), slam(v.0.1-55), ggrepel(v.0.9.8), ROI.plugin.lpsolve(v.1.0-2), systemfonts(v.1.3.2), maketools(v.1.3.2), foreach(v.1.5.2), tidyr(v.1.3.2), jquerylib(v.0.1.4), glue(v.1.8.1), ROI(v.1.0-2), nloptr(v.2.2.1), codetools(v.0.2-20), stringi(v.1.8.7), gtable(v.0.3.6), shape(v.1.4.6.1), lmtest(v.0.9-40), lme4(v.2.0-1), tibble(v.3.3.1), logger(v.0.4.2), pillar(v.1.11.1), htmltools(v.0.5.9), circlize(v.0.4.18), R6(v.2.6.1), textshaping(v.1.0.5), Rdpack(v.2.6.6), tidygraph(v.1.3.1), evaluate(v.1.0.5), lpSolveAPI(v.5.5.2.0-17.15), lattice(v.0.22-9), rbibutils(v.2.4.1), backports(v.1.5.1), memoise(v.2.0.1), bslib(v.0.11.0), Rcpp(v.1.1.1-1.1), svglite(v.2.2.2), nlme(v.3.1-169), checkmate(v.2.3.4), xfun(v.0.58), zoo(v.1.8-15), buildtools(v.1.0.0), pkgconfig(v.2.0.3) and GlobalOptions(v.0.1.4)