Informational Odds Ratio: A Useful Measure of Epidemiologic Association in Environment Exposure Studies

Jimmy T. Efird; Suzanne Lea; Amanda Toland; Christopher J. Phillips

doi:10.1177/EHI.S9236

How to translate text using browser tools

1 January 2020 Informational Odds Ratio: A Useful Measure of Epidemiologic Association in Environment Exposure Studies

Jimmy T. Efird, Suzanne Lea, Amanda Toland, Christopher J. Phillips

Author Affiliations +

Environmental Health Insights, 6(1): (2020). https://doi.org/10.1177/EHI.S9236

Abstract

The informational odds ratio (IOR) measures the post-exposure odds divided by the pre-exposure odds (ie, information gained after knowing exposure status). A desirable property of an adjusted ratio estimate is collapsibility (ie, the combined crude ratio will not change after adjusting for a variable that is not a confounder). Adjusted traditional odds ratios (TORs) are not collapsible. In contrast, Mantel-Haenszel adjusted IORs generally are collapsible. IORs are a useful measure of disease association in environmental case-referent studies, especially when the disease is common in the exposed and/or unexposed groups.

Introduction

A central theme of environmental epidemiology is to quantify the occurrence (eg, incidence, prevalence) and/or outcome (eg, morbidity, mortality) of disease among a population exposed to a putative environmental hazard. The exposed population is then compared with a non-exposed population to determine if exposure is associated with disease. The environmental hazard may be behavioral in nature (eg, cigarette smoking, methamphetamine use, fat in diet), the consequence of modern lifestyle (eg, job stress, inadequate sleep), a by-product of industry (eg, air population, groundwater contamination, mercury in fish), or attributable to other sources in one's surroundings (eg, automobile exhaust, pesticide spraying, off-gassing of indoor building materials). Furthermore, the timing of the exposure may be shortlived, long-term, retrospective, prospective, current (ecologic), and/or ongoing. A short-term exposure to a very hazardous agent may convey the same impact on health as the continuous exposure to a relatively minor hazard. Gene-environment interaction also may play an important role in the underlying disease process.¹

Different epidemiologic measures are available to gauge the association between environmental exposure and disease. The application of a particular measure depends on the underlying properties of the measure and the respective context of the study.² A frequently used measure of disease association in environmental exposure studies is the traditional odds ratio (TOR). This measure is defined as the odds for disease given exposure divided by the odds for disease given no exposure (Fig. 1). TORs have the distinct advantage of being invariant to rotation. That is, the disease TOR [ie,(a/b)/(c/d)] is equal to the exposure TOR [ie,(a/c)/(b/d)]. Furthermore, when disease is rare among both the exposed and non-exposed groups, TORs often are used in retrospective analyses as an approximate measure of relative risk (RR) [ie, TOR ≈ RR = (a/e)/(c/f)].³

Figure 1.

Computing TOR and IOR from a 2 x 2 contingency table.

An alternative measure of disease association closely related to the TOR is the informational odds ratio (IOR). The IOR measures the probability for exposure given disease divided by the probability for exposure given no disease (Fig. 1). Using Bayes theorem, it is easy to see that the IOR is equivalent to the post-exposure odds divided by the pre-exposure odds (Fig. 2).⁴ The IOR resembles the traditional odds ratio (TOR) except that the probability terms in the denominator (ie, P(D)/P()) are not conditioned on the absence of exposure (ie, P(D|)/P(|)). When defined in the context of a receiver operator curve (ROC), the IOR also may be computed by multiplying the TOR by the likelihood ratio for a negative exposure (LR^–) (ie, P(|D)/P(|)) (Fig. 3). Referring to Figure 1, TOR = (a/b)/(c/d) = 2.58 and LR^– = (c/d)/(g/h) = 0.56. Accordingly, IOR = 2.58*0.56. = 1.44. The IOR is interpreted as an outcome measure of information gained after knowing exposure status and may be used in case-referent studies independent of whether the disease is rare or common. When exposure is rare in both disease and non-disease groups, TOR ≈ IOR.

Figure 2.

Equivalence between IOR and the post-exposure odds divided by the pre-exposure odds.

Figure 3.

Relationship between IOR, TOR and LR^–.

A desirable property of an adjusted ratio estimate is collapsibility (ie, the combined crude ratio will not change after adjusting for a variable that is not a confounder). TORs are not collapsible.^5,6 Applying standard techniques, we illustrate two approaches for computing a common IOR and 100(1-α)% confidence intervals (CIs) and compare the measures with respect to collapsibility.

Methods

95% robust (Normal theory) CI estimate for IOR

Given a single stratum (j), a large-sample (asymptotically consistent) estimate for var{log(IOR_j)} may be derived using the delta-method (based on a first order Taylor series) and is seen to equal (1/a - 1/g + 1/b - 1/h) (Fig. 4).^7,8 The latter is equivalent to the robust “sandwich” estimate for var{log(IOR_j)}.^9,10 IORs are ratios of probabilities and confidence intervals are computed in an analogous manner as risk ratios.¹¹ Applying the central limit theorem (CLT), the computational formula for a 100(1-α)% robust (normal theory) CI estimate for IOR_j is given in Figure 5.¹² The 95% CI_Robust estimate for the crude IOR shown in Figure 1 is given as (1.38-1.50).

Figure 4.

Derivation of using the delta-method.

Figure 5.

Computing a robust 100(1-α)% confidence interval estimate for IOR.

Covariate adjusted (pooled) estimate and 100(1-α)% confidence interval (CI) for stratified IOR

A summary estimate or common IOR for a series of 2 X 2 tables may be easily computed by taking the weighted average of stratum-specific IORs, given a fixed-effects model (ie, barring chance, the treatment effect is similar in all strata). Two main weighting techniques for pooling data across stratum are traditionally used in practice to compute combined relative-effect estimates.¹³ Below, the methods are presented in the context of estimating a covariate-adjusted IOR and corresponding 100(1-α)%.

Woolf method

Assuming IORs are not significantly heterogeneous for k (j = 1 to k) strata and applying Woolf's weighted least squares method, the logarithm of the covariate adjusted (pooled) estimate for a stratified IOR [ie, log(IOR_Woolf)] may be obtained by weighting the logarithm of each stratum-specific IOR_j estimate inversely proportional to its estimated variance (Fig. 6).^14,15 A 100(1-α)% normal theory CI estimate for IOR_Woolf is given in Figure 7.

Figure 6.

Woolf's weighted least squares estimate for the logarithm of IOR.

Figure 7.

Computing an 100(1-α)% confidence interval estimate for IOR_Woolf

Mantel-Haenszel method

The IOR also may be expressed as the cross-frequency for the a^th cell (ie, a*h/i) of a 2 X 2 table divided by cross-frequency for the c^th cell (ie, b*g/i). Given a series of 2 X 2 tables (stratum) indexed by (j = 1 to k), the weighted Mantel-Haenszel estimate for the common IOR is then computed by separately summing the cross-frequency terms in the numerator and denominator of the IOR estimate over each of the (k) stratum (Fig. 8).¹⁶ Here again, we have assumed that the IORs are not significantly heterogeneous for k (j = 1 to k) strata. The term (ŵ) defined in Figure 4, which denotes the inverse var{log(IOR)} estimate, also may be written as a function of the cross-frequencies for the a^th and c^th cell (Fig. 9). A pooled estimate for (ŵ) is then computed by separately summing the terms in the numerator and denominator over each of the (k) stratum (j = 1 to k) (Fig. 10).¹⁷ Applying the central limit theorem, a robust 100(1-α)% normal theory CI estimate for IOR_MH is given in Figure 11. Note, the IOR_MHestimate will always be bounded by the minimum and maximum of the stratum specific IORs estimates, since it represents a weighted average of the individual stratum. If the disease ratios g_j/h_j are constant across strata, the Mantel-Haenszel estimate for IOR will equal the combine crude IOR.¹⁷ When b_j/h_j are not constant across strata the variance estimate of the combined crude IOR will not be consistent and the Mantel-Haenszel estimate is generally recommended as the measure of association in this case.¹⁷

Figure 8.

Mantel-Haenszel estimate for a common IOR.

Figure 9.

Expressing ŵ in terms of the cross-frequencies for the a^th and b^th cell.

Figure 10.

Computing a pooled version for ŵ over (k) stratum (j = 1 to k).

Figure 11.

Computing a robust 100(1-α)% confidence interval estimate for IOR_MH.

Results

Comparison of the Woolf and Mantel-Haenszel methods with respect to collapsibility

A confounding variable is an extraneous variable that masks the true influence of a putative causal variable on the effect (outcome) being studied. By definition, it must be related to both the cause and effect variables.³ Consider the association between “crystal meth” (methamphetamine) use and cardiomyopathy in young patients.¹⁸ Crystal meth users tend to be cigarette smokers and cigarette smoking potentially is associated with cardiomyopathy.¹⁹ Failing to adjust for cigarette smoking may confound the association between crystal meth use and cardiomyopathy. An estimate is collapsible if the combined crude estimate does not change after adjusting for a variable that is not a confounder. It is well known that adjusted TORs are not collapsible.^5,6

Consider the stratified data shown in Figures 12 and 13 corresponding to the collapsed data presented in Figure 1. If Exposure (E) represents the causal factor and Death (D) the effect, then Sex (S) is not a confounding variable since it is not related to Death on either the TOR or IOR scale (ie, ). However, if Sex (S) represents the causal factor and Death (D) the effect, then Exposure (E) is a confounder because it is related to both Death (, ) and Sex (,). Referring to Figure 14, we see that neither nor are collapsible with respect to sex because both adjusted estimates differ from the combined . However, referring to Figure 15 we see that the adjusted Mantel-Haenszel estimate for this example is collapsible with respect to sex (ie, ). On the other hand the adjusted Woof estimate is not is collapsible with respect to sex (ie, ). The IOR_Woolf estimate is based on a non-linear (logarithmic) weighted estimate of stratum-specific IORs and accordingly the combined crude IOR does not necessarily remain constant after adjusting for a variable that is not a confounder. In our simple example, we see that the results obtained by the Mantel-Haenszel method are identical to those obtained from a Poisson regression model using robust variance estimation.

Figure 12.

Contingency tables corresponding to data in Figure 1 stratified by sex.

Figure 13.

Contingency tables corresponding to data in Figure 1 stratified by exposure.

Figure 14.

Crude and adjusted TOR estimates corresponding to data in Figures 1, 12 and 13.

Figure 15.

Crude and adjusted IOR estimates corresponding to data in Figures 1, 12 and 13.

Exact confidence intervals for IOR

When sample sizes are small, an exact unconditional CI estimate may be computed for the IOR. However, due to the discrete nature of the problem, the resulting CI estimates tend to be very wide. Consider the case when exposure is rare in both disease and non-disease groups (ie, TOR ≈ IOR). In the example shown in Figure 16, we see that the standard exact CI estimate for the IOR²⁰ is considerably wider than the standard exact CI estimate for the TOR²¹ even though one would expect the coverage to be nearly equal. Furthermore, as illustrated in Figure 17, the standard exact CI for the IOR estimate is neither asymptotically efficient nor consistent. A pseudo “continuity-adjusted” exact confidence interval based on the Farrington-Manning score statistic provides better coverage in some cases, however the resulting CIs may be too narrow when one or more cell sizes are very small, as illustrated in Figure 16 (IOR = 1.0, CI_MF = 0.0594-11.1435).²² By parallel analogy, the above small-sample concerns identically apply to RR estimates. Methods for improving the nominal coverage (ie, at least 1-α) of unconditional exact marginal effect estimates have been suggested in the literature.²³

Figure 16.

Comparison of exact confidence interval procedures for TOR and IOR.

Figure 17.

Comparison of asymptotic and exact confidence interval procedures for IOR.

Discussion

A desirable property of an adjusted ratio estimate is that the combined crude ratio will not change after adjusting for a variable that is not a confounder (ie, collapsibility). It is well known in the literature that adjusted TORs are not collapsible. This is illustrated in Figure 14, where both the TOR_Woolf and TOR_MH sex adjusted estimates differed from the combined crude TOR, even though sex is not a confounding variable. In prospective (cohort) studies, the association between a putative exposure and disease adjusting for other important model variables may be computed using the generally collapsible Mantel-Haenszel RR estimate. When disease is rare among both the exposed and non-exposure groups in a case-referent study, the TOR and RR estimates will be approximately equal. However, the outcome of interest in some retrospective environment exposure studies may be fairly common and the TOR estimate will not equal the combined crude estimate after adjusting for a variable that is not a confounder.

The IOR is a useful measure of association in environmental case-referent studies, especially when the outcome under consideration is known to occur frequently. Similar to RRs, Mantel-Haenszel adjusted IORs are generally collapsible (criteria for simple and strict collapsibility are discussed in the literature^6,24,25). The IOR measures how much more (or less) likely patients with the disease have a particular exposure than those without disease (ie, the post-exposure odds divided by the pre-exposure odds).¹¹ Similar to other relative effect estimates IORs are logarithmic, meaning that a value of 1.0 corresponds to no association between exposure and disease, while an IOR greater/less than unity indicates a positive/negative association with disease.

Author Contributions

Conceived and designed the experiments: JTE. Analysed the data: JTE. Wrote the first draft of the manuscript: JTE. Contributed to the writing of the manuscript: JTE, SL, AT, CJP. Agree with manuscript results and conclusions: JTE, SL, AT, CJP. Jointly developed the structure and arguments for the paper: JTE, SL, AT, CJP. Made critical revisions and approved final version: JTE, SL, AT, CJP. All authors reviewed and approved of the final manuscript.

Disclosures and Ethics

As a requirement of publication author(s) have provided to the publisher signed confirmation of compliance with legal and ethical obligations including but not limited to the following: authorship and contributorship, conflicts of interest, privacy and confidentiality and (where applicable) protection of human and animal research subjects. The authors have read and confirmed their agreement with the ICMJE authorship and conflict of interest criteria. The authors have also confirmed that this article is unique and not under consideration or published in any other publication, and that they have permission from rights holders to reproduce any copyrighted material. Any disclosures are made in this section. The external blind peer reviewers report no conflicts of interest.

Acknowledgments

Katherine T. Jones (ECU) offered valuable editorial assistance during the writing of this manuscript.

References

1.

Efird J.An efficient gatekeeper algorithm for detecting GxE. Cancer Inform. 2010; 12: 115–20. Google Scholar

2.

Behrens T., Pigeot I., Ahrens W.Epidemiologische und statistische methoden der risikoabschätzung. Bundesgesundheitsbl. 2009; 52: 1151–60. Google Scholar

3.

Wassertheil-Smoller S. Biostatistics and Epidemiology. New York, NY: Springer-Verlag, 1990. Google Scholar

4.

Katz M.A probability graph describing the predictive value of a highly sensitive diagnostic test. N Engl J Med. 1944; 291: 1115–6. Google Scholar

5.

Cummings P.The relative merits of risk ratios and odds ratios. Arch Pediatr Adoles Med. 2009; 163: 438–45. Google Scholar

6.

Wermuth N.Parametric collapsibility and the lack of moderating effects in contingency tables with a dichotomous response variable. JR Statist Soc B. 1987; 49: 353–64. Google Scholar

7.

Oehlert G.A note on the delta method. Amer Stat. 1992; 46: 27–9. Google Scholar

8.

Katz D., Baptista J., Azen S., Pike M.Obtaining confidence intervals for risk ratio in cohort studies. Biometrics. 1978; 34: 469–74. Google Scholar

9.

Kauermann G., Carroll R.A note on the efficiency of sandwich covariance matrix estimation. JASA. 2001; 96: 1387–96. Google Scholar

10.

Zou G.A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004; 159: 702–6. Google Scholar

11.

Deeks J., Altman D.Diagnostic tests 4: likelihood ratios. Br Med J. 2004; 329: 168–9. Google Scholar

12.

Le Cam L.The central limit theorem around 1935. Statist Sci. 1986; 1: 78–91. Google Scholar

13.

Morris J., Gardner M.Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates. Br Med J. 1988; 296: 1313–6. Google Scholar

14.

Woolf B.On estimating the relation between blood group and disease. Ann Hum Genet. 1955; 19: 251–3. Google Scholar

15.

Grizzle J., Starmer F., Koch G.Analysis of categorical data by linear models. Biometrics. 1969; 25: 489–504. Google Scholar

16.

Woodward M. Epidemiology: Study Design and Data Analysis.2nd ed.Boca Raton: Chapman & Hall/CRC, 2005. Google Scholar

17.

Greenland S., Robins J.Estimation of a common effects parameter from sparse follow-up data. Biometrics. 1985; 41: 55–68. Google Scholar

18.

Yeo K., Wijetunga M., Ito H.. The association of methamphetamine use and cardiomyopathy in young patients. Am J Med. 2007; 120: 165–71. Google Scholar

19.

Hartz A., Anderson A., Brooks H., Manley J., Parent G., Barboriak J.The association of smoking with cardiomyopathy. N Engl J Med. 1984; 311: 1201–6. Google Scholar

20.

Santner T., Snell M.Small-sample confidence intervals for p₁-p₂ and p₁/p₂ in 2 X 2 contingency tables. JASA. 1980; 75: 386–94. Google Scholar

21.

Thomas D.Algorithm AS-36. Exact confidence limits for the odds ratio in a 2 X 2 table. Appl Stat. 1971; 20: 105–10. Google Scholar

22.

Chan I., Zhang Z.Test-based exact confidence intervals for the difference of two binomial proportions. Biometrics. 1999; 55: 1202–9. Google Scholar

23.

Mukhopadhyay P. Exact tests and exact confidence intervals for the ratio of two binomial proportions. Ph.D. Dissertation, NC State University, 2003. Google Scholar

24.

Whittemore A.Collapsibility of multidimensional contingency tables. JR Statist Soc B. 1978; 40: 328–40. Google Scholar

25.

Geng Z.Collapsibility of relative risk in contingency tables with a response variable. JR Statist Soc B. 1992; 54: 585–93. Google Scholar

© 2012 SAGE Publications. This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 3.0 License (http://www.creativecommons.org/licenses/by-nc/3.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).

Citation Download Citation

Jimmy T. Efird, Suzanne Lea, Amanda Toland, and Christopher J. Phillips "Informational Odds Ratio: A Useful Measure of Epidemiologic Association in Environment Exposure Studies," Environmental Health Insights 6(1), (1 January 2020). https://doi.org/10.1177/EHI.S9236

Published: 1 January 2020

Access the abstract

JOURNAL ARTICLE
PAGES

DOWNLOAD PAPER + SAVE TO MY LIBRARY