Skip to main content

Directed Acyclic Graphs

  • Reference work entry
Book cover Handbook of Epidemiology

Abstract

A directed acyclic graph (DAG) can be thought of as a kind of flowchart that visualizes a whole causal etiological network, linking causes and effects. In epidemiology, the terms causal graph, causal diagram, and DAG are used as synonyms (Greenland et al. 1999). DAGs are considered to be of use for embedding causality in a formal causal framework (Hernán and Robins 2006; Robins 2001; Hernán et al. 2004). In probability theory, there is a somewhat different understanding of DAGs, which we will discuss later. This chapter aims to demonstrate how DAGs can help to formalize the search for answers to different research questions in epidemiology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 999.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,399.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Andersson SA, Madigan D, Perlman MD (1997) A characterization of Markov equivalence classes for acyclic digraphs. Ann Stat 25:505–541

    Article  Google Scholar 

  • Berkson J (1946) Limitations of the application of fourfold tables to hospital data. Biom Bull 2:47–53

    Article  CAS  Google Scholar 

  • Bishop CM (2007) Pattern recognition and machine learning. Springer, New York

    Google Scholar 

  • Borsuk ME (2008) Bayesian networks. In: Jørgensen SE, Fath B (eds) Encyclopedia of ecology. Elsevier, Burlington, pp 307–317

    Chapter  Google Scholar 

  • Bottcher SG, Dethlefsen C (2011) Deal: learning bayesian networks with mixed variables. http://CRAN.R-project.org/package=deal. R package version 1.2–34

  • Breitling L (2010) dagR: a suite of R functions for directed acyclic graphs. Epidemiology 21:586–587

    Article  PubMed  Google Scholar 

  • Chickering D, Meek C (2002) Finding optimal Bayesian networks. In: Darwiche A, Friedman N (eds) Proceedings of the eighteenth annual conference on uncertainty in artificial intelligence (UAI-02). Morgan Kaufmann, San Francisco, pp 94–102

    Google Scholar 

  • Chickering DM (1996) Learning Bayesian networks is NP-complete. In: Fisher D, Lenz HJ (eds) Learning from data: artificial intelligence and statistics V. Lecture notes in statistics, vol 112. Springer, New York, pp 121–130

    Chapter  Google Scholar 

  • Chickering DM, Heckerman D, Meek C (2004) Large-sample learning of Bayesian networks is NP-hard. J Mach Learn Res 5:1287–1330

    Google Scholar 

  • Cobb BR, Rumí R, Salmerón A (2007) Bayesian network models with discrete and continuous variables. In: Lucas P, Gámez JA, Salmerón A (eds) Advances in probabilistic graphical models. Studies in fuzziness and soft computing, vol 213. Springer, Berlin, pp 81–102

    Chapter  Google Scholar 

  • Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9:309–347

    Google Scholar 

  • Cowell RG, Dawid AP, Lauritzen SL, Spiegelhalter DJ (1999) Probabilistic networks and expert systems. Information science and statistics. Springer, New York

    Google Scholar 

  • Dagum P, Luby M (1993) Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artif Intell 60:141–154

    Article  Google Scholar 

  • Daly R, Shen Q, Aitken S (2011) Learning Bayesian networks: approaches and issues. Knowl Eng Rev 26:99–157

    Article  Google Scholar 

  • Darwiche A (2009) Modeling and reasoning with Bayesian networks. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Darwiche A (2010) Bayesian networks. Commun ACM 53:80–90

    Article  Google Scholar 

  • Dawid AP (2010a) Beware of the DAG! JMLR workshop Conf Proc 6:59–86

    Google Scholar 

  • Dawid AP (2010b) Seeing and doing: the Pearlian synthesis. In: Dechter R, Geffner H, Halpern JY (eds) Heuristics, probability and causality: a tribute to Judea Pearl. College Publications, London, pp 309–325

    Google Scholar 

  • Dethlefsen C, Højsgaard S (2005) A common platform for graphical models in R: the gRbase package. J Stat Softw 14:1–12

    Google Scholar 

  • Didelez V, Sheehan NA (2007) Mendelian randomisation: why epidemiology needs a formal language for causality. In: Russo F, Williamson J (eds) Causality and probability in the sciences. Texts in philosophy, vol 5. College Publications, London, pp 263–292

    Google Scholar 

  • Fast A, Hay M, Jensen D (2008) Improving accuracy of constraint-based structure learning. Technical Report 08-48, Computer Science Department, University of Massachusetts Amherst

    Google Scholar 

  • Friedman N (1997) Learning belief networks in the presence of missing values and hidden variables. In: Fisher DH (ed) Proceedings of the fourteenth international conference on machine learning (ICML ’97). Morgan Kaufmann, San Francisco, pp 125–133

    Google Scholar 

  • Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 303:799–805

    Article  PubMed  CAS  Google Scholar 

  • Friedman N, Goldszmidt M, Wyner A (1999a) Data analysis with Bayesian networks: a bootstrap approach. In: Prade H, Laskey K (eds) Proceedings of the fifteenth annual conference on uncertainty in artificial intelligence (UAI-99). Morgan Kaufmann, San Francisco, pp 196–205

    Google Scholar 

  • Friedman N, Goldszmidt M, Wyner A (1999b) On the application of the bootstrap for computing confidence measures on features of induced bayesian networks. In: Heckerman D, Whittaker J (eds) Proceedings of the seventh international workshop on artificial intelligence and statistics. Morgan Kaufmann, San Francisco, pp 197–202

    Google Scholar 

  • Geiger D, Heckerman D, King H, Me (2001) Stratified exponential families: graphical models and model selection. Ann Stat 29:505–529

    Google Scholar 

  • Geneletti S, Mason A, Best N (2011) Adjusting for selection effects in epidemiologic studies: why sensitivity analysis is the only “solution”. Epidemiology 22:36–39

    Article  PubMed  Google Scholar 

  • Getoor L, Rhee JT, Koller D, Small P (2004) Understanding tuberculosis epidemiology using structured statistical models. Artif Intell Med 30:233–256

    Article  PubMed  Google Scholar 

  • Gilks WR, Richardson T, Spiegelhalter D (1996) Markov Chain Monte Carlo in practice. Chapman & Hall, Boca Raton

    Book  Google Scholar 

  • Glover F (1989) Tabu search – part i. ORSA J Comput 1:190–206

    Article  Google Scholar 

  • Glover F (1990) Tabu search – part ii. ORSA J Comput 2:4–32

    Article  Google Scholar 

  • Glymour C, Scheines R, Spirtes P, Ramsey J (2012) TETRAD project. http://www.phil.cmu.edu/projects/tetrad/. Accessed 15 Aug 2012

  • Glymour MM (2006) Using causal diagrams to understand common problems in social epidemiology. In: Oakes J, Kaufmann J (eds) Methods in social epidemiology. Jossey-Bass, San Francisco, pp 393–428

    Google Scholar 

  • Glymour MM, Greenland S (2008) Causal diagrams. In: Rothman K, Greenland S, Lash T (eds) Modern epidemiology, 3rd edn. Lippincott Williams & Wilkins, Philadelphia, pp 183–209

    Google Scholar 

  • Greenland S, Brumback B (2002) An overview of relations among causal modelling methods. Int J Epidemiol 31:1030–1037

    Article  PubMed  Google Scholar 

  • Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research. Epidemiology 10:37–48

    Article  PubMed  CAS  Google Scholar 

  • Heckerman D (1999) A tutorial on learning with Bayesian networks. In: Jordan M (ed) Learning in graphical models. MIT, Cambridge, pp 301–354

    Google Scholar 

  • Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn 20:197–243

    Google Scholar 

  • Hernán MA, Robins JM (2006) Instruments for causal inference: an epidemiologist’s dream? Epidemiology 17:360–372

    Article  PubMed  Google Scholar 

  • Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA (2002) Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol 155:176–184

    Article  PubMed  Google Scholar 

  • Hernán MA, Hernández-Díaz S, Robins JM (2004) A structural approach to selection bias. Epidemiology 15:615–625

    Article  PubMed  Google Scholar 

  • Højsgaard S (2012) Graphical independence networks with the gRain package for R. J Stat Softw 46:1–26

    Google Scholar 

  • Højsgaard S, Edwards D, Lauritzen SL (2012) Graphical models with R. Springer, New York

    Book  Google Scholar 

  • Husmeier D (2003) Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19:2271–2282

    Article  PubMed  CAS  Google Scholar 

  • Husmeier D (2005) Probabilistic modeling in bioinformatics and medical informatics. Springer, London

    Book  Google Scholar 

  • Imoto S, Goto T, Miyano S (2002) Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. Pac Symp Biocomput 7:175–186

    Google Scholar 

  • Imoto S, Kim S, Goto T, Miyano S, Aburatani S, Tashiro K, Kuhara S (2003) Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network. J Bioinform Comput Biol 1:231–252

    Article  PubMed  CAS  Google Scholar 

  • Jensen FV, Nielsen TD (2007) Bayesian networks and decision graphs. Springer, New York

    Book  Google Scholar 

  • Kalisch M, Bühlmann P (2007) Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J Mach Learn Res 8:613–636

    Google Scholar 

  • Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P (2012) Causal inference using graphical models with the R package pcalg. J Stat Softw 47:1–26

    Google Scholar 

  • Kirkpatrick S, Gelatt CDJ, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680

    Article  PubMed  CAS  Google Scholar 

  • Kjærulff UB, Madsen AL (2008) Bayesian networks and influence diagrams: a guide to construction and analysis. Springer, New York

    Book  Google Scholar 

  • Knüppel S (2011) DAG program. http://epi.dife.de/dag/. Accessed 3 Oct 2012

  • Knüppel S, Stang A (2010) DAG program: identifying minimal sufficient adjustment sets. Epidemiology 21:159

    Article  PubMed  Google Scholar 

  • Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT, Cambridge

    Google Scholar 

  • Korb KB, Nicholson AE (2011) Bayesian artificial intelligence. 2nd edn. CRC, Boca Raton

    Google Scholar 

  • Lauritzen SL (1990) Graphical models. Clarendon, Oxford

    Google Scholar 

  • Lauritzen SL (1992) Propagation of probabilities, means, and variances in mixed graphical association models. J Am Stat Assoc 87:1098–1108

    Article  Google Scholar 

  • Lauritzen SL (1995) The EM algorithm for graphical association models with missing data. Comput Stat Data An 19:191–201

    Article  Google Scholar 

  • Lauritzen SL, Spiegelhalter DJ (1988) Local computations with probabilities on graphical structures and their application to expert systems. J Roy Stat Soc B 50:157–224

    Google Scholar 

  • Lauritzen SL, Dawid AP, Larsen BN, Leimer HG (1990) Independence properties of directed Markov fields. Networks 20:491–505

    Article  Google Scholar 

  • Li J, Wang ZJ (2009) Controlling the false discovery rate of the association/causality structure learned with the PC algorithm. J Mach Learn Res 10:475–514

    Google Scholar 

  • Liu Z, Malone B, Yuan C (2012) Empirical evaluation of scoring functions for Bayesian network model selection. BMC Bioinform 13:S14

    Article  Google Scholar 

  • Lunn D, Spiegelhalter D, Thomas A, Best N (2009) The BUGS project: evolution, critique and future directions. Stat Med 28:3049–3067

    Article  PubMed  Google Scholar 

  • Madsen AL, Lang M,, Kjærulff UB, Jensen F (2003) The Hugin tool for learning Bayesian networks. In: Nielsen TD, Zhang NL (eds) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 2711. Springer, Berlin, pp 594–605

    Chapter  Google Scholar 

  • Markowetz F, Spang R (2007) Inferring cellular networks – a review. BMC Bioinform 8(Suppl 6):S5

    Article  CAS  Google Scholar 

  • Moral S, Rumí R, Salmeó A (2001) Mixtures of truncated exponentials in hybrid Bayesian networks. In: Benferhat S, Besnard P (eds) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 2143. Springer, Berlin, pp 156–167

    Chapter  Google Scholar 

  • Murphy K (2007) Software for graphical models: a review. ISBA Bull 14:13–15

    Google Scholar 

  • Murphy K (2012) Software packages for graphical models/ Bayesian networks. http://www.cs.ubc.ca/~murphyk/Software/bnsoft.html. Accessed 15 Aug 2012

  • Nadathur SG, Warren JR (2011) Emergency department triaging of admitted stroke patients – a Bayesian network analysis. Health Inform J 17:294–312

    Article  Google Scholar 

  • Nguefack-Tsague G (2011) Using Bayesian networks to model hierarchical relationships in epidemiological studies. Epidemiol Health 33:e2011006

    Article  PubMed Central  PubMed  Google Scholar 

  • Pearl J (2009) Causality – models, reasoning and inference. 2nd edn. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • R Development Core Team (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/. Accessed 15 Aug 2012

  • Ramsey J (2010) Bootstrapping the PC and CPC algorithms to improve search accuracy. Tech Rep 101, Department of Philosophy, Carnegie Mellon University. http://repository.cmu.edu/philosophy/101. Accessed 15 Aug 2012

  • Ramsey J, Zhang J, Spirtes P (2006) Adjacency-faithfulness and conservative causal inference. In: Proceedings of the twenty-second annual conference on uncertainty in artificial intelligence (UAI-06). AUAI, Arlington, pp 401–408

    Google Scholar 

  • Robins JM (2001) Data, design, and background knowledge in etiologic inference. Epidemiology 12:313–320

    Article  PubMed  CAS  Google Scholar 

  • Robins JM, Blevins D, Ritter G, Wulfsohn M (1992) G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of aids patients. Epidemiology 3:319–336

    Article  PubMed  CAS  Google Scholar 

  • Robins JM, Hernán MA, Brumback B (2000) Marginal structural models and causal inference in epidemiology. Epidemiology 11:550–560

    Article  PubMed  CAS  Google Scholar 

  • Robins JM, Scheines R, Spirtes P, Wasserman L (2003) Uniform consistency in causal inference. Biometrika 90:491–515

    Article  Google Scholar 

  • Robinson R (1977) Counting unlabeled acyclic digraphs. In: Little H (ed) Combinatorial mathematics V. Lecture notes in mathematics, vol 622. Springer, Berlin, pp 28–43

    Chapter  Google Scholar 

  • Rothman KJ (1976) Causes. Am J Epidemiol 104:587–592

    PubMed  CAS  Google Scholar 

  • Rothman KJ, Greenland S, Lash T (2008) Modern epidemiology. 3rd edn. Lippincott Williams & Wilkins, Philadelphia

    Google Scholar 

  • Rubin D (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701

    Article  Google Scholar 

  • Scutari M (2010) Learning Bayesian networks with the bnlearn R package. J Stat Softw 35:1–22

    Google Scholar 

  • Shenoy PP (2011) A re-definition of mixtures of polynomials for inference in hybrid Bayesian networks. In: Liu W (ed) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in computer science, vol 6717. Springer, Berlin, pp 98–109

    Chapter  Google Scholar 

  • Shrier I, Platt RW (2008) Reducing bias through directed acyclic graphs. BMC Med Res Methodol 8:70

    Article  PubMed Central  PubMed  Google Scholar 

  • Spiegelhalter DJ, Lauritzen SL (1990) Sequential updating of conditional probabilities on directed graphical structures. Networks 20:579–605

    Article  Google Scholar 

  • Spirtes P, Glymour C (1990) An algorithm for fast recovery of sparse causal graphs. Report CMU-PHIL-15, Department of Philosophy, Carnegie Mellon University

    Google Scholar 

  • Spirtes P, Meek C, Richardson T (1995) Causal inference in the presence of latent variables and selection bias. In: Besnard P, Hanks S (eds) Proceedings of the eleventh conference on uncertainty in artificial intelligence (UAI-95). Morgan Kaufmann, San Francisco, pp 499–506

    Google Scholar 

  • Spirtes P, Glymour C, Scheines R (2001) Causation, prediction and search, 2nd edn. MIT, Cambridge

    Google Scholar 

  • Stefanini FM, Coradini D, Biganzoli E (2009) Conditional independence relations among biological markers may improve clinical decision as in the case of triple negative breast cancers. BMC Bioinform 10(Suppl 12):S13

    Article  CAS  Google Scholar 

  • Textor J (2012) DAGitty v.10. http://www.dagitty.net/. Accessed 3 Oct 2012

  • Textor J, Hardt J, Knüppel S (2011) DAGitty: a graphical tool for analyzing causal diagrams. Epidemiology 5:745

    Article  Google Scholar 

  • Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65:31–78

    Article  Google Scholar 

  • VanderWeele TJ, Robins JM (2007a) Directed acyclic graphs, sufficient causes, and the properties of conditioning on a common effect. Am J Epidemiol 166:1096–1104

    Article  PubMed  Google Scholar 

  • VanderWeele TJ, Robins JM (2007b) Four types of effect modification: a classification based on directed acyclic graphs. Epidemiology 18:561–568

    Article  PubMed  Google Scholar 

  • Verma T, Pearl J (1991) Equivalence and synthesis of causal models. In: Bonissone P, Henrion M, Kanal L, Lemmer J (eds) Proceedings of the sixth conference on uncertainty in artificial intelligence (UAI-90). Elsevier, Amsterdam, pp 258–268

    Google Scholar 

  • Verma T, Pearl J (1992) An algorithm for deciding if a set of observed independencies has a causal explanation. In: Dubois D, Wellman MP, D’Ambrosio B, Smets P (eds) Proceedings of the eighth conference on uncertainty in artificial intelligence (UAI-92). Morgan Kaufmann, San Mateo, pp 323–330

    Google Scholar 

  • Wang M, Chen Z, Cloutier S (2007) A hybrid Bayesian network learning method for constructing gene networks. Comput Biol Chem 31:361–372

    Article  PubMed  CAS  Google Scholar 

  • Weinberg CR (1993) Toward a clearer definition of confounding. Am J Epidemiol 137:1–8

    PubMed  CAS  Google Scholar 

  • Weinberg CR (2007) Can DAGs clarify effect modification? Epidemiology 18:569–572

    Article  PubMed Central  PubMed  Google Scholar 

  • Wong ML, Lee SY, Leung KS (2002) A hybrid approach to discover Bayesian networks from databases using evolutionary programming. In: Proceedings of the 2002 IEEE international conference on data mining, ICDM ’02. IEEE Computer Society, Los Alamitos, pp 498–505

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Foraita, R., Spallek, J., Zeeb, H. (2014). Directed Acyclic Graphs. In: Ahrens, W., Pigeot, I. (eds) Handbook of Epidemiology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-09834-0_65

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-09834-0_65

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-09833-3

  • Online ISBN: 978-0-387-09834-0

  • eBook Packages: MedicineReference Module Medicine

Publish with us

Policies and ethics