Visual Analytics and Imaging Laboratory (VAI Lab)
Computer Science Department, Stony Brook University, NY

An Explainable AI Approach to Large Language Model Assisted Causal Model Auditing and Development

Abstract: Causal networks are widely used in many fields, including epidemiology, social science, medicine, and engineering, to model the complex relationships between variables. While it can be convenient to algorithmically infer these models directly from observational data, the resulting networks are often plagued with erroneous edges. Auditing and correcting these networks may require domain expertise frequently unavailable to the analyst. We propose the use of large language models such as ChatGPT as an auditor for causal networks. Our method presents ChatGPT with a causal network, one edge at a time, to produce insights about edge directionality, possible confounders, and mediating variables. We ask ChatGPT to reflect on various aspects of each causal link and we then produce visualizations that summarize these viewpoints for the human analyst to direct the edge, gather more data, or test further hypotheses. We envision a system where large language models, automated causal inference, and the human analyst and domain expert work hand in hand as a team to derive holistic and comprehensive causal models for any given case scenario. This paper presents first results obtained with an emerging prototype.

Teaser 1: Workflow of our ChatGPT-powered Causal Auditor:

The steps of our workflow are: (1) Algorithmic discovery of the initial (raw) causal model. (2) Query-driven ChatGPT-based edge commentary. (3) Analyst-initiated model refinement informed by the outcomes of steps 1 and 2.

A few of the visual elemens we have designed, apart from the node link diagram that represents our causal model:

Teaser 2: On the top we see the Causal Debate Chart for the relation Percent Fair or Poor Health - Life Expectancy, presenting an overwhelming belief that the former is the cause of the latter. On the bottom is the Causal Relation Environment Chart for the same relation. The intensity of red and green encodes the strength of the mediators and covariates (weak, medium, strong), and the grey color of the cause and effect variables stand for the basic relationship without directional trends. Finally, in the center is the Confounder/Mediator Chart for the relation Food Environment Index and Violent Crime Rate. This chart shows all causal relationships for all trends in one view. Please see the paper for a more detailed description.

Paper: Y. Zhang, B. Fitzgibbon, D. Garofolo, A. Kota, E. Papenhausen, K. Mueller, "An Explainable AI Approach to Large Language Model Assisted Causal Model Auditing and Development," NL-VIZ Workshop 2023 (jointly held with IEEE VIS). Melbourne, Australia, October 2023 PDF PPT

Funding: This research was funded in part by a grant from the American Public Health Association (APHA) and a grant from the New York State Strategic Partnership for Industrial Resurgence (SPIR) program.