Formal epistemologists have their FEW, philosophers of reason have their SLACRR, metaphysicians of science have their SMS. Philosophers of causation don’t have anything similar. We think it is high time to rectify this dire situation. This is the first workshop on the philosophy, psychology, and computer science of causation, and we hope more will come. We are inviting submissions from all researchers working on causation, causal cognition, and causal discovery.
Please submit an abstract of 300–1000 words to kyoto23@causation.science. Specifically, please send an email with your name, the title of your talk, and the abstract in the body of the email and submission as its title. If you have a (drafty or polished) paper, or your abstract can't be easily pasted as text (e.g., it contains figures or symbols), please in addition attach a PDF of the paper or the abstract. Please mind that the more of the argument your abstract contains, the more likely it will be accepted. There will be limited financial support available; please state at the end of the email if your attendance is conditional on receiving such support.
The deadline for submitting abstracts: April 23. We will notify you at the end of April. The workshop itself will happen in Kyoto on June 24-26, 2023. In addition to the talks, we are also planning some sightseeing activities in the evenings and on the days surrounding the conference.
We are looking forward to reading your submission,
Jun Otsuka and Tom Wysocki
If you're not on the program but want to attend the talks, please fill out this form.
土 Saturday 24.06 | 日 Sunday 25.06 | 月 Monday 26.06 | |
---|---|---|---|
Each presentation is 30min followed by 10min Q&A | |||
Lecture room 3, Graduate school of letters | Seifuso Villa | ||
915-930 | Opening remarks | Meet at Seifu-kaikan hall at 910 to go to the Seifuso Villa. You can't enter the villa on your own. | |
930-1010 | Jennifer McDonald What Causal Models Bring to the Table | Sander Beckers Backtracking Counterfactuals | Takashi Nicolas Maeda Discovery of time series causal models in the presence of unobserved variables |
1020-1100 | Hanti Lin Probabilities of Counterfactuals and Counterfactual Probabilities in Causal Models | Jiji Zhang Actual Causation and Minimality | Jennifer Jhun Causal Relations in Economic Contexts. |
1110-1150 | Malcolm Forster Counterfactual Predictive Maps in Causation and Beyond | Weixin Cai A Plea for Middle-Range Theories of Causation | 🖖 |
1150-1400 | Lunch break | ||
1400-1440 | Christopher Hitchcock Causal Models with Non-causal Constraints | Jonathan Vandenburgh Knowledge, Causal Safety, and Shortcuts in Machine Learning | Jun Otsuka Process Theory of Causality and the Causal Markov Condition |
1450-1530 | Zhao Fan Did Turing propose a casual analysis of computability? | Camilo Sarmiento Formalising actual causality and its applications to automated planning and computational ethics | Tom Wysocki Underdeterministic Causation with String Diagrams |
1530-1600 | Break | Sightseeing for the willing | |
1600-1640 | Xiuyuan An Choice of Variables and the Principle of the Common Cause | Murali Ramachandran Causation and Two Types of Dependence | |
1650-1730 | Frederick Eberhardt Learning an Index of Economic Complexity | Hayato Saigo Category Algebras and States on Categories: Toward Noncommutative Causal Theories | |
1730-1830 | |||
1830-2030 | Dinner at Boogaloo cafe. |
Confirmed speakers include:
Sober’s (1988, 2001) Venetian sea levels and British bread prices example raises questions about whether it violates the Principle of the Common Cause (PCC). However, the ambiguity of “correlation” and where do we get probabilistic dependence is important to the debate about the PCC. To get probabilities, statistical inference commonly assumes the samples are Independent and Identically Distributed (IID), in which independent means independent sample units. The term “unit” can refer to (1) a set of variables and (2) the specific object of study in an investigation (denoted by u in U), while a variable is a real-valued function that is defined on every unit in U (see Holland, 1986, p. 945). Therefore, under the first understanding, if there are interactions between units (i.e., inter-unit), their correlations are dependencies which contradict the IID assumption of independence, and thus mislead causal inference (see Zhang & Spirtes, 2014). In responding to Sober, this paper contends that we should specify the variables and units relevant to a study (which I call variable choice) in order to explain correlations. This is because causal relations may be true for one set of choice of variables but not for another (Woodward, 2016). Then, if the IID assumption is applied to infer probabilities, we have to realize that inter-unit causation undermines the independent assumption in IID, but not PCC. That’s the reason why Sober’s case is not a genuine counterexample to PCC.
Here are the main ideas of the paper. Firstly, it provides an overview of the causal correlation and variables in relation to PCC, with a focus on Sober’s case. Most of the extant literature are based on: (1) probabilities in the example are not homogeneous through time and (2) disputes about the notion of “correlation”, whether the correlations in the levels are about token events, variable types or others (e.g., Forster, 1988; Hausman & Woodward, 1999; Hoover, 2003; Papineau, 1992; Steel, 2003). This paper focuses on the problem of variable choice.
Secondly, it introduces two concepts, i.e., inter-unit and intra-unit causation, to review Sober’s example. J. Zhang and Spirtes (2014) diagnose that the causal Markov condition (CMC) is formulated in terms of causal structures that depict intra-unit causal relations only. The key is that causal influences only happen among the same unit, which means, a cause (A) specific to unit u1 will only have an influence on some property (B) of u1 but won’t influence another unit u2. Otherwise, the correlations between variables A and B we calculate are heterogeneous (means, not attributing the same property) ––that’s how the false inference occurs.
Thirdly, I analyze the concept of correlation as it is between token variables and their relationships with IID. I point out that in Sober’s case, IID is violated because of relevant properties interfering across units, and those inter-unit causation bring the erroneous statistical inference from sample to population. Nevertheless, choosing units correctly depends on how we conceive causalities, while correctly applying the IID determines how the probabilistic dependence are presented to us––they are related but not the same thing.
On the one hand, IID and intra-unit causal relation supporters lay divergent emphases on causal inference. Accepting that there should be a two-step approach to statistical inference of causation: step 1 is statistical estimation of probabilistic distribution and step 2 is causal inference using probabilities, Forster suggests that “the problem is how to use to data to discover whether there is dependence or not” (personal communication), which is in step 1 statistical inference, so he prefers to use IID; while J. Zhang and Spirtes’ point is in step 2, they redefine a population for variables (without a description of what variables should be), and choose to avoid inter-unit interference, so as CMC can hold.
On the other hand, J. Zhang and Spirtes argue that inter-unit causation makes IID sampling more difficult (2014, p. 248). I do not think so; I argue that IID itself is a too strong assumption to (strictly) apply. Since IID is an inter-unit relation which unit interactions may undermine, I suggest that we should foremost choose units correctly to qualify IID to apply; then deal with the unbiased sampling of data (i.e., no inter-unit interaction) explained by PCC––as if PCC is taken as an epistemic principle for causal inference, rather than a metaphysical principle in a definition of causation (Reiss, 2015, p. 170).
Counterfactual reasoning—envisioning hypothetical scenarios, or possible worlds, where some circumstances are different from what (f)actually occurred (counter-to-fact)—is ubiquitous in human cognition. Conventionally, counterfactually-altered circumstances have been treated as “small miracles” that locally violate the laws of nature while sharing the same initial conditions. In Pearl’s structural causal model (SCM) framework this is made mathematically rigorous via interventions that modify the causal laws while the values of exogenous variables are shared. In recent years, however, this purely interventionist account of counterfactuals has increasingly come under scrutiny from both philosophers and psychologists. Instead, they suggest a backtracking account of counterfactuals, according to which the causal laws remain unchanged in the counterfactual world; differences to the factual world are instead “backtracked” to altered initial conditions (exogenous variables). In the present work, we explore and formalise this alternative mode of counterfactual reasoning within the SCM framework. Despite ample evidence that humans backtrack, the present work constitutes, to the best of our knowledge, the first general account and algorithmisation of backtracking counterfactuals. We discuss our backtracking semantics in the context of related literature and draw connections to recent developments in explainable artificial intelligence (XAI).
This paper is a plea for an alternative approach to theorizing causation which I call “middle-range theorizing”, a term borrowed from sociology (cf. Merton 1949/1968; Cartwright 2020). This way of theorizing causation is distinguished from “grand theorizing”, which is popular in traditional discussions of causation, causal cognition, and causal discovery. The difference between the two types of theorizing lies primarily in their scopes. Grand theories of causation aim at all causal relationships as their targets of analysis, whereas middle-range theories of causation only intend to say something true of some but not all causal relationships. In this paper, I argue that middle-range theorizing over causation should receive more attention because it is a distinctively beneficial approach that complements, rather than competes with, the much more popular grand theorizing over causation nowadays.
I use theories in the metaphysics of causation as main examples to motivate my plea. Major metaphysical theories of causation are grand theories in the sense I mentioned, since they aim to characterize essential or defining features that make all causal relationships qua causal. Examples include not only monist theories seeking to identify a common essence shared by all causal relationships (e.g., regularity-based accounts, counterfactual-based accounts, process accounts, power theories) but also pluralist theories that sort out all causal relationships into mutually irreducible metaphysical categories at all once. Besides the existence of alleged counterexamples to almost every monist grand theory, grand metaphysical theories of causation also face a series of challenges: (1) the challenge from particularity, which emphasizes the ultimate diversity and particularities of causal relationships and hence questions the possibility of constructing a plausible grand theory of causation that identifies a common essence (or, for pluralists, multiple disjunctive and jointly exhaustive essences) among all causal relationships (Anscombe 1971/1981; Cartwright 2004); (2) the challenge from non-factuality, which questions the reality or objectivity of grand metaphysical theories of causation based on the fact that these theories (at least those proposed by far) in fact set little constraint on what real-world causal relationships must look like and instead keep being cast into doubts by new forms of causal relationship found in science (Norton 2003; manuscript); and (3) the challenge from pragmatic insignificance, which questions the scientific (esp. methodological) usefulness of grand theory that may even tell us something true or objective about the metaphysical nature or essence of all causal relationships (Woodward 2014a; 2014b; 2015).
Instead of defending grand theorizing in causal metaphysics against these challenges, my goal is to suggest an alternative way of theorizing causation – middle-range theorizing – which is, as I shall argue, not only immune from the challenges but also conducive to our understanding of causation. Instead of theorizing over all causal relationships, every middle-range theory of causation focuses only on a limited but significant portion of them. Middle-range theories of causation are possible because causation is like a biological genus, which can be divided into multiple species and in multiple ways. While grand theories of causation aim to help us understand the whole genus of causation, middle-range theories of causation focus on understanding causation at the species level, with each middle-range theory targeting a specific species of causation. It is in this sense that grand theorizing and middle-range theorizing can complement each other and work together to improve our understanding of causation.
In particular, I highlight two types of middle-range theorizing: domain-centered theorizing and property-centered theorizing. Domain-centered theorizing emphasizes the fact that causal relationships in different domains (or, sub-domains, sub-sub-domains, etc.) may manifest different characteristics. For example, causal relationships studied by biology and by social sciences may have properties those studied by physics do not possess (e.g., the former tend to constitute causal mechanisms whereas the latter are not), and vice versa. And even within the social domain, causal relationships involved in the exchange of dialogue and the influence of ideas may possess unique features other sub-domains of social sciences do not have. I argue that it is worth directing our attention to the specificities of causal relationships in different domains or sub-domains, in order to improve our understanding of these causal relationships as well as our abilities to predict, explain, and control events in these domains (or sub-domains). Clearly, theories focusing on specific causal relationships in different domains (or sub-domains) are middle-range in nature.
Property-centered theorizing emphasizes the fact that besides essential or defining characteristics, causal relationships also possess (or are conceived as possessing) contingent or non-essential characteristics (“non-essential” in the sense that these characteristics are not necessary for a relationship to be causal). Examples include stability, specificity, and proportionality of causal relationships (Woodward 2010), speed of change (Ross 2018), size and duration of causal influence, reversibility (Ross and Woodward 2022), linearity, monotonicity, circularity (or feedback loop), same-level/cross-level causation, threshold effect, causal equilibrium, etc. I argue that these non-essential causal properties also deserve specific theories to answer conceptual, metaphysical, epistemological, and axiological questions associated with them. Insofar as some of these properties are manifested by only some but not all causal relationships, theories about them are middle-range because they do not intend to say anything true of all causal relationships.
Finally, I argue that middle-range theorizing over causation is immune from the challenges to grand metaphysical theories of causation. First, it is immune from the challenge from particularity because, even if causal relationships are ultimately diverse in nature, a significant portion of them may nevertheless reside in the same domain (or sub-domain) or have some non-essential properties in common. This offers a sufficient ground for middle-range theorizing. Second, middle-range theorizing over causation is immune from the challenge from non-factuality because middle-range theories are not meant to be exceptionless, which makes them not only tolerant of counterexamples but also shed light on causal relationships falling within their scopes. Third, middle-range theories of causation are immune from the challenge from pragmatic insignificance because, by focusing on a narrower set of causal relationships, a middle-range theory can achieve a higher level of integration between the metaphysical analysis of these causal relationships and epistemological lessons informed by this analysis.
Hidalgo & Hausmann's (2009) Economic Complexity Index (ECI) is a first attempt to develop an indicator that explains the growth of a country's economy in terms of the _diversity_ of the traded products. There are at least two ways to interpret the ECI. One can view it as a summary statistic that is descriptive of the different economies and that correlates with future growth. But the far more intriguing possibility is that the ECI is indicative of an underlying cause of future growth. The claim would be that it is not any specific trade-pattern or set of goods that leads to growth but that it is the diversity of goods, or the complexity of underlying capabilities used in their production, that is an important cause of growth. What sort of evidence can one offer in favor or against such an abstract cause of growth? Our approach is motivated by the following consideration: If economic complexity, as tracked by the ECI, is indeed a high-level cause of future growth then that index should be identifiable by an _unsupervised_ method applied to the relation between trade data and growth. If, in contrast, the ECI is merely a descriptive summary that tracks some, but not all (or, far too many) features of the relation between a country’s economy and its future growth, then this index will not be identifiable by an unsupervised learning method. This talk will report on our efforts to determine the status of ECI. The work is part of a more general effort to understand the principles we use to delineate scientific quantities that we attribute a causal status to. [Joint work-in-progress with Patrick Burauel.]
In 1936, Alan Turing provided his analysis of computability in his well-known paper “On Computable Numbers, with an Application to the Entscheidungsproblem” (OCN). As scholars have rightly pointed out, Turing’s analysis of computability concerns the idealized human computer rather than the computing machine (e.g., Gandy 1988). In particular, Turing laid out several restrictive conditions for the human computer (Turing 1936, 249-252). These conditions sit in the heart of Turing’s argument in support of the now-called Church-Turing Thesis (CTT) (“The ‘computable’ numbers may be described briefly as the real numbers whose expressions as a decimal are calculable by finite means” (Turing 1936, 230)) – a Thesis that is essential for establishing the unsolvability of the Entscheidungsproblem. Understanding Turing’s analysis of computability is therefore crucial in understanding Turing’s results in computability and their implications.
In criticizing Turing’s justification for CTT, Stewart Shanker notoriously argued that Turing proposed a causal analysis of computability. For instance, Shanker claimed that “Turing assumes that the answer to the question, ‘How did x arrive at the correct answer?’ consists in a specification of the causal sequence of ‘mental states’ which can be modelled – and thence explained – on a Turing Machine” (1998, 9). Moreover, he said, “Turing suggests that the (human) computer’s ‘state of mind’ is the causal intermediary between observed symbols and subsequent action” (1998, 29). This causal interpretation of Turing’s analysis of computability falls under what Jack Copeland and Oron Shagrir called the cognitive approach to computability. Under this approach, at least some of Turing’s restrictive conditions for the human computer reflect “the limitations of human cognitive capacities as these capacities are involved in calculation” (Copeland and Shagrir 2013, 12). In contrast, Copeland and Shagrir argued that Turing’s approach to computability is noncognitive, where Turing’s restrictive conditions for the human computer “merely explicate the concept of effective computation as it is properly used and as it functions in the discourse of logic and mathematics” (2013, 12).
In this talk, I will consider to what extent can Turing’s analysis of computability be regarded as cognitive, let alone causal. First, I will introduce several key concepts, including, 1) Turing’s restrictive conditions for the human computer; 2) the causal interpretation of Turing’s analysis of computability; 3) the distinction between the cognitive and noncognitive approaches to computability. Second, I will show that some of Turing’s restrictive conditions for the human computer are evidently not motivated by any cognitive concerns. Instead, they are motivated by logical concerns, such as the notion of logical simplicity.
Third, I will move on to a more controversial restrictive condition – the finiteness of the states of mind (I denote this condition as Finiteness). Besides the causal interpretation, Finiteness is often considered to be motivated by the limitation of the human computer’s “sensory apparatus” (Sieg 2002, 396), which arguably also falls under the cognitive approach to computability. A common argument against the cognitive interpretation of Turing’s analysis of computability (also used by Copeland and Shagrir) appeals to Turing’s suggestion that all the occurrences of “state of mind” in OCN can be replaced with a physical counterpart – “a note of instructions” (Turing 1936, 253). This argument, even if successful, leaves other cognitive factors untouched. Copeland and Shagrir’s distinct argument for the noncognitive interpretation is based on Turing’s example of an uncomputable number 𝛿 in OCN. They argued that Turing would insist on the uncomputability of 𝛿 even if the human computer possesses stronger cognitive capacities. I will argue that they offer no convincing reasons for what Turing would say about 𝛿 in that hypothetical situation.
In the last part of this talk, I will propose an alternative noncognitive interpretation for Finiteness. In OCN, Turing seemed to suggest that “the real question at issue” concerning justifying CTT is “what are the possible processes which can be carried out in computing a number?” (1936, 249). Turing’s analysis of computability should therefore be regarded as an attempt to answer this “real question”. I will show that Turing’s analysis of computability concerns mainly two processes – recognizing symbols/states (Finiteness) and manipulating symbols. Elucidating Turing’s remarks on recognition and symbols in OCN will reveal that Finiteness is best to be understood as ensuring the process of recognizing symbols/states can be carried out mechanically. I conclude Finiteness is a direct consequence of the notion of effective computation.
This talk is about CP maps, where ‘CP’ is an acronym for Counterfactual Predictive, Conditional Probabilistic, Conceptually Progressive, Confirmationally Projective, or Causal Parental. Counterfactuals are standardly explicated in terms of English subjunctive conditionals ‘if P were the case, then Q would be the case’. CP maps, in contrast, are inferred from frequency tables using the IID (Identical Independent Distribution) assumption. If the single-case probabilities are identical, then we infer a Counterfactual Predictive map for each data point in the table. This is an account of counterfactual import that formal epistemologists can explore. The talk will apply the idea to a handful of topics, as far as time permits.
In standard Structural Equation Models (SEMs), it is possible to intervene simultaneously on every combination of variables in the model. This means that standard SEMs cannot include variables that stand in non-causal relations with one another; for example, a model that included variables for temperature in Fahrenheit and for temperature in Celsius would allow us to represent meaningless interventions that set these variables to incompatible values. However, there are both scientific and philosophical contexts where it would be useful to include variables subject to non-causal constraints. For example, if one’s mental state supervenes on the physical state of one’s brain, the variables representing mental state and physical state will be subject to non-causal constraints. One might nonetheless wish to include both variables in a causal model to address issues concerning mental causation. We will present an amendment to the standard formalism for causal models that permits the inclusion of variables subject to non-causal constraints. One interesting feature of the formalism is that it allows us to represent different types of intervention on the same variable.
This talk gives an overview of how a practicing economist might delineate some domain of interest, using a case study from industrial organization. In such exercises, equilibrium analysis, tightly linked to causal reasoning, plays a central role. However, economists’ practices seem to jibe oddly with at least one other account of boundary drawing: I have in mind the New Mechanists in particular, some who have addressed the question explicitly. Individuating some relevant domain via equilibrium reasoning often provides a serviceable way of delineating economic domains of interest, but some would seem to reject it as a general strategy. On the contrary, I suggest that this activity is at least conceptually on par with some of the very moves that the New Mechanist might make, and makes perspicuous its own causal underpinnings.
Three kinds of causal models are particularly influential. There are (i) causal Bayes nets, and (ii) structural equation models with independent error variables, both being popular in computer science and in philosophy. And there are (iii) Rubin causal models, most popular in social and health sciences, with Nobel-prize-winning applications. Those three kinds of causal models are often thought to be closely connected. But I propose a reconsideration. To allow for the possibility of indeterminism, Rubin causal models need to be carefully treated: the probabilities of counterfactuals therein should be reinterpreted in terms of counterfactual probabilities. As a result, Rubin causal models should be used in conjunction with causal Bayes nets, rather than structural equation models---contrary to the traditional wisdom.
Methods of inferring causal relationships from observational data alone is called causal discovery. Causal discovery often makes certain assumptions about the data-generating process. The most important of these is the assumption of the absence of latent confounders. We have developed a method for causal discovery based on the assumption that latent confounders exist. In this presentation, we propose a causal discovery method from time series data under the assumption that causality is non-linear and that a latent confounder exists.
This paper looks at what causal model analyses of causation are up to – that is, what justifies treating causal model analyses as cutting-edge counterfactual analyses. Specifically, then, it focuses on analyses of actual causation that take causal models to represent counterfactual dependencies, with counterfactuals underwritten by a similarity semantics. The task becomes that of identifying what causal models bring to the table such that an analysis invoking them improves on a counterfactual analysis that doesn’t. I argue that no substantive contribution made by the models framework requires models. But this is not to say no contribution is made. The real contribution of a models approach, I argue, is heuristic. It lies in their making plain what has thus far skulked in the shadows of a counterfactual analysis – that conditions constraining the evaluation of causation-relevant counterfactuals cannot be simultaneously determinate, categorical, and mind-and-language independent. I conclude by suggesting a view of causation which, in my view, best responds to this challenge.
The past decade has seen a raise of alternative approach to causal modeling called process theory, which uses category theory (more specifically, symmetric monoidal categories) and their visual representation called string diagrams to represent causal structures (e.g. Jacobs et al. 2019). This talk introduces the basic framework of process theory and analyzes it from a philosophical standpoint. Process theory models causation as a network of interconnected mechanisms. These mechanisms are represented as boxes that return outputs for given inputs. Boxes are connected to each other with wires of a matching type to form a causal structure. Networks consisting of such boxes and wires are string diagrams. While a causal graph expresses the (causal) regularities among the properties of objects or events, a string diagram emphasizes the aspect of causality as processes, whence the name of the theory (e.g., Salmon 1984). Process theory sheds new light on the philosophical debate over the legitimacy of the causal Markov condition. In process theory, individual causal models are represented as functors from a category of string diagrams to another symmetric monoidal category such FinStoch, the category of finite sets (representing values of variables) and stochastic matrices (representing conditional probabilities). It is known that the resulting model satisfies the causal Markov condition (more specifically the principle of common cause) only when the category of string diagrams contains a particular mechanism called copier. This observation naturally leads to the questions: when does the copier mechanism exist, and how is its existence interpreted? The presentation discusses these points in order to draw out the philosophical implications of process theory.
Lewis’s original counterfactual analysis of causation takes ‘chainwise’ counterfactual dependence to be necessary and sufficient for causation. But problems involving preemptive causation lead him to twice revise his account, so that no actual dependence relation need connect causes and their effects. This paper considers merits and demerits of two approaches which aim to establish an actual dependence relation, distinguished by their appeal to what we might call event-excluding and fact-fixing dependence, respectively. The residual problems motivate a ‘mixed’ account, incorporating both kinds of dependence.
Although moral responsibility is not circumscribed by causality, they are closely intertwined. Furthermore, a rational understanding of the evolution of the physical world is inherently linked to the idea of causality. Thus, decision-making AI systems with automated planning inevitably have to deal with causality, especially if they consider aspects of imputability or integrate references to ethical norms. The numerous debates surrounding causation in the last few decades have demonstrated the complexity of this notion and the difficulty of integrating it into planning. As a result, much of the work in computational ethics relegates causality to the background, despite the aforementioned considerations. The contribution of this work is to provide an actual causation definition suitable for action languages and a complete and sound translation into logic programming. This definition serves as a formalisation of Wright's NESS test. The resulting logic program enables the handling of complex causal relationships. In addition to enabling agents to reason about causality, this contribution specifically empowers the computational ethics domain to address previously unattainable situations. In a context where ethical considerations in decision-making are increasingly significant, advances in computational ethics can greatly benefit the entire AI community. This is a joint work with Gauvain Bourgne, Katsumi Inoue and Jean-Gabriel Ganascia.
In this talk we introduce "category algebras" and "states on categories" as the basis for constructing noncommutative causal theories. The speaker proposed to use these concepts to understand quantum fields and merge causal structures with noncommutative probabilistic structures (Saigo 2021). The problem of merging causal structures and (noncommutative) probabilistic structures, however, does not only emerge when considering quantum fields. On the contrary, it is a problem that must eventually be confronted when thinking quantitatively about causality. This is because causality is generally non-deterministic, and understanding it quantitatively will eventually need to be related to the (generalized) concept of probability. One of the major reasons for considering "noncommutative" probability theory is that causal structures cannot be written in the language of conventional probability theory. In order to treat causal and probabilistic structures in a unified manner, it is necessary to consider a framework broader than that of conventional probability theory, and noncommutative probability theory is a candidate for such a framework. Category algebras and states on categories are almost inevitable to think of a noncommutative probability structure in terms of causal structures considered as categories.
Two kinds of causes have captured most of philosophical imagination: deterministic causes, which necessitate their effects, and probabilistic causes, which change the probability of their effects. However, it has recently turned out that there are non-deterministic causes, which, rather than changing the probability of their effects, change their effects’ modal status. For instance, the meteor that killed dinosaurs is an underdeterministic (token) cause of our existence because big mammals, including humans, wouldn’t have evolved if dinosaurs kept roaming the earth. As there seems to be no determinate probability of our evolving on an Earth free of dinosaurs, this causal relation cannot be analyzed in probabilistic terms. A new concept is needed, and indeed, a new concept has been provided (Wysocki 2023a, 2023b).
Here, I will introduce a framework that allows for modeling underdeterministic causation. However, instead of the standard framework of causal models (Wysocki 2023b), I will use the category-theoretical framework of string diagrams (Otsuka & Saigo 2023). Why this nonstandard tool? While string diagrams allow for modeling situations that violate the Causal Markov Condition (as shown by Otsuka in his presentation) and hence are more expressive than causal models, the main benefit of using string diagrams is that they can represent deterministic, underdeterministic, and probabilistic causation in a unified way. This framework captures the differences between these three causal species with a single requirement on the string-diagrammatic representation (specifically, a requirement on the columns of the matrices that the string-diagramatic boxes are mapped onto). This fact shows, first, that the three types of causes are manifestations of a single causal concept; second, that string diagrams are superior to causal models in representing causation.
First, I will motivate the concept of underdeterministic causation, as it’s still new to the literature: we need it whenever non-deterministic causal relations cannot, in principle or in practice, be described probabilistically. I then will introduce an interpretation of string diagrams that can model underdeterminism: specifically, a functor from string diagrams to the category representing variable values and conditional causal modalities. Finally, I will discuss how different concepts are represented with different types of matrices; I will also mention the prospects of representing situations where different causal relations--underdeterministic and probabilistic--occur.
To know that something is true, our beliefs should be causally aligned with the world in the right way. This causal alignment fails in Gettier cases, where our evidence is causally disconnected from the proposition we believe. For example, when believing the correct time based on a stopped clock, the clock's display (our evidence) is causally disconnected from the actual time (the proposition we believe). This causal misalignment also occurs in cases of belief based on statistical evidence. For example, if we believe that someone does not have a rare disease simply because the disease is rare, we are forming a belief without any causal or individualized basis for doing so.
In this talk, I will motivate a causal condition on knowledge to explain these cases: causal safety. Causal safety builds on the modal notion of safety, where belief in p is safe if, in nearby possible worlds where one believes p, p is true. Inspired by causal theories of counterfactuals, the nearby worlds are determined by causal relationships rather than by a similarity measure. I argue that causal safety has two components. First, one's evidence E and belief p should be connected by a causal path that does not involve substantial confounding or large errors. Second, the causal variables cannot take values in the actual world that are consistent with a situation where one has the same evidence but where p is false. I argue that the first condition excludes belief based on statistical evidence from knowledge and that the second condition excludes Gettier cases from knowledge.
I will also argue that there are parallels between these cases in epistemology and some of the problems raised for the robustness of machine learning algorithms. One parallel arises when machine learning algorithms rely on causal shortcuts to achieve high accuracy. For example, consider a model that predicts whether a patient has Covid based on chest x-ray images. Suppose that the model relies on a spurious feature, such as the amount of space above the patient's shoulders, which is correlated with the hospital system the patient received x-rays in, which is correlated with whether the patient has Covid (DeGrave et al., 2021). I will argue that relying on a causal shortcut means that one cannot know that the output of the machine learning model is correct.
First, one can note parallels between relying on a causal shortcut and the cases discussed above. Like Gettier cases, relying on a causal shortcut involves getting the right answer for the wrong reason, or for a reason that is causally disconnected from the proposition believed. And like cases of belief based on statistical evidence, relying on a causal shortcut leads to beliefs based on evidence that is merely correlated with the proposition believed rather than individualized evidence. These intuitions can be vindicated using causal safety. For example, suppose that the Covid detector correctly predicts that a patient has Covid, but when a non-Covid x-ray is transformed so that the non-Covid patient is positioned like the Covid patient, the model falsely predicts that the patient has Covid. In this case, the model's reliance on the patient's position violates the second condition of causal safety: there is a causal variable (the patient's position) that takes a value that causes the same evidence (the model's Covid prediction) but where the belief is false (the patient does not have Covid). And if this causal shortcut arises in many of the cases where the Covid detector is used, then the causal path between the model's prediction and Covid status involves substantial confounding from patient position, violating the first condition of causal safety.
I will conclude with some reflections on the implications of this for machine learning. First, I argue that it is plausible that, in order to trust the outputs of machine learning models, we ought to be in a position to know that the model's output is true. If this is the case, then relying on causal shortcuts is a barrier to building trustworthy machine learning systems. Second, I raise the plausible concern that machine learning models designed to maximize accuracy regardless of causal path will always rely on causal shortcuts. If this is the case, then outputs of machine learning models in general may more closely resemble guesses based on statistical evidence than more robust sources of knowledge such as the outputs of scientific instruments or human testimony.
Many accounts of actual causation based on causal models include a minimality clause, which is supposed to disqualify a condition that includes redundant parts from being a cause, similiar to what the non-redundancy condition in the INUS account does. We examine this rationale for the minimality clause, by exposing the clause's consequences in the structural accounts, in contrast to those of its counterpart in the INUS account, and by analyzing the apparent absence of such a clause in Lewis-style counterfactual theories. We conclude that the prevailing minimality condition in structural accounts of causation does not quite match the pronounced rationale; either the condition or the rationale needs to be revised, and we tentatively explore both options.
Below is the map with the Saturday and Sunday venue; its google location is here.
Some random yet useful information:
Contact: kyoto23@causation.science
Host institution: Center for Applied Philosophy & Ethics, Graduate School of Letters, Kyoto University
This workshop is possible thanks to our sponsors:
RIKEN Center for Advanced Intelligence Project,
Data Science and AI Innovation Research Promotion Center, Shiga University.