Machine learning has risen to the forefront of scientific research due to its unparalleled predictive capabilities. As a result, researchers have become increasingly interested in uncovering the underlying causal structures that govern the relationships between variables in a system. These causal structures, often represented as directed acyclic graphs (DAGs), provide insights into how changes in one variable may directly or indirectly affect other variables, enabling a deeper understanding of the complex interactions within the system. While it is essential to constrain a model by minimizing spurious correlations and conducting "What-If" analyses, learning causal relationships from observational data, known as causal discovery, remains an active and challenging research area. This is due to factors like finite sampling, unobserved confounding factors, and measurement errors. Current approaches, including constraint-based and score-based methods, often struggle with high computational complexity because of the combinatorial nature of estimating DAGs. Inspired by the workshop on the Causality Challenge 'Cause-Effect Pair' at the Neural Information Processing Systems in 2013, this dissertation adopts a novel approach, generating a probability distribution over all possible graphs based on cause-effect pair features proposed in response to the workshop challenge.
The primary goal of this study is to develop new methods that leverage this probabilistic information and assess their performance. Furthermore, this work introduces a novel causal feature selection (CFS) algorithm using this approach and the establishment of a new evaluation criterion for CFS. To further enhance experimental performance, this dissertation proposes the use of a Graph Neural Networks (GNNs)--based probabilistic predictive framework for causal discovery. Conventional causal discovery algorithms face significant challenges in dealing with large-scale observational datasets and capturing global structural information. The GNN-based approach addresses these limitations, enabling the learning of complex causal structures directly from data augmented with statistical and information-theoretic measures. The proposed framework represents a significant leap forward in causal discovery, offering improved accuracy and scalability in both synthetic and real-world datasets, as well as introducing a novel synergy between probabilistic learning and causal graph analysis.
In addition to the methodological advancements, this dissertation includes an application of counterfactual analysis to study affective polarization on social media. By comparing scenarios with and without specific influencer-led conversations on platforms like Twitter, I analyze the impact of these conversations on public sentiment. This application highlights the practical implications of the proposed causal modeling techniques, demonstrating their utility in understanding real-world issues and contributing to the broader field of social media analysis.