The increasing performance of feature extraction and regression modeling in various domains raises the hope for machine and deep learning to assist clinicians in numerous healthcare applications. However, the complex and multimodal nature of the problems and the scarce resource of high-quality labeled data in this domain introduces several challenges and limitations. These challenges, along with lack of interpretability, undermines the generalizability and usability of many state-of-the-art machine learning models.
This dissertation focuses on using multimodal sources of data for regression modeling in healthcare applications. The argument is that domain knowledge describes the nature of each modality's relationship with the target function. This relationship can characterize the appropriate level of representation and an efficient integration method. We define a framework with two heterogeneous modalities, one modality provides more local features, while another contains higher-level global information. We demonstrate the framework's applicability for multiple healthcare regression tasks.
In this framework, we propose two approaches for increasing the performance in the absence of large-scale data: leveraging the abstraction of the modality representations based on domain knowledge, and a tree-structure convolutional neural network for integrating the information from the heterogeneous modalities. This framework is discussed in more detail for two different cases of "Alzheimer's disease progression prediction" and "radiation therapy treatment planning." The former predicts a scalar target variable, while the latter approximates a two-dimensional one. The first application's performance is compared with the previous submissions for the same dataset; it outperforms the best-reported results.