


三、Introduction and Related work




本次组会阅读的文献为  Multi-modal Graph Learning for Disease PredictionSCI一区


        Benefifiting from the powerful expressive capability of graphs, graph-based approaches have been popularly applied to handle multi-modal medical data and achieved impressive performance in various biomedical(生物医学的) applications. For disease prediction tasks, most existing graph-based methods tend to defifine the graph manually based on specifified modality (e.g., demographic information), and then integrated other modalities to obtain the patient representation by Graph Representation Learning (GRL).


        However, constructing an appropriate graph in ad vance is not a simple matter for these methods. Meanwhile, the complex correlation between modalities is ignored. These factors inevitably(不可避免地) yield(导致) the inadequacy(不充足) of providing  suffificient information about the patient’s condition for a  reliable diagnosis. To this end , we propose an end-to-end  Multi-modal Graph Learning framework (MMGL) for disease prediction with multi-modality.  To effectively exploit the rich  information across multi-modality associated with the dis ease, modality-aware (模态感知)representation learning is proposed  to aggregate(聚合) the features of each modality by leveraging(利用) the  correlation and complementarity between the modalities.
        Furthermore, instead of defifining the graph manually, the latent (潜在的)graph structure is captured through an effective way of adaptive(自适应) graph learning. It could be jointly(联合) optimized(优化) with the prediction model, thus revealing the intrinsic (内在的)con nections among samples. Our model is also applicable to the scenario(场景) of inductive(归纳) learning for those unseen data. An extensive group of experiments on two disease predic tion tasks demonstrates that the proposed MMGL achieves more favorable performance.

Although the above methods have achieved remarkable performance, three key issues remain to be further considered with respect to the graph-based methods in disease prediction  tasks, and even in some other biomedical-related aspects:
 (i) Insuffificient inter-modal relationship mining(挖掘). Each modality provides different information for the diagnosis of a disease, which explicitly(确实) is complementary but also redundant(冗余). However, both concatenation [20], [23], [24] and intra-modal attention mechanism(机制) [21], [22] adopted in previous studies are hard to capture the latent inter-modal correlation, which  may cause the learned representation to be biased towards a single modality. In addition, the general multi-modal shared representation learning methods merely focus on capturing the  commonalities(共性) between modalities, while the dissimilarities(不同性)between modalities are ignored, possibly resulting in the lack of complementary information.
        (ii) Hand-designing(手工设计) the graph adjacency matrix in a multi-stage framework. Both existing single-graph based methods [11], [16], [25], [26] and multi-graph based methods [20], [22], [27] construct the graph through hand-designed similarity(相似性) measures, which inevitably require careful tuning(调整) and are thus diffificult to generalize to downstream tasks. Meanwhile, the training of the several parts, such as multimodal representation learning, graph construction, and prediction, are independent(独立的) of each other in a multi-stage framework. Such practice not only weakens the integrality of the model, but also leads to suboptimal(次优)performance in downstream tasks. A better approach is to learn a graph in an adaptive way, which has been studied in GNNs to some extent [28]–[30]. But currently, less focus has been put on the graph structure learning in the biomedical fifield [23].

        (iii) Hard applicable to inductive learning. For the ap proaches based on spectral(光谱) graph convolution like [11], [16], [24], it’s hard for them to generalize to unseen samples. Besides, to accommodate the setting of inductive learning, it is also essential but cumbersome(麻烦) for multi-graph based methods [20], [22], [27] to measure the relationship of unseen samples on each graph.
To address the issues mentioned above, we concentrate in this paper on graph learning for disease prediction with multi- modality, and the main contributions can be highlighted in the following aspects:
        1、We propose a Multi-modal Graph Learning model (MMGL) for disease prediction with multi-modality, which is applicable to the scenarios of inductive learning.
2、To characterize a patient with multi-modality, the pro posed modality-aware representation learning (MARL) obtains not only the modality-shared(模态共享) representation serv ing as commonality(共性), but also the modality-specifified rep resentation that is patient-sensitive(患者敏感) as complementary.

3、To reveal the intrinsic relations among patients, an adaptive graph learning (AGL) is proposed to obtain a la tent graph structure to match flexibly for GNN-based downstream tasks. Furthermore, the unified(统一) modeling of MARL and AGL can be jointly optimized in an end-to- end way, facilitating more effificient training and inductive testing.
4、Compared to the state-of-the-art approaches, the comparable even signifificant improvement on two disease datasets indicates the advantages of our MMGL in terms
of disease prediction tasks. Meanwhile, the visualization of contribution score reflected by the obtained depen dencies among multi-modality also provides a modal- explainable decision support for doctors in real medical applications and inspiration for disease research.


Architecture(建筑) of three types of multi-modal shared representation learning. (a) Directly concatenation, (b) Intra-modal attention based weighted fusion. (c) Our modality-aware representation learning. The (a) and (b) fusions have only one interactive (交互)operation for different modal features, which proceed (继续)at the end of the module. In contrast(相反), our module has more interactive operations through multi-modal attention for features from different modalities.

The architecture overview of our MMGL. The multi-modal features X is first embedded into the modality-specified representation space and the modality-shared representation space through the modal-aware representation learning. Then an adjacency matrix A for X is learned based on the adaptive graph learning. Finally, we could obtain the prediction results through a GNN based on A and H, where H = Concat(H sh, H sp).

In this paper, we propose a multi-modal graph learning framework named MMGL for disease prediction. To capture the shared and complementary information among multi- modality, we propose modal-aware representation learning to simultaneously obtain the modality-specifified representa tion and the modality-shared representation considering inter- modal correlations. Furthermore, a lightweight (轻量)adaptive graph learning is proposed to reveal the intrinsic relations among subjects, which could construct an optimal graph structure for downstream tasks. Meanwhile, MMGL could be jointly optimized in an end-to-end way, which enables more efficient training and inductive testing. Our ongoing(不断的) research work will extend our MMGL to unified(统一的) graph learning for incomplete data and more biomedical tasks.

