Neural network integrating graph attention and LSTM based on brain effective connectivity for diagnose of Alzheimer’s disease

Yu Deng; Jiani Li; Yitong Huang; Meiyu Liu

doi:10.1117/12.3039959

11 September 2024 Neural network integrating graph attention and LSTM based on brain effective connectivity for diagnose of Alzheimer’s disease

Yu Deng, Jiani Li, Yitong Huang, Meiyu Liu

Author Affiliations +

Proceedings Volume 13270, International Conference on Future of Medicine and Biological Information Engineering (MBIE 2024); 1327019 (2024) https://doi.org/10.1117/12.3039959
Event: 2024 International Conference on Future of Medicine and Biological Information Engineering (MBIE 2024), 2024, Shenyang, China

Abstract

In recent years, progress in deep learning has significantly refined AD/MCI classification, but the relationship between functional connectivity changes and structural connectors remains to be created. To address this issue, this article proposes an inventive diagnostic system that utilizes the brain's effective connectivity network and integrates the Graph Attention Network (GAT) with the Long Short-Term Memory Network (LSTM). with the Long Short-Term Memory Organize (LSTM). By capturing brain interactions and dynamic changes, the framework can progress with demonstrative precision. Utilizing the Alzheimer's Malady Neuroimaging Activity (ADNI) dataset, the framework proved to be excellent at recognizing and predicting Alzheimer's disease, which illustrates the clinical potential it has. This research details the design, implementation and initial validation of the proposed method, emphasizing its effectiveness.

1. INTRODUCTION

As the worldwide populace ages, the determination and treatment of age-related illnesses such as Alzheimer’s illness (AD) has become a top priority. Alzheimer’s disease, an irreversible neurological disorder, and the increase in the population suffering from it confirms this trend. Deep learning is getting to be more able in its capacity to utilize neuroimaging data for classification and expectation errands. Many studies have used machine learning techniques such as support vector machines (SVMs) [1] and deep learning strategies to predict which MCI patients will progress to AD based on their neuroimaging results [2]. In any case, these procedures regularly fall flat to capture the complex connections between distinctive brain districts, coming about in less precise analysis. traditional functional networks, effective connectivity network establishes direct connections between brain regions, revealing interactions and dynamics in cognitive tasks. And brain effective connectivity networks uncover complex intuition between brain locales. Optimizing connection weights using graph attention networks (GATs) permits effective information transfer and processing and improves the model’s ability to perceive subtle differences in brain activity. LSTMs show advantages in processing time-series data, which can simulate the dynamics of the brain’s causal networks over time. Therefore, joining GAT with LSTMs empowers the framework to successfully learn and foresee key AD biomarkers from neuroimaging data, giving a more exact and comprehensive symptomatic approach.

As a result, the research proposes a brain effective connectivity networks-based AD diagnosis technique, aiming to improve demonstrative exactness and comprehensiveness. The paper points of interest the plan, usage, and preparatory assessment comes about, approving the proposed method’s adequacy and potential for clinical application.

2. METHOD

In this part, firstly, the brain effective connectivity networks based on Bayesian Network is introduced. After that, the Hill Climbing (HC) algorithm is introduced in detail. To analyse the brain effective connectivity networks, GAT is use to aggregate spatial features and LSTM based Encoder/Decoder is used for predicting time series. The method for constructing its model is illustrated as shown in Figure 1.

Figure 1.

Overall Process Diagram

2.1

Construction of brain effective connectivity networks

2.1.1

Application in the experiment

This study proposes a method based on Bayesian Network (BN) to construct brain effective connectivity networks, represented by a graph structure matrix A ∈ R^n×n, with the aim of exploring causal relationships between brain regions. We employ a heuristic search algorithm, the HC algorithm, to search for optimal network structures within the possible network structure space. The Bayesian Information Criterion (BIC) is adopted as our scoring function to evaluate the goodness of fit of different network structures. Usually the structure with the lowest BIC value is selected as the current optimal solution. Then we continue to search for a better solution of BN (Bayesian network) through iterative local optimization strategy.

2.1.2

Bayesian networkV = {X₁, X₂,…,X_n}E ∈ V × V

BN (Bayesian Network) is a probabilistic graph model used to describe the joint probability distribution of a set of random variables [3]. BN is a directed acyclic graph (DAG) of structure G=(V, E), consisting of a set of nodes and a set of edges.The set of nodes V ={X₁, X₂, …, X_n} corresponds to the random variables in the system, while the directed edges (or arcs) in E ∈ V × V represent causal or conditional dependencies between these variables.

For each node X_i ∈ V in the network, a conditional probability distribution P(X_i|pa(X_i)) is defined, where pa(X_i) denotes any combination of values of the parent nodes pa(X_i). These conditional probability distributions are central to the network, as they quantify the strength of dependency between nodes and their parents. In addition, the joint probability distribution of the whole network can be reconstructed by Markov conditions.[4]. We use BIC (Bayesian Information Criterion) as an effective model selection criterion for Bayesian networks. The BIC, also known as the Schwartz Information Criterion, aims to evaluate the balance between goodness of fit and complexity of a model through comprehensive measures. Lower BIC values mean that the model fits the data better, while avoiding overfitting.The calculation of BIC involves the model’s log-likelihood function (LL), a dataset (D) in the BN, the pair of nodes and edges in the BN (T), and the observed data (N)[5].

2.1.3

Hill climbing algorithm

HC algorithm is an optimization algorithm based on heuristic search. Its core mechanism involves iteratively evaluating and refining the current network structure to find a statistically optimal model within the possible network structure space. The HC algorithm starts with an initial network structure and then applies a predefined scoring function to evaluate the fit of that structure. During each iteration, the algorithm searches the neighborhood of the current structure, incrementally building the network structure. If a structure with a higher score is found in the neighborhood, it is adopted as the new current structure, and the search continues. If no higher-scoring structure is found, the algorithm terminates, and the current structure is considered a local optimum [6]. In constructing brain effective connectivity networks, the HC algorithm is employed to identify causal interactions between brain regions. These interactions are crucial for understanding the pathophysiological processes of neuropsychiatric disorders.

2.2

Graph Attention Network

GAT represents a robust graph-based methodology that facilitates nodes in assimilating information from neighboring nodes via an attention mechanism [7]. In the proposed method, the brain causal network part uses GAT to process the information between different brain regions. In this way, the input process is not only to measure single node information, but to consider the interaction between neighbor node information. Fig.2 shows the process of processing brain causal network data using GAT. It can be seen from Fig.2 that this process contains stages such as input diagram, GAT layer and output.

Figure 2.

Process and Analyze Complex Graph Structures.

Firstly, the adjacency matrix of the brain effect graph was analyzed and defined a (V, E, V ={X₁,X₂,…,X_n}, V are vertices, E represents the edges connecting any two brain regions and is directed. We denoted the time series features corresponding to each node brain region as where h_i is the feature of the i-th node, and 120 is the feature dimension.

Subsequently, the coefficient between node v_i and its adjacent node v_j was delineated. Similarly, the coefficients spanning the neighbor were calculated, and the score of the node was standardized in the following manner:

Where W is learnable parameters, and a represents a single-layer neural network with LeakyReLU as the activation function.

For node v_i, the output through the multi-head attention mechanisms in l layers can be defined as follows:

Where σ is a sigmoid activation function that ensures the score is constrained between 0 and 1. K is the number of independent attention heads. The symbol || denotes the concatenation of K heads, except for the average of the last GAT layer. As an L-layer GAT computation, the final node features are obtained. Finally, by employing BatchNorm1d to perform batch normalization on the data, the convergence rate of gradient descent is expedited. The aforementioned operations are conducted twice to obtain features that aggregate information from second-order neighbors.

2.3

Loss calculation and optimization based on LSTM

2.3.1

Application of LSTM in this experiment

In the experiment, LSTM networks are employed to process and analyze time series data. The specific application process is as follows:

First, 120 continuous time series data points are sequentially input into the encoder. The LSTM network is used to predict the characteristics of the next time point iteratively. This step-by-step prediction method enables the model to continuously update its internal hidden states and memory cells, effectively capturing long-term dependency information and enhancing the model’s ability to handle complex time series data.

The decoder then receives the memory cell states and hidden layer representations output by the encoder to iteratively predict the subsequent 120 time series features. During the prediction process, the decoder utilizes a threshold mechanism to select inputs [8]. The application process of LSTM in the experiment is shown in the Fig.3.

Figure 3.

Architecture of the Proposed LSTM

Finally, the predicted data is compared with the actual observed data, the loss function is calculated, and the model parameters are optimized using the backpropagation algorithm.This process effectively reduces prediction error and improves model accuracy. The loss function used here is the Mean Squared Error (MSE):

2.3.2

Long short-term memory

Recurrent Neural Networks (RNNs) [9] are a class of deep neural networks that can model the dynamic temporal characteristics of sequences by maintaining internal hidden states. However, basic RNNs encounter the problems of vanishing and exploding gradients when dealing with long sequences. LSTM networks address these issues by adding three gating mechanisms: forget gate, input gate, and output gate [10]. These mechanisms extend the basic RNN, enabling LSTMs to capture long-term dependencies in sequences and make the optimization process easier.

3. RESULT

A model combining graph attention network and LSTM is proposed based on the brain effect network. The model consists of three GAT blocks and an autoencoder with LSTM components. For the experiment, 178 fMRI scans were collected from the ADNI database, with an equal ratio of Alzheimer’s patients to cognitively normal individuals (1:1). From this data, the brain effective connectivity network was constructed, and metrics such as accuracy, recall rate, and specificity were evaluated on this dataset.

3.1

Dataset

In the present study, data were derived from the ADNI dataset, specifically ADNI 3 [11]. The final data of 178 subjects included 89 Alzheimer’s disease patients (AD) and 89 cognitively normal people (CN). Rs-fMRI data was employed, and the DPABI software package in Matlab was utilized in this research. Following the pre-processing of 178 datasets to derive signals from distinct brain regions as delineated by the AAL segmentation, the data was subdivided into multiple segments, each encompassing 20 data points. The total number of brain regions is 116, with the study focusing on N=90 brain regions within the cerebral part. Average fMRI BOLD signals are employed to measure brain activity, F ∈ R^N×T as temporal feature.

3.2

Brain effective connectivity network construction effect

Fig.4 shows two different brain effective connectivity network of AD and CN. The different brain effective connectivity network reveals a significant difference in connectivity, which may be related to the effects of AD and CN. Fig.4 shows the brain effective connectivity network constructed by two randomized subjects. As shown by the comparison of brain effective connectivity network between AD patients and healthy subjects, the number of connections in brain effective connectivity network is significantly reduced in AD patients. Significant differences in the brain effective connectivity network can also be observed between the two types of subjects. This is consistent with recent studies finding cognitive decline in AD patients. Further, it can be demonstrated that the orientation information of brain interval interactions contained in the brain effective connectivity network is helpful for the auxiliary diagnosis of AD compared with the traditional brain functional network.

Figure 4.

Brain Effective Connectivity Networks of AD and CN (left AD, right CN)

3.3

Model comparison

The classification performance of our model and other widely used methods in the literatures are listed in Table 1. Bold text in table indicates best result.

Table 1.

Classification Performance.

Model	Accuracy	Recall	F1 score	Precision
SVM[12]	72.2	74.6	72.1	73.7
GAT[13]	58.3	95.2	58.2	72.7
KNN[14]	63.8	63.6	63.6	63.7
OURS	77.7	80.9	80.9	73.3

It can be seen from Table 1 that the proposed method achieves the best results by processing the same data with appropriate parameters selected for different methods. The result of the proposed method is evaluated by the common indexes in four classification tasks. It can be seen from the comparison that the proposed method obtains better indexes than other comparison methods in most cases. This is because, compared to the contrast method, the proposed method is no longer simply considering information from single morphological data for diagnosis. It is a comprehensive consideration of the obtained brain effective connectivity network map data as well as time series data, which will make the diagnostic results more accurate and reliable.

4. CONCLUSION

The brain effective connectivity network has richer information compared with the traditional brain functional network, especially in the brain interval information interaction, which is closely related to the weakened brain interval information circulation in AD patients found in other studies. However, there are no studies using brain effective connectivity network for auxiliary diagnosis of AD. This work proposes a model combining a GAT and LSTM based on the brain effective connectivity network to discriminate between AD and CN. GAT and LSTM are used to consider time series and graph structure information. In addition to good classification performance, the construction of brain effective connectivity network can better reflect the differences in brain connectivity between the two groups of subjects.

ACKNOWLEDGMENT

This work was supported by National Training Programs of Innovation and Entrepreneurship for Undergraduates (240217), National Natural Science Foundation of China (62072089) and the Fundamental Research Funds for Central Universities (N2424010-19).

REFERENCES

[1]

Han, H., “Frequency-dependent changes in the amplitude of low-frequency fluctuations in amnestic mild cognitive impairment: A resting-state fMRI study,” NeuroImage, 55 (1), 287 –295 (2011). https://doi.org/10.1016/j.neuroimage.2010.11.059 Google Scholar

[2]

Gao, Y., Lewis, N., Calhoun, V. D. and Miller, R. L., “Interpretable LSTM model reveals transiently-realized patterns of dynamic brain connectivity that predict patient deterioration or recovery from very mild cognitive impairment,” Computers in Biology and Medicine, 161 107005 (2023). https://doi.org/10.1016/j.compbiomed.2023.107005 Google Scholar

[3]

Neflan, A. V., “LEARNING SNP DEPENDENCIES USING EMBEDDED BAYESIAN NETWORKS,” IEEE Computational Systems, (2006). Google Scholar

[4]

Gámez, J. A., Mateo, J. L., and Puerta, J. M., “Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood,” Data Mining and Knowledge Discovery, 22 (1–2), 106 –148 (2010). Google Scholar

[5]

Adhitama, R. P., Saputro, D. R. S., and Sutanto, S., “HILL CLIMBING ALGORITHM ON BAYESIAN NETWORK TO DETERMINE PROBABILITY VALUE OF SYMPTOMS AND EYE DISEASE,” Barekeng, 16 (4), 1271 –1282 (2022). https://doi.org/10.30598/barekengvol16iss4year2022 Google Scholar

[6]

Ghosh, K. K., Ahmed, S., Singh, P. K., Geem, Z. W. and Sarkar, R., “Improved Binary Sailfish Optimizer based on Adaptive Β-Hill Climbing for feature selection,” IEEE Access, 8 83548 –83560 (2020). https://doi.org/10.1109/Access.6287639 Google Scholar

[7]

Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P. and Bengio. Y., “Graph attention networks,” arXiv (Cornell University), (2018). Google Scholar

[8]

Revathi, T. K., Balasubramaniam, S., Sureshkumar, V. and Dhanasekaran, S., “An improved Long Short-Term Memory Algorithm for cardiovascular disease prediction,” Diagnostics, 14 (3), 239 (2024). https://doi.org/10.3390/diagnostics14030239 Google Scholar

[9]

Sherstinsky, A., “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Physica. D, Nonlinear Phenomena, 404 132306 (2020). https://doi.org/10.1016/j.physd.2019.132306 Google Scholar

[10]

Gers, F. A., Schmidhuber, J. and Cummins, F., “Learning to Forget: Continual Prediction with LSTM,” Neural Computation, 12 (10), 2451 –2471 (2000). https://doi.org/10.1162/089976600300015015 Google Scholar

[11]

,ADNI | Alzheimer’s Disease Neuroimaging Initiative, http://adni.loni.use.edu/ Google Scholar

[12]

Rabeh, A. B., Benzarti, F. and Amiri, H., “Diagnosis of Alzheimer Diseases in Early Step Using SVM (Support Vector Machine),” in 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV), 364 –367 (2016). Google Scholar

[13]

Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y., “Graph Attention Networks,” in Proceedings of the 6th International Conference on Learning Representations, (2018). Google Scholar

[14]

Vijaykumar, S., “Alzheimer’s Disease Diagnosis by using Dimensionality Reduction Based on Knn Classifier,” Biomedical and Pharmacology Journal, 10 (4), (2017). Google Scholar

Citation Download Citation

Yu Deng, Jiani Li, Yitong Huang, and Meiyu Liu "Neural network integrating graph attention and LSTM based on brain effective connectivity for diagnose of Alzheimer’s disease", Proc. SPIE 13270, International Conference on Future of Medicine and Biological Information Engineering (MBIE 2024), 1327019 (11 September 2024); https://doi.org/10.1117/12.3039959

Access the abstract

PROCEEDINGS
6 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Brain

Data modeling

Alzheimer disease

Neural networks

Diagnostics

Data processing

Mathematical optimization

1.

INTRODUCTION

2.

METHOD