
Citation: Alessandro Michelangeli, Raffaele Scandone. On real resonances for three-dimensional Schrödinger operators with point interactions[J]. Mathematics in Engineering, 2021, 3(2): 1-14. doi: 10.3934/mine.2021017
[1] | Jiachen Mu, Duanzhi Zhang . Multiplicity of symmetric brake orbits of asymptotically linear symmetric reversible Hamiltonian systems. Electronic Research Archive, 2022, 30(7): 2417-2427. doi: 10.3934/era.2022123 |
[2] | Jun Pan, Haijun Wang, Feiyu Hu . Revealing asymmetric homoclinic and heteroclinic orbits. Electronic Research Archive, 2025, 33(3): 1337-1350. doi: 10.3934/era.2025061 |
[3] | Malika Amir, Aissa Boukarou, Safa M. Mirgani, M'hamed Kesri, Khaled Zennir, Mdi Begum Jeelani . Well-posedness for a coupled system of gKdV equations in analytic spaces. Electronic Research Archive, 2025, 33(7): 4119-4134. doi: 10.3934/era.2025184 |
[4] | Weiyu Li, Hongyan Wang . Dynamics of a three-molecule autocatalytic Schnakenberg model with cross-diffusion: Turing patterns of spatially homogeneous Hopf bifurcating periodic solutions. Electronic Research Archive, 2023, 31(7): 4139-4154. doi: 10.3934/era.2023211 |
[5] | Xing Zhang, Xiaoyu Jiang, Zhaolin Jiang, Heejung Byun . Algorithms for solving a class of real quasi-symmetric Toeplitz linear systems and its applications. Electronic Research Archive, 2023, 31(4): 1966-1981. doi: 10.3934/era.2023101 |
[6] | Xiaojun Huang, Zigen Song, Jian Xu . Amplitude death, oscillation death, and stable coexistence in a pair of VDP oscillators with direct–indirect coupling. Electronic Research Archive, 2023, 31(11): 6964-6981. doi: 10.3934/era.2023353 |
[7] | Dong Li, Xiaxia Wu, Shuling Yan . Periodic traveling wave solutions of the Nicholson's blowflies model with delay and advection. Electronic Research Archive, 2023, 31(5): 2568-2579. doi: 10.3934/era.2023130 |
[8] | Zhuo Ba, Xianyi Li . Period-doubling bifurcation and Neimark-Sacker bifurcation of a discrete predator-prey model with Allee effect and cannibalism. Electronic Research Archive, 2023, 31(3): 1405-1438. doi: 10.3934/era.2023072 |
[9] | Jun Pan, Haijun Wang, Feiyu Hu . Creation of hidden n-scroll Lorenz-like attractors. Electronic Research Archive, 2025, 33(7): 4167-4183. doi: 10.3934/era.2025188 |
[10] | Alessandro Portaluri, Li Wu, Ran Yang . Linear instability of periodic orbits of free period Lagrangian systems. Electronic Research Archive, 2022, 30(8): 2833-2859. doi: 10.3934/era.2022144 |
Chronic ankle instability (CAI) is common for patients with lateral collateral ligament injuries (LCL injuries) of the ankle [1, 2]. About up to 40% of patients who have previously encountered lateral ankle sprain had a CAI problem [3]. These patients often require radical treatment for stabilizing their ankle, such as using orthoses, taping, or accepting surgical repairs on the lateral ligaments [4].
In current practice, physicians make diagnoses of CAI based on physical and medical examinations. So, medical imaging and gait kinetics on CAI is rapidly developing into a world research focus [5, 6, 7, 8]. Deep learning algorithms are developed to improve the detection of radiographs accordingly [9]. To precise manage CAI, physicians will need to perform gait analysis on patients. Lacking gait analysis impedes our understanding of patients' injuries and impacts on locomotion [10]. Many scientists argued that gait analysis is necessary for patients with foot and ankle ligament injuries, especially in those serious and frequent cases [11]. Gait analysis includes spatiotemporal and kinematic characteristics that can describe pathological changes in foot and ankle motion during walking [12, 13, 14]. This quantitative analysis will provide a solid foundation for deep learning to carry out an automatic, accurate, and immediate detection method for patients with ankle instability [15]. However, there are few studies on intelligent detection on gait analysis for patients with CAI now.
For this issue, our study aimed to augment gait features in CAI patients with highly representative, yet fully private and synthetic data for intelligent detection. To achieve this goal, we first employed the Dual Generative Adversarial Networks (Dual-GAN), a dataset augmentation method, to synthesize the significant spatiotemporal and kinematic characteristics measured in our previous study [16, 17]. The Dual-GAN was trained by the injury and control group independently to learn the probability distribution of spatiotemporal and kinematic characteristics in walking. Then we used the t-distribution Stochastic Neighbor Embedding (t-SNE) algorithm to evaluate the relevance between the real and synthesized features visually. These real and synthesized data were fed to the Long Short-Term Memory (LSTM)-, LSTM- Fully Convolutional Networks (FCN)-, and Convolutional LSTM-based CAI detection models under different strategies for training. Finally, the confusion matrix was used to profile the diagnosis results using these three intelligent CAI detection models.
The Generative Adversarial Networks (GAN) proposed by Goodfellow et al. is an unsupervised deep learning model playing a Zero-Sum game between two competition networks (generator and discriminator). The generator attempts to synthesize samples to confuse the discriminator, while the discriminator is a binary classifier to detect the synthesized samples from real-synthesis mixed samples as much as possible. In current studies, several improved GANs were proposed aiming for the state-of-the-art application. The study of improved GANs mainly focused on optimizing the network structure and the objective function.
Several altered network structures have been presented in GAN to improve performance. Radford et al. [18] presented the Deep Convolutional GAN (DCGAN) using the transposed/general deep convolutional layers to replace the fully connected layers in the generator/discriminator to increase the performance. The DCGAN uses the fractional-stride convolution for pooling to improve their stability. Zhang et al. [19] synthesized photo-realistic images from text with Stack GAN (SGAN). The SGAN includes two conditional GANs decomposing a hard problem into sub-problems. The Stage-I GAN is used to synthesize low-resolution images. The Stage-II GAN takes Stage-I results as inputs and generated high-resolution data with realistic details. Yi et al. [20] proposed the Dual-GAN for mutual translation between two domains. Dual learning trains two opposite translators (source-to-target domain and target-to-source domain) simultaneously by minimizing the reconstruction loss via these coupled GANs. The coupled GANs form a nested feedback loop to reinforce the learning. Compared to the classic GAN model, the Dual-GAN model is easier to converge, as well as it has better stability and performance in data synthesizing. Moreover, several studies, such as Triple-GAN for hyperspectral classification on small training dataset [21], Star GAN for multi-domain image-to-image translation [22], Big-BiGAN progressing in image generation quality to improve representation learning performance [23], and Cycle-Dublur GAN to improve the image quality of chest cone-beam computed tomography images [24], etc., revealed that the altered networks make the GANs available for a wide range of image applications.
Furthermore, the improved objective function and penalty function are used to solve vanishing gradient and mode collapse in GAN. F-divergence is a general divergence measure between two given probability distributions including Kullback-Leibler divergence, Jensen-Shannon divergence, etc. Nowozin et al. showed a unified framework f-GAN and discussed the benefits for GAN using various f-divergence functions on model training [25]. F-GAN offers a powerful approach to measure complex distributions without factorizing assumptions and inspires the following researchers. The researchers try to design modified GANs to work well, fast, and concisely. Compared to the cross-entropy loss function, GAN with the least-squares loss function makes the synthesized samples closer to the real data. The least-squares loss function provides a smooth and non-saturating gradient in the discriminator forcing the synthesized samples toward the decision boundary [26]. Wasserstein GAN replaces the Jensen-Shannon divergence with Wasserstein distance to deal with the mode collapse and provide a meaningful loss function [27]. However, the training process and the convergence of Wasserstein GAN are more time-consuming than the original GAN. Given these issues, Ishaan Gulrajani et al. proposed gradient penalty instead of weight clipping in Wasserstein GAN and demonstrated that the Wasserstein GAN with gradient penalty (WGAN-GP) performs a stable training [28].
Various GANs were widely applied in image synthesizing for machine learning of small sample sets in recent studies, particularly in brain images, while rarely in time-series synthesizing. Bidirectional GAN, a novel end-to-end network, uses image contexts and latent vectors to optimize brain MR-to-PET synthesis for Alzheimer's Disease (AD) [29, 30]. Multi-directional perception GAN (MP-GAN) can visualize the morphological features indicating the severity of AD by the capture capability of salient global features [31]. Tensor-train, High-order pooling, and Semi-supervised learning-based GAN (THS-GAN) is proposed to assess mild cognitive impairment and AD [32]. Compared to image synthesizing, time-series synthesizing will learn their distribution and dynamics synchronously in time-series embedding space [33, 34].
This controlled laboratory study was conducted at the Peking University Third Hospital. The methods used in this study were reviewed and approved by the Institutional Research Board at the Peking University Third Hospital. Each participant provided informed, written consent before entering the study.
Three patients (all male, mean age of 34 years, range: 32–37 years) diagnosed with CAI resulting from the Grade III LCL injuries of the ankle (severe, complete rupture of the ligament fibers) and ten paired control subjects (all male, mean age of 28 years, range: 24–32 years) were recruited. The control subjects from the student and staff of the hospital were no known lower extremity abnormalities, previous injuries, or surgeries. There was a total of 211 normal gait cycles and 30 injured gait cycles from the participants. The training dataset for Dual-GAN consisted of 132 normal gait cycles for six control subjects and 20 injured gait cycles from two patients with CAI. The rest of the data from another four control subjects and a patient was used to test the proposed model.
As shown in Figure 1, the reflective markers were stuck on the shank and foot following the Heidelberg Foot Measurement Model (HFMM) [35]. A visual anthropometric model was built in the software Vicon Nexus 1.8.5 (Vicon Motion Systems Ltd., Oxford, UK). The Vicon MX Motion Capture System (Vicon Motion Systems Ltd., Oxford, UK) with eight cameras was used to track the markers within the capture volume and record their three-dimensional coordinate data at 100 Hz. The raw streaming data during walking were exported as.csv files for future analysis.
The raw data was filtered by a low-pass zero phase shift first-order Butterworth filter and segmented into gait cycles. The gait cycles were normalized to 100 sample points by linear interpolation for comparison among subjects.
In this study, we selected basic gait variables, velocity and micro-adjustment variables, and range of motion (ROM) variables as spatiotemporal and kinematic characteristics for CAI detection.
The basic gait variables reported in our previous study, including the percentage of first/second rocker phase in the gait cycle (%), the stride length (mm), and the duration (s/stride), were significantly different between the patients with LCL injuries of the ankle and control subjects [16].
Five markers covering knee, ankle, and foot in HFMM were chosen for measuring velocity and micro-adjustment variables, including TTU, LML, CCL, DMT2, and HLX.
The results [16] revealed that in the gait cycle significant differences in velocity between patients with CAI of the ankle and control subjects involved the maximum velocity (Vmax, mm/10-2s) of the knee (TTU), ankle (LCL), and foot (CCL, DMT2, HLX); the percentage of time to maximum velocity (TVmax, %) of the ankle (LML) and foot (CCL, DMT2, HLX); the minimum velocity (Vmin, mm/10-2s) of the TTU, LML, and DMT2; the percentage of time to minimum velocity (TVmin, %) of the TTU.
In the 2nd rocker phase, the significant differences in micro-adjustment and velocity between the two groups included the number of valleys of the TTU, LML, and DMT2; the number of peaks of the TTU, LML, CCL, and DMT2; the mean velocity of the TTU, LML, DMT2, and HLX.
According to our research, eight ROM variables calculated in [17], including tibiotalar flexion, forefoot/ankle abduction, medial arch angle, lateral arch angle, subtalar rotation, forefoot/ankle supination, MT I-V angle, and lateral malleolus scale, all made a certain contribution to automatic detection for the ankle instability.
The Dual-GAN is a closed-loop feedback system consisting of a pair of tasks. We can obtain detection results from unlabeled data, and then use the results to improve the machine learning models in dual tasks. The dual network structure translates in the cross-domain, which means that the network learns the bidirectional mapping between the features in domain F and the results in domain R. The generator GA maps feature f(f∈F) to result r(r∈R) and generator GB maps result r(r∈R) to features f(f∈F). The discriminator DA can recognize the difference between the generated result of GA and the real result in domain R. The discriminator DB can recognize the difference between the generated features of GB and the real features in domain F.
As shown in Figure 2, the real feature f translates to the result GA(f,z) via GA, and the discriminator DA evaluates how appropriate the translation GA(f,z) is in domain R. Here, z and z′ appeared below are both random noises. GA(f,z) then translates back to domain F using GB, which outputs GB(GA(f,z),z′) as the reconstructed f. Similarly, the real result r is translated as GB(r,z′) and then reconstructed as GA(GB(r,z′),z).The discriminator DA is trained with positive samples r and negative examples GA(f,z), whereas DB takes f as positive samples and GB(r,z′) as negative samples. The generator GA and GB emulate "fake" outputs to blind the correlation between discriminator DA and DB and minimize the reconstruction loss ∥GA(GB(r,z′),z)−r∥ and ∥GB(GA(f,z),z′)−f∥ for optimizing the Dual-GAN.
We employ the gradient penalty in WGAN-GP replacing the weight clipping to insure gradient stability in this study. The loss functions of DA and DB are defined as:
LdA=E˜x∼Pg[DA(˜x)]−Ex∼Pr[DA(x)]+λE¯x∼P¯x[(‖∇¯xDA(¯x)‖2−1)2] | (1) |
LdB=E˜x∼Pg[DB(˜x)]−Ex∼Pf[DB(x)]+λE¯x∼P¯x[(‖∇¯xDB(¯x)‖2−1)2] | (2) |
Here, loss function LdA is the objective of DA, and loss function LdB is the objective of DB. λ is a hyperparameter, and it is set to 10 in this study. We also define P¯x as sampling uniformly along straight lines between pairs of points sampled from the data distribution (Pf or Pr) and the corresponding generator distribution Pg.
Generator GA and GB use the same loss function:
lg(f,r)=λF‖f−GB(GA(f,z),z′)‖+λR‖r−GA(GB(r,z′),z)‖−DB(GB(r,z′))−DA(GA(f,z)) | (3) |
where f∈F; r∈R; λF, λR are two constant parameters within [100.0, 1000.0]. Here, the Manhattan distance is used to measure the recovery error in order to force the reconstruction samples to obey the domain distribution.
Dual-GAN consists of two generators and two discriminators. The discriminators have the same structure while the generators are different. The architecture of generators GA and GB is shown in Figure 3.
Generator GA adopts the FCN. The input of GA is a feature vector, and the output is diagnosis result vector for CAI. Specifically, GA is built with a convolutional layer with 5×5 kernel-size and 2 stride-size, 4 convolutional layers with 3×3 kernel-size and 2 stride-size, as well as 3 convolutional layers with 3×3 kernel-size and 3 stride-size. Each layer includes the leaky rectified linear unit (LReLU) activation function and batch normalization (BN) except for the first layer.
Generator GB also uses the FCN. But the input of GB is diagnosis result vector for CAI, and the output is a feature vector. When the input is a real result vector, a 1×1 convolutional layer is set for matching the dimension between the real result vector and GB. Generator GB is built with a convolutional layer with 1×1 kernel-size, 3 deconvolutional layers with 3×3 kernel-size and 3 stride-size, 4 deconvolutional layers with 3×3 kernel-size and 3 stride-size, as well as a convolutional layer with 5×5 kernel-size and 2 stride-size. Each layer included the rectified linear unit (ReLU) activation function and BN except for the last layer. The last layer configures a TanHyperbolic (tanh) activation function to output the generated feature.
The generators GA and GB make up a U-shaped generator net with skip connections for learning both high- and low-level features. So, we can establish contact between the gait characteristics and CAI detection results through these skip connections.
The architecture of discriminators DA and DB is shown in Figure 4. When the input is a real result vector, a 1×1 convolutional layer is set for matching the dimension between the real and generated result vectors. Discriminators DA and DB consist of a convolutional layer with 5×5 kernel-size and 2 stride-size, 4 convolutional layers with 3×3 kernel-size and 2 stride-size, as well as a fully connected layer. The LReLU activation function follows each convolutional layer, and BN is configured at layers 2–5.
We used mini-batch Stochastic Gradient Descent and applied the Root Mean Square Propagation (RMSProp) solver to optimize this Dual-GAN model. The batch size was set to 8 and the dropout rate was 0.3. The learning rate α and the attenuation rate β were set to 0.001 and 0.9, respectively. We adopted Manhattan distance to measure the recovery error between synthesized samples and real data. The gradient penalty replaced weight clipping in the loss function for high-quality data generation. The Dual-GAN training procedure is shown in Figure 5.
We randomly selected a group of real characteristics and synthesized samples using Dual-GAN for visualization (Figure 6). The parameters of Dual-GAN are initialized at epoch 0. After 50 epochs, the network begins to learn the general trend of spatiotemporal and kinematic characteristics in the gait cycle. After 100 epochs, the generator can synthesize an unsmooth curve closing to real characteristics. While training after 150–200 epochs, the synthesized sample resembles the real data, but it seems relatively oscillatory. After 250 epochs, the Dual-GAN is convergence, and the synthesized sample coordinate with the real gait characteristics. In the following section, we will feed the synthesized samples to the detection model and verify the synthesized samples using the t-SNE algorithm.
The network architecture based on LSTM learns the long-term sequence characteristics of spatiotemporal and kinematic variables well. We adopted three LSTM-based deep learning models, including LSTM, LSTM-FCN, and Convolutional LSTM to detection CAI (Figure 7) [36, 37, 38, 39]. The synthesized and real data were mixed for training the detection model.
LSTM can solve the problem of gradient disappearance, especially in the long-distance dependence task. We used LSTM to CAI detection in this study. As shown in Figure 7, the LSTM cell is composed of an input gate, an output gate, and a forget gate. The cell traces the dependency between features in the input sequence. The input gate selectively retains the information from the previous time step. The forget gate gives up some input information from the previous node. The output gate controls the current cell state output to the next cell. The training procedure of the LSTM-based detection model is shown in Figure 8. The activation function of LSTM is the logistic sigmoid function. The input gate i, forget gate f, and output gate O are:
it=σ(Wi⋅[ht−1,Xt]+bi) | (4) |
ft=σ(Wf⋅[ht−1,Xt]+bf) | (5) |
Ct=ftCt−1+ittanh(Wc⋅[ht−1,Xt]+bc) | (6) |
Ot=σ(Wo[ht−1,Xt]+bo) | (7) |
ht=Ottanh(Ct) | (8) |
where Xt is the input of current time; ht−1 is the output in the previous time; Ct−1 is the cell status in the previous time; W is the connection weight; b is bias item; σ is a sigmoid activation function; tanh is a hyperbolic tangent function.
As shown in Figure 7, the LSTM-FCN model in this study includes a fully convolutional block and a LSTM block. The fully convolutional block is comprised of three stacked temporal convolutional blocks and a global average pooling layer. The LSTM block consists of a dimension shuffle layer, two general LSTM sub-modules, and a dropout layer. The dimension shuffle layer can reduce the training time improving the model's efficiency. The training procedure of the LSTM-FCN-based detection model is shown in Figure 9.
The Convolutional LSTM (Figure 7) not only possesses the capacity of temporal modeling from LSTM but also can describe spatial local features. This is because the Convolutional LSTM model applies convolution structure in both input-to-state and state-to-state transitions to extract spatial features. The "peephole connection" in each gate is employed to supervise the cell state. The training procedure of the Convolutional LSTM-based detection model is shown in Figure 10.
We designed three data feeding strategies (DFSs) for training the CAI detection model and comparing their outcomes. The first data feeding strategy (DFS1) only included significant spatiotemporal characteristics (basic gait variables, velocity variables, and micro-adjustment variables). The second data feeding strategy (DFS2) only included kinematic characteristics (ROM variables). The third data feeding strategy (DFS3) included both spatiotemporal and kinematic characteristics.
Moreover, we created three sub-DFSs for comparing the effects of the synthesized data on the detection model. The first sub-DFS (DFS*-1), as same as the training set for the Dual-GAN, only included real characteristics. The second sub-DFS (DFS*-2) included the real characteristics plus 200 synthesized data for each group. The third sub-DFS (DFS*-3) included the real characteristics plus 1000 synthesized data for each group.
The t-SNE algorithm is a nonlinear descending dimension method for visualizing the high-dimensional spatiotemporal and kinematic characteristics [40]. What t-SNE does is present a way to project high-dimensional data into a low-dimensional space [41]. The data point on the 2-dimensional plane is attracted to data it is semblable in high-dimensional space and repelled by data it is unlike [42]. Figure 11 shows the visualization of 132 real control data (), 20 real injury data (
), 200 randomly selected synthesized control data (
), and 200 randomly selected synthesized injury data (
) reported by t-SNE. Here, the perplexity was 30 to keep a balance between local and global features.
The synthesized data overlapped with the same classification of real data by learning the distribution from the control group and injury group, respectively. The distribution between real and synthesized data for tibiotalar flexion (B), lateral arch angle (E), forefoot/ankle supination (G), MT I-V angle (H), and lateral malleolus scale (I) were more similar than the others. Compared to patients, the synthesized data of control subjects () were more consistent with the real (
) due to their routine gait rhythm and pattern. The synthesized data is generally in good shape for training the CAI detection model.
The CAI detection is a binary classification including CAI and non-CAI. Here classification metrics such as accuracy, precision, recall (sensitivity), and f1-score (balanced score) are selected to evaluate the performance of the CAI detection modes. the relevant components including TP (the number of correct predicting non-CAI samples), TN (the number of correct predicting CAI), FP (the number of false predicting non-CAI), and FN (the number of false predicting CAI). The definitions of the classification metrics are explained below.
Accuracy is the proportion of all correctly predicted samples to total samples for measuring the performance of a detection model.
ACC=TP+TNTP+TN+FP+FN | (9) |
Precision is the proportion of all correctly predicted non-CAI samples to all predicted non-CAI samples.
Precision=TPTP+FP | (10) |
Recall is the proportion of all correctly predicted non-CAI samples to all real non-CAI samples.
Recall=TPTP+FN | (11) |
The f1-score is the harmonic mean of precision and recall measuring their difference.
f1−score=2×Precision×RecallPrecision+Recall | (12) |
The LSTM-based detection model in this study consisted of an input layer, two hidden layers, and an output layer. For each input/hidden layer, there were 32 neurons. The output layer was a Softmax classifier with two nodes to output the result of detection (CAI/non-CAI). We adopted a binary cross-entropy loss function after the output layer to estimate the consistency between the outcome and true value. Here the RMSProp solver was used to restrain the swing amplitude and speed up the convergence during the gradient descent. The iteration was set to 300 and the batch size was 32 in model training. The accuracy, precision, recall, and f1-score for CAI detection using this LSTM model are presented in Table 1.
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 74.16 | 88.76 | 88.76 |
precision | 0.94 | 0.89 | 0.89 | |
recall | 0.76 | 1.00 | 1.00 | |
f1-score | 0.84 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 86.52 | 94.38 | 94.38 |
precision | 1.00 | 0.94 | 0.94 | |
recall | 0.85 | 1.00 | 1.00 | |
f1-score | 0.92 | 0.97 | 0.97 | |
DFS*-3 | accuracy (%) | 98.88 | 100 | 95.51 |
precision | 1.00 | 1.00 | 0.95 | |
recall | 0.99 | 1.00 | 1.00 | |
f1-score | 0.99 | 1.00 | 0.98 |
We tried to replace the LSTM module with LSTM-FCN in the CAI detection model for improving the classification performance. In this model, the LSTM and FCN sub-module perceived the same input in different views. The LSTM sub-model included a dimension shuffle layer and two LSTM cell layers followed by a dropout layer. The shuffle layer for receiving the input was a multivariate time series with a single time step. For each LSTM cell layer, there were 32 neurons. The FCN sub-module consisted of three stacked convolutional blocks with filter sizes of 128, 256, and 128 respectively and a global pooling layer. Each convolutional block comprised a temporal convolutional layer accompanied by BN (momentum was 0.99, epsilon was 0.001) and a ReLU activation function. The output of the dropout and global pooling layer was concatenated and input into the Softmax classifier. Here the iteration was set to 50 and the batch size was 32 in model training. The accuracy, precision, recall, and f1-score for CAI detection using this LSTM model are presented in Table 2.
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 83.15 | 88.76 | 88.76 |
precision | 0.89 | 0.89 | 0.89 | |
recall | 0.92 | 1.00 | 1.00 | |
f1-score | 0.91 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 87.64 | 96.63 | 94.38 |
precision | 0.89 | 0.96 | 0.94 | |
recall | 0.99 | 1.00 | 1.00 | |
f1-score | 0.93 | 0.98 | 0.97 | |
DFS*-3 | accuracy (%) | 94.38 | 94.38 | 95.51 |
precision | 0.97 | 0.94 | 0.95 | |
recall | 0.96 | 1.00 | 1.00 | |
f1-score | 0.97 | 0.97 | 0.98 |
Considering the temporal-spatial characteristics of gait variables, we further adopted the Convolutional LSTM-based model detecting the CAI. The convolutional layer with filter size 64 and kernel size 1×3, and was accompanied by a ReLU activation function. The convolutional layer was followed by a dropout layer (dropout rate was 0.5) and a flattened layer for inputting a dimensional array into the Softmax classifier. Here the iteration was set to 25 and the batch size was 64 in model training. The accuracy, precision, recall, and f1-score for CAI detection using this LSTM model are presented in Table 3.
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 88.76 | 88.76 | 88.76 |
precision | 0.89 | 0.89 | 0.89 | |
recall | 1.00 | 1.00 | 1.00 | |
f1-score | 0.94 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 87.64 | 96.63 | 98.88 |
precision | 1.00 | 1.00 | 1.00 | |
recall | 0.86 | 0.96 | 0.99 | |
f1-score | 0.93 | 0.98 | 0.99 | |
DFS*-3 | accuracy (%) | 94.38 | 100 | 100 |
precision | 0.99 | 1.00 | 1.00 | |
recall | 0.95 | 1.00 | 1.00 | |
f1-score | 0.97 | 1.00 | 1.00 |
Based on the results of spatiotemporal and kinematic characteristics augmentation using Dual-GAN, we employed three LSTM-based models (the LSTM-based model, LSTM-FCN-based model, and Convolutional LSTM-based model) in the study to compare their performance and reveal the effect of synthesizing data on CAI detection. The detection outputs by LSTM, LSTM-FCN, and Convolutional LSTM models under three DFSs are shown in Tables 1–3 and the Receiver Operating Characteristic (ROC) curves for the CAI and non-CAI classification are shown in Figure 12.
We took a step to optimize the feeding by pre-selecting crucial features for intelligent detection in our previous study [16, 17]. The spatiotemporal and kinematic characteristics in gait were both effective to predict CAI. However, the detection performance using the model trained by only real spatiotemporal characteristics (DFS1) was unsatisfactory (Area Under Curve (AUC): 0.122–0.718). When we used real kinematic characteristics (DFS2) for training, the outcome was improved (AUC for DFS2: 0.923–0.995). When we added real kinematic characteristics to real spatiotemporal characteristics (DFS3) for training, the outcome was also good (AUC for DFS3: 0.738–0.987). In general, the training set of only the real data did not yield adequate classification outcomes. The accuracy of DFS2 and DFS3 were all 88.76% for LSTM, LSTM-FCN, and Convolutional LSTM-based detection models. The accuracy, precision, recall, and f1-score were moderate.
So, we further included 200 and 1000 synthesized characteristics in the training set, and the detection accuracy, precision, recall, and f1-score improved dramatically. Adding 200 synthesized data for training, the outcome of the detection model was improved. When we included 1000 synthesized data, the detection accuracy, precision, recall, and f1-score improved dramatically for either or both spatiotemporal and kinematic characteristics. The volume of training samples is crucial for the models' detection performance. Spatiotemporal and kinematic characteristics augmentation using Dual-GANs can improve the diagnostic accuracy of the CAI detection models presented in this study.
When we added 200 synthesized data to the real data set training the LSTM-based CAI detection model (Table 1), the accuracy was improved (from 74.16% to 86.62% for DFS1; from 88.76% to 94.38% for DFS2 and DFS3). After adding 1000 synthesized data, the accuracy for the three DFSs was more than 95% (DFS1: 98.88%, DFS2: 100%, DFS3: 95.51%), and the precision, recall, and f1-score also look good.
For LSTM-FCN-based CAI detection model (Table 2), the accuracy was improved (from 83.15% to 87.64% for DFS1; from 88.76% to 96.63% for DFS2; from 88.76% to 94.38% for DFS3) because of adding 200 synthesized data (DFS*-2). Adding 1000 synthesized data (DFS*-3) enhanced the detection accuracy of DFS1 (from 87.64% to 94.38%) and DFS3 (from 94.38% to 95.51%), but there was no positive effect on DFS2.
The results in Table 3 reveal that the performance of the Convolutional LSTM-based classifier is better than LSTM and LSTM-FCN-based CAI detection model. The Convolutional LSTM-based detection model showed better performance for spatiotemporal and kinematic characteristics. Adding more synthesized data helped the Convolutional LSTM-based classifier enhance the detection performance. when we added 1000 synthesized data to the training set, the accuracy of DFS1, DFS2, and DFS3 was up to 94.38, 100, and 100% respectively. Their precision, recall, and score were all superior. The detection results were good, but several limitations of the current study need to be noted. First, the high-quality gait information from the optical motion capture system helped us find several significant spatiotemporal and kinematic characteristics of CAI enhancing the detection efficiency, but the data collection is time-consuming. At present, wearable devices, such as smartphones, wearable biofeedback sensors, and sensing fabrics, are widely applied to gait monitoring [43, 44]. Unlike laboratory-based motion tracking, wearable devices are convenient to record daily gait data. Based on our current results, we will explore effective gait features from the data recording by wearable devices. It is possible to get more practical applications for our studies, such as multi-sensor-based human activity recognition, bipedal walking robot, exoskeleton, et al. [45, 46, 47, 48]. Second, this paper focused on the performance of LSTM-based models for ankle instability detection. We will explore more CAI detection models based on characteristics augmenting and combine them to improve the detection effectiveness in our further research.
In this paper, we first present an approach for augmenting spatiotemporal and kinematic characteristics using the Dual-GAN to train a series of modified LSTM detection models to make the training process more data-efficient. The Dual-GAN enables the synthesized data to approximate the real data distribution visualized by the t-SNE algorithm.
Then we use LSTM-, LSTM-FCN-, and Convolutional LSTM-based detection models training by the real data collected from the controlled laboratory study and mixed data from real and synthesized gait features respectively to identify the patients with CAI. The detection models were tested in real data to validate the positive role in data augmentation as well as to demonstrate the capability and effectiveness of the modified LSTM algorithm for CAI detection using spatiotemporal and kinematic characteristics in walking.
The experimental results show that the proposed data augmentation method promoted the detection performance in cases of the limited real dataset and the modified LSTM algorithm yielded an enhanced classification outcome to identify those CAI patients from a group of control subjects based on gait analysis data than any previous reports. This is because the Dual-GAN addresses the insufficient limitation of the training set, and the Convolutional LSTM-based detection model works well both in spatial and temporal features during walking.
The techniques proposed in this study develop a new way to extract significant features from a small clinical gait analysis dataset to improve computer-assisted diagnosis on CAI patients. This is a concrete step toward the long-term goal to develop artificial intelligence-based instruments for clinical diagnosis and rehabilitation. In subsequent studies, we will expand our research to posture control in more sports situations, such as running, jumping, and cutting, et al., and improve the performance of deep learning models to enhance the medical treatment in sports medicine.
The publication of this paper is funded by the Peking University Third Hospital—Haidian Innovation and Transformation Project under Grant Y74482-09, the National Natural Science Foundation of China under Grant 61801019, the China Scholarship Council under Grant 201906465021, and the Fundamental Research Funds for the University of Science and Technology Beijing under Grant FRF-BD-19-012A.
The authors declare there is no conflict of interest.
[1] | Agmon S (1975) Spectral properties of Schrödinger operators and scattering theory. Ann Scuola Norm Sup Pisa Cl Sci 2: 151-218. |
[2] | Albeverio S, Brzeźniak Z, Dabrowski L (1995) Fundamental solution of the heat and Schrödinger equations with point interaction. J Funct Anal 130: 220-254. |
[3] | Albeverio S, Fenstad JE, Høegh-Krohn R (1979) Singular perturbations and nonstandard analysis. T Am Math Soc 252: 275-295. |
[4] | Albeverio S, Gesztesy F, Høegh-Krohn R, et al. (1988) Solvable Models in Quantum Mechanics, New York: Springer-Verlag. |
[5] | Albeverio S, Høegh-Krohn R (1981) Point interactions as limits of short range interactions. J Operat Theor 6: 313-339. |
[6] | Albeverio S, Høegh-Krohn R, Streit L (1977) Energy forms, Hamiltonians, and distorted Brownian paths. J Math Phys 18: 907-917. |
[7] | Albeverio S, Karabash IM (2017) Resonance free regions and non-Hermitian spectral optimization for Schrödinger point interactions. Oper Matrices 11: 1097-1117. |
[8] | Albeverio S, Karabash IM (2019) On the multilevel internal structure of the asymptotic distribution of resonances. J Differ Equations 267: 6171-6197. |
[9] | Albeverio S, Karabash IM (2018) Generic asymptotics of resonance counting function for Schrödinger point interactions. arXiv: 1803.06039. |
[10] | Alsholm P, Schmidt G (1971) Spectral and scattering theory for Schrödinger operators. Arch Rational Mech Anal 40: 281-311. |
[11] | Arlinskiĭ Y, Tsekanovskiĭ E (2005) The von Neumann problem for nonnegative symmetric operators. Integr Equat Oper Th 51: 319-356. |
[12] | Berezin F, Faddeev L (1961) A Remark on Schrodinger's equation with a singular potential. Sov Math Dokl 2: 372-375. |
[13] | Cornean HD, Michelangeli A, Yajima K (2020) Two-dimensional Schrödinger operators with point interactions: Threshold expansions, zero modes and Lp-boundedness of wave operators. Rev Math Phys 32: 1950012. |
[14] | Dabrowski L, Grosse H (1985) On nonlocal point interactions in one, two, and three dimensions. J Math Phys 26: 2777-2780. |
[15] | D'Ancona P, Pierfelice V, Teta A (2006) Dispersive estimate for the Schrödinger equation with point interactions. Math Method Appl Sci 29: 309-323. |
[16] | Dell'Antonio G, Michelangeli A, Scandone R, et al. (2018) Lp-Boundedness of Wave Operators for the Three-Dimensional Multi-Centre Point Interaction. Ann Henri Poincaré 19: 283-322. |
[17] | Duchêne V, Marzuola JL, Weinstein MI (2011) Wave operator bounds for one-dimensional Schrödinger operators with singular potentials and applications. J Math Phys 52: 1-17. |
[18] | Erdoğan MB, Schlag W (2004) Dispersive estimates for Schrödinger operators in the presence of a resonance and/or an eigenvalue at zero energy in dimension three. I. Dyn Part Differ Eq 1: 359-379. |
[19] | Galtbayar A, Yakima K (2019) On the approximation by regular potentials of Schrödinger operators with point interactions. arXiv: 1908.02936. |
[20] | Georgiev V, Visciglia N (2007) About resonances for Schrödinger operators with short range singular perturbation, In: Topics in Contemporary Differential Geometry, Complex Analysis and Mathematical Physics, Hackensack: World Sci. Publ., 74-84. |
[21] | Goldberg M, Green WR (2016) The Lp boundedness of wave operators for Schrödinger operators with threshold singularities. Adv Math 303: 360-389. |
[22] | Goloshchapova NI, Malamud MM, Zastavnyĭ VP (2011) Positive-definite functions and the spectral properties of the Schrödinger operator with point interactions. Mat. Zametki 90: 151-156. |
[23] | Goloshchapova N, Malamud M, Zastavnyi V (2012) Radial positive definite functions and spectral theory of the Schrödinger operators with point interactions. Math Nachr 285: 1839-1859. |
[24] | Grossmann A, Høegh-Krohn R, Mebkhout M (1980) A class of explicitly soluble, local, manycenter Hamiltonians for one-particle quantum mechanics in two and three dimensions. I. J Math Phys 21: 2376-2385. |
[25] | Grossmann A, Høegh-Krohn R, Mebkhout M (1980) The one particle theory of periodic point interactions. Commun Math Phys 77: 87-110. |
[26] | Iandoli F, Scandone R (2017) Dispersive estimates for Schrödinger operators with point interactions in R3, In: Advances in Quantum Mechanics: Contemporary Trends and Open Problems, Springer International Publishing, 187-199. |
[27] | Ionescu AD, Jerison D (2003) On the absence of positive eigenvalues of Schrödinger operators with rough potentials. Geom Funct Anal 13: 1029-1081. |
[28] | Jensen A, Kato T (1979) Spectral properties of Schrödinger operators and time-decay of the wave functions. Duke Math J 46: 583-611. |
[29] | Jerison D, Kenig CE (1985) Unique continuation and absence of positive eigenvalues for Schrödinger operators. Ann Math 121: 463-494. |
[30] | Kato T (1959) Growth properties of solutions of the reduced wave equation with a variable coefficient. Commun Pure Appl Math 12: 403-425. |
[31] | Koch H, Tataru D (2001) Carleman estimates and unique continuation for second-order elliptic equations with nonsmooth coefficients. Commun Pure Appl Math 54: 339-360. |
[32] | Koch H, Tataru D (2006) Carleman estimates and absence of embedded eigenvalues. Commun Math Phys 267: 419-449. |
[33] | Kuroda ST (1978) An Introduction to Scattering Theory, Aarhus: Aarhus Universitet, Matematisk Institut. |
[34] | Lipovský J, Lotoreichik V (2017) Asymptotics of resonances induced by point interactions. Acta Phys Pol A 132: 1677-1682. |
[35] | Nelson E (1977) Internal set theory: A new approach to nonstandard analysis. B Am Math Soc 83: 1165-1198. |
[36] | Rauch J (1978) Local decay of scattering solutions to Schrödinger's equation. Commun Math Phys 61: 149-168. |
[37] | Rodnianski I, Schlag W (2004) Time decay for solutions of Schrödinger equations with rough and time-dependent potentials. Invent Math 155: 451-513. |
[38] | Scandone R (2019) Zero modes and low-energy resolvent expansion for three-dimensional Schrödinger operators with point interactions. arXiv: 1901.02449. |
[39] | Scarlatti S, Teta A (1990) Derivation of the time-dependent propagator for the three-dimensional Schrödinger equation with one-point interaction. J Phys A 23: L1033-L1035. |
[40] | Sjöstrand J (2002) Lectures on resonances, Available from: http://sjostrand.perso.math.cnrs.fr/Coursgbg.pdf. |
[41] | Sjöstrand J, Zworski M (1991) Complex scaling and the distribution of scattering poles. J Am Math Soc 4: 729-769. |
[42] | Yajima K (2005) Dispersive estimates for Schrödinger equations with threshold resonance and eigenvalue. Commun Math Phys 259: 475-509. |
[43] | Yajima K (2016) Remarks on Lp-boundedness of wave operators for Schrödinger operators with threshold singularities. Doc Math 21: 391-443. |
[44] | Yajima K (2016) On wave operators for Schrödinger operators with threshold singuralities in three dimensions. arXiv: 1606.03575. |
[45] | Zorbas J (1980) Perturbation of self-adjoint operators by Dirac distributions. J Math Phys 21: 840-847. |
[46] | zu Castell W, Filbir F, Szwarc R (2005) Strictly positive definite functions in Rd. J Approx Theory 137: 277-280. |
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 74.16 | 88.76 | 88.76 |
precision | 0.94 | 0.89 | 0.89 | |
recall | 0.76 | 1.00 | 1.00 | |
f1-score | 0.84 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 86.52 | 94.38 | 94.38 |
precision | 1.00 | 0.94 | 0.94 | |
recall | 0.85 | 1.00 | 1.00 | |
f1-score | 0.92 | 0.97 | 0.97 | |
DFS*-3 | accuracy (%) | 98.88 | 100 | 95.51 |
precision | 1.00 | 1.00 | 0.95 | |
recall | 0.99 | 1.00 | 1.00 | |
f1-score | 0.99 | 1.00 | 0.98 |
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 83.15 | 88.76 | 88.76 |
precision | 0.89 | 0.89 | 0.89 | |
recall | 0.92 | 1.00 | 1.00 | |
f1-score | 0.91 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 87.64 | 96.63 | 94.38 |
precision | 0.89 | 0.96 | 0.94 | |
recall | 0.99 | 1.00 | 1.00 | |
f1-score | 0.93 | 0.98 | 0.97 | |
DFS*-3 | accuracy (%) | 94.38 | 94.38 | 95.51 |
precision | 0.97 | 0.94 | 0.95 | |
recall | 0.96 | 1.00 | 1.00 | |
f1-score | 0.97 | 0.97 | 0.98 |
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 88.76 | 88.76 | 88.76 |
precision | 0.89 | 0.89 | 0.89 | |
recall | 1.00 | 1.00 | 1.00 | |
f1-score | 0.94 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 87.64 | 96.63 | 98.88 |
precision | 1.00 | 1.00 | 1.00 | |
recall | 0.86 | 0.96 | 0.99 | |
f1-score | 0.93 | 0.98 | 0.99 | |
DFS*-3 | accuracy (%) | 94.38 | 100 | 100 |
precision | 0.99 | 1.00 | 1.00 | |
recall | 0.95 | 1.00 | 1.00 | |
f1-score | 0.97 | 1.00 | 1.00 |
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 74.16 | 88.76 | 88.76 |
precision | 0.94 | 0.89 | 0.89 | |
recall | 0.76 | 1.00 | 1.00 | |
f1-score | 0.84 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 86.52 | 94.38 | 94.38 |
precision | 1.00 | 0.94 | 0.94 | |
recall | 0.85 | 1.00 | 1.00 | |
f1-score | 0.92 | 0.97 | 0.97 | |
DFS*-3 | accuracy (%) | 98.88 | 100 | 95.51 |
precision | 1.00 | 1.00 | 0.95 | |
recall | 0.99 | 1.00 | 1.00 | |
f1-score | 0.99 | 1.00 | 0.98 |
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 83.15 | 88.76 | 88.76 |
precision | 0.89 | 0.89 | 0.89 | |
recall | 0.92 | 1.00 | 1.00 | |
f1-score | 0.91 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 87.64 | 96.63 | 94.38 |
precision | 0.89 | 0.96 | 0.94 | |
recall | 0.99 | 1.00 | 1.00 | |
f1-score | 0.93 | 0.98 | 0.97 | |
DFS*-3 | accuracy (%) | 94.38 | 94.38 | 95.51 |
precision | 0.97 | 0.94 | 0.95 | |
recall | 0.96 | 1.00 | 1.00 | |
f1-score | 0.97 | 0.97 | 0.98 |
DFS | Measure | DFS1 | DFS2 | DFS3 |
DFS*-1 | accuracy (%) | 88.76 | 88.76 | 88.76 |
precision | 0.89 | 0.89 | 0.89 | |
recall | 1.00 | 1.00 | 1.00 | |
f1-score | 0.94 | 0.94 | 0.94 | |
DFS*-2 | accuracy (%) | 87.64 | 96.63 | 98.88 |
precision | 1.00 | 1.00 | 1.00 | |
recall | 0.86 | 0.96 | 0.99 | |
f1-score | 0.93 | 0.98 | 0.99 | |
DFS*-3 | accuracy (%) | 94.38 | 100 | 100 |
precision | 0.99 | 1.00 | 1.00 | |
recall | 0.95 | 1.00 | 1.00 | |
f1-score | 0.97 | 1.00 | 1.00 |