
Citation: Sanna Markkanen, Judith Plummer Braeckman, Pon Souvannaseng. Mapping the evolving complexity of large hydropower project finance in low and lower-middle income countries[J]. Green Finance, 2020, 2(2): 151-172. doi: 10.3934/GF.2020009
[1] | Biljana Ilić, Dragica Stojanovic, Gordana Djukic . Green economy: mobilization of international capital for financing projects of renewable energy sources. Green Finance, 2019, 1(2): 94-109. doi: 10.3934/GF.2019.2.94 |
[2] | Mukul Bhatnagar, Sanjay Taneja, Ercan Özen . A wave of green start-ups in India—The study of green finance as a support system for sustainable entrepreneurship. Green Finance, 2022, 4(2): 253-273. doi: 10.3934/GF.2022012 |
[3] | Nabil Haque, Sungida Rashid . Host country characteristics attracting climate projects through public-private partnerships. Green Finance, 2019, 1(4): 405-428. doi: 10.3934/GF.2019.4.405 |
[4] | Shahinur Rahman, Iqbal Hossain Moral, Mehedi Hassan, Gazi Shakhawat Hossain, Rumana Perveen . A systematic review of green finance in the banking industry: perspectives from a developing country. Green Finance, 2022, 4(3): 347-363. doi: 10.3934/GF.2022017 |
[5] | Laura Grumann, Mara Madaleno, Elisabete Vieira . The green finance dilemma: No impact without risk – a multiple case study on renewable energy investments. Green Finance, 2024, 6(3): 457-483. doi: 10.3934/GF.2024018 |
[6] | Goshu Desalegn . Insuring a greener future: How green insurance drives investment in sustainable projects in developing countries?. Green Finance, 2023, 5(2): 195-210. doi: 10.3934/GF.2023008 |
[7] | Wen-Tien Tsai . Green finance for mitigating greenhouse gases and promoting renewable energy development: Case study in Taiwan. Green Finance, 2024, 6(2): 249-264. doi: 10.3934/GF.2024010 |
[8] | María Cantero-Saiz, Sergio Sanfilippo-Azofra, Begoña Torre-Olmo, Violeta Bringas-Fernández . ESG and bank profitability: the moderating role of country sustainability in developing and developed economies. Green Finance, 2025, 7(2): 288-331. doi: 10.3934/GF.2025011 |
[9] | Yafei Wang, Jing Liu, Xiaoran Yang, Ming Shi, Rong Ran . The mechanism of green finance's impact on enterprises' sustainable green innovation. Green Finance, 2023, 5(3): 452-478. doi: 10.3934/GF.2023018 |
[10] | Raja Elyn Maryam Raja Ezuma, Nitanan Koshy Matthew . The perspectives of stakeholders on the effectiveness of green financing schemes in Malaysia. Green Finance, 2022, 4(4): 450-473. doi: 10.3934/GF.2022022 |
Machinery has become an integral part of human life, especially considering the technological advancements related to Industry 4.0. Failures occurring on such crucial machinery lead to unplanned downtime, ultimately resulting in loss in economic aspects [1]. This is catastrophic in industry and public transport since these failures stop production, causing hassles to the public. Hence, machine diagnostics is of high importance in such instances. Fault prediction in an early stage dramatically improves the machine's lifetime, reducing costs and preventing downtimes.
Air and oil leaks are two of the predominant operational failures in public transport modalities, especially in metro trains, which are our prime objective. Air leaks are prone to occur in the dryer component, whereas oil leaks are prone to occur in the compressor component [2]. Various sensors, like pressure transducers, pneumatic sensors, motor current, etc., are used to analyze and diagnose the faults in metro trains [3,4,5]. An abnormal change can be observed in these sensors in the event of a fault in that component. Continuous monitoring of the vulnerable components with the sensors mentioned above can help identify the occurrence of a fault in that component [6].
Predictive maintenance has been an emerging technology in machine diagnostics, aiming to predict faults early and perform maintenance to prevent catastrophic events [7]. Also, anomaly detection based on sensor data on an early scale will reduce maintenance expenses and avoid downtime. Data curation, data pre-processing, diagnosis, and decision-making are the critical aspects of predictive maintenance [8]. This data-driven approach has been proven effective due to the vast availability of data and intelligent algorithms for automated analysis [9].
Artificial intelligence plays a promising role in fault prediction and predictive maintenance. Several machines and deep learning algorithms are trained on continuous data from sensors attached to the target machinery [10]. The proposed work uses machine and deep learning algorithms for anomaly detection (air and oil failure) on the air production unit of metro trains and real-time dashboard development, which is the first of this work as per our knowledge. The following section explains the existing works on machine usage and deep learning algorithms for predictive maintenance.
The following are the contributions of the proposed work:
1. To develop a deep learning algorithm for the simultaneous identification of the type and location of the fault, along with GPS quality monitoring from sensor data in metro trains.
2. To integrate an explainable AI technique into the model's prediction and highlight the key sensors contributing to the fault.
3. To develop a dashboard integrating the sensor data analytics as visual graphs, the deep learning model, and the explainable AI results for analysis by engineers.
This work will be of great aid to maintenance engineers for fault analysis in sensor values and assessment of GPS signals in real time. This predictive maintenance application can aid in reducing the downtime and service costs of machine parts, if found damaged.
The following is the outline of the research paper. Section 2 briefs about the existing work relevant to the field of interest. Section 3 explains the material description, the pre-processing techniques used, the different machine learning algorithms, the training parameters, and the methodologies for dashboard development. Section 4 represents the proposed methods' results, graphs, and supporting diagrams. Section 5 compares the proposed work with existing works and draws significant conclusions and future scope.
The detection of air leakage from the pneumatic door of a train is an attempt to reduce train downtime. Deep learning algorithms were applied for automatic feature extraction from extensive data obtained from continuous monitoring by sensors for the task of fault detection [11]. The OSR (open set recognition) concept was used for multi-task classification to predict the known class and detect unknown samples. A lightweight convolutional neural network (CNN) model streamlined with the OSR technique was trained to predict the air leakage. An 8-layer neural network consisting of 6 convolutions and two dense layers was used for air leakage. This model was trained using an SGD (stochastic gradient descent) optimizer with a learning rate of 0.001 for a batch size of 64.
The server air leakage in the breaking pipe results in breaking issues and decreases the train reliability [12]. Due to the visual constraints for air leak detection, the paper proposes a framework for the simultaneous prediction of the type and severity of air leakage using anomaly detection methods based on the on-and-off logs of the compressor. Around 632,683 data points were collected from May 2016 to October 2016 from 178 VIRM trains, of which 6957 are labeled as "Air Leakage" and 625,726 are labeled as "Normal". They have used a logistic classifier model for two different classes of compressor behavior for each separate train. One defines the boundary by separating two classes under everyday situations, and the other models the distribution of the compressor idle time and run time separately using logistic functions. It also further detects the context of compressor idle time erroneously classified as a compressor run time, aiding in anomaly detection. A density-based unsupervised clustering approach is adopted for anomaly detection before four weeks and can pre-filter anomalies to prevent false alarms.
The challenges encountered by traditional manufacturing companies during their transition to intelligent factories, notably the scarcity of historical data for training machine learning models, were addressed by Mohan Rajashekarappa et al. [13]. A novel approach of artificially inducing anomalies for data labeling was introduced, and it underscored the importance of proactive readiness for potential future disruptions in newly installed systems. Through two experiments focused on air leakage detection, the proposed methodology demonstrates exceptional performance with RUS-Boosted bagged trees, yielding 98.73% accuracy, 99.40% precision, recall of 99.21%, and an F1 score of 99.30% on the test data.
The critical issue of energy efficiency and fault detection in air conditioning systems emphasizes their intricate nature and substantial energy consumption impact [14]. The study comprises two essential components: First, it investigates the ramifications of various faults within the air conditioning system on its coefficient of performance (COP), shedding light on the potential energy wastage associated with these faults. The research convincingly demonstrates that different faults lead to varying degradation levels in the COP. Second, the paper evaluates the effectiveness of three supervised learning classifier models in classifying these faults: deep learning, support vector machine (SVM), and multi-layer perceptron (MLP). The research assesses the performance of these classifiers across six distinct fault classes, revealing that different faults indeed exert varying negative impacts on the COP.
Predicting air failure of the air production unit (APU) in metro trains. The dataset used for this task was MetroPT, a 6-month analysis of metro trains in Portugal comprising analog, digital, and GPS sensors [15]. The GPS information was excluded from the dataset, and the timestamp was encoded using the label encoding technique. A random forest classifier algorithm was used for the multi-class classification of air failure prediction. The data was undersampled and then split into training and testing sets. A feature importance visualization technique was employed to identify the root cause of the air failure. The random forest classifier produced 85% and 97% accuracies for the binary and multi-class classification tasks, respectively.
A deep learning neural network for anomaly detection in metro trains was developed by Davari et al. [16]. The algorithms used for this task were the sparse autoencoder and variational autoencoder. This work is an unsupervised learning approach for anomaly detection of air failures in trains. The MetroPT dataset was used for this work with sensors placed in the air production unit. The two versions of the autoencoder were used for sensor data reconstruction, and a low-pass filter was used to perform anomaly detection and detect faults. The autoencoder algorithms using the digital data produced precision, recall, and F1 scores of 44%, 13%, and 32% better than that of the algorithms trained on the analog data.
An expert system for the multi-objective optimization of equipment was developed for highway optimization by Ali et al. [17]. The particle swarm optimization was used to simultaneously optimize the time, cost, and quality of the equipment for construction. This method reduced the time and cost by 35.4% and 39.1%, respectively. The application of predictive maintenance in concrete manufacturing was done by Alshboul et al. [18]. Seven different classification algorithms were used, out of which the cat boost classifier produced an F1-score of 0.985, an accuracy of 0.984, a recall of 0.983, and a ROC curve area of 0.984. A comparative analysis of machine learning algorithms for concrete strength estimation was performed by Alshboul et al. [19]. Three machine learning algorithms, namely, XGBoost, LighGBM, and genetic programming, were used, out of which the LightGBM and XGBoost algorithms surpassed other studied algorithms with a coefficient of determination of 95.74% and 93.27%, respectively.
The proposed work aims to develop a decision process for failure prediction and identification of the failure type and location using machine learning and multi-task models using deep learning algorithms.
The proposed work adopts the following workflow consisting of different blocks: data acquisition, data pre-processing, feature pre-processing, visualization, model development, validation, and deployment. Figure 1 represents the proposed workflow visually.
The dataset used for this work is named MetroPT [17,18], comprising of sensor information related to urban metro trains in Portugal collected during the year 2022. The dataset comprises different analogue, digital, and GPS sensors continuously capturing data from the metro trains for six months. The MetroPT dataset has been curated to develop AI algorithms for automated fault prediction and predictive maintenance of metro trains based on sensor data. Table 1 represents the different sensors, their description, and the unit of measurement used for acquiring the MetroPT dataset.
Name | Description | Type of sensor | Unit |
TP2 | Compressor pressure | Analog | Bar |
TP3 | Pneumatic panel pressure | Analog | Bar |
H1 | Pressure of the valve that is activated when the pressure exceeds 10.2 bar | Analog | Bar |
DV_Pressure | Pressure drop due to water discharge by air dryers | Analog | Bar |
Reservoirs | Air tank pressure | Analog | Bar |
Oil temperature | Temperature of oil in compressor | Analog | Celsius |
Flowmeter | Airflow | Analog | m3/h |
Motor current | Current flowing in the motor | Analog | Ampere |
Comp | Electric signal of the compressor based on the air intake | Digital | - |
DV Electric | Electric signal of compressor outlet | Digital | - |
Towers | Specifies the two towers based on the action of air drying | Digital | - |
MPG | Trigger to start the compressor when the pressure is less than 8.2 bar | Digital | - |
LPS | Trigger when the pressure is less than 7 bar | Digital | - |
The pressure switch | Trigger when pressure is detected in the pilot control valve | Digital | - |
Oil level | Trigger when the oil level is less than the threshold | Digital | - |
Caudal Impulses | Trigger for the air flowmeter | Digital | - |
GPS Longitude | Longitude position | Analog | ° |
GPS Latitude | Latitude position | Analog | ° |
GPS Speed | Speed | Analog | Km/h |
The dataset comprises around 15 million sensor data records from Jan 1, 2022 to Jun 30, 2022. The dataset obtained from the source is not directly labeled. The dataset owners have provided information about the failure, like the start time, end time, type, and location of the failure. Based on the start and end times, the appropriate timestamps were found and the values between those timestamps were coded according to the fault type. Table 2 shows the statistical description of the dataset for the parameters, namely, TP2, TP3, H1, DV_pressure (DVP), Reservoirs, Oil_temperature (OT), Flowmeter, Motor_current (MoC), COMP, DV Electric (DVE), Towers, MPG, LPS, Pressure_switch (PrS), Oil_level (OL), Caudal_impulses (CaI), GPS Longitude (GPSLong), GPS Latitude (GPSLat), GPS Speed, GPS Quality, month, day, hour, minute, second. The statistical values of mean, standard deviation (std), minimum value, 25%, 50%, 75%, and maximum values for a dataset are evaluated here. Table 3 represents the label code for the different types of faults occurring in the metro train.
TP2 | TP3 | H1 | DVP | Reservoirs | OT | Flowmeter | MoC | |
mean | 0.947 | 8.989 | 8.038 | -0.019 | 1.63 | 65.843 | 20.128 | 2.040 |
std | 2.836 | 0.667 | 2.846 | 0.185 | 0.064 | 5.931 | 3.578 | 2.198 |
min | -0.03 | 0.937 | -0.033 | -0.036 | 1.349 | 18.575 | 18.8347 | -0.009 |
25% | -0.009 | 8.492 | 8.332 | -0.0279 | 1.608 | 61.825 | 18.9748 | 0.0024 |
50% | -0.007 | 8.996 | 8.876 | -0.025 | 1.635 | 66.475 | 19.03 | 0.007 |
75% | -0.006 | 9.506 | 9.438 | -0.025 | 1.667 | 70.575 | 19.040 | 3.837 |
max | 10.806 | 10.38 | 10.368 | 8.11 | 1.791 | 80.174 | 37.008 | 9.3375 |
COMP | DVE | Towers | MPG | LPS | PrS | OL | CaI | |
mean | 0.892 | 0.107 | 0.946 | 0.892 | 0.004 | 0.0 | 0.0 | 0.002 |
std | 0.309 | 0.309 | 0.225 | 0.309 | 0.068 | 0.0 | 0.0 | 0.047 |
min | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
25% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
50% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
75% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
max | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 |
GPSLong | GPSLat | GPSSpeed | GPSQuality | |||||
mean | -7.880 | 37.578 | 8.592 | 0.912 | ||||
std | 2.443 | 11.649 | 14.096 | 0.282 | ||||
min | -8.69 | 0.0 | 0.0 | 0.0 | ||||
25% | -8.66106 | 41.1696 | 0.0 | 1.0 | ||||
50% | -8.658 | 41.1858 | 0.0 | 1.0 | ||||
75% | -8.583 | 41.212 | 16.0 | 1.0 | ||||
max | 0.0 | 41.240 | 286.0 | 1.0 | ||||
month | day | hour | minute | second | ||||
mean | 1.0 | 16.046 | 13.139 | 29.507 | 29.499 | |||
std | 0.0 | 8.919 | 6.444 | 17.318 | 17.318 | |||
min | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | |||
25% | 1.0 | 8.0 | 9.0 | 15.0 | 14.0 | |||
50% | 1.0 | 16.0 | 14.0 | 30.0 | 29.0 | |||
75% | 1.0 | 24.0 | 19.0 | 45.0 | 44.0 | |||
max | 1.0 | 31.0 | 23.0 | 59.0 | 59.0 |
Label Code | Corresponding fault |
0 | Air leak in air dryer |
1 | Air leak in client chamber |
2 | Oil leak in compressor |
3 | No fault |
Timestamp data cannot be processed by machine learning and deep learning algorithms. Hence, it needs to be processed. This issue was tackled by extracting the month, week, day, hour, minute, and second information from the timestamp using the Pandas functionalities. Finally, the timestamp feature is removed from the dataset. Two new columns, one for the type of failure and another for the location of the failure, were created from the labels. These columns are created as these are the labels for the multi-task model. Table 4 represents the different failure types and location codes.
Label Code | Type of failure | Location of failure |
0 | Air leak | Air dryer |
1 | Oil leak | Client chamber |
2 | No failure | Compressor |
3 | Not reported | No location |
The final step in the data pre-processing was the operation of undersampling, especially in the dataset of stage 1. The stage-1 dataset comprises 2, 12,104 samples under the fault condition, about 1.63% of the entire data. Around 98.4% of the data belongs to the regular class, proving the dataset is highly imbalanced. Hence, the samples from the regular class were undersampled. Around 28, 00,000 randomly selected samples were taken from the entire dataset, which was used as the data for the first stage. The final dataset comprises about 25 input features and three target vectors. Table 5 represents the description of all the targets in the final dataset.
Name of target vector | Description | Classes |
Type | Code for the type of failure | 0- Air failure |
1- Oil failure | ||
2- Normal | ||
Location | Code for the location of the failure | 0- Air dryer |
1- Client | ||
2- Compressor | ||
3- No location | ||
GPS quality | Quality of the GPS sensor | 0- Good |
1- Bad |
The dataset has no null or duplicate values since the dataset is obtained from sensors that continuously monitor the trains. The feature scaling technique, normalization, was adopted to bring all the features to the same scale (0-1). Appropriate data visualization techniques were used for the dataset's univariate, bivariate, and multivariate analysis. The dataset comprises continuous (analogue sensors) and categorical features (digital sensors); we have split them for data visualization. A histogram with a kernel density estimator function is used to visualize the continuous features in the dataset. A pie chart is used to visualize the categorical features. Finally, the information was visualized using a map that represents the train's route along with the train's speed. Figures 2, 4, 3, and 5 represent the visualization plots obtained from the continuous, categorical, GPS, and entire dataset, respectively. After visualization, a stratified split of ratio 80:20 was made for the training and testing sets, respectively.
Multi-task learning is the ability of a neural network to simultaneously obtain multiple outputs from a single input. In this work, based on the sensor information, the multi-task neural network is designed to simultaneously predict the type and location of the failure in the metro train. The multi-task neural network has common pre-processing layers, followed by branches, each corresponding to a particular task. Hence, the multi-task neural network uses parallel processing to simultaneously identify the type and location of the fault and assess the GPS quality.
The multi-task neural network comprises shared layers and task-specific layers. The input layer, a single hidden layer with four neurons, and regularization layers like dropout and batch normalization are common for both tasks. In contrast, there are individual output layers for each task. The output layer comprises three neurons for the task of failure identification, one neuron for the task of location identification, and one neuron for the task of GPS quality identification. Figure 6 represents the architecture diagram of the proposed multi-task model.
The abovementioned multi-task neural networks were trained using the Adam optimizer and a combination of categorical and binary cross-entropy loss functions. The batch size was set to 5000, and the models were trained for 20 epochs. A hybrid loss function was used to train the multi-task model since two tasks (fault type and location identification) were multi-class. In contrast, the task of GPS quality identification is binary. Equation 3.1 represents the hybrid loss function used to train the multi-task model.
LF=2∑i=11Nn∑JM∑k=1−ykjlog(p(ykj)i−1NN∑L2∑p=1ylplogPlp | (3.1) |
Evaluation is an essential component of the proposed workflow. Model evaluation is done to identify the model's performance on the testing set. Evaluation of the testing set is essential to identify if the trained model has overfit or underfit. The following performance metrics are derived from the confusion matrix.
The confusion matrix comprises four values: true positive, true negative, false positive, and false negative. The diagonal elements of the matrix indicate the correctly classified samples (true positive and true negative), and the non-diagonal elements indicate the misclassified samples (false positive and false negative). The following are the different metrics used to evaluate the trained models.
Accuracy is defined as the ratio of the correct classifications to that of the total classifications. Accuracy is considered the gold standard metric for the evaluation of classification algorithms. The formula for accuracy is mentioned in Equation 3.2.
Accuracy=TP+TNTP+TN+FP+FN | (3.2) |
Precision is defined as the ratio of true positives to that of the total positives. Precision is one of the metrics used to analyze a model's performance in class imbalance conditions. The formula for precision is mentioned in Equation 3.3.
Precision=TPTP+FP | (3.3) |
Recall is defined as the ratio of true positives to that of total samples. Recall is another metric that is used to analyze multi-class classification under the condition of class imbalance. Equation 3.4 represents the mathematical formula for recall.
Recall=TPTP+FN | (3.4) |
AUC-ROC is expanded as the area under the regional operating characteristic curve. The ROC curve is the plot between the false positive and true positive values. The area under that curve is termed an AUC score. An AUC value of less than 0.5 is considered a terrible score, a score of 0.5 is considered a random guess, and a score of more than 0.8 is considered a good score.
Model interpretability has been a focus and requirement for which the demand has risen in recent years. Many machine learning and deep learning algorithms are considered black boxes, providing output for input data without any logical interpretations. This is needed in areas like healthcare, where life-concerning critical decisions are made. In such instances, providing interpretability to the model by providing explanations of the predictions can greatly aid clinicians.
In this work, the local interpretable model agnostic explanations (LIME) [19] is used to derive the interpretations of the complex ensemble classifier. As the name suggests, LIME works locally, meaning it works on individual data samples. Also, this method is model agnostic, meaning it works for all models. LIME works by mapping a simple interpretable model (like linear regression) on a complex model [20]. The local region of the data space is considered, where synthetic samples are generated based on original samples. These synthetic samples are labeled based on the prediction of the complex model. Then, a simple interpretable linear regression is trained on the synthetic labeled data. The coefficients of the trained linear regression model represent the interpretations of the complex model on the local space. The LIME tabular function from the LIME library is used to get the interpretations for individual data samples.
The LIME function explains the type of fault, location of the fault, and quality of the GPS sensor, respectively. The tabular explainer from LIME was used to explain the predicted instances. A LIME explainer was used for each task: failure type, failure location, and GPS quality. The LIME explainer generates a figure that shows the input that contributed to that particular class (positive) and the features that contributed to the counter-class (negative). Hence, we can understand the features that positively and negatively contribute to a particular class from the plot.
The ultimate aim of the work is to develop a dashboard for real-time data analytics and predictions. A website was developed using a Python-based web development tool. The website will request the data recorded from the sensors as an Excel sheet of CSV (comma-separated values). The data visualization and prediction tasks are done simultaneously upon receiving the data. On the data visualization task, a stacked area chart of the continuous features, a stacked bar chart for the categorical features, and a map depicting the speed and route of the train are made. For the data prediction task, the latest values are processed and sent to the trained multi-task model for predictions on the failure type, failure location, and quality of the GPS, which are displayed on the website. Finally, the LIME explanations for the model on the provided sample are given for all three tasks: failure type identification, failure location identification, and GPS quality assessment.
The developed multi-task model was trained on the training set with the above-mentioned epochs and batch size. Table 6 represents the performance metrics of the model on the training and testing datasets for the fault type and location identification, respectively.
Set | Time consumption per epoch (ms) | Metric | Type identification | Location identification | GPS quality identification |
Training | 43 | Loss | 0.0031 | 0.0020 | 0.0029 |
Accuracy | 99.94 | 99.998 | 99.99 | ||
Precision | 99.99 | 100 | 100 | ||
Recall | 100 | 100 | 100 | ||
AUC | 1 | 1 | 1 | ||
Testing | 1 | Loss | 0.0035 | 0.0026 | 0.0033 |
Accuracy | 98.89 | 99.12 | 99.24 | ||
Precision | 99.56 | 99.67 | 99.84 | ||
Recall | 99.92 | 99.93 | 99.93 | ||
AUC | 1 | 1 | 1 |
The trained multi-task ANN model produced 98.89%, 99.12% and 99.24% accuracy for failure type identification, failure location identification and GPS quality assessment. Also, the trained model's precision, recall and AUC values are high, indicating that the model overcomes class imbalance issues.
The performance plots representing the values of the performance metrics for each epoch in the training phase are shown in Figure 7. The loss gradually decreases in each task's epoch for fault, location, and GPS quality, as shown in Figure 7(a). In contrast, the recall, AUC, and precision are increasing for each epoch, as shown in Figure 7(b), Figure 7(c), and Figure 7(d), respectively. This shows no fluctuations in the training phase and no signs of varying gradients.
The performance plots represent the values of the performance metrics for each epoch in the testing phase. The loss gradually decreases in each task's epoch, as shown in Figure 8(a). In contrast, the recall, AUC, and precision are increasing for each epoch for fault, location, and GPS quality as shown in Figure 8(b), Figure 8(c), and Figure 8(d). The precision, recall, and AUC values for GPS quality reached the maximum in the initial epochs and remained the same for the rest. Also, fluctuations in the precision values are observed for the tasks of fault type and fault location identification.
The confusion matrices on the testing set represent that the trained model has produced high true negatives and true positives. In contrast, it has produced very few false positives and false negatives. The confusion matrix for the fault type is shown in Figure 9(a), The fault location confusion matrix is shown in Figure 9(b) and the confusion matrix for GPS quality is shown in Figure 9(c).
The classification report for fault type classification indicates high precision, recall, and F1 scores for all three classes, showing no sign of class imbalance. The classification report for the trained multi-task model for fault type identification on the testing set is shown in Table 7. The classification report for the trained multi-task model for fault location identification on the testing set is shown in Table 8. The classification report for the trained multi-task model for GPS quality assessment on the testing set is shown in Table 9.
Classes | Precision | Recall | F1-Score |
0 | 0.98 | 1.00 | 0.99 |
1 | 1.00 | 1.00 | 1.00 |
2 | 1.00 | 1.00 | 1.00 |
Classes | Precision | Recall | F1-Score |
0 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
2 | 1 | 1 | 1 |
3 | 1 | 1 | 1 |
Classes | Precision | Recall | F1-Score |
0 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
A website was developed using Streamlit and hosted online using the Streamlit share, see [21] for the website URL. Figures 10 and 11 represent the snips of the developed website about data visualization and predictive modeling. Figures 12, 13, and 14 represent the LIME explanations for the tasks of failure type identification, failure location identification, and GPS quality assessment, respectively. Figure 10 represents the area chart for the continuous features and the bar chart for the categorical features. From this bar, anomalies in the sensor values can be visually identified. Figure 11 represents the map plot for the GPS data; the dots represent the map's route based on the latitude and longitude values, whereas the dot's intensity depicts the speed. This graph identifies the train's route and the crucial locations at which the train went fast/slow. Also, the multi-task neural network predictions are mentioned on the website for each dataset instance.
Table 10 compares the performance of the proposed multi-task neural network to that of the existing works related to the application of predictive maintenance in train fault analysis using sensor data.
Author | Algorithm | Result |
Rajashekarappa et al. [13] | RUS Boosted Classifier | 98.73% accuracy |
Najjar et al. [15] | Random Forest Classifier | 97% accuracy |
Davari et al. [16] | Autoencoder | 44% improvement in precision compared to baseline |
Proposed work | Multi-task Artificial Neural Network | 99% average accuracy |
Based on our knowledge, only two works [15,16] use artificial intelligence for the predictive maintenance of urban metro trains. Najjar et al. [15] worked on predicting air failure of the air production unit (APU) in metro trains. The dataset used for this task was MetroPT, a 6-month analysis of metro trains in Portugal comprising analog, digital, and GPS sensors. The GPS information was excluded from the dataset, and the timestamp was encoded using the label encoding technique. A random forest classifier algorithm was used for the multi-class classification of air failure prediction. The data was undersampled and then split into training and testing sets. A feature importance visualization technique was employed to identify the root cause of the air failure. The proposed work produced testing accuracies of 84% and 87% on the binary and multi-class classification tasks and F1 scores in the ranges of 0.83–0.5 and 0.73–0.97 for the binary and multi-class classification tasks. The proposed model produced better results than the work and has also included oil failures in addition to air failures. Davari et al. [16] developed a deep learning neural network for anomaly detection in metro trains. The algorithms used for this task were the sparse autoencoder and variational autoencoder. This work is an unsupervised learning approach for anomaly detection of air failures in trains. The sparse autoencoder trained on the digital data produced 42% more than those trained on analog data. Also, the variational autoencoder performed better than the sparse autoencoder by 37%. The proposed work considered both analog, digital, and GPS sensors worked on both air and oil failures, and produced state-of-the-art results.
The multi-task model has produced excellent results on the training and testing datasets. The performance plots prove that the model has been trained perfectly and shows no signs of overfitting or underfitting. Also, the confusion matrices and classification report suggest that the trained model is generalized and does not exhibit any signs of class imbalance.
The proposed work has some advantages in comparison to the existing works. The proposed work addresses all issues faced in the metro trains (air and oil failures). Another advantage is the excellent results obtained by the trained models. The third advantage is agility; the proposed multi-task model has taken less time to train and predict batch data. Finally, the explainable AI technique, namely LIME, has been implemented to provide interpretations to the outputs given by the multi-task model. This provides belief in the model prediction and can also be useful for engineers to deeply analyze the issue.
Figure 12 represents the LIME explanation for a local instance related to the task of fault type prediction. As observed from the plot, the predicted instance is air failure with 100% confidence, and the features positively and negatively contribute to the prediction.
Figure 13 represents the LIME explanation for a local instance related to the task of fault location prediction. As observed from the plot, the predicted instance is a client with 100% confidence, and the features positively and negatively contribute to the prediction. Motor current, DV pressure, and the day positively contributed to the prediction, followed by GPS speed and reservoir. The GPS longitude, minute, and flowmeter have negatively contributed to the prediction.
Figure 14 represents the LIME explanation for a local instance related to the task of GPS quality prediction. As observed from the plot, the predicted instance is air failure with 100% confidence, and the features positively and negatively contribute to the prediction. Oil temperature has majorly contributed positively, whereas H1, day, and minute have majorly contributed negatively. For all plots, the range or condition of the input features is given, which might be of great use for the fault analysis.
One reason for achieving good results is the split of the output labels into fault type and location. This reduces the number of interdependent classes in each stage, which might have improved the performance of the algorithms. Also, we observed a performance rise of 6% when the features were standardized. Considering the deep learning aspects, we developed a multi-task model which splits the classes into individual tasks, allowing for more attention, resulting in better results and quicker periods.
However, the study has some limitations. First, the dataset was undersampled to 30, 00,000 data points, roughly 20% of the entire dataset. The second one was the expansion of the target vectors into new columns, which introduced more computations and the need to train more models. While this approach was considered to improve the holistic performance of the models, this resulted in the creation of multiple datasets and, ultimately, multiple ML models for training, leading to computational costs. The third one was the limited selection of machine learning algorithms. Many good machine learning algorithms like support vector machine and ensemble learning techniques were not implemented due to the computational constraints and long training durations (the SVM algorithm did not train even after 30 minutes!).
Hence in this work, a multi-task model was developed for the identification of failures simultaneously. The proposed method has produced 98.89%, 99.12%, and 99.24% accuracies in the testing set for failure type, failure location, and GPS quality predictions, respectively, exceeding the state-of-the-art methods. The model produced 99.56%, 99.67%, and 99.84% precision in the testing set for failure type, failure location, and GPS quality predictions, respectively. The high accuracy and precision values indicate the good performance of the model and no signs of class imbalance. The deep learning model took 43 seconds for training and 1 second for inferencing for test data, showing fast predictions, needed for predictive maintenance applications. Moreover, a real-time interactive dashboard was developed, performing dynamic data visualization and predictions. Finally, using the LIME explainable AI technique provides explanations for the predictions, adding belief and better analysis for engineers. The developed system would be advantageous for engineers to perform fault analysis and predictive maintenance effectively. Future work will use a database to store the streaming data and deploy the system in real time. Also, we will develop deep learning algorithms on the entire dataset and employ online learning strategies to update the learned model in real time.
Pratik Vinayak Jadhav: Data Curation and Analysis, Research Design and Methodology, writing draft. Sairam V. A: Data Curation and Analysis, Research Design and Methodology, writing draft. Siddharth Sonkavade: Data Curation and Analysis, Research Design and Methodology, writing draft. Shivali Amit Wagle : Conceptualization, Supervision, Project Development, Writing, Review, and Editing. Preksha Pareek: Conceptualization, Supervision, Project Development, Writing, and Review. Ketan Kotecha: Writing, Review, and Editing, Funding and Resources. Tanupriya Choudhury: Data Analysis, Writing, Review, and Editing.
The authors declare no conflict of interest.
[1] | AfDB (2019) Bujagali Interconnection Project-project completion report. Available from: https://www.afdb.org/en/documents/document/uganda-bujagali-interconnection-project-project-completion-report-101626. |
[2] | African Energy (2018) Cameroon: Africa50 and Stoa acquire stakes in Nachtigal. Available from: https://www.africa-energy.com/live-data/article/cameroon-africa50-and-stoa-acquire-stakes-nachtigal. |
[3] |
Alam F, Alam Q, Reza S, et al. (2017) A review of hydropower projects in Nepal. Energy Procedia, 581-585. doi: 10.1016/j.egypro.2017.03.188
![]() |
[4] | Authority BP [BPA] (2011) Leaflet describing the Bui hydropower project, background and proposed benefits, Accra: Bui Power Authority. |
[5] | Bitexco Power (2016) Investment signing ceremony. Available from: http://bitexco.com.vn/newdetail/uob-orix-to-invest-50m-in-vietnambased-bitexco-power 113.html. |
[6] | Blimpo MP, Cosgrove-Davies M (2019) Electricity Access in Sub-Saharan Africa: Uptake, Reliability, and Complementary Factors for Economic Impact, Washington DC: World Bank. |
[7] |
Bottelier P (2007) China and the World Bank: how a partnership was built. J Contemp China 16: 239-258. doi: 10.1080/10670560701194475
![]() |
[8] |
Briscoe J (1999) The financing of hydropower, irrigation and water supply infrastructure in developing countries. Int J Water Resour Dev 15: 459-491. doi: 10.1080/07900629948718
![]() |
[9] |
Bräutigam D (2011) Aid "with Chinese characteristics": Chinese foreign aid and development finance meet the OECD-DAC aid regime. J Int Dev 23: 752-764. doi: 10.1002/jid.1798
![]() |
[10] | Centre for Public Impact (2017) The Bujagali Dam Project in Uganda. Available from: https://www.centreforpublicimpact.org/case-study/bujagali-dam-project-uganda/. |
[11] | Chen W, Dollar D, Tang H (2016) Why is China investing in Africa? Evidence from the firm level. World Bank Econ Rev 32: 610-632. |
[12] | Cheng D, Shi X, Yu J (2020) The impact of the green energy infrastructure on firm productivity: evidence from the three gorges project in the people's republic of china. ADBI Working Paper No.1075, February 2020. Available from: https://www.adb.org/publications/impact-green-energy-infrastructure-firm-productivity. |
[13] | China Exim Bank (2016) White Paper on Green Finance. Available from: english.eximbank.gov.cn. |
[14] | Corfee-Morlot J, Parks P, Ogunleye J (2019) Achieving Clean Energy Access in Sub-Saharan Africa. Available from: https://www.oecd.org/environment/cc/climate-futures/case-study-achieving-clean-energy-access-in-sub-saharan-africa.pdf. |
[15] | Dreher A, Fuchs A, Parks BC, et al. (2017) Aid, China, and Growth: Evidence from a New Global Development Finance Dataset. AidData Working Paper #46. Williamsburg VA: AidData. |
[16] | Eberhard A, Gratwick K, Morella E, et al. (2016) Independent Power Projects in Sub-Saharan Africa: Lessons from Five Key Countries, Washington DC: World Bank. |
[17] | Equator Principles (2013) Equator Principles. Available from: https://equatorprinciples.com/wp-content/uploads/2017/03/equator_principles_III.pdf. |
[18] | Gallagher K (2018) China's global energy finance, Boston: Global Development Policy Center, Boston University. |
[19] |
Gugler P, Shi J (2009) Corporate social responsibility for developing country multinational corporations: Lost war in pertaining global competitiveness. J Bus Ethics 87: 3-24. doi: 10.1007/s10551-008-9801-5
![]() |
[20] |
Hausermann H (2018) "Ghana must Progress, but we are Really Suffering": Bui Dam, Antipolitics Development, and the Livelihood Implications for Rural People. Society Natural Resour 31: 633-648. doi: 10.1080/08941920.2017.1422062
![]() |
[21] | Heiser W, Liu I, Sachdev KBS (2018) Chinese financing options for Southeast Asian hydropower projects. Int J Hydropower Dams 25: 40-44. |
[22] | Hensengerth O (2011) Interaction of Chinese institutions with host governments in dam construction: the Bui Dam in Ghana. Available from: http://nrl.northumbria.ac.uk/15230/1/Interaction_of_Chinese_Institutions.pdf. |
[23] |
Hensengerth O (2013) Chinese hydropower companies and environmental norms in countries of the global South: the involvement of Sinohydro in Ghana's Bui Dam. Environ Dev Sustain 15: 285-300. doi: 10.1007/s10668-012-9410-4
![]() |
[24] | HSA (2019) Hydropower Sustainability Assessment Guidelines and Protocols. Available from: www.hydrosustainability.org. |
[25] | ICOLD (2011) Constitution status. Available from: https://www.icold-cigb.org/userfiles/files/CIGB/INSTITUTIONAL_FILES/Constitution2011.pdf. |
[26] | IEA (2018) International Energy Agency statistics. Available from: www.iea.org/topics/renewables/hydropower/. |
[27] | IEA (2017) Southeast Asia energy outlook 2017. Available from: https://www.iea.org/publications/freepublications/publication/WEO2017SpecialReport_SoutheastAsiaEnergyOutlook.pdf. |
[28] | IEA-ETSAP IRENA (2015) Hydropower Technology Brief. Technology Brief E06. Available from: https://www.irena.org/-/media/Files/IRENA/Agency/Publication/2015/IRENA-ETSAP_Tech_Brief_E06_Hydropower.pdf. |
[29] | IFC (2015) Hydroelectric Power: A Guide for Developers and Investors. Available from: www.ifc.org/wps/wcm/connect/06b2df8047420bb4a4f7ec57143498e5/Hydropower_Report.pdf. |
[30] | IFC (2017) Blended finance at IFC. Available from: https://www.ifc.org/wps/wcm/connect/b775aee2-dd16-4903-89bc-17876825bad8/BF-factsheet-dec2017-01-print.pdf?MOD=AJPERES&CVID=m0Bft1u. |
[31] | IFC (2018) Pioneering responsible business standards: The Equator Principles at 15. Available from: https://www.ifc.org/wps/wcm/connect/news_ext_content/ifc_external_corporate_site/news+and+events/news/insights/perspectives-i2c2 . |
[32] | IHA (2015) Sustainable Development Goals: how does hydropower fit in? Available from: https://www.hydropower.org/blog/sustainable-developmentgoals-how-does-hydropower-fit-in. |
[33] | Ingram E (2018) EDF, IFC, Republic of Cameroon sign agreements to build 420-MW Nachtigal hydropower plant. Hydro Rev, 11. |
[34] | IRENA (2019) Renewable capacity highlights. Available from: https://www.irena.org/-/media/Files/IRENA/Agency/Publication/2019/Mar/RE_capacity_highlights_2019.pdf?la=en&hash=BA9D38354390B001DC0CC9BE03EEE559C280013F. |
[35] |
Kirchherr J, Matthews N, Charles KJ, et al. (2017) "Learning it the hard way": social safeguards norms in Chinese-led dam projects in Myanmar, Laos and Cambodia. Energy Policy 102: 529-539. doi: 10.1016/j.enpol.2016.12.058
![]() |
[36] | Le L (2017) Building Hydropower Plants in Uganda: Who is the Best Partner? Freeman Spogli Institute for International Studies, Stanford University and Johns Hopkins School of Advanced International Studies. Available from: https://fsi.stanford.edu/publication/building-hydropower-plants-uganda-who-best-partner. |
[37] | Locher H, Hermansen GY, Johannesson GA, et al. (2010) Initiatives in the hydro sector post-World Commission on Dams-the Hydropower Sustainability Assessment Forum. Water Altern 3: 43-57. |
[38] | Markkanen S, Plummer Braeckman J (2019) Financing Sustainable Hydropower Projects in Emerging Markets: An Introduction to Concepts and Terminology. FutureDAMS Working Paper 003. Manchester: The University of Manchester. |
[39] |
Merme V, Ahlers R, Gupta J (2014) Private equity, public affair: Hydropower financing in the Mekong Basin. Global Environ Change 24: 20-29. doi: 10.1016/j.gloenvcha.2013.11.007
![]() |
[40] |
Meyer R, Eberhard A, Gratwick K (2018) Uganda's power sector reform: there and back again? Energy Sustainable Dev 43: 75-89. doi: 10.1016/j.esd.2017.11.001
![]() |
[41] | MIGA (2018) Nachtigal Hydro IPP. Available from: https://www.miga.org/project/nachtigal-hydro-ipp. |
[42] | Mott MacDonald (2009) Enhancing Development Benefits to Local Communities from Hydropower Projects: A Literature Review. Available from: http://documents.worldbank.org/curated/en/406951468326991910/pdf/702810ESW0P1100tBenefits0Lit0Review.pdf. |
[43] |
Obour P, Owusu K, Agyeman EA, et al. (2016) The impacts of dams on local livelihoods: a study of the Bui Hydroelectric Project in Ghana. Int J Water Resour Dev 32: 286-300. doi: 10.1080/07900627.2015.1022892
![]() |
[44] | Overseas Development Institute (ODI) (2016) Age of Choice: Uganda in the New Development Finance Landscape. Available from: https://www.odi.org/sites/odi.org.uk/files/resource-documents/10459.pdf. |
[45] |
Oud E (2002) The evolving context for hydropower development. Energy Policy 30: 1215-1223. doi: 10.1016/S0301-4215(02)00082-4
![]() |
[46] |
Pepermans G, Driesen J, Haeseldonckx D, et al. (2005) Distributed generation: definition, benefits and issues. Energy Policy 33: 787-798. doi: 10.1016/j.enpol.2003.10.004
![]() |
[47] | Plummer J (2013) Assessing the effects of pre-construction delay in hydropower projects. PhD Thesis. Cambridge: Department of Engineering, Centre for Sustainable Development, University of Cambridge. |
[48] | Plummer Braeckman J, Disselhoff T, Kirchherr J (2019) Cost and schedule overruns in large hydropower dams: an assessment of projects completed since 2000. Int J Water Resour Dev, 1-16. |
[49] | Poindexter G (2017) CTGC begins construction on the 16-GW Baihetan hydropower station in Southwest China. Hydroworld 8/2017. Available from: https://www.hydroreview.com/2017/08/03/ctgc-begins-construction-on-the-16-gw-baihetan-hydropower-station-in-southwest-china/#gref. |
[50] | Porter IC, Shivakumar J (eds) (2010) Doing a Dam Better: The Lao People's Democratic Republic and the Story of Nam Theun 2, Washington DC: World Bank. |
[51] |
Tirpak D, Adams H (2008) Bilateral and multilateral financial assistance for the energy sector of developing countries. Climate Policy 8: 135-151. doi: 10.3763/cpol.2007.0443
![]() |
[52] | WEF-World Economic Forum (2020) The argument for suspending debt payments for emerging economies throughout the pandemic. Available from: https://www.weforum.org/agenda/2020/04/suspend-emerging-developing-economies-debt-payments-covid19-coronavirus. |
[53] | World Bank (1961) Report to the International Bank for Reconstruction and Development-Uganda Electricity Board Project. Available from: documents.worldbank.org. |
[54] | World Bank (1962) Report and recommendations of the President to the Executive directors on a proposed development credit to India for the second Koyna power project. Available from: http://documents.worldbank.org/curated/en/560331468285881087/pdf/multi0page.pdf. |
[55] | World Bank (1973) Appraisal of Kafue Hydropower Project stage II, Zambia. World Bank staff appraisal report, 7 May 1973. Available from: documents.worldbank.org. |
[56] | World Bank (1977) Report and Recommendation of the President of the International Development Association and the International Bank for Reconstruction and Development to the Executive Directors on a proposed credit and proposed loans to the Republic of Malawi for a Third Power Project. Available from: documents.worldbank.org. |
[57] | World Bank (2005) Project appraisal report for Nam Theun II, Laos PDR. Available from: http://documents.worldbank.org/curated/en/250731468277466031/pdf/317640corr.pdf. |
[58] | World Bank Group (2014) Supporting Hydropower: An Overview of the World Bank Group's Engagement. Available from: http://documents.worldbank.org/curated/en/628221468337849536/pdf/91154-REPF-BRI-PUBLIC-Box385314B-ADD-SERIES-Live-wire-knowledge-note-series-LW36-New-a-OKR.pdf. |
[59] | World Bank (2017a) State of Electricity Access Report 2017. Available from: http://documents.worldbank.org/curated/en/364571494517675149/pdf/114841-REVISED-JUNE12-FINAL-SEAR-web-REV-optimized.pdf. |
[60] | World Bank (2017b) Maximizing Finance for Development (MFD). Available from: https://www.worldbank.org/en/about/partners/maximizing-finance-for-development. |
[61] | World Bank (2018) Cameroon: World Bank Group helps boost hydropower capacity. Press release. Available from: https://www.worldbank.org/en/news/press-release/2018/07/19/cameroon-world-bank-group-helps-boost-hydropower-capacity. |
[62] | World Bank and IEA (2015) Progress toward Sustainable Energy 2015. Available from: https://openknowledge.worldbank.org/handle/10986/22148. |
[63] | World Energy Council (2015) World Energy Resources: Charting the Upsurge in Hydropower Development 2015. Available from: https://www.worldenergy.org/assets/downloads/World-Energy-Resources_Charting-the-Upsurge-in-Hydropower-Development_2015_Report2.pdf. |
[64] |
Yankson P, Asiedu A, Owusu K, et al. (2018) The livelihood challenges of resettled communities of the Bui dam project in Ghana and the role of Chinese dam-builders. Dev Policy Rev 36: O476-O494. doi: 10.1111/dpr.12259
![]() |
[65] |
Zimny J, Michalak P, Bielik S, et al. (2013) Directions in development of hydropower in the world, in Europe and Poland in the period 1995-2011. Renew Sust Energy Rev 21: 117-130. doi: 10.1016/j.rser.2012.12.049
![]() |
1. | Piotr F. Borowski, Nexus between water, energy, food and climate change as challenges facing the modern global, European and Polish economy, 2020, 6, 2471-2132, 397, 10.3934/geosci.2020022 | |
2. | Christopher Schulz, Udisha Saklani, The future of hydropower development in Nepal: Views from the private sector, 2021, 179, 09601481, 1578, 10.1016/j.renene.2021.07.138 | |
3. | Fanqi Zou, Tinghui Li, Feite Zhou, Does the Level of Financial Cognition Affect the Income of Rural Households? Based on the Moderating Effect of the Digital Financial Inclusion Index, 2021, 11, 2073-4395, 1813, 10.3390/agronomy11091813 | |
4. | Divya Narain, Hoong Chen Teo, Alex Mark Lechner, James E.M. Watson, Martine Maron, Biodiversity risks and safeguards of China’s hydropower financing in Belt and Road Initiative (BRI) countries, 2022, 5, 25903322, 1019, 10.1016/j.oneear.2022.08.012 | |
5. | Edoardo Borgomeo, Bill Kingdom, Judith Plummer-Braeckman, Winston Yu, Water infrastructure in Asia: financing and policy options, 2022, 0790-0627, 1, 10.1080/07900627.2022.2062707 | |
6. | Judith Plummer Braeckman, Sanna Markkanen, Perceptions of Risk in Relation to Large Hydropower Projects: A Finance Perspective, 2021, 1556-5068, 10.2139/ssrn.4011266 | |
7. | Emmanuel Yamoah Tenkorang, Francis Enu-Kwesi, Franklin Bendu, Pon Souvannaseng, Evolving Lending Regimes and the Political Economy of Dam Financing in Ghana, 2022, 1556-5068, 10.2139/ssrn.4013584 | |
8. | Yan Yang, Bo Yang, Zijun Xin, Green finance development, environmental attention and investment in hydroelectric power: From the perspective of environmental protection law, 2024, 69, 15446123, 106167, 10.1016/j.frl.2024.106167 | |
9. | Pon Souvannaseng, Fast Finance and the Political Economy of Catastrophic Dam Collapse in Lao PDR: The Case of Xe Pian-Xe Namnoy, 2024, 97, 0030-851X, 261, 10.5509/2024972-art7 |
Name | Description | Type of sensor | Unit |
TP2 | Compressor pressure | Analog | Bar |
TP3 | Pneumatic panel pressure | Analog | Bar |
H1 | Pressure of the valve that is activated when the pressure exceeds 10.2 bar | Analog | Bar |
DV_Pressure | Pressure drop due to water discharge by air dryers | Analog | Bar |
Reservoirs | Air tank pressure | Analog | Bar |
Oil temperature | Temperature of oil in compressor | Analog | Celsius |
Flowmeter | Airflow | Analog | m3/h |
Motor current | Current flowing in the motor | Analog | Ampere |
Comp | Electric signal of the compressor based on the air intake | Digital | - |
DV Electric | Electric signal of compressor outlet | Digital | - |
Towers | Specifies the two towers based on the action of air drying | Digital | - |
MPG | Trigger to start the compressor when the pressure is less than 8.2 bar | Digital | - |
LPS | Trigger when the pressure is less than 7 bar | Digital | - |
The pressure switch | Trigger when pressure is detected in the pilot control valve | Digital | - |
Oil level | Trigger when the oil level is less than the threshold | Digital | - |
Caudal Impulses | Trigger for the air flowmeter | Digital | - |
GPS Longitude | Longitude position | Analog | ° |
GPS Latitude | Latitude position | Analog | ° |
GPS Speed | Speed | Analog | Km/h |
TP2 | TP3 | H1 | DVP | Reservoirs | OT | Flowmeter | MoC | |
mean | 0.947 | 8.989 | 8.038 | -0.019 | 1.63 | 65.843 | 20.128 | 2.040 |
std | 2.836 | 0.667 | 2.846 | 0.185 | 0.064 | 5.931 | 3.578 | 2.198 |
min | -0.03 | 0.937 | -0.033 | -0.036 | 1.349 | 18.575 | 18.8347 | -0.009 |
25% | -0.009 | 8.492 | 8.332 | -0.0279 | 1.608 | 61.825 | 18.9748 | 0.0024 |
50% | -0.007 | 8.996 | 8.876 | -0.025 | 1.635 | 66.475 | 19.03 | 0.007 |
75% | -0.006 | 9.506 | 9.438 | -0.025 | 1.667 | 70.575 | 19.040 | 3.837 |
max | 10.806 | 10.38 | 10.368 | 8.11 | 1.791 | 80.174 | 37.008 | 9.3375 |
COMP | DVE | Towers | MPG | LPS | PrS | OL | CaI | |
mean | 0.892 | 0.107 | 0.946 | 0.892 | 0.004 | 0.0 | 0.0 | 0.002 |
std | 0.309 | 0.309 | 0.225 | 0.309 | 0.068 | 0.0 | 0.0 | 0.047 |
min | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
25% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
50% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
75% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
max | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 |
GPSLong | GPSLat | GPSSpeed | GPSQuality | |||||
mean | -7.880 | 37.578 | 8.592 | 0.912 | ||||
std | 2.443 | 11.649 | 14.096 | 0.282 | ||||
min | -8.69 | 0.0 | 0.0 | 0.0 | ||||
25% | -8.66106 | 41.1696 | 0.0 | 1.0 | ||||
50% | -8.658 | 41.1858 | 0.0 | 1.0 | ||||
75% | -8.583 | 41.212 | 16.0 | 1.0 | ||||
max | 0.0 | 41.240 | 286.0 | 1.0 | ||||
month | day | hour | minute | second | ||||
mean | 1.0 | 16.046 | 13.139 | 29.507 | 29.499 | |||
std | 0.0 | 8.919 | 6.444 | 17.318 | 17.318 | |||
min | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | |||
25% | 1.0 | 8.0 | 9.0 | 15.0 | 14.0 | |||
50% | 1.0 | 16.0 | 14.0 | 30.0 | 29.0 | |||
75% | 1.0 | 24.0 | 19.0 | 45.0 | 44.0 | |||
max | 1.0 | 31.0 | 23.0 | 59.0 | 59.0 |
Label Code | Corresponding fault |
0 | Air leak in air dryer |
1 | Air leak in client chamber |
2 | Oil leak in compressor |
3 | No fault |
Label Code | Type of failure | Location of failure |
0 | Air leak | Air dryer |
1 | Oil leak | Client chamber |
2 | No failure | Compressor |
3 | Not reported | No location |
Name of target vector | Description | Classes |
Type | Code for the type of failure | 0- Air failure |
1- Oil failure | ||
2- Normal | ||
Location | Code for the location of the failure | 0- Air dryer |
1- Client | ||
2- Compressor | ||
3- No location | ||
GPS quality | Quality of the GPS sensor | 0- Good |
1- Bad |
Set | Time consumption per epoch (ms) | Metric | Type identification | Location identification | GPS quality identification |
Training | 43 | Loss | 0.0031 | 0.0020 | 0.0029 |
Accuracy | 99.94 | 99.998 | 99.99 | ||
Precision | 99.99 | 100 | 100 | ||
Recall | 100 | 100 | 100 | ||
AUC | 1 | 1 | 1 | ||
Testing | 1 | Loss | 0.0035 | 0.0026 | 0.0033 |
Accuracy | 98.89 | 99.12 | 99.24 | ||
Precision | 99.56 | 99.67 | 99.84 | ||
Recall | 99.92 | 99.93 | 99.93 | ||
AUC | 1 | 1 | 1 |
Classes | Precision | Recall | F1-Score |
0 | 0.98 | 1.00 | 0.99 |
1 | 1.00 | 1.00 | 1.00 |
2 | 1.00 | 1.00 | 1.00 |
Classes | Precision | Recall | F1-Score |
0 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
2 | 1 | 1 | 1 |
3 | 1 | 1 | 1 |
Classes | Precision | Recall | F1-Score |
0 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
Author | Algorithm | Result |
Rajashekarappa et al. [13] | RUS Boosted Classifier | 98.73% accuracy |
Najjar et al. [15] | Random Forest Classifier | 97% accuracy |
Davari et al. [16] | Autoencoder | 44% improvement in precision compared to baseline |
Proposed work | Multi-task Artificial Neural Network | 99% average accuracy |
Name | Description | Type of sensor | Unit |
TP2 | Compressor pressure | Analog | Bar |
TP3 | Pneumatic panel pressure | Analog | Bar |
H1 | Pressure of the valve that is activated when the pressure exceeds 10.2 bar | Analog | Bar |
DV_Pressure | Pressure drop due to water discharge by air dryers | Analog | Bar |
Reservoirs | Air tank pressure | Analog | Bar |
Oil temperature | Temperature of oil in compressor | Analog | Celsius |
Flowmeter | Airflow | Analog | m3/h |
Motor current | Current flowing in the motor | Analog | Ampere |
Comp | Electric signal of the compressor based on the air intake | Digital | - |
DV Electric | Electric signal of compressor outlet | Digital | - |
Towers | Specifies the two towers based on the action of air drying | Digital | - |
MPG | Trigger to start the compressor when the pressure is less than 8.2 bar | Digital | - |
LPS | Trigger when the pressure is less than 7 bar | Digital | - |
The pressure switch | Trigger when pressure is detected in the pilot control valve | Digital | - |
Oil level | Trigger when the oil level is less than the threshold | Digital | - |
Caudal Impulses | Trigger for the air flowmeter | Digital | - |
GPS Longitude | Longitude position | Analog | ° |
GPS Latitude | Latitude position | Analog | ° |
GPS Speed | Speed | Analog | Km/h |
TP2 | TP3 | H1 | DVP | Reservoirs | OT | Flowmeter | MoC | |
mean | 0.947 | 8.989 | 8.038 | -0.019 | 1.63 | 65.843 | 20.128 | 2.040 |
std | 2.836 | 0.667 | 2.846 | 0.185 | 0.064 | 5.931 | 3.578 | 2.198 |
min | -0.03 | 0.937 | -0.033 | -0.036 | 1.349 | 18.575 | 18.8347 | -0.009 |
25% | -0.009 | 8.492 | 8.332 | -0.0279 | 1.608 | 61.825 | 18.9748 | 0.0024 |
50% | -0.007 | 8.996 | 8.876 | -0.025 | 1.635 | 66.475 | 19.03 | 0.007 |
75% | -0.006 | 9.506 | 9.438 | -0.025 | 1.667 | 70.575 | 19.040 | 3.837 |
max | 10.806 | 10.38 | 10.368 | 8.11 | 1.791 | 80.174 | 37.008 | 9.3375 |
COMP | DVE | Towers | MPG | LPS | PrS | OL | CaI | |
mean | 0.892 | 0.107 | 0.946 | 0.892 | 0.004 | 0.0 | 0.0 | 0.002 |
std | 0.309 | 0.309 | 0.225 | 0.309 | 0.068 | 0.0 | 0.0 | 0.047 |
min | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
25% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
50% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
75% | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
max | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 |
GPSLong | GPSLat | GPSSpeed | GPSQuality | |||||
mean | -7.880 | 37.578 | 8.592 | 0.912 | ||||
std | 2.443 | 11.649 | 14.096 | 0.282 | ||||
min | -8.69 | 0.0 | 0.0 | 0.0 | ||||
25% | -8.66106 | 41.1696 | 0.0 | 1.0 | ||||
50% | -8.658 | 41.1858 | 0.0 | 1.0 | ||||
75% | -8.583 | 41.212 | 16.0 | 1.0 | ||||
max | 0.0 | 41.240 | 286.0 | 1.0 | ||||
month | day | hour | minute | second | ||||
mean | 1.0 | 16.046 | 13.139 | 29.507 | 29.499 | |||
std | 0.0 | 8.919 | 6.444 | 17.318 | 17.318 | |||
min | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | |||
25% | 1.0 | 8.0 | 9.0 | 15.0 | 14.0 | |||
50% | 1.0 | 16.0 | 14.0 | 30.0 | 29.0 | |||
75% | 1.0 | 24.0 | 19.0 | 45.0 | 44.0 | |||
max | 1.0 | 31.0 | 23.0 | 59.0 | 59.0 |
Label Code | Corresponding fault |
0 | Air leak in air dryer |
1 | Air leak in client chamber |
2 | Oil leak in compressor |
3 | No fault |
Label Code | Type of failure | Location of failure |
0 | Air leak | Air dryer |
1 | Oil leak | Client chamber |
2 | No failure | Compressor |
3 | Not reported | No location |
Name of target vector | Description | Classes |
Type | Code for the type of failure | 0- Air failure |
1- Oil failure | ||
2- Normal | ||
Location | Code for the location of the failure | 0- Air dryer |
1- Client | ||
2- Compressor | ||
3- No location | ||
GPS quality | Quality of the GPS sensor | 0- Good |
1- Bad |
Set | Time consumption per epoch (ms) | Metric | Type identification | Location identification | GPS quality identification |
Training | 43 | Loss | 0.0031 | 0.0020 | 0.0029 |
Accuracy | 99.94 | 99.998 | 99.99 | ||
Precision | 99.99 | 100 | 100 | ||
Recall | 100 | 100 | 100 | ||
AUC | 1 | 1 | 1 | ||
Testing | 1 | Loss | 0.0035 | 0.0026 | 0.0033 |
Accuracy | 98.89 | 99.12 | 99.24 | ||
Precision | 99.56 | 99.67 | 99.84 | ||
Recall | 99.92 | 99.93 | 99.93 | ||
AUC | 1 | 1 | 1 |
Classes | Precision | Recall | F1-Score |
0 | 0.98 | 1.00 | 0.99 |
1 | 1.00 | 1.00 | 1.00 |
2 | 1.00 | 1.00 | 1.00 |
Classes | Precision | Recall | F1-Score |
0 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
2 | 1 | 1 | 1 |
3 | 1 | 1 | 1 |
Classes | Precision | Recall | F1-Score |
0 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
Author | Algorithm | Result |
Rajashekarappa et al. [13] | RUS Boosted Classifier | 98.73% accuracy |
Najjar et al. [15] | Random Forest Classifier | 97% accuracy |
Davari et al. [16] | Autoencoder | 44% improvement in precision compared to baseline |
Proposed work | Multi-task Artificial Neural Network | 99% average accuracy |