Energy from municipal solid waste is steadily being integrated into the global energy feedstock, given the huge amount of waste being generated from various sources. This study develops a Multilayer Perceptron Artificial Neural Network for the prediction of High Heating Value of municipal solid waste as a function of moisture content, carbon, hydrogen, oxygen, nitrogen, sulphur, and ash. A total of 123 experimental data were extracted from reliable database for training, testing, and validation of the model. This model was trained, validated and tested with 70%, 20%, and 10% of the municipal solid waste biomass datasets respectively. The predicted High Heating Value was compared with the experimental data for two different training functions: Levenberg Marquardt backpropagation and Resilience backpropagation, and with some correlation from the literature. The accuracy of the model was reported based on some known performance criteria. The values of Root Mean Squared Error (RMSE), Mean Absolute Deviation (MAD), Mean Absolute Percentage Error (MAPE), and Coefficient of Correlation (CC) were 3.587, 2.409, 21.680, 0.970 respectively for RP and 3.095, 0.328, 22.483, 0.986 for LM respectively. Regression analysis was also carried out to determine the level of correlation between the experimental and predicted High Heating Values (HHV). The authors concluded that these models can be a useful tool in the prediction of heating value of MSW in order to facilitate clean energy production from waste.
1.
Introduction
Waste utilization and management is a major challenge in sustainable development, given the mammoth amount of waste being generated from various sources across the globe. Presently, the main trend in waste management is the production of value-added products from Municipal Solid Waste (MSW) [1,2,3], since it leads to the reduction in the amount of MSW deposited at the landfill and also decreases the consumption of traditional fossil fuel [3]. MSW is gaining momentum as a viable source of biofuel [4,5,6], this place a relative advantage on it and biomass as other renewable sources cannot be readily converted to liquid fuel. Also, there exists a high prospect of MSW utilization in electricity generation [7]. The United Nation (UN) projected that the world population will increase by 2.2 billion between 2017 and 2050 [8]. This seemingly exponential increase in the world population and the quest for urban life will subsequently lead to increase in both the MSW generation and energy demand. The main driver of the increase in MSW in urban areas is the change in consumption pattern and living standard of the urban population. In 2016 only, 2.01 billion tonnes of solid waste was generated at the rate of 0.74 kg/person in a day across the global cities with 33% not managed in an environmentally safe manner [9]. From the statistics, the waste generated in cities across the globe may increase to 3.40 billion tonnes from 2016 level by year 2050 [9]. In low-income countries, around 90% of the waste generated are not properly disposed [10]. The MSW generated from developing countries are made of 55–80 % household, followed by market and commercial areas, which are made of industrial, institutional, and other related sources with 10–30% contribution [11,12]. Residents of developing countries are at risk due to unsustainable and indiscriminate disposal of waste [13]. In view of the severe risk associated with this practice, managing waste is very germane to environmental protection and climate change mitigation. Several countries have moved to quantify and utilize the MSW which they generate [14,15,16,17]. Given the cost implication of waste management, waste to energy (WTE) can be considered as a sustainable pathway to waste management. This has a high prospect of serving as an alternative source of energy [18] and in the production of other value-added products [3], which are economically viable and environmentally sustainable [18,19]. MSW is made up of various heterogeneous substances, from which energy can be produced through various conversion processes such as biological, thermochemical, anaerobic digestion, pyrolysis incineration and so on [20,21,22,23,24,25]. Also, methane capturing was proposed as a way to sustainably manage the MSW [16].
However, the design of new systems that can extract fuels from MSW requires knowledge of the fundamental properties of the MSW, especially its High heating value (HHV) and elemental composition as this gives the scientists and engineers a clue to the utility of the MSW in fuel production [26,27]. Heating value is used to determine the quantity of energy, which can be recovered from an amount of waste when subjected to conversion processes [20]. Although the heating value can be experimentally determined using bomb calorimeter, the current trend in global economy requires an urgent need to minimize the cost of production of energy, which include the determination of the heating value of MSW. The information from routine data such as moisture content, carbon, hydrogen, oxygen, nitrogen and sulphur, ash content enables rapid decisions about utilization of MSW. This will further provide a first-hand information regarding the gaseous emission and global warming potential of the MSW.
Several models have been developed based on the experimental data from gravimetric composition, ultimate and proximate analysis and moisture content of waste [20]. However, most of these models do not account for the nonlinear dependencies of MSW [26,27,28]. This has therefore created a knowledge gap regarding the prediction of HHV of MSW [29,30]. A nonlinear dependency exists in MSW when a change in a property does not correspond to a change in other properties. For instance, a change in the moisture content or carbon content of MSW may not linearly correspond to a change in oxygen, hydrogen, nitrogen, or sulphur content. Previous correlations and models based on ultimate analysis were developed by Kathiravale, et al. [31], Wilson [32], Meraz et al [33], Boumanchar, et al. [34], Niessen [35], Chang [36] and Shi, et al. [37] to estimate the heating value of waste.
Presented in Table 1 is an overview of correlations for the prediction of HHV of MSW feedstock. Also, Boumanchar, et al. [34] investigated Multiple Regression Analysis (MLR) and Genetic Programming (GP) formulation for the prediction of the HHV of MSW. Most of the correlations reported (Table 1) were based on the MSW [31,33,34,37], while others were based on organic waste [32,35,36].
Apart from the above-mentioned issues which are related to the prediction of the HHV of MSW, to the best of our knowledge, the literature survey revealed that there is no model which have compared the HHV of biomass using LM and RP. Therefore, the present study constructs a Multilayer Perceptron Artificial Neural Network (MLP-ANN) model for the prediction of High Heating Value (HHV) of MSW using Levenberg Marquardt (LM) and Resilient backpropagation (RP) as the training algorithms with moisture content, carbon, hydrogen, oxygen, nitrogen and sulphur, ash as the input variables.
2.
Materials and methods
2.1. Data collection and processing
The data of experimental measurements from previous studies comprising 123 MSW samples were extracted from literature credited to Meraz, et al. [33] and Phyllis2 [38] since they are the most comprehensive database for MSW. Most of the MSW fall within the classes of domestic waste, plastic, paper, municipal residue, textile and so on. The input variables are the percentage elemental constituents (C, O, H, N, S, ash on dry basis, and moisture content) and the output variable is the HHV (MJ/kg). The dataset for the model was divided into training, testing, and validation in the ratio 70%, 10%, and 20% respectively. The descriptive statistical distribution of the MSW data applied in this study is as shown in Table 2.
2.2. Principles of MLP-ANN
The Multilayer Perceptron (MLP) is a popular supervised learning technique in ANN whose architecture has been used for several forecasting problems in the literature [39,40,41]. It is a distributed mathematical model inspired by the behaviour of human brain and nervous system. The MLP basically consists of three layers; the input layer, hidden layer, and the output layer. The hidden layer may have one or more activation function(s) [42,43,44,45]. The input for this study is the elemental composition including the ash content of the MSW while the output is the HHV. In order to determine the optimal prediction model for the HHV, two training algorithms which are Levenberg Marquardt (LM), and Resilient Backpropagation (RP) algorithms were applied. The network is made of two hidden layers H1 and H2 with 2 and 3 neurons in the first and second layer respectively. After initial trials, the activation function used at the first layer was logarithmic sigmoid [46,47] while the second layer was SoftMax [42] and linear function was chosen for the output layers. Figure 1 and 2 show the schematic diagram of MLP-ANN and neural network architecture respectively. The training and testing steps involved in the MLP-ANN for the prediction of the heating value were presented in Figure 1 while the transfer functions used in the model formulation were represented in Figure 2.
2.2.1. Levenberg-Marquardt backpropagation (LM)
The Levenberg Marquardt algorithm is well-known algorithm which exhibits adaptive behaviour according to the solution distance [48]. It has shown better performance in variety of application when compared to gradient descent and other conjugate gradient methods [49,50]. This method was developed to approach the second-order training speed such that the need for the computation of Hessian matrix is ruled out [51,52]. Considering the effectiveness and efficiency of Newton's method, LM aims at shifting to Newton's method for a quick convergence. The Hessian matrix is approximated when the scalar α is zero, but for a large size of α, a gradient descent with small step size is adopted. This is such that the scalar α is reduced after each successful step and an increase is observed only when an increase in performance function is envisaged by a preliminary step. Thus, a reduction in the performance function, which is typically a sum of the squares for feedforward backpropagation networks is observed at every iteration during the network training process [53]. The Jacobian matrix, which contains the first derivatives of the network errors as a vector with respect to the weights and biases can be computed using a standard backpropagation technique. This computation is observed to be less complex than the Hessian matrix. The weight update is achieved as follows:
such that
where J is the Jacobian matrix, α is a scalar constant and I is an identity matrix.
2.2.2. Resilient backpropagation (RP)
RP algorithm was developed to overcome the local minima error related to backpropagation. Resilient algorithm provides quick local adaptation during the training process. As one of the faster training algorithms, RP eliminates the effects caused by partial derivatives often associated with multi-layered networks trained with sigmoid functions. Slope of sigmoid functions used in multi-layer perceptron approach zeros as input size gets larger. This becomes a problem when steepest descent with sigmoid functions are used for network training. A slight change in the gradient value causes a slight change in the values of the weights and biases, even when the weights and biases are not close to their optimal values [53]. Rather than using the magnitude of the partial derivatives, RP algorithm applies the weight step based local gradient sign to update the weight. When the updated value of each weight is adapted, the delta weights, Δwjk are transformed as follows [52]:
where individual value change is Ajk(i) and the error function is E. RP algorithm gives a faster convergence during neural network training compared to most other algorithms. Table 3 presents the user-defined parameters for the training of the MLP model.
2.3. Performance analysis
The performance of the MLP-ANN model was based on some statistical measures which are; mean absolute deviation (MAD), root mean square error (RMSE), mean absolute percentage error (MAPE), and coefficient of correlation (CC). The choice of these metrics is based on their application in numerous related studies as an effective means of determining the eligibility of the model for prediction. The MAPE gives an information about the average error as a percentage of the predicted value whether the error is positive or negative. It is a dimensionless index. The lower the value of MAPE, the better the performance of the model [37]. According to Chang et al [54], if MAPE is < 10%, the performance of the model is said to be excellent, for values between 10% to less than 20%, the model is considered good; MAPE value from 20% to less than 50% is classified as acceptable, however, a MAPE value > 50 is classified as unacceptable [54]. Although this classification is not absolute since the acceptable MAPE baseline also depend on the characteristics of the dataset. CC measure the strength of the association between two variables. If the value stands at 0, it means there is no correlation between the variables, but if ranges between ﹣1 to 1 then it means there is a strong negative or positive correlation between the variables. As the RMSE is becoming lower, the model is expected to become better.
Correlation Coefficient (CC)
Root Mean square Error (RMSE):
Mean Absolute Deviation (MAD):
Mean absolute percentage error (MAPE)
3.
Results and discussion
The observed and predicted HHV at the testing phase using RP and LM are shown in Figure 3 and 4 respectively in order to compare the prediction performance of both models. It was observed that the predicted HHV and the actual HHV showed similar trend with the minimum and maximum absolute discrepancies being 0.0148 and 8.2034 MJ/kg respectively for LM and 0.0048 and 8.6719 MJ/kg respectively for RP. Both models follow a similar pattern across 90% of the dataset, this further shows the similarity between the predicted and the actual HHV and underlines the advantage of nonlinear regression model in the prediction of variation between the input and output variables. The results obtained in Figure 3 and 4 could improve if the moisture content were ignored in the development of the model [27], but the significance of this parameter in practical application of MSW means that it should be accounted for.
Figure 5 and 6 shows the regression analysis of the network output and the input for training, validation, testing, and overall dataset. The regression between the network input and the target function for RP and LM are 0.9704 and 0.9857 respectively. This shows that LM have better coefficient of regression which translates to better explaining power than RP; the explanatory power of LM is 2% greater than RP. Both learning algorithms showed that the ANN output are in close agreement with the actual HHV.
Table 4 presents the evaluation of the accuracy of the model based on RMSE, MAD, MAPE and CC. Both algorithms showed better performance with good coverage when tested and validated. However, the comparative analysis of their performance metrics showed that LM has lower RMSE (3.095), MAD (0.328) and higher CC (0.986), though at the expense of lower MAPE which is obtained in the case of RP algorithm. The improved MAPE value obtained based on RP may have benefitted from improved speed of convergence, less sensitivity to the training parameters [55] and minimum learning steps attributed to RP.
To ensure the validity of the developed model, the statistical measures were compared for GP and MLR developed by Boumanchar et al [34]. A comparison with the GP with LM and RP shows that the CC is the same for RP but lesser than LM. Also, the RMSE observed in GP is lesser than RP and LM, though at the expense of CC. The MLR shows the highest RMSE and lowest CC of all the models reported. It is safe to conclude that both LM and RP algorithms can be used for the prediction of the HHV of MSW since the variation in their performance is low.
Also, the model developed in this study was further juxtaposed with the existing linear correlation from the literature. The comparison was only drawn with the correlations that used MSW as their dataset and based on this, Shi et al [37], Meraz et al [33], and Kathiravale et al [31] were selected. All the correlations were developed based on the data that was applied in this study to ensure the uniform condition. The deviation of the correlations from the actual experimental data and MLP-ANN model is shown in Table 5. The LM-based model shows the least deviation from the experimental results compared to the RP-based model. This may have been due to the inherent ability of the ANN model to learn the hidden patterns and adapt to non-linear distribution of the data, which is not the strength of linear regression.
Conclusions
The MSW ultimate analysis data with different sort of waste were applied to develop MLP-ANN model given the simplicity of this approach as only little calculation is required. This model was applied to predict the HHV of MSW as an avenue for waste to energy production. It was concluded that both LM and RP algorithms can be used for the prediction of the HHV of MSW, however, the LM-based model performs better than the RP-based model. This tool will be very useful in decision-making process for the designing of thermal conversion system and accelerate the MSW to energy conversion process. Further study will include the larger dataset with an ANN optimization, in order to further improve the robustness of the model.
Conflict of interest
The authors declare no conflict of interests.