Research article Topical Sections

Sulfur dioxide removal by sol-gel sorbent derived CuO/Alumina sorbents in fixed bed adsorber

  • Received: 05 December 2016 Accepted: 18 January 2017 Published: 10 February 2017
  • Nanostructured alumina supported copper oxide granular sorbents were prepared by the sol-gel method. The properties of the sol-gel derived sorbents were compared with a similar commercial sorbent which has been used in the pilot scale moving-bed copper oxide process for flue gas treatment. The crushing strength of the sol-gel derived sorbents is about 6–7 times that of the commercial samples, while the attrition rate of the former is at least 3 times smaller. At temperatures below 400 °C, SO2 sorption capacity of the sol-gel derived sorbent is about 3 times that of the commercial sorbent with a similar amount of CuO loading (7–9 wt%). The better mechanical properties and higher sulfation capacity of the sol-gel derived alumina supported copper oxide sorbents are due to their unique microstructure and the coating method for CuO.

    Citation: Zhong-Min Wang. Sulfur dioxide removal by sol-gel sorbent derived CuO/Alumina sorbents in fixed bed adsorber[J]. AIMS Environmental Science, 2017, 4(1): 134-144. doi: 10.3934/environsci.2017.1.134

    Related Papers:

    [1] Herbert F. Jelinek, Andrei V. Kelarev . A Survey of Data Mining Methods for Automated Diagnosis of Cardiac Autonomic Neuropathy Progression. AIMS Medical Science, 2016, 3(2): 217-233. doi: 10.3934/medsci.2016.2.217
    [2] Andrei V. Kelarev, Xun Yi, Hui Cui, Leanne Rylands, Herbert F. Jelinek . A survey of state-of-the-art methods for securing medical databases. AIMS Medical Science, 2018, 5(1): 1-22. doi: 10.3934/medsci.2018.1.1
    [3] Isaac Kofi Owusu, Emmanuel Acheamfour-Akowuah, Lois Amoah-Kumi, Yaw Amo Wiafe, Stephen Opoku, Enoch Odame Anto . The correlation between obesity and other cardiovascular disease risk factors among adult patients attending a specialist clinic in Kumasi. Ghana. AIMS Medical Science, 2023, 10(1): 24-36. doi: 10.3934/medsci.2023003
    [4] Frantisek Franek, W. F. Smyth, Xinfang Wang . The Role of The Prefix Array in Sequence Analysis: A Survey. AIMS Medical Science, 2017, 4(3): 261-273. doi: 10.3934/medsci.2017.3.261
    [5] Masoud Nazemiyeh, Mehrzad Hajalilou, Mohsen Rajabnia, Akbar Sharifi, Sabah Hasani . Diagnostic value of Endothelin 1 as a marker for diagnosis of pulmonary parenchyma involvement in patients with systemic sclerosis. AIMS Medical Science, 2020, 7(3): 234-242. doi: 10.3934/medsci.2020014
    [6] Kavin Mozhi James, Divya Ravikumar, Sindhura Myneni, Poonguzhali Sivagananam, Poongodi Chellapandian, Rejili Grace Joy Manickaraj, Yuvasree Sargunan, Sai Ravi Teja Kamineni, Vishnu Priya Veeraraghavan, Malathi Kullappan, Surapaneni Krishna Mohan . Knowledge, attitudes on falls and awareness of hospitalized patient's fall risk factors among the nurses working in Tertiary Care Hospitals. AIMS Medical Science, 2022, 9(2): 304-321. doi: 10.3934/medsci.2022013
    [7] Giuliano Crispatzu, Alexandra Schrader, Michael Nothnagel, Marco Herling, Carmen Diana Herling . A Critical Evaluation of Analytic Aspects of Gene Expression Profiling in Lymphoid Leukemias with Broad Applications to Cancer Genomics. AIMS Medical Science, 2016, 3(3): 248-271. doi: 10.3934/medsci.2016.3.248
    [8] Nicole Lavender, David W. Hein, Guy Brock, La Creis R. Kidd . Evaluation of Oxidative Stress Response Related Genetic Variants, Pro-oxidants, Antioxidants and Prostate Cancer. AIMS Medical Science, 2015, 2(4): 271-294. doi: 10.3934/medsci.2015.4.271
    [9] Manasseh B. Wireko, Jacobus Hendricks, Kweku Bedu-Addo, Marlise Van Staden, Emmanuel A. Ntim, Samuel F. Odoom, Isaac K. Owusu . Alcohol consumption and HIV disease prognosis among virally unsuppressed in Rural KwaZulu Natal, South Africa. AIMS Medical Science, 2023, 10(3): 223-236. doi: 10.3934/medsci.2023018
    [10] Katsiaryna V Gris, Kenzo Yamamoto, Marjan Gharagozloo, Shaimaa Mahmoud, Camille Simard, Pavel Gris, Denis Gris . Exhaustive behavioral profile assay to detect genotype differences between wild-type, inflammasome-deficient, and Nlrp12 knock-out mice. AIMS Medical Science, 2018, 5(3): 238-251. doi: 10.3934/medsci.2018.3.238
  • Nanostructured alumina supported copper oxide granular sorbents were prepared by the sol-gel method. The properties of the sol-gel derived sorbents were compared with a similar commercial sorbent which has been used in the pilot scale moving-bed copper oxide process for flue gas treatment. The crushing strength of the sol-gel derived sorbents is about 6–7 times that of the commercial samples, while the attrition rate of the former is at least 3 times smaller. At temperatures below 400 °C, SO2 sorption capacity of the sol-gel derived sorbent is about 3 times that of the commercial sorbent with a similar amount of CuO loading (7–9 wt%). The better mechanical properties and higher sulfation capacity of the sol-gel derived alumina supported copper oxide sorbents are due to their unique microstructure and the coating method for CuO.


    With the advent of the era of big data, deep learning technology has become a research hotspot in the field of artificial intelligence. It has shown great advantages in image recognition, speech recognition, natural language processing and other fields. The problem of sequence labeling is the most common problem in natural language. Shao et al. [1] assign semantic labels in input sequences, exploiting encoding patterns in the form of latent variables in conditional random fields to capture latent structure in observed data. Lin et al. [2] proposed an attentional segmentation recurrent neural network (ASRNN), which relies on a hierarchical attentional neural semi-Markov conditional random field (semi-CRF) model for sequence labeling tasks.

    Convolutional neural networks (CNN) have been widely used in computer vision recognition tasks. Djenouri et al. [3] proposed a technique for particle clustering for object detection (CPOD), built on top of region-based methods, using outlier detection, clustering, particle swarm optimization (PSO), and deep convolutional networks to identify smart object data. Shao et al. [4] proposed an end-to-end multi-objective neuroevolution algorithm based on decomposition and dominance (MONEADD) for combinatorial optimization problems to improve the performance of the model in inference. From 2010 to 2017, the ImageNet Large Scale Visual Recognition Challenge has been held for seven years. The image classification accuracy of the champions has increased from 71.8% to 97.3%. The emergence of AlexNet in 2012 was a milestone in deep learning field. After that, the ImageNet dataset accuracy has been significantly improved by novel CNNs, like VGG [5], GoogleNet [6], ResNet [7,8], DenseNet [9], SE-Net [10], and automatic neutral architecture search [11,12,13].

    However, it is necessary to consider high accuracy, platform resources, and the efficiency of systems in real-world applications, e.g., automatic drive systems, intelligent robot systems, and mobile device applications. Moreover, most of the best-performing CNNs need to run on a high-performance graphics processing unit (GPU). So, real-world tasks have driven the development of more lightweight CNNs, to allow CNN to be used in more low-performance devices [14,15], like Xception [16], MobileNet [17], MobileNet V2 [18,19], ShuffleNet [20], ShuffleNet V2 [21] and CondenseNet [22]. Group convolution and depth-wise separable convolution [23] are crucial in these works.

    As the best paper at the CVPR 2017 conference, DenseNet beat the best performing ResNet on ImageNet without group convolution or depth-wise separable convolution. Subsequently, the SE-Net achieved the best results in the history of ImageNet in ILSVRC2017, but there are still too many parameters in SE-Net. Following these works, Huang et al. [9] have proposed Learned Group Convolutions to improve DenseNet connection and convolution methods. Inspired by these jobs, we study using Squeeze-and-Excitation block (SE-block) to improve the lightweight CNN. Furthermore, we explore how to design the structure of the convolutional layer to enhance the network's performance.

    We propose a more efficient network, CED-Net, which combines bottleneck layer with learned group convolution and SE block. Learned group convolution can crop the network channel during the training phase. And the SE block can recalibrate the feature channel to enhance the channel beneficial to the network. Through experiments, we demonstrate that CED-Net is superior to other lightweight network in terms of accuracy, the number of parameters, and FLOPs.

    In the past few years, designing CNNs by adjusting an optimal depth to balance accuracy and performance was a very active field. Most recent work has been many progresses in algorithm optimization exploration, including pruning redundant connections [24,25,26,27], using low-accuracy or quantized weights [28,29], or designing efficient network architectures.

    Early researchers proved pruning redundant and quantization are effective methods because deep neural networks often have a substantial number of redundant weights that can be pruned or quantized without sacrificing (and sometimes even improving) accuracy. For CNNs, different pruning techniques may lead to varying levels of granularity [30]. Fine-grained pruning, e.g., independent weight pruning [31], generally achieves a high degree of sparsity. Coarse grained pruning methods such as filter-level pruning earn a lower degree of sparsity, but the resulting networks are much more regular, facilitating efficient implementations.

    Recently researchers have explored the structures of the efficient network that can be applied on mobile devices such as MobileNet V2, ShuffleNet V2, and NasNet. In these networks, depth-wise separable convolutions play a vital role, which can reduce a large number of network parameters without significantly reducing the accuracy. However, according to the Howard et al. [17,18], a large amount of depth-wise separable convolutions will decrease the computational speed of the network. Therefore, CED-Net uses a more efficient group convolution and densely connected architecture to reduce the number of parameters of the network. Furthermore, because many deep-learning libraries efficiently implement group convolutions, they save a lot of computational time in theory and practice.

    In addition, the bottleneck layer proposed in ResNet can effectively reduce parameters for multilayer network. Our experiments show that CED-Net can achieve higher accuracy and fewer parameters than CondenseNet of the same structure when layers are deeper.

    Huang et al. [9], as the best paper for CVPR2017, proposed a densely connection network that is better than the previous champion ResNet on the ImageNet. After that, CondenseNet achieved the same accuracy with only half of the number of parameters of DenseNet. In CondenseNet, learned group convolution plays a key role; it can train the network with sparsity inducing regularization for a fixed number of iterations. Subsequently, it prunes away unimportant filters with low magnitude weights. Because many deep-learning libraries efficiently implement group convolutions, they save a lot of computational time in theory and practice.

    Moreover, the Squeeze-and-Excitation structure that shines on ILSVRC2017 has been experimented on by most famous networks. Squeeze and Excitation are two very critical operations. First, it is used to model the interdependencies between feature channels explicitly. It is a new "channel recalibration" strategy. Specifically, by automatically learning the importance of each feature channel, SE-Net enhances the proper channel and suppresses useless channels. Most of the current mainstream networks are constructed based on superimposed basic blocks. It can be seen that the SE module can be embedded in almost all network structures, so CED-Net achieves more efficient performance by embedding the SE module.

    In this section, we first introduce the structure and function of the bottleneck layer. Next, we explore how SE Block as a channel enhancement block can improve the performance of CED-Net. Finally, we describe the network details of CED-Net for CIFAR dataset.

    As shown in Figure 1, H, W, Cin are the height, width, and the number of channels of the input image, respectively, and g is the growth coefficient of the channel. CED-Net consists of multiple dense blocks for feature extraction. The dense block is shown in Figure 2(c). It consists of two 1 × 1 LG-Conv (Learned Group Convolution) layers and one 3 × 3 G-Conv (Group Convolution) layer. Each 1 × 1 LG-Conv layer uses a permute operation for channel shuffling to reduce accuracy. BN-ReLU nonlinearly activates the input and output in the dense block. And use the AvgPool layer for down sampling.

    Figure 1.  A 5-layer dense block with channel enhancement and bottleneck layer.
    Figure 2.  Different networks' bottleneck layer or dense block. (a) ResNet. (b) CondenseNet. (c) CED-Net.

    The bottleneck layer is proposed in ResNet, and the detailed structure is shown in Figure 2(a). The three-layer bottleneck structure consists of 1 × 1, 3 × 3, and 1 × 1 convolutional layers, where two 1 × 1 convolutions are used to reduce and increase (restore) dimensions. The 3 × 3 convolutional layer can be seen as a bottleneck for a smaller input/output dimension. We replace the 1 × 1 standard convolution with the learned group convolution, and the 3 × 3 standard convolution is replaced with the group convolution. Unlike ResNet, the CED-Net replaces element-wise addition with channel concatenation. Because it can use the semantic information of different scale feature maps to achieve better performance by increasing the channel, the element addition operation does not take up too much memory during network transmission. Still, it may introduce extra noise that will lose some feature map information.

    Figure 2(b) shows the structure used in CondenseNet. The Permute layer, enabling shuffling between channels, is designed to reduce the adverse effects of the introduction of 1 × 1 LG-Conv. But there are still many parameters in a deep network with the bottleneck layer. Figure 2(c) shows part of the structure used by CED-Net. This structure has fewer parameters than that in Figure 2(b). Expressly, the condense factor and bottleneck factor in CED-Net are set to 4 and reduced by half compared to CondenseNet. This is to reduce the parameters caused by adding a 1 × 1 LG-Conv layer.

    One dense layer used in CED-Net is of quadratic time complexity (Θ(25G2/4+4CG)) concerning the number (C) of input channels and the number (G) of output channels. Compared with ordinary 3 × 3 convolution (Θ(9CG)), as a result of C is much greater than G with the deepening of network layers, CED-Net reduces the time complexity by half.

    Figure 3 shows how channels change the process of the bottleneck layer based on learning group convolution. The parameters and calculation amount are 1/4 of the standard bottleneck layer. Based on the image classification comparing experiments on the CIFAR dataset, we can conclude that our structure can increase the classification accuracy by 0.4% when the number of parameters and the amount of calculation is almost the same as CondenseNet (see Section 4). When network layers are deeper (depth is 272), the number of parameters and the amount of calculation of CED-Net are smaller than the CondenseNet of the same depth. Still, the classification accuracy is higher than that of CondenseNet.

    Figure 3.  Bottleneck layer with Learned Group Convolutions.

    In CED-Net, since the network is a densely connected structure, the input data of each convolution layer has a large amount of channel information. And the output after convolution is the sum of all previous channel information. This has led to the entanglement of information and spatial relevance. Furthermore, in lightweight networks, group convolution can significantly reduce the amount of computation by ensuring that each convolution operation is only on the corresponding input channel group. However, if multiple sets of convolutions are stacked together, there is a side effect: A channel output is only derived from a few numbers of input channels. This would reduce the information flow between channel groups and express information.

    Therefore, we use the channel permute (see Figure 2(c)) and the Squeeze-and-Excitation block to make the information between the groups more circulated to allow the network to focus on more helpful information. As shown in Figure 4, Squeeze-and-Excitation blocks can improve the representation of the network by increasing the interdependence between convolution feature channels. The detailed process is divided into two steps: Squeeze and Excitation.

    Figure 4.  Channel Enhancement with Squeeze-and-Excitation.

    Squeeze. CNNs all have the problem that due to the nature of convolutional calculations, each convolution filter can only focus on specific spatial information. To alleviate this problem, the Squeeze, as a global description operation, encodes the global spatial information into the channel descriptor and calculates the mean of each channel through global average pooling.

    zc=Fsq(uc)=1W×HWi=1Hj=1uc(i,j) (1)

    As shown in Eq (1), where Zc is the output of the squeeze layer, W, H are the width and height of the input feature map of the current layer. uc is the input feature map, and Fsq() can represent the global information of the entire feature map. The global average pooling used in this paper squeezes the feature map into a value to indicate the importance of the corresponding channel.

    Excitation. To take advantage of the information obtained by the squeeze operation, the excitation operation needs to meet two criteria to achieve full capture of channel dependencies. First, it must be able to learn nonlinear interactions between channels. And second, it must learn a non-mutually exclusive relationship. Specifically, the gate mechanism is parameterized by concatenating two fully connected (FC) layers above and below the nonlinear (ReLU) and then activated with the sigmoid function.

    s=Fex(z,W)=σ(g(z,W))=σ(W2θ(W1z)) (2)

    where θ is the ReLU function.W1RCr×C, W2RC×Cr are the weights of the dimensionality reduction layer and the dimensionality increase layer, respectively. Where r is the dimensionality reduction rate, and C is the number of channels. To limit the complexity of the model and increase the generalization, a "bottleneck" is formed by a two-layer FC layer around a nonlinear map, where r sets 16. Finally, after obtaining the so-called gate, by multiplying the channel gates by the corresponding feature maps, you can control the flow of information for each feature map.

    We embed the Squeeze-and-Excitation block into the 3 × 3 G-Conv layer because the number of input/output feature channels in the first 3 × 3 G-Conv is the same and smaller. The Squeeze-and-Excitation block can effectively enhance the effective channel after feature extraction without extra parameters. According to the research results of Hu et al., this method can balance the accuracy of the model and the number of parameters.

    Algorithm 1 Image classification based on CED-Net
    Input: In = datasets (x1,y1), (x2,y2), …, (xm,ym)
    Output: Op = Classification accuracy: (y1,y2,,yn)
        Set: CED-Net feature extraction: Gk(·), k (0, n)
        for x = 1 : m do
          Softmax(Gk(xi))=eginkegk
          i [1, m], where gi is one class value in Gk(·).
          Return Op
       end for

    CED-Net can guarantee good performance while maintaining lightweight models because of the effective combination of bottleneck layer structure and channel enhancement blocks. An important difference between CED-Net and other network architectures is that CED-Net has a very narrow layer. The relatively small channel growth rate is sufficient to obtain the most advanced results on the test dataset. This can increase the proportion of features from the later layers relative to features from the previous layers. So, we set the channel growth rate of a dense connection layer to 4. And we found that if the number of early layers is set too deep, it will significantly increase the FLOPs of the network.

    Architectural details. The model used in our experiments has three dense blocks. Before the data enters the first dense block, the input image would go through a 3 × 3 standard convolution which output channels are 16 and stride size is 2. In the dense layer, the number of channel enhancement blocks should be set according to the growth rate, the input channels, and the output channels, see Eq (3).

    n=CoutCing (3)

    where g is the growth rate, Cin is the input channels, n is the number of channel enhancement blocks, and Cout is the output channels. For example, in the experiment, we set the growth rate to 8, 16, and 32, and the channels of dense layer output is 256, 756, and 1696 respectively, so the number of channel enhancement blocks in the dense layer are all 30.

    For each convolutional layer with a kernel size of 3 × 3, each side of the input is zeros-padded to keep the feature size fixed. In general, we add the batch normalization layer and the ReLU function after the last dense layer and then use the global average pooling to compress the feature map into one dimension as the input of the Softmax layer. The exact network configuration is shown in Tables 1 and 2.

    Table 1.  Network structure of CED-Net on CIFAR.
    Layers Output Size Output Channels Repeat Stride
    3 × 3 Convolution 32 × 32 16 1 1
    Dense bottleneck block 32 × 32 256 (g = 8) 30 1
    Avg pooling 16 × 16 1 2
    Dense bottleneck block 16 × 16 736 (g = 16) 30 1
    Avg pooling 8 × 8 1 2
    Dense bottleneck block 8 × 8 1696 (g = 32) 30 1
    Global avg pooling 1 × 1 1696 1 8
    Fully connected 1 × 1 10 1

     | Show Table
    DownLoad: CSV
    Table 2.  Network structure of CED-Net on ImageNet.
    Layers Output Size Output Channels Repeat Stride
    3 × 3 Convolution 112 × 112 64 1 2
    Dense bottleneck block 112 × 112 96 (g = 8) 4 1
    Avg pooling 56 × 56 1 2
    Dense bottleneck block 56 × 56 192 (g = 16) 6 1
    Avg pooling 28 × 28 1 2
    Dense bottleneck block 28 × 28 448 (g = 32)
    8 1
    Avg pooling 14 × 14 1 2
    Dense bottleneck block 14 × 14 1088 (g = 64) 10 1
    Avg pooling 7 × 7 1 2
    Dense bottleneck block 7 × 7 2112 (g = 128) 8 1
    Global avg pooling 1 × 1 2112 1 7
    Fully connected 1 × 1 1000 1

     | Show Table
    DownLoad: CSV

    The training process of CED-Net is shown in Algorithm 1. (xi, yi) in the input represent the images and label of the ith batch respectively. For each batch, we use softmax to obtain the output Yi of CED-Net. Finally, the image features Gk of n categories are obtained.

    This section conducted experiments on the CIFAR10, CIFAR-100, and the ImageNet (ILSVRC 2012) datasets. First, we compared them with other advanced convolutional neural networks, such as VGG16, ResNet-101, and DenseNet. Then, we conducted ablation experiments to CED-Net, mainly comparing three networks, the primary network of CED-Net-128, the optimization network with only the bottleneck layer, and the network with only the channel enhancement block. Through these experiments, we verify the effectiveness of our improved method. Next, we will introduce the data set and the evaluation indicators of the experiment.

    The CIFAR-10 and CIFAR-100 datasets consist of colored natural images with 32 × 32 pixels. CIFAR-10 consists of images drawn from 10 classes and CIFAR-100 from 100 classes. The training and test sets contain 50, 000 and 10, 000 images, respectively, and we picked up 5000 training images as a validation set. We adopt a standard data augmentation scheme (mirroring/shifting) and image zero-padded with 4 pixels per side, and then randomly cropped to generate a 32 × 32 image. The image is flipped horizontally at a probability of 0.5 and normalized by subtracting the channel average and dividing by the channel standard deviation.

    The ImageNet datasets consist of 224 × 224 pixels colored natural images with 1000 classes. The training and validation sets contain 1, 280, 000 and 50, 000 images, respectively. We adopt the data-augmentation scheme at training time and perform a rescaling to 256 × 256 followed by a 224 × 224 center crop at test time before feeding the input image into the networks.

    We evaluate CED-Net on three criteria:

    Accuracy is the most common metric. It is the number of samples that are paired divided by the number of all samples. Generally speaking, the higher the accuracy is, the better the classifier will be:

    accuracy=(TP+TN)/(P+N) (4)

    where P (positive) is the number of positive examples in the sample, and N (negative) is the number of negative examples. TP (true positives) is the number of samples that are positive examples that are correctly classified. TN (true negatives) is the number of samples that are actually negative that are correctly classified.

    For a single convolutional kernel we have:

    parameters=k2×Cin×Cout (5)

    where 𝑘 is the convolution filter's size, 𝐶in is the input channels, and 𝐶out is the output channels;

    To measure the amount of calculation of the model, we compute the number of FLOPs of each layer. For convolutional kernels, we have:

    FLOPs=2HW(k2Cin+1)Cout (6)

    where 𝐻, 𝑊 are height and width. For fully connected layers, we compute FLOPs as:

    FLOPs=(2I1)O (7)

    where 𝐼 is the input dimensionality and 𝑂 is the output dimensionality.

    To further prove the stability of CED-Net, we added the interpretation and comparison of precision, recall and F-measure in the ablation experiment:

    precision=TP/(TP+FP) (8)
    recall=TP/(TP+FN) (9)
    Fmeasure=2precisionrecall/(precision+recall) (10)

    We train all models with stochastic gradient descent (SGD) using similar optimization hyper-parameters [23,24,25,26,27,28,29,30]. And we set the Nesterov momentum weight to 0.9 without damping and use a weight decay of 0.0001. All models are trained with mini-batch size 128 for 200 epochs on the training datasets. We use the cosine annealing learning rate curve, starting from 0.1 and gradually reducing to 0.

    In this part, we train CED-Net and other advanced convolutional neural networks on the CIFAR-10 and CIFAR100 datasets. We compared these models under the above three evaluation criteria. See Table 3 for a detailed list.

    Table 3.  The classification accuracy on CIFAR-10 and CIFAR-100.
    Model Params FLOPs CIFAR-10 CIFAR-100
    VGG-16 14.73 M 314 M 92.64 72.23
    ResNet-101 42.51 M 2515 M 93.75 77.78
    ResNeXt-29 9.13 M 1413 M 94.82 78.83
    MobileNet V2 2.30 M 92 M 94.43 68.08
    DenseNet-121 6.96 M 893 M 94.04 77.01
    CondenseNet-86 0.52 M 65 M 94.48 76.36
    CondenseNet-182 4.20 M 513 M 95.87 80.13
    CED-Net-128
    CED-Net-272
    0.69 M
    5.32 M
    75 M
    649 M
    94.89
    96.31
    77.35
    80.72

     | Show Table
    DownLoad: CSV

    In Table 3, we show the results of comparing 128-layer CED-Net and 272-layer CED-Net with other state-of-the-art CNN architectures. All models were trained in 200 epochs in the experiment. The results show that after introducing the bottleneck layer structure and channel enhancement blocks to CED-Net, the CondenseNet increases the accuracy by 0.4–0.5% with minimal parameters and FLOPs cost compared with the same number of stacked blocks n datasets. Moreover, compared to the more advanced MobileNet V2, CED-Net is more accurate without using depth-wise separable convolutions. And the parameter amount is 1/4 of it, and the FLOPs are also more minor.

    In this part, we train CED-Net and other advanced convolutional neural networks on the ImageNet datasets. We compared these models under the above four evaluation criteria. See Table 4 for a detailed list.

    Table 4.  The classification accuracy on ImageNet.
    Model Params FLOPs Top-1 Top-5
    VGG-16 138.36 M 15.48 G 71.93 90.67
    ResNet-101 44.55 M 7.83 G 80.13 95.4
    MobileNet V2 3.5 M 0.3 G 71.8 91
    DenseNet-121 7.98 M 2.87 G 74.98 92.29
    CondenseNet 4.8 M 0.53 G 73.8 91.7
    SE-Net 115 M 20.78 G 81.32 95.53
    CED-Net-115 9.3 M 1.13 G 78.65 93.7

     | Show Table
    DownLoad: CSV

    In Table 4, we show the results of comparing 115-layer CED-Net with other CNN architectures. The results show that the accuracy of Top-1 and Top-5 is improved by 4.85 and 2%, respectively, compared with the same depth of CondenseNet. At the same time, the dense bottleneck block used in the CED net is more complex. Compared with DenseNet, CED-Net increases the number of parameters by 16.5% but reduces the amount of calculation by 39.4%; the accuracies of Top-1 and top-5 are improved by 3.67 and 1.41%, respectively. Compared with SE-Net, CED-Net reduces the Top-1 accuracy by 2.67%, but the parameter quantity is only 8.1% of SE-Net.

    Some misclassification images are shown in Figure 5. There may be unavoidable interference information in these pictures; Also, it may be that the network model constructed in this paper does not learn a sufficient number of diverse features and cannot correctly identify each picture with different features.

    Figure 5.  ImageNet misclassified pictures.

    In the dense bottleneck block shown in Figure 2, we use the learned group revolution before and after the 3 × 3 group convolution, which means that there are two consecutive learned group convolutions between the two 3 × 3 group convolutions. The two index layers used have redundancy, but we think it is necessary. These redundancies can improve the learned group revolution's generalization performance and help subsequent feature extraction. But this design increases the amount of calculation and parameters of the intermediate convolution.

    In this part, we performed a CED-Net ablation experiment. We trained four models on the CIFAR-10 dataset, CEDNet-128a with no bottleneck layer and channel enhancement block, CED-Net-128b with convolutional layer structure changed to bottleneck layer, CED-Net-128c with channel enhancement block based on CondenseNet-86 and CED-Net-128 that we proposed in this paper.

    In Table 5, CondenseNet-86 is our basic model. It can be seen that when we turn the structure of CondenseNet into a bottleneck layer, the parameters and FLOPs of the network are only slightly improved, and the accuracy can be increased by about 0.3%. When we added the channel enhancement block to CondenseNet-86, we saw not much increase in FLOPs. But the parameters are raised, and the accuracy can be improved by about 0.3%. In our CED-Net-128, the accuracy rate has been significantly improved, and the channel enhancement block mainly causes the increase in parameters. The bottleneck layer structure causes an increase in FLOPs. In addition, the Accuracy, Precision, Recall and F-measure of each model are very close, which prove that the four models have extracted stable features.

    Table 5.  The result of ablation experiments of CIFAR-10.
    Model Params FLOPs Accuracy Precision Recall F-measure
    CondenseNeta 0.52M 65.82M 94.48 94.50 94.48 94.49
    CED-Net-128b 0.59M 75.04M 94.75 94.75 94.75 94.75
    CED-Net-128c 0.66M 67.04M 94.74 94.76 94.74 94.75
    CED-Net-128 0.69M 75.41M 94.89 94.89 94.89 94.88
    Note: aThe basic model of CED-Net same as CondenseNet-86 without bottleneck layer and channel enhancement block; bThe basic model of CED-Net only add a bottleneck layer; cThe basic model of CED-Net only add channel enhancement block.

     | Show Table
    DownLoad: CSV

    This paper introduces CED-Net: a more efficient densely concatenated convolutional neural network based on feature enhancement block and bottleneck layer structure, which increases accuracy by learning group convolution and feature reuse. To make the reasoning effective, the pruned network can be converted to a network with conventional group convolution, which is effectively implemented in most deep learning libraries. In our experiments, CED-Net outperformed its underlying network CondenseNet and other advanced convolutional neural networks such as Mobilenet V2 and ResNeXt in terms of computational efficiency at the same accuracy level. Moreover, CED-Net has a much simpler structure with higher accuracy. We anticipate further research in CED-Net to combine this framework to the Neural Architecture Search (NAS), so as to design more lightweight Convolutional Neural Network models. We hope our work will draw more attention toward a broader view of using lightweight architecture for deep learning.

    This work was supported by the National Natural Science Foundation of China (No. 61976217), the Opening Foundation of Key Laboratory of Opto-technology and Intelligent Control, Ministry of Education (KFKT2020-3), the Fundamental Research Funds of Central Universities (No. 2019XKQ YMS87), Science and Technology Planning Project of Xuzhou (No. KC21193).

    The authors declare there is no conflict of interest.

    [1] Livengood CD, Markussen JM (1994) FG Technologies for Combined Control of SO2 and NOx. Power Eng 98: 38-42.
    [2] Yeh JT, Demski RJ, Strakey JP, et al. (1985) Combined SO2/NOx Removal from Flue Gas. Detailed Discussion of a New Regenerative Fluidized-Bed Process Developed by the Pittsburgh Energy Technology Center. Environ Prog 4: 223-228.
    [3] Wang ZM, Lin YS (1998) Sol-Gel Synthesis of Pure and Copper Oxide Coated Mesoporous Alumina Granular Particles. J Catal 174: 43-51. doi: 10.1006/jcat.1997.1913
    [4] Deng SG, Lin YS (1997) Granulation of Sol-Gel-Derived Nanostructured Alumina. IChE J 43: 505-514.
    [5] Wang ZM, Sahle-Demessie E, Hassan AA (2011) Selective Oxidation Using Flame Aerosol Synthesized Iron and Vanadium-Doped Nano-TiO2. J Nanotechnology 2011: 209150.
    [6] Wang ZM, Sahle-Demessie E, Hassan AA, et al. (2012) Surface Structure and Photocatalytic Activity of Nano-TiO2 Thin Film for Selective Oxidation. J Environ Eng 138: 923-931
    [7] Wang ZM, Yang G, Biswas P, et al. (2001) Processing of iron-doped titania powders in flame aerosol reactors. Powder Technol 114: 197-204. doi: 10.1016/S0032-5910(00)00321-1
    [8] Wang ZM, Lin YS (1998) Sol-gel-derived Alumina-Supported Copper Oxide Sorbent for Flue Gas Desulfurization. Ind Eng Chem Res 37: 4675-4681. doi: 10.1021/ie980343u
    [9] Sahle-Demessie E, Gonzalez M, Wang ZM, et al. (1999) Synthesizing Alcohols and Ketones by Photoinduced Catalytic Partial Oxidation of Hydrocarbons in TiO2 Film Reactors Prepared by Three Different Methods. Ind Eng Chem Res 38: 3276-3284. doi: 10.1021/ie990054l
    [10] van Heldon HJA, Nabor JE, Zuiderweg J, et al. (1970) Removal of sulfur oxides from gas mixture, U.S. Pat. 3.501,897.
    [11] McCrea DH, Forney AJ, Myers JG (1970) Recovery of sulfur from flue gases using a copper oxide absorbent. J Air Pollut Contr Assoc 20: 819-824. doi: 10.1080/00022470.1970.10469479
    [12] Dautzenberg FM, Nader JE, van Ginneken AJJ (1971) The Shell Flue Gas Desulphurization Process. Chem Eng Prog 67: 86-91.
    [13] Friedman RM, Freeman JJ, Lytle FW (1978) Characterization of Cu/Al2O3 Catalysts. J Catal 55: 10-28. doi: 10.1016/0021-9517(78)90181-1
    [14] Strohmeier BR, Leyden DE, Field RS, et al. (1985) Surface Spectroscopic Characterization of Cu/Al2O3 Catalysts. J Catal 94: 514-530. doi: 10.1016/0021-9517(85)90216-7
    [15] Centi G, Riva A, Passarini N, et al. (1990) Simultaneous Removal of SO2/NOx from Flue-Gases Sorbent-Catalyst Design and Performances. Chem Eng Sci 45: 2679-2686. doi: 10.1016/0009-2509(90)80158-B
    [16] Yoo KS, Kim SD, Park SB (1994) Sulfation of Al2O3 in Flue Gas Desulfurization by CuO/g-Al2O3 Sorbent. Ind Eng Chem Res 33: 1786-1791. doi: 10.1021/ie00031a018
    [17] Centi G, Hodnett BK, Jaeger P, et al. (1995) Development of Copper-on-alumina Catalytic Materials for the Cleanup of Flue Gas and the Disposal of Diluted Ammonium Sulfate Solutions. J Mater Res 10: 553-561. doi: 10.1557/JMR.1995.0553
    [18] Kiel JHA, Edelaar ACS, Prins W, et al. (1992) Performance of Silica Supported Copper-Oxide Sorbents for SOx/NOx Removal from Flue-Gas. 1. Sulfur-Dioxide Absorption and Regeneration Kinetics. Appl Catal B: Environ 1: 13-39.
    [19] Yeh JT, Drummond CJ, Joubert JI (1987) Process Simulation of the Fluidized-Bed Copper-Oxide Process Sulfation Reaction. Environ Prog 6: 44-50. doi: 10.1002/ep.670060123
    [20] Sun X, Tang X, Yi H, et al. (2015) Simultaneous adsorption of SO2 and NO from flue gas over mesoporous alumina. Environ Technol 36: 588-594. doi: 10.1080/09593330.2014.953600
    [21] Wu CM, Baltrusaitis J, Gillanand EG, et al. (2011) Sulfur Dioxide Adsorption on ZnO Nanoparticles and Nanorods. J Phys Chem C 115: 10164-10172.
    [22] Lo Jacono M, Cimino A, Inversi M (1982) Oxidation States of Copper on Alumina Studied by Redox Cycles. J Catal 76: 320-332. doi: 10.1016/0021-9517(82)90263-9
    [23] Habashi F, Mikhail SA, Vo Van K (1976) Reduction of sulfates by hydrogen. Can J Chem 54: 3646-3650. doi: 10.1139/v76-524
    [24] Sacks MD, Tseng TY, Lee SY (1984) Thermal Decomposition of Spherical Hydrated Basic Aluminum Suffate. Ceramic Bulletin 63: 301.
    [25] Yoo JS, Bhattacharyya AA, Radlowski CA, et al. (1992) Advanced De-SOx catalysts: mixed solid solution spinels with cerium oxide. Appl Catal B 1: 169-189. doi: 10.1016/0926-3373(92)80022-R
  • This article has been cited by:

    1. Andrei V. Kelarev, Xun Yi, Hui Cui, Leanne Rylands, Herbert F. Jelinek, A survey of state-of-the-art methods for securing medical databases, 2018, 5, 2375-1576, 1, 10.3934/medsci.2018.1.1
    2. Hend Amraoui, Faouzi Mhamdi, Mourad Elloumi, 2019, Chapter 43, 978-3-030-35230-1, 591, 10.1007/978-3-030-35231-8_43
    3. Hend Amraoui, Faouzi Mhamdi, Mourad Elloumi, 2019, Association Rule Mining Using Discrete Jaya Algorithm, 978-1-7281-4484-9, 872, 10.1109/HPCS48598.2019.9188123
  • Reader Comments
  • © 2017 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(6742) PDF downloads(1225) Cited by(0)

Figures and Tables

Figures(6)  /  Tables(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog