Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

Spatial behavior for the quasi-static heat conduction within the second gradient of type Ⅲ

  • This article focused on investigating the spatial behavior of the quasi-static biharmonic conduction equation within the framework of type Ⅲ of the second gradient in a two-dimensional cylindrical domain. The results of growth or decay estimates were established by using a second-order differential inequality. When the distance tends to infinity, the energy either grows exponentially or decays exponentially. The results showed that the Saint-Venant principle was also valid for the quasi-static biharmonic conduction equation.

    Citation: Jincheng Shi, Shuman Li, Cuntao Xiao, Yan Liu. Spatial behavior for the quasi-static heat conduction within the second gradient of type Ⅲ[J]. Electronic Research Archive, 2024, 32(11): 6235-6257. doi: 10.3934/era.2024290

    Related Papers:

    [1] Ruxin Xue, Jinggui Huang, Zaitang Huang, Bingyan Li . Reconstructed graph spatio-temporal stochastic controlled differential equation for traffic flow forecasting. Electronic Research Archive, 2025, 33(4): 2543-2566. doi: 10.3934/era.2025113
    [2] Yao Sun, Lijuan He, Bo Chen . Application of neural networks to inverse elastic scattering problems with near-field measurements. Electronic Research Archive, 2023, 31(11): 7000-7020. doi: 10.3934/era.2023355
    [3] Weichi Liu, Gaifang Dong, Mingxin Zou . Satellite road extraction method based on RFDNet neural network. Electronic Research Archive, 2023, 31(8): 4362-4377. doi: 10.3934/era.2023223
    [4] Xinzheng Xu, Yanyan Ding, Zhenhu Lv, Zhongnian Li, Renke Sun . Optimized pointwise convolution operation by Ghost blocks. Electronic Research Archive, 2023, 31(6): 3187-3199. doi: 10.3934/era.2023161
    [5] Kai Huang, Chang Jiang, Pei Li, Ali Shan, Jian Wan, Wenhu Qin . A systematic framework for urban smart transportation towards traffic management and parking. Electronic Research Archive, 2022, 30(11): 4191-4208. doi: 10.3934/era.2022212
    [6] Fenggang Yuan, Cheng Tang, Zheng Tang, Yuki Todo . A model of amacrine cells for orientation detection. Electronic Research Archive, 2023, 31(4): 1998-2018. doi: 10.3934/era.2023103
    [7] Boshuo Geng, Jianxiao Ma, Shaohu Zhang . Ensemble deep learning-based lane-changing behavior prediction of manually driven vehicles in mixed traffic environments. Electronic Research Archive, 2023, 31(10): 6216-6235. doi: 10.3934/era.2023315
    [8] Xin Liu, Yuan Zhang, Kai Zhang, Qixiu Cheng, Jiping Xing, Zhiyuan Liu . A scalable learning approach for user equilibrium traffic assignment problem using graph convolutional networks. Electronic Research Archive, 2025, 33(5): 3246-3270. doi: 10.3934/era.2025143
    [9] Hua Yang, Xuan Geng, Heng Xu, Yichun Shi . An improved least squares (LS) channel estimation method based on CNN for OFDM systems. Electronic Research Archive, 2023, 31(9): 5780-5792. doi: 10.3934/era.2023294
    [10] Zhiyuan Feng, Kai Qi, Bin Shi, Hao Mei, Qinghua Zheng, Hua Wei . Deep evidential learning in diffusion convolutional recurrent neural network. Electronic Research Archive, 2023, 31(4): 2252-2264. doi: 10.3934/era.2023115
  • This article focused on investigating the spatial behavior of the quasi-static biharmonic conduction equation within the framework of type Ⅲ of the second gradient in a two-dimensional cylindrical domain. The results of growth or decay estimates were established by using a second-order differential inequality. When the distance tends to infinity, the energy either grows exponentially or decays exponentially. The results showed that the Saint-Venant principle was also valid for the quasi-static biharmonic conduction equation.



    CNN: Convolutional Neural Network; GRU: Gated Recurrent Unit; DPI: Deep Packet Inspection; TCP: Transmission Control Protocol; IP: Internet Protocol; RNN: Recurrent Neural Network; LSTM: Long Short-Term Memory; SSR: ShadowSoksR; XAI: Explainable Artificial Intelligence

    Math symbols
    zt The update gate of GRU
    rt The reset gate of GRU
    xt The input of GRU
    h The output of GRU
    ht The candidate output of GRU
    zj The raw score of the jth neuron
    ezj The exponential value for each element
    pj The probability that the input sample belongs to the jth class
    Pmax The node with the highest probability
    M The number of mobile applications
    TP Positive samples are correctly identified as positive samples
    TN Negative samples are correctly identified as negative samples
    FP Negative samples are misidentified as positive samples
    FN Positive samples are misidentified as negative samples

    In recent years, there have been significant changes in the way users access the Internet, and the traffic generated by mobile devices has exploded. Mobile applications have become an indispensable part of people's daily lives, and user behavior recognition in mobile applications has become one of the hot research directions in the field of mobile Internet [1]. By monitoring and recognizing the flow of mobile application traffic, network security issues such as malicious behavior, network attacks and privacy leaks can be detected and protecting user data and privacy is of great significance for network security control [2]. In addition, the research results can also be used in fields such as mobile application recommendation, advertising delivery and user behavior analysis. Therefore, the research on the recognition and management of user behavior flow in mobile applications has important research value and practical significance.

    Although mobile application traffic identification study is similar to PC traffic identification work, the uniqueness of mobile application traffic, such as the fast iteration speed of application versions and data transmission through encryption protocols, has posed greater challenges to traditional identification methods [3,4]. Currently, mobile traffic identification can be classified into four categories: port-based classification methods, Deep Packet Inspection (DPI) based classification methods, traditional machine learning-based classification methods and deep learning-based classification methods.

    Initially, the major means of classifying network traffic were based on TCP/UDP packet port numbers. However, since many applications do not use registered port numbers or disguise port numbers, this method is often not effective. DPI-based classification methods are a typical rule-based method that requires manual rule-making and matching to achieve traffic identification. This method is time-consuming and becomes less applicable with more encrypted traffic [5]. Traditional machine learning-based classification methods require complex feature engineering techniques to achieve better accuracy, and this method may become ineffective after mobile application updates, which is also one of the bottlenecks faced by machine learning development [6,7]. Deep learning-based classification methods can optimize feature engineering on their own [8,9], solving the problem of over-reliance on manual feature extraction accuracy.

    With the increasing number of installed applications on mobile devices, these applications may run automatically in the background and generate traffic even when the user has not opened them. This poses significant challenges for identifying mobile application traffic. The problem becomes even more severe when communicating through an encrypted channel. Therefore, we propose mobile application traffic identification model that is resilient to background traffic interference under encrypted channels. The model aims to achieve accurate classification of mobile application traffic in encrypted channels and address the issue of background traffic interference. Furthermore, multiple sets of experiments are conducted to evaluate the performance of the model in classifying mobile application traffic in encrypted channels. The main contributions of this work are as follows:

    1) The adoption of a burst-flow diversion method, where traffic is grouped based on the time intervals between packet arrivals, source-destination IP addresses and port numbers. Different thresholds (0.005 s, 0.05 s, 0.5 s and 1 s) for grouping are compared to assess their impact on the classifier, aiming to find the optimal threshold that enhances the accuracy and robustness of the model's classification.

    2) A neural network which combines Convolutional Neural Networks (CNNs) and Gated Recurrent Units (GRUs) has been designed. This approach utilizes CNNs to extract spatial features from network traffic and employs GRUs to model the temporal sequence, enabling the capturing of dynamic variations in mobile application traffic. Extensive comparative experiments have been conducted to determine the optimal model parameters. This enables the model to achieve superior performance in traffic recognition under encrypted channels.

    3) In scenarios where unknown applications generate background traffic interference, the proposed model not only accurately detects the traffic of known applications under encrypted channels but also incorporates a filtering module to filter out background traffic. The paper explores and analyzes the background traffic detection rate and false positive rate under different confidence thresholds, selecting an appropriate confidence threshold that achieves an optimal balance between background traffic detection and false positives. This leads to an improved recognition accuracy of the model and a stronger anti-interference capability compared to existing methods.

    The organization of the paper is as follows. Section 2 provides a comprehensive review of related work in the field of mobile application traffic identification. Section 3 presents the framework of the proposed method. Section 4 describes the details of the proposed method. Section 5 presents the experimental setup and discuss the results and analysis. Finally, Section 6 concludes the paper with a summary of the findings and suggestions for future research.

    The paragraph describes several studies that explore the effectiveness of various machine learning methods for network traffic classification [10,11]. For instance, reference [12] successfully identified thousands of applications using the traffic features generated during application launch. However, when the training and testing datasets are collected from different devices, the accuracy drops by as much as 26%. Taylor et al. partitioned the network traffic using adjacent packet time interval thresholds, then performed finer-grained partitioning based on four-tuple information and extracted 18 features related to packet length from packets sequences in different directions. They used support vector machines and random forests to establish classification models, achieving application classification accuracy rates of up to 98% [13].

    Similarly, Park et al. [14] used machine learning techniques to identify traffic patterns generated by instant messaging application Kakao Talk. Its method selected packet length as the feature sequence and achieved a recognition accuracy rate of 99.7% for Kakao Talk encrypted traffic, albeit with poor scalability. Saltaformaggio et al. proposed a user behavior recognition system called NetScop, which can be deployed on Wi-Fi access points or other network devices. The system had an identification accuracy rate of 78% [15]. Reference [16] found that by analyzing side information such as the size of Apple's iMessage network packets, it is possible to identify message length and language type, as well as to distinguish between five user behaviors such as sending messages, inputting and reading, with classification accuracy rates over 90%.

    Conti et al. extracted features from the relevant information in the data stream, such as packet length and transmission direction, and proposed a data stream analysis method based on hierarchical clustering to solve the problem of multiple data streams generated by each operation in the application [17]. They applied clustering to the data stream and performed label operation, followed by classification of the behavior operation using random forests. The experimental results show that its identification accuracy rate can up to 95%. These methods rely heavily on the accuracy of feature engineering by domain experts, which is time-consuming and of limited generality and can become ineffective after mobile application upgrades. Deep learning avoids the need for feature engineering by domain experts and has stronger capabilities in learning complex patterns than traditional machine learning methods. The following works demonstrate the effectiveness of deep learning methods for network traffic classification. For example, Wang et al. built a stacked autoencoder classifier model which is used to classify 58 common protocols with accuracy and recall rates exceeding 90% [18]. Hu et al. proposed a CLD-Net model that combines CNN and LSTM to distinguish network encrypted traffic and accurately recognize Facebook and Skype application traffic (chat, audio, or file) on the ISCX public dataset [19]. Aceto et al. proposed two frameworks called MIMETIC and DISTILLER respectively, where MIMETIC requires two inputs for model training, payload information and protocol/time series features, to classify traffic [20]. Multitask and multimode deep learning models are suitable for mobile application classification using the DISTILLER framework. Wang et al. proposed an end-to-end encrypted traffic classification model based on 1D-CNN, which extracts features and selects them before performing classification [21]. This model was validated using the ISCX public dataset and showed that the classification performance of 1D-CNN is superior to that of 2D-CNN.

    Table 1 illustrates the categorized reviewed works based on their primary features. Despite the satisfactory identification performance of existing deep learning-based methods for mobile application traffic identification, there exists a problem of insufficient feature extraction in current works. Additionally, these methods fail to consider the scenario where unknown applications generate background traffic interference in real-world usage, indicating their limitations. Existing works focus on closed testing scenarios for traffic identification, where the training and testing datasets consist of the same traffic classes. This causes a significant decrease in classification accuracy when facing background traffic interference. The interference issue becomes more severe when users communicate through encrypted channels, as both mobile application traffic and background traffic are encrypted, leading to confusion between background traffic and the desired mobile application traffic, thereby affecting the accuracy of mobile application traffic identification. We present a mobile application traffic identification model that is resilient to background traffic interference under encrypted channels. We utilize a neural network approach that combines CNNs and GRUs because CNNs excel in extracting spatial features, while GRUs are adept at handling temporal features among samples. By integrating these two, we can effectively extract latent spatial features from the data and capture temporal characteristics among samples. This methodology empowers the model to maximize feature extraction from mobile application traffic within encrypted channels, consequently enhancing recognition accuracy. Following this, we conducted multiple sets of experiments to evaluate the model's performance in classifying mobile application traffic within encrypted channels.

    Table 1.  Characteristics of reviewed works.
    Author Primary feature
    Alan et al.[12] Using traffic features generated during application launch
    Taylor et al.[13] Extracted 18 features related to packet length from packet sequences in different directions
    Park et al.[14] The packet length as a feature sequence and its scalability is poor
    Saltaformaggio et al.[15] User behavior recognition system deployed on Wi-Fi devices
    Coull et al.[16] Analyzing additional information to identify user behavior
    Conti et al.[17] Hierarchical clustering-based data stream analysis method
    Wang et al.[18] Stacked autoencoder classification model
    Hu et al.[19] CLD-Net model combining CNN and LSTM for distinguishing network encrypted traffic
    Aceto et al.[20] MIMETIC and DISTILLER frameworks
    Wang et al.[21] End-to-end encrypted traffic classification model based on 1D-CNN

     | Show Table
    DownLoad: CSV

    The overall framework of the mobile application user behavior classification model proposed in this paper is shown in Figure 1, which mostly includes three modules: The data preprocessing module, feature extraction module and background traffic filtering module.

    Figure 1.  Overall framework of the proposed method.

    In the data preprocessing module, this paper employs three steps to process the experimental dataset: initial screening, traffic grouping, and image transformation. In the traffic grouping step, a Burst-flow diversion method is utilized to divide the traffic samples. The grouping is performed based on the time intervals between data packet arrivals, source and destination IP addresses and port numbers. After grouping, the byte length of the flow samples is standardized to facilitate subsequent model input and processing.

    In the feature extraction module, a neural network design combining CNN and GRU is utilized. This design incorporates the advantages of both CNN and GRU models. It not only accurately and efficiently extracts potential spatial features from the data but also captures temporal features between samples. This allows the model to maximize the extraction of characteristics from mobile application traffic under encrypted channels, thereby enhancing the recognition accuracy.

    In the background traffic filtering module, the model's output probability distribution is compared with a preset confidence threshold. If the predicted probability of a certain class is greater than or equal to the confidence threshold, that class is considered the final prediction result. If the predicted probability of a certain class is below the confidence threshold, that class is identified as background traffic and filtered out, not included in the final prediction result. The introduction of the background traffic filtering module effectively reduces interference from background traffic and abnormal data, thereby improving the stability and generalization ability of the model.

    Traffic datasets are typically stored and distributed in.pcap or.pcapng format, and they cannot be directly classified in most cases. Preprocessing of such files is necessary to convert the raw network traffic into a data format suitable for inputting into a classification model. In this paper, the processing of the dataset mainly involves three steps: screening of traffic, traffic grouping, and image transformation. The specific process is as follows.

    In practical network environments, there can be a certain number of TCP retransmissions and corrupt packets. The occurrence and frequency of TCP retransmissions and corrupt packets are primarily dependent on the current network conditions. If a large number of TCP retransmissions and corrupt packets are saved along with the data packets generated by normal communication of mobile applications, it can interfere with the training of the classifier model. Therefore, it is necessary to filter them out.

    One common approach is to filter based on packet characteristics, such as examining packet sequence numbers, checksums and acknowledgment numbers to identify and eliminate packets that might be retransmitted or damaged. Another method involves utilizing the mechanisms within the TCP protocol to filter retransmitted packets, for instance, inspecting the flags in the TCP header to identify retransmitted packets. Additionally, using information such as the order of packet arrival, along with network status and protocol specifications, can help filter out abnormal data that might interfere with training. Here, we employ a method based on packet characteristics for filtering.

    We adopt the Burst-flow approach to partition the traffic based on the time intervals between packets, source and destination IP addresses and port numbers. The reason we use the burst-flow method to divide the data is twofold. First, the content of traffic within encrypted channels is challenging to analyze directly. By analyzing the bursts in traffic, it becomes possible to make certain inferences about the transmission patterns, frequency, or size of the encrypted data, aiding in understanding the characteristics of data transmission. Second, within the same encrypted channel, different applications may be carried, and from an observer's perspective, these applications share identical quintuples. Employing clustering provides a more convenient way to distinguish between them. For a collection of data packets corresponding to a specific mobile application, the packets are sorted based on timestamp labeling to ensure that packets with the same timestamp are grouped together in the same burst group. Next, a threshold value for bursts needs to be determined. If the time difference between the current packet and the previous packet is less than the burst threshold, the current packet is assigned to the burst group where the previous packet belongs. Otherwise, it is considered as the first packet of a new burst group. As described in Section 5.4.1, this section presents comparative experiments with different burst thresholds (0.005 s, 0.05 s, 0.5 s and 1 s) to evaluate the impact of different burst thresholds on classification accuracy. Ultimately, 0.5 s is selected as the burst threshold for the experimental setup in this paper.

    After this step, the burst dataset is obtained, denoted as BurstDat={busrt1,busrt2,..., busrti,..., busrtn}. busrti represents the ith burst group, and n represents the total number of burst groups. The current burst group may contain data packets generated by more than one device, so a more detailed subdivision of the burst groups is required based on the source and destination IP addresses.

    By traversing the burst dataset, for each burst group busrti, packets with the same or opposite source and destination IP addresses, as well as the same or opposite source and destination port numbers, are combined to form a flow data group. The packets within each flow group are sorted in ascending order based on their timestamps. After this step, the flow dataset is obtained, denoted as FlowData={flow1,flow2,...,flowi,...,flown}. flowi represents the ith data flow group, and n represents the total number of data flow groups. Refers to the data object that will be further transformed into grayscale images.

    In order to facilitate the input and processing of subsequent models, the post-partitioned data samples are processed to have a unified byte length. During the establishment of communication, there is frequent interaction between the sender and receiver, and the headers contain more valuable information. Therefore, the original byte sequences of the data packets are extracted starting from the header of the flow sample. The Maximum Transmission Unit (MTU) defines the maximum packet length as 1500 bytes in the network. As described in Section 5.4.2, after practical testing, a standard length of 784 bytes is chosen in this study. If the byte count of a flow sample exceeds the standard length, the first bytes up to the standard length are extracted from the header. If the byte count is less than the standard length, it is padded with 0 × 00 bytes to reach the standard length. The processed data samples are then transformed into grayscale images. Figure 2 displays the traffic of different applications in grayscale format, representing the flow through encrypted channels.

    Figure 2.  Gray-scale representation of mobile application traffic under an encrypted channel.

    In a grayscale image, each pixel can have 256 shades, with 0 × 00 representing black and 0 × ff representing white. One byte consists of 8 bits, and the shade of each pixel in the grayscale image is determined by the value of each byte. The standard-length bytes are transformed into grayscale images of size 28 × 28. The varying interaction behaviors of different types of traffic result in different compositions of the original byte sequences of the data packets. As a result, the generated grayscale images exhibit distinct texture styles, which possess strong representational capabilities.

    The neural network structure designed for feature extraction in the mobile app user behavior classification model in this paper is a combination of Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU), which combines the advantages of CNN and GRU. It can not only accurately and efficiently mine potential spatial features from data, but also extract temporal features between samples. CNN accurately extracts the features of traffic samples through the convolutional layer, and then reduces the dimension of the extracted feature information through the pooling layer to reduce the computational complexity of the network. Considering that network traffic data is structured sequential data, this paper uses CNN as part of the neural network structure to learn the spatial features of traffic data. In addition, this paper also uses recurrent neural networks to learn the temporal features of network traffic sequences. LSTM and GRU are two widely used recurrent neural networks to eliminate the gradient vanishing and explosion problems of traditional RNN. Compared with LSTM, GRU can better capture the dependencies with larger intervals in the temporal data, and has fewer parameters and shorter training time in the training process, which is why we choose GRU as the neural network for this paper.

    In the feature extraction module, the designed neural network is used for model training. First, the preprocessed samples are input into the CNN network to extract spatial features. Then, the extracted features are integrated and sent to the GRU network to extract the temporal features of traffic. Finally, the predicted results are output through the fully connected layer and Softmax layer. The specific structure of the feature extraction network model in this paper is shown in Figure 3. We select hyperparameters, such as the number of convolutional layers, the number of fully connected layers, the stride size and the activation function, through comprehensive testing involving numerous parameter combinations. The depth of CNN should neither be too large nor too small, so that it can learn complex relationships while keeping the model converging.

    Figure 3.  Feature extraction network model structure.

    The spatial feature learning module of this network framework consists of the input layer, convolutional layer, pooling layer and fully connected layer of traditional CNN [22,23]. The specific design model architecture parameters are shown in Table 2. The input format of the input layer matches the output format of the preprocessing, which is a fixed format of M × N. To select the best input format, we first set the initial range of M to 10–35 and the range of N to 10–35, and then randomly selects different sizes for a series of comparative experiments. After considering all parameters, using a 28 × 28 input format can achieve the best balance between classification performance and computational efficiency.

    Table 2.  CNN model architecture parameter list.
    Layer number Operation Input format Output format
    1 Input 28 × 28 28 × 28
    2 Convolution 28 × 28 32 × (28 × 28)
    3 Pooling 32 × (28 × 28) 32 × (14 × 14)
    4 Convolution 32 × (14 × 14) 32 × (14 × 14)
    5 Pooling 32 × (14 × 14) 32 × (7 × 7)
    6 FC 32 × (7 × 7) 128 × 1

     | Show Table
    DownLoad: CSV

    The proposed model adopts a two-layer convolutional structure for feature extraction, each convolutional layer using 32 convolution kernels of size 3 × 3 to process input data, generating 32 feature maps. In the first convolutional layer, the size of the generated feature map is 28 × 28, while in the second convolutional layer, the size of the generated feature map is 14 × 14. To introduce non-linear transformation, the activation function after the two convolutional operations is the Rectified Linear Units (ReLU). Compared with traditional Sigmoid and Tanh functions, ReLU does not require exponential and derivative calculations, making it more computationally efficient and faster. In addition, ReLU can alleviate the problem of gradient vanishing in deep neural networks, thus, it is chosen as the activation function in this paper [24].

    In the pooling layer, the maximum pooling function with a kernel size of 2 × 2 is used for pooling, which takes the maximum value within the adjacent matrix region as the pooling result. To avoid overfitting, L2 regularization and Dropout techniques are used for training optimization. L2 regularization can penalize high weight values to avoid overfitting to the training data, while Dropout can randomly set neuron outputs to 0, forcing the model to learn more robust features and preventing severe co-adaptation between neurons [25].

    The GRU (Gated Recurrent Unit) is an improved version of the LSTM (Long Short-Term Memory) with fewer parameters and only two gates, but with similar functionality [26]. Considering the hardware computational capacity and time cost, this paper uses the GRU to learn the time features of the mobile app user behavior data flow. The feature vector obtained by the Convolutional Neural Network is input into the GRU module for learning to obtain the time features of the data flow. The detailed internal structure of the GRU is shown in Figure 4.

    Figure 4.  The structure of GRU.

    GRU combines the input gate and forget gate of LSTM into a single gate called the update gate, denoted as zt in the diagram. The update gate determines how much past and new information to retain at the current time step. It controls a combination of the input and forget gates, dictating the extent to which the new candidate value will be incorporated into the current cell state. Specifically, the update gate's computation involves utilizing a sigmoid activation function based on the current input and the hidden state from the previous time step to generate a value between 0 and 1. This value decides how much information gets updated into the state at the current time step.

    GRU also has another gate called the reset gate, denoted as rt in the diagram, which controls how much past information should be forgotten. The reset gate determines how much of the previous hidden state should be ignored at the current time step. It assists the network in deciding whether to retain past information and which information to discard. The computation of the reset gate is similar to that of the update gate, employing a sigmoid activation function based on the current input and the hidden state from the previous time step to generate a value between 0 and 1.

    In the diagram, xt represents the input, h represents the output and ht represents the candidate output. Subscripts and denote the current and previous time steps, respectively. The amount of past memory information that can continue to be retained at the current time step is controlled by zt, or in other words, it determines how much information from the previous time step and the current time step should be passed on to the future. The expressions for each parameter are shown in Eqs (4.1)–(4.4).

    zt=σ(Wz[ht1,xt]) (4.1)
    rt=σ(Wr[ht1,xt]) (4.2)
    ˜ht=tanh(W[rt×ht1,xt]) (4.3)
    ht=(1zt)×ht1+ztטht (4.4)

    With the increasing number of installed applications on mobile devices, these applications may run automatically in the background and generate traffic even when the user has not opened them. This poses significant challenges for identifying mobile application traffic. The problem becomes even more severe when communicating through an encrypted channel. Under an encrypted channel, both mobile application traffic and background traffic are encrypted, making it difficult to distinguish between them. This confusion hampers the accuracy of mobile application traffic identification, as background traffic gets mixed with the desired mobile application traffic, making it challenging to differentiate them accurately. The confusion between background traffic and mobile application traffic complicates the model's learning process, making it challenging to accurately differentiate and understand genuine behavioral patterns of mobile applications. This confusion leads to unstable predictive outcomes, impacting the model's generalization ability and its capability to identify unknown data. Additionally, the presence of background traffic interference may compel the model to handle a considerable amount of noisy data during training, elevating the complexity and time costs of the training process. Hence, to address the issue of confusion between background traffic and mobile application traffic, there is a need for a background traffic filtering method.

    The model proposed in this paper effectively reduces the interference of background traffic and abnormal data by constructing a background traffic filtering module to identify and filter untrained application behavior as unknown traffic, thus improving the stability and generalization ability of the model. The principle of background traffic filtering is shown in Figure 5. First, the data stream is input to the trained model presented in Section 4.2. When a data stream is input to the model, the output layer of the model converts the raw scores of each class into probabilities using the Softmax function. Specifically, for a classification problem with N classes, the output layer of the model usually has N neurons, each of which corresponds to a class, and outputs a real number as the raw score for that class. Then, the Softmax function is used to convert these raw scores into probabilities for each class. The definition of the Softmax function is as follows:

    softmax(zj)=ezjNk=1ezk (4.5)

    where zj is the raw score of the jth neuron, and is the total number of neurons. The Softmax function transforms each raw score into a non-negative real number pj representing the probability that the input sample belongs to the jth class, and satisfies:

    Nj=1pj=p1+p2++pN=1 (4.6)
    Figure 5.  The filter principle of the background traffic.

    In general, for any given input, the value of one node should be higher than the values of other output nodes. As shown in Figure 5, the node with the highest probability is denoted as Pmax. The decision to convert the predicted probabilities into class labels is controlled by a parameter called the confidence threshold. If Pmax is less than the threshold, the input traffic is treated as an unknown sample. Among all predicted results, the samples with confidence above the threshold are retained, while the samples below the threshold are filtered out. Using the confidence threshold to filter samples, the interference of background traffic and abnormal data can be effectively reduced, and the stability and generalization ability of the model can be improved. Furthermore, training efficiency can be improved, and training costs can be reduced. In order to test the impact of the threshold on the model performance, a series of thresholds were selected and tested, and the threshold parameter with the highest classification accuracy was ultimately selected, as described in Section 5.5.2.

    The algorithm first utilizes CNN to extract spatial features from traffic samples and then employs GRU to learn the temporal features of network traffic data. Finally, it integrates these features for user behavior classification. Through this combined approach, the algorithm accurately extracts spatial and temporal features from the data to better classify user behavior. Next, we will assess the algorithm's complexity from the perspectives of time complexity and model parameter count.

    In terms of time complexity, a CNN involves operations like convolution, pooling and activation functions. Typically, for a CNN with N layers and M filters, the time complexity per sample is often denoted as O(NMHW), where H and W represent the height and width of the feature maps. The time complexity of a GRU primarily depends on its matrix multiplication, element-wise operations and non-linear activation functions. For a time series data of length T, the time complexity of a GRU is usually O(TD2), where D represents the GRU unit's dimension.

    Regarding the model parameter count, the parameter quantity in a CNN model relates to its number of layers, filter count and the number of neurons in each layer. Typically, the parameter count of a CNN model is roughly O(NML), where L represents the number of neurons. The parameter count of a GRU model is determined by the number of units and the input dimension. For a model with N GRU units, the parameter count is approximately O(ND2), where D is the input dimension.

    The operating system used in the experiments of this paper was the 64-bit Windows 10 operating system, with an Intel Core i7-9750H/2.60 GHz CPU, 16 GB of memory, Keras as the deep learning platform, TensorFlow 1.8.0 as the deep learning backend and Python 3.6.2 as the development environment. Cross-entropy was used as the loss function during training of the deep neural network, which outputs probability values between 0 and 1 and measures the performance of the classification model, as defined in Eq (5.1).

    H(y,p)=Mi=1yilog(pi) (5.1)

    where M is the number of mobile applications, pi is the probability calculated through regression by the Softmax function. The model was optimized using the gradient descent method, with a mini-batch size of 64 and a learning rate of 0.01, and trained for approximately 50 epochs. Neural networks are often overtrained, so it is important to validate and test the trained model.

    The ten-fold cross-validation method was used in this study in order to reduce the influence of randomness and chance on model evaluation results for a single test set. This method divides the original dataset into 10 mutually exclusive subsets, and each time selects one subset as the validation set and the remaining 9 subsets as the training set for model training. This process is repeated 10 times, with each subset taking turns as the validation set. Finally, the results of these 10 runs are averaged to obtain the final result, which effectively evaluates the performance and feasibility of the algorithm. This method can make the evaluation results more statistically significant and reduce the degree of overfitting of the model to specific datasets, thereby better evaluating the performance of the model on unknown datasets.

    The experimental dataset in this study was established based on a laboratory environment using the ShadowSocksR (SSR) proxy software to create an encrypted channel. SSR establishes an encrypted channel between the local device and a remote server to encrypt network traffic. This encrypted channel ensures the confidentiality of the data, thereby preventing sensitive information from being intercepted and monitored by third parties. The environment and process for constructing the dataset are illustrated in Figure 6. A laptop was used to create a mobile hotspot, simulating a local area network environment. In this setup, the laptop acts as the gateway listening device, and all traffic generated by smartphones connected to the mobile hotspot passes through the laptop. Wireshark software was used to capture all the traffic in this scenario.

    Figure 6.  Creation process of the mobile application traffic dataset under an encrypted channel.

    The specific information about the experimental devices used in the study is presented in Table 3. Each smartphone device used in the experiments had the SSR client installed (version 3.6.0) and was configured in global proxy mode. It means that all the traffic generated by the Android devices was forwarded through the SSR server. In addition, eight popular mobile applications were selected for the experiments, including IQiyi, Douyin, JD, Toutiao, NetEase Cloud Music, Instagram, Twitter and YouTube.

    Table 3.  Specific information about experimental devices.
    Devices Devices brand Devices version System version
    Smartphone 1 vivo vivo Z6 Android 10.0
    Smartphone 2 Xiaomi Xiaomi 8 Android 8.0
    Smartphone 3 OPPO OPPO A9 Android 8.1
    Notebook computer Dell Inspiron 5501-R1625D Windows 10

     | Show Table
    DownLoad: CSV

    The data collection process involved manually operating each selected mobile application on the smartphone devices and capturing approximately 30 minutes of traffic data for each application. This process was repeated for a total duration of about 4 hours. During the data collection, only one selected application was installed on each device, and no other applications were running in the background. The internet access permissions of other applications on the Android devices were restricted to minimize background network traffic. This ensured that the captured traffic data represented the pure traffic of a specific mobile application under the encrypted channel, facilitating subsequent experiments.

    On the laptop side, Wireshark software (version 3.0.14) was installed to monitor and capture the traffic data packets of each mobile application. During the capturing process, measures were taken to filter out damaged and retransmitted packets. The captured mobile application traffic was saved in the.pcap file format using Wireshark and labeled with the application source. The number of traffic data packets collected for each mobile application is presented in Table 4.

    Table 4.  Packet count statistics for each application.
    APP Packet quantity Proportion
    IQiyi 74,659 9.96%
    Douyin 129,155 17.23%
    JD 95,083 12.69%
    Toutiao 112,520 15.01%
    NetEase Cloud Music 82,535 11.01%
    Instagram 79,462 10.61%
    Twitter 79,222 10.57%
    YouTube 96,843 12.92%
    Total 749,479 100.0%

     | Show Table
    DownLoad: CSV

    In this paper, we use Accuracy, Precision, Recall, F1-Score and Confusion Matrix to evaluate classification models. The accuracy rate describes the overall performance of the classifier, and the accuracy rate evaluates the classification effect of each category in the classification problem. The F1-Score is used to evaluate the performance of the classifier. The confusion matrix can be used to observe the classification of each category in detail. TP means that a positive sample is correctly identified as a positive sample. TN means negative samples are correctly identified as negative samples. FP means that negative samples are misidentified as positive samples. FN means that positive samples are misidentified as negative samples.

     Accuracy =TP+TNTP+FP+TN+FN (5.2)
     Precision =TPTP+FP (5.3)
     Recall =TPTP+FN (5.4)
    F1 Score =2 Precision×Recall  Precision + Recall  (5.5)

    In this paper, we adopt a combined approach of various deep learning models. During the training and testing validation processes, issues related to the selection of hyperparameters such as burst threshold setting, number of bytes for packet truncation, convolutional kernel sizes and Dropout values were addressed. Comparative experiments were designed to determine optimal parameters for these factors.

    In this study, the Burst-flow method is employed to process the encrypted mobile application traffic dataset. The traffic data is discretized into burst-form network traffic blocks based on a predefined burst threshold. If the time interval between two data packets exceeds the burst threshold, they are separated into two different bursts. This process helps to prepare the burst-form traffic for further feature extraction. Previous research by Falaki et al. [27] observed that most data packets on smart mobile devices are sent or received within 4.5 s of the previous packet. Similarly, Taylor et al. [13] suggested that setting the burst threshold to 1 s slightly increases the number of bursts in the network, while providing near real-time performance.

    In this study, by observing the arrival time intervals of the captured encrypted mobile application traffic data, we found that the majority of data packets have arrival time intervals ranging from 0.005 to 0.5 s. This indicates that network performance (bandwidth and latency) has improved compared to earlier research. Therefore, this study re-explores the setting of the burst threshold in the traffic processing stage. In the experiments, burst thresholds of 0.005 s, 0.05 s, 0.5 s and 1 s are tested, and the impact of different burst thresholds on classification accuracy is evaluated based on experimental results. The number of data samples obtained for each application under different burst threshold settings is shown in Table 5. Additionally, the classification accuracy achieved by the dataset under different burst threshold settings is illustrated in Figure 7.

    Table 5.  Packet count statistics for each application.
    APP Sample quantity
    1 s 0.5 s 0.05 s 0.005 s
    IQiyi 2122 2421 3288 7784
    Douyin 1352 1517 1680 7520
    JD 2458 2651 3111 6018
    Toutiao 1120 1166 1269 1567
    NetEase Cloud Music 1805 1921 2214 13,561
    Instagram 1088 1566 1674 3210
    Twitter 960 1025 1136 2876
    YouTube 1780 1997 2395 5638
    Total 12,685 14,264 16,767 48,174

     | Show Table
    DownLoad: CSV
    Figure 7.  Classification accuracy under different burst thresholds.

    From the experimental results, it can be observed that the burst threshold has a significant impact on the final model's classification performance. When the burst threshold is set to 0.5 s, the model achieves the highest accuracy in classifying the dataset. Therefore, we choose a burst threshold of 0.5 s as the standard for flow separation in the traffic processing.

    To investigate the impact of truncating the original byte length of packets on the classification task of mobile application traffic under encrypted channels, different byte lengths of packets were truncated as input features for the classification model. Figure 8 presents the classification accuracy under different packet lengths.

    Figure 8.  Classification accuracy under different packet lengths.

    From Figure 8, it can be observed that as the original byte length increases from 100 to 700, the classification accuracy of the model significantly improves. This indicates that increasing the original byte length can provide more feature information. However, from the curve's change, it can be seen that once the original byte length reaches a certain threshold, the provided feature information tends to saturate. Therefore, we select truncating 784 bytes for further experiments. This approach can provide sufficient information while improving data processing efficiency.

    The size of the convolutional kernel can affect the ability to extract features. Using a kernel that is too large or too small can have an impact on feature extraction performance. To investigate the impact of convolutional kernel size on the performance of the classification model for mobile application traffic under encrypted channels, this section conducts comparative experiments using five different kernel sizes. The experimental results are shown in Figure 9.

    Figure 9.  Effect of convolution kernel size on classification performance.

    From Figure 9, When using different kernel sizes, it was observed that they indeed affect the network's classification performance. Specifically, as the kernel size gradually increased from smaller values to 3 × 3, the model's classification accuracy improved consistently. However, once the size exceeded 3 × 3, the accuracy of the model started to decline. This fluctuation indicates a complex non-linear relationship between the kernel size and model performance. Selecting an appropriate kernel size involves balancing the capability to extract features with maintaining model simplicity. Consequently, the research findings suggest that a 3 × 3 kernel size is the optimal choice, allowing for high performance while effectively managing model complexity.

    As mentioned in Section 4.2, to address the issue of overfitting during model training, the Dropout method is used to randomly drop out a fraction of neurons during the training process. This is a regularization technique that randomly drops a certain proportion of neurons during the training process to reduce interdependency among neurons and prevent the model from overfitting to the training data. In this process, a portion of neurons is randomly deactivated during each training iteration, setting their output to zero with a certain probability. Consequently, each neuron learns to become more robust during training, not relying heavily on the presence of specific other neurons. This method helps improve the model's generalization by forcing the network to learn more robust features rather than relying on particular neurons, thereby reducing overfitting to the training data. In order to determine the appropriate value for Dropout, this section conducts comparative experiments with different Dropout values (0.3, 0.4, 0.5 and 0.6) in the model. The experimental results, as shown in Figure 10, indicate that the model achieves the highest classification accuracy when the Dropout value is set to 0.5. Therefore, in this study, the Dropout parameter is set to 0.5 for the model.

    Figure 10.  Effect of different dropout settings on model performance.

    Given that the model does not suffer from overfitting or underfitting, it can achieve good classification performance on encrypted mobile application traffic. In this case, the accuracy and loss curve of the model training are shown in Figure 11. Regularization and the integration of Dropout algorithm help alleviate the issue of overfitting and improve the model's generalization ability. From Figure 11, it can be observed that the optimized model continuously improves its accuracy on the validation set as the neural network is trained step by step. After 30 epochs of training, the model achieves an accuracy of 98.12% on the training set and 97.56% on the validation set.

    Figure 11.  Performance of the model after optimization.

    The experiments in this section include performance evaluation of classification, evaluation of resistance to background traffic interference, ablation experiments and comparative experiments. The specific experimental procedures and results analysis are described as follows.

    After selecting the experimental parameters in Section 5.4, to evaluate the classification performance of the model on mobile application traffic under encryption, this section conducted experiments on the experimental dataset. First, the classification performance of the proposed model without introducing background traffic interference was validated. The experimental results are shown in Figure 12 and Table 6.

    Figure 12.  Confusion matrix distribution diagram.
    Table 6.  The performance of the model on the evaluation index.
    APP Precision Recall F1-Score
    IQiyi 0.990 0.982 0.986
    Douyin 0.991 0.962 0.978
    JD 0.994 0.957 0.975
    Toutiao 0.977 1.000 0.987
    NetEase Cloud Music 1.000 0.963 0.981
    Instagram 0.957 0.985 0.971
    Twitter 0.962 0.985 0.973
    YouTube 0.937 0.966 0.951

     | Show Table
    DownLoad: CSV

    Figure 12 represents the confusion matrix of the classification accuracy for mobile application traffic under encryption. The closer the elements on the diagonal of the matrix are to 100% and the closer the other elements are to 0, the better the classification performance of the algorithm model for mobile application traffic. Table 6 presents the precision, recall and F1-score of the model for the identification of the 8 mobile applications. By considering the experimental results from Figure 12 and Table 6, it can be observed that the model achieves the highest classification accuracy for the 'Toutiao' application under encryption, and the recognition accuracy for all 8 applications is above 95%. Therefore, the designed classification model in this study demonstrates good recognition performance without the introduction of background traffic interference. The next section will evaluate the performance of the proposed method in resisting background traffic interference by introducing such interference.

    To evaluate the performance of the background traffic filtering module, multiple experiments were conducted in this section. The experiments considered all eight classes of applications in the collected dataset and created two separate subsets: A training set and a test set. In each experiment, one application was selected and separated as an unknown class sample solely for the test dataset, labeled as 'Unknown'. The remaining seven applications were used as known class samples in the training set, labeled as 'A', 'B', 'C', 'D', 'E', 'F' and 'G'. The model was trained using the training set, and after training, the test set samples from the eight classes of applications were classified for testing the model. To test the performance of the model in filtering background traffic, Figure 13 shows the confusion matrices before and after introducing the background traffic filtering module.

    Figure 13.  Confusion matrix before and after the background traffic filtering module is introduced.

    From Figure 13(a), it can be observed that before the introduction of the background traffic filtering module, the model achieved high accuracy in classifying the known classes 'A', 'B', 'C', 'D', 'E', 'F' and 'G'. However, the samples of the unknown class 'Unknown' were misclassified by the classifier into classes 'B', 'C' and 'G', indicating that the model had weak classification ability for unknown classes and some generalization performance issues.

    From Figure 13(b), it can be seen that after introducing the background traffic filtering module, the proposed model was able to recognize and filter out 90.67% of the unknown class samples. This improvement significantly enhanced the model's classification ability and generalization performance for unknown classes. After filtering, the model only focused on more reliable samples, allowing for more accurate classification of unknown classes and avoiding misclassification of unknown class samples into known classes.

    However, from the result graph, it can be observed that the model has a misclassification rate of 14.19%. Among them, 2.44% of class 'A' samples, 0.79% of class 'B' samples, 4.21% of class 'C' samples, 4.31% of class 'E' samples and 2.44% of class 'G' samples were misclassified by the classifier as unknown class samples. According to the analysis, the main reason for the misclassifications by the classifier is the insufficient accuracy of the confidence threshold setting. To achieve the optimal balance between the background traffic detection rate and the misclassification rate, this section further analyzed the confidence distribution of each sample class, as shown in Figure 14.

    Figure 14.  Confidence distribution map of various samples.

    Figure 14(a)(g) represent the confidence distributions of known class samples, while (h) represents the confidence distribution of unknown class samples, i.e., background traffic samples. Analyzing the confidence distribution graph, it can be observed that the majority of known class samples have confidence values distributed in the higher probability range. Specifically, most of the known class samples have confidence values of 0.97 and above. This indicates that the model has high confidence and accuracy in classifying these known class samples. On the other hand, the confidence distribution of unknown background traffic samples exhibits a scattered trend, with lower confidence values, mostly located in the lower probability range. To achieve the optimal balance between the background traffic detection rate and the misclassification rate, this section conducted comparative experiments using multiple confidence values (0.95, 0.955, 0.96, 0.965, 0.97, 0.975 and 0.98) as filtering thresholds. The experimental results are shown in Figure 15.

    Figure 15.  Confidence threshold comparison experiment.

    According to the experimental results in Figure 15, as the confidence threshold increases, the misclassification rate of known traffic gradually increases, while the background traffic detection rate gradually increases. Therefore, selecting an appropriate confidence threshold requires finding a balance between the misclassification rate of known traffic and the background traffic detection rate. When the confidence threshold increases from 0.96 to 0.965, the misclassification rate of known traffic slightly increases, but the background traffic detection rate significantly improves. However, when the confidence threshold continues to increase to 0.97, the misclassification rate of known traffic sharply increases, while the growth of the background traffic detection rate slows down. This indicates that at this threshold, too many target mobile application traffic samples are misclassified as unknown background traffic, leading to a significant increase in the misclassification rate. Therefore, 0.965 is chosen as the final confidence threshold to achieve the best balance.

    It is important to note that when setting the confidence threshold, specific application scenarios and requirements should be taken into account. If a higher background traffic detection rate is required, the confidence threshold can be appropriately increased to increase the detection rate. If a lower misclassification rate of known class mobile application traffic samples is desired, the confidence threshold can be appropriately lowered to reduce the misclassification rate.

    The proposed method includes multiple components that contribute to the improvement of the classification performance. In order to evaluate the contributions of each component to the final recognition performance of the model, ablation experiments were conducted on both the original dataset and the dataset with background traffic. The compared models included independent CNN model, GRU model, CNN and GRU combined model and the model with the additional background traffic filtering gain module. The results of the comparison showed improvements in various classification performance metrics when the models with performance gain modules were introduced.

    According to Tables 7 and 8:

    Table 7.  Results of evaluation indexes of ablation experimental model in the original dataset.
    Model Accuracy Precision Recall F1-Score
    CNN 0.924 0.933 0.914 0.925
    GRU 0.888 0.896 0.873 0.886
    CNN+GRU 0.975 0.976 0.976 0.975

     | Show Table
    DownLoad: CSV
    Table 8.  Results of evaluation indexes of ablation experimental model with background flow dataset.
    Model Accuracy Precision Recall F1-Score
    CNN 0.761 0.746 0.782 0.764
    GRU 0.724 0.714 0.745 0.727
    CNN+GRU 0.812 0.808 0.816 0.812
    CNN+GRU+Background flow filtering 0.954 0.959 0.961 0.961

     | Show Table
    DownLoad: CSV

    1) Under the original dataset, the CNN and GRU models have relatively lower classification performance but exhibit some level of classification ability. The combination of CNN and GRU effectively utilizes their respective advantages. CNN can extract local features through operations such as convolution and pooling, while GRU, with its recurrent neural network structure, can model and classify long-term dependencies. This feature extraction and combination approach leads to improved model performance.

    2) Under the dataset with background traffic, the CNN, GRU and CNN+GRU models exhibit relatively poorer classification performance. This indicates that these models have weaker resistance to interference when dealing with datasets containing background traffic. The addition of the background traffic filtering module significantly improves the performance of the CNN+GRU+background traffic filtering model compared to the other models. This suggests that the model effectively reduces the impact of background traffic on classification performance, enhances its resistance to interference and improves robustness when dealing with datasets containing background traffic.

    In order to demonstrate the superiority of the proposed framework, comparative experiments were conducted with existing state-of-the-art frameworks. The baseline methods compared are described as follows:

    Reference [19] proposed the CLD-Net model, which utilizes the capability of Convolutional Neural Networks (CNNs) to classify image categories. It learns and classifies preprocessed grayscale images of original flows and further enhances the model's classification ability using Long Short-Term Memory (LSTM) networks for temporal sequence data. Experimental results show that the model can differentiate between VPN and non-VPN network traffic on the publicly available ISCX dataset and accurately identify specific traffic types (chat, audio, or file) of Facebook and Skype applications.

    Reference [28] proposed a network traffic classification model called ABL-TC, which introduces attention mechanism to improve the LSTM model. Based on the experimental results on the publicly available ISCX VPN-nonVPN and Tor-nonTor datasets, the model performs well in 18 classification tasks involving regular encryption, VPN, and Tor traffic, with average precision, recall and F1-score all exceeding 99.6%.

    Reference [29] introduces the Transformer model, where we employ 32-dimensional embedding vectors to represent each input token. The model incorporates 6 attention heads to concurrently capture correlations within different feature subspaces. Additionally, the internal feedforward neural network comprises hidden layers of 32 dimensions. These parameter selections aim to strike a balance between model capacity and computational efficiency.

    Although existing works have shown good recognition performance, none of the three mentioned papers considered the presence of background traffic generated by unknown applications in real-world scenarios. Existing works focus on classifying network traffic in closed testing environments where the training and testing sets contain the same traffic classes. This results in significantly reduced classification accuracy when faced with background traffic interference.

    In this section, the above three methods were first applied to the laboratory-collected pure original dataset for classification experiments of mobile application traffic under encrypted channels. A comparison was made with the proposed method, and the classification performance of different methods for each application traffic is shown in Table 9.

    Table 9.  The effect of different methods on traffic classification for different applications.
    APP Reference[19] Reference[28]
    A P R F1 A P R F1
    IQiyi 0.911 0.938 0.911 0.924 0.936 0.948 0.936 0.942
    Douyin 0.948 0.993 0.948 0.970 0.896 0.995 0.896 0.943
    JD 0.958 1.000 0.958 0.979 0.926 0.957 0.926 0.941
    Toutiao 0.99 0.915 0.990 0.951 0.975 0.940 0.975 0.957
    NetEase Cloud Music 0.963 0.976 0.963 0.970 0.928 0.944 0.928 0.936
    Instagram 0.993 0.944 0.993 0.968 0.990 0.911 0.990 0.949
    Twitter 0.965 0.976 0.965 0.971 0.910 0.938 0.910 0.924
    YouTube 0.923 0.912 0.923 0.912 0.959 0.886 0.959 0.921
    Macro average 0.955 0.957 0.955 0.956 0.939 0.940 0.939 0.939
    APP Rerference[29] Our proposed
    A P R F1 A P R F1
    IQiyi 0.969 0.968 0.993 0.971 0.982 0.990 0.982 0.986
    Douyin 0.972 0.987 0.977 0.976 0.966 0.991 0.962 0.978
    JD 0.989 0.969 0.957 0.982 0.957 0.994 0.957 0.975
    Toutiao 0.966 0.991 0.965 0.976 1.000 0.977 1.000 0.987
    NetEase Cloud Music 0.986 0.971 0.989 0.984 0.963 1.000 0.963 0.981
    Instagram 0.991 0.969 0.971 0.986 0.985 0.957 0.985 0.971
    Twitter 0.986 0.968 0.961 0.949 0.985 0.962 0.985 0.973
    YouTube 0.925 0.963 0.976 0.963 0.966 0.937 0.966 0.951
    Macro average 0.973 0.973 0.974 0.973 0.975 0.976 0.976 0.975

     | Show Table
    DownLoad: CSV

    The experimental results show that, in the absence of background traffic interference, the proposed method performs well in classifying mobile application traffic under encrypted channels. In terms of classification accuracy, Reference [19] achieves an accuracy of 0.955, and Reference [28] achieves an accuracy of 0.939, while the proposed method achieves an accuracy of 0.975, surpassing the other two methods by over 2%. The proposed method also outperforms the other two methods in terms of precision, recall and F1 score, with an improvement of over 2% in each metric. In conclusion, the proposed method demonstrates superior performance overall and exhibits good classification ability for mobile application traffic under encrypted channels.

    Considering that in real identification scenarios, the target and background traffic often coexist, the background traffic can interfere with the classifier. To highlight the advantages of the proposed method in resisting background traffic interference, in addition to the comparative experiments conducted on the laboratory-collected pure original dataset for encrypted channel mobile application traffic classification, this section constructs a dataset that includes background traffic according to the method described earlier for evaluating the performance against background traffic interference. The comparison experiments on the dataset were performed, comparing with three methods, respectively, from references [19,28,29]. The performance of different methods on the experimental dataset is shown in Table 10.

    Table 10.  Performance of different methods on experimental datasets.
    Reference Model Original dataset Including background
    accuracy traffic dataset accuracy
    Reference [19] CNN+LSTM 0.955 0.813
    Reference [28] Attention-based LSTM 0.939 0.782
    Reference [29] Transformer 0.973 0.842
    This paper CNN+GRU+Background flow filtering 0.975 0.954

     | Show Table
    DownLoad: CSV

    The experimental results indicate that when the generated dataset containing background traffic is used for classifying mobile application traffic under encrypted channels, the three compared models show a significant decrease in accuracy, with a reduction of 14.2%, 15.7% and 13.1%, respectively. This is because the compared methods only consider the same dataset for training and evaluating model performance, and they perform well when tested in an ideal environment. However, when unknown data is present, their classification accuracy is affected by the presence of background traffic. Therefore, when these models are deployed in real network environments, their ability to resist background traffic interference is significantly weaker than the proposed method in this study. In summary, the proposed method in this research exhibits higher robustness and practicality in classifying mobile application traffic under encrypted channels.

    In this work, we propose a novel method for mobile application recognition in encrypted channels. We process the traffic through the encrypted channel using a slicing method and have devised a neural network model that combines CNN and GRU. This model leverages CNN for extracting spatial features from network traffic and employs GRU for modeling the temporal sequences. This approach effectively characterizes the spatiotemporal features of mobile application traffic over encrypted channels. It enables the extraction of features from mobile application traffic under encrypted channels and employs comprehensive analysis based on probabilistic outputs to filter out low-confidence background traffic interference. Relevant experiments demonstrate that the proposed method exhibits a high recognition accuracy and robust interference resistance. This approach presents a novel perspective and method for addressing the challenge of identifying mobile application traffic under encrypted channels, offering significant practical application potential.

    Although this paper does not compare with multimodal models, it does compare with other models (CNN, GRU, CNN+GRU). Additionally, we focus on evaluating the performance against background interference. Within the current field, it is noted that many advanced techniques often overlook interference issues, while our work aims to address this research gap. At this stage, we have chosen to concentrate the paper's emphasis on the model performance comparison before and after introducing the anti-interference module. We believe this decision helps highlight the specific contribution of our research. Nonetheless, in future work, we plan to conduct further comparisons with state-of-the-art techniques, such as multimodal networks, to ensure readers understand the significance of our current design. Moreover, the work conducted in this paper was based on a dataset constructed in a laboratory environment. With the continuous growth of the internet environment and the number of applications, these data have certain limitations. In the future, expanding the dataset by incorporating more devices and a wider range of application data could lead to better improvements in traffic recognition solutions on a larger scale. Finally, leveraging Explainable Artificial Intelligence (XAI) techniques is also a potential avenue for further research, as discussed by Nascita et al., to explicate and strengthen our proposed method [30]. Through XAI technology, we can delve into understanding the decision-making logic of the model in identifying applications within encrypted channels, thereby enhancing the method's transparency and interpretability. This not only enhances the performance and robustness of our method but also provides deeper insights and avenues for improvement within the realm of identifying mobile applications in encrypted channels. Future research could further explore identification techniques in encrypted scenarios, methods to enhance security and customized identification models tailored to different application types, thereby further propelling the development and innovation in this field.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    This work was supported by the National Natural Science Foundation of China (Grants Nos. 61931004 and 62072250), the National Key Research and Development Program of China (Grants No. 2021QY0700), Key Laboratory of Intelligent Support Technology In Complex Environment.

    The authors declare that they have no conflicts of interest to report regarding the present study.



    [1] J. R. Fernández, R. Quintanilla, Analysis of a higher order problem within the second gradient theory, Appl. Math. Lett., 154 (2024), 109086. https://doi.org/10.1016/j.aml.2024.109086 doi: 10.1016/j.aml.2024.109086
    [2] D. Iesan, R. Quintanilla, A second gradient theory of thermoelasticity, J. Elasticity, 154 (2023), 629–643. https://doi.org/10.1007/s10659-023-10020-1 doi: 10.1007/s10659-023-10020-1
    [3] J. K. Knowles, On Saint-Venant's principle in the two dimensional linear theory of elasticity, Arch. Rational Mech. Anal., 21 (1966), 1–22. https://doi.org/10.1007/BF00253046 doi: 10.1007/BF00253046
    [4] J. K. Knowles, An energy estimate for the biharmonic equation and its application to saint-venant's principle in plane elastostatics, Indian J. Pure Appl. Math., 14 (1983), 791–805.
    [5] J. N. Flavin, On Knowles' version of Saint-Venant's principle in two-dimensional elastostatics, Arch. Rational Mech. Anal., 53 (1974), 366–375. https://doi.org/10.1007/BF00281492 doi: 10.1007/BF00281492
    [6] J. N. Flavin, R. J. Knops, Some convexity considerations for a two-dimensional traction problem, Z. Angew. Math. Phys., 39 (1988), 166–176. https://doi.org/10.1007/BF00945763 doi: 10.1007/BF00945763
    [7] C. O. Horgan, Decay estimates for the biharmonic equation with applications to saint-venant principles in plane elasticity and stokes flows, Q. Appl. Math., 42 (1989), 147–157.
    [8] C. Lin, Spatial decay estimates and energy bounds for the stokes flow equation, SAACM, 2 (1992), 249–262.
    [9] R. J. Knops, C. Lupoli, End effects for plane Stokes flow along a semi-infinite strip, Z. Angew. Math. Phys., 48 (1997), 905–920. https://doi.org/10.1007/s000330050072 doi: 10.1007/s000330050072
    [10] J. C. Song, Improved decay estimates in time-dependent Stokes flow, J. Math. Anal. Appl., 288 (2003), 505–517. https://doi.org/10.1016/j.jmaa.2003.09.007 doi: 10.1016/j.jmaa.2003.09.007
    [11] J. C. Song, Improved spatial decay bounds in the plane Stokes flow, Appl. Math. Mech., 30 (2009), 833–838. https://doi.org/10.1007/s10483-009-0703-z doi: 10.1007/s10483-009-0703-z
    [12] Y. Liu, C. Lin, Phragmén-Lindelöf type alternative results for the stokes flow equation, Math. Inequal. Appl., 9 (2006), 671–694.
    [13] C. H. Lin, L. E. Payne, A Phragmén-Lindelöf alternative for a class of quasilinear second order parabolic problems, Differ. Integral Equations, 8 (1995), 539–551. https://doi.org/10.57262/die/1369316504 doi: 10.57262/die/1369316504
    [14] C. O. Horgan, L. E. Payne, Phragmén-lindelöf type results for harmonic functions with nonlinear boundary conditions, Arch. Rational Mech. Anal., 122 (1993), 123–144. https://doi.org/10.1007/BF00378164 doi: 10.1007/BF00378164
    [15] Y. Li, X. Chen, Phragmén-Lindelöf alternative results and structural stability for Brinkman fluid in porous media in a semi-infinite cylinder, Open Math., 20 (2022), 1665–1684. https://doi.org/10.1515/math-2022-0531 doi: 10.1515/math-2022-0531
    [16] P. Zeng, D. Li, Y. Li, The growth or decay estimates for nonlinear wave equations with damping and source terms, Math. Biosci. Eng., 20 (2023), 13989–14004. https://doi.org/10.3934/mbe.2023623 doi: 10.3934/mbe.2023623
    [17] J. Jiménez-Garrido, J. Sanz, G. Schindl, A Phragmén-Lindelöf theorem via proxi-mate orders, and the propagation of asymptotics, J. Geom. Anal., 30 (2020), 3458–3483. https://doi.org/10.1007/s12220-019-00203-5 doi: 10.1007/s12220-019-00203-5
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(637) PDF downloads(47) Cited by(0)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog