
In this paper, the invariance of separation in covering approximation spaces are discussed. This paper proves that some separations in covering approximation spaces are invariant to reducts of coverings, invariant to covering approximation subspaces and invariant under CAP-transformations of covering approximation spaces. These results deepen and enrich theory of separations in covering approximation spaces, which is helpful to give further researches and applications of Pawlak rough set theory in information sciences.
Citation: Qifang Li, Jinjin Li, Xun Ge, Yiliang Li. Invariance of separation in covering approximation spaces[J]. AIMS Mathematics, 2021, 6(6): 5772-5785. doi: 10.3934/math.2021341
[1] | Iftikhar Ahmad, Hira Ilyas, Muhammad Asif Zahoor Raja, Tahir Nawaz Cheema, Hasnain Sajid, Kottakkaran Sooppy Nisar, Muhammad Shoaib, Mohammed S. Alqahtani, C Ahamed Saleel, Mohamed Abbas . Intelligent computing based supervised learning for solving nonlinear system of malaria endemic model. AIMS Mathematics, 2022, 7(11): 20341-20369. doi: 10.3934/math.20221114 |
[2] | Peng Lai, Wenxin Tian, Yanqiu Zhou . Semi-supervised estimation for the varying coefficient regression model. AIMS Mathematics, 2024, 9(1): 55-72. doi: 10.3934/math.2024004 |
[3] | Jun Ma, Junjie Li, Jiachen Sun . A novel adaptive safe semi-supervised learning framework for pattern extraction and classification. AIMS Mathematics, 2024, 9(11): 31444-31469. doi: 10.3934/math.20241514 |
[4] | Yahia Said, Amjad A. Alsuwaylimi . AI-based outdoor moving object detection for smart city surveillance. AIMS Mathematics, 2024, 9(6): 16015-16030. doi: 10.3934/math.2024776 |
[5] | Mashael M Asiri, Abdelwahed Motwakel, Suhanda Drar . Robust sign language detection for hearing disabled persons by Improved Coyote Optimization Algorithm with deep learning. AIMS Mathematics, 2024, 9(6): 15911-15927. doi: 10.3934/math.2024769 |
[6] | Peng Zhong, Xuanlong Wu, Li Zhu, Aohao Yang . A new APSO-SPC method for parameter identification problem with uncertainty caused by random measurement errors. AIMS Mathematics, 2025, 10(2): 3848-3865. doi: 10.3934/math.2025179 |
[7] | Chaeyoung Lee, Sangkwon Kim, Soobin Kwak, Youngjin Hwang, Seokjun Ham, Seungyoon Kang, Junseok Kim . Semi-automatic fingerprint image restoration algorithm using a partial differential equation. AIMS Mathematics, 2023, 8(11): 27528-27541. doi: 10.3934/math.20231408 |
[8] | Jun Ma, Xiaolong Zhu . Robust safe semi-supervised learning framework for high-dimensional data classification. AIMS Mathematics, 2024, 9(9): 25705-25731. doi: 10.3934/math.20241256 |
[9] | Rehab Alharbi, S. E. Abbas, E. El-Sanowsy, H. M. Khiamy, K. A. Aldwoah, Ismail Ibedou . New soft rough approximations via ideals and its applications. AIMS Mathematics, 2024, 9(4): 9884-9910. doi: 10.3934/math.2024484 |
[10] | Hyam Abboud, Clara Al Kosseifi, Jean-Paul Chehab . Stabilized bi-grid projection methods in finite elements for the 2D incompressible Navier-Stokes equations. AIMS Mathematics, 2018, 3(4): 485-513. doi: 10.3934/Math.2018.4.485 |
In this paper, the invariance of separation in covering approximation spaces are discussed. This paper proves that some separations in covering approximation spaces are invariant to reducts of coverings, invariant to covering approximation subspaces and invariant under CAP-transformations of covering approximation spaces. These results deepen and enrich theory of separations in covering approximation spaces, which is helpful to give further researches and applications of Pawlak rough set theory in information sciences.
Person re-identification (re-ID) aims to find the person-of-interest across different cameras from the gallery when the person-of-interest is given. With the development of surveillance equipment and the increasing demand for public safety, many camera networks have been installed in public places such as theme parks, airports, streets and university campuses. Therefore, there is an urgent need to develop an intelligent technology for monitoring image analysis. At present, re-ID has been widely used in the security field and has become the focus of academic research. However, person re-ID faces more challenges such as lighting, pose, viewpoint, camera variation, and so on. At the early work of person re-ID, there is only a few small dataset, and the texture feature, color feature, and hand-crafted feature are used in the field of person re-ID. In recent years, with the development of deep Convolutional Neural Networks (CNN) and the emergence of large-scale datasets, CNN-based deep learning models have made a great success in the field of re-ID. Most person re-ID methods focus on supervised learning [1,2,3,4,5,6,7]. These methods rely on labeled datasets, that is, each pedestrian in the training data has an identity label. However, due to the high cost of labeling large-scale datasets, the semi-supervised person re-ID has been proposed.
The semi-supervised person re-ID methods use part of the labeled data and part of the unlabeled data for training. We focus on one-example setting, i.e., each identity has only one labeled example. The setting can use the labeled data to estimate the pseudo-label of unlabeled data by calculating the feature distance between the labeled data and the unlabeled data. Then the reliable pseudo-label data are selected and added to the label data to train the model together with unlabeled data. However, due to the difference in the camera frames, the lighting, and the location of the camera, the training dataset under one-example setting will be affected by the style variation between the cameras and reduce the model performance. This problem can be solved by style transfer.
The style transfer method is to use the Generative Adversarial Network (GAN) to transfer the labeled pedestrian image style to the unlabeled test data and use it to train the model. It can better solve the problem of performance degradation when training and testing on different datasets or between different cameras in the same dataset. At the same time, this method also avoids the tedious work of labeling data on the unlabeled test data and reduces the expensive cost of labeling data on the unlabeled test data and it is widely used in supervised person re-ID or semi-supervised person re-ID. Figure 1 shows the style transfer from random cameras to other cameras in the same dataset and the label of the generated image is the same as the original image. Therefore, we propose a random style transfer strategy that effectively solves the problem of camera style variation in the one-example setting.
In the one-example person re-ID task, Wu et al. [8] use part of the labeled person images and part of the unlabeled person images for training. During training, an image is randomly selected as the labeled data for each identity under the camera with the smallest number, and the rest is used as unlabeled data. Then these two kinds of data are used for training. However, the method of [8] cannot eliminate the domain difference between cameras in the training phase and the label evaluation phase. The random style transfer strategy we propose is to randomly transform the image style of the labeled data and unlabeled data to other camera styles without changing the person identity label during training. This can solve the problem of camera style variation in the same training dataset, thereby improving performance. Besides, when evaluating pseudo-labels, Wu et al. [8] use the features of the labeled data as the evaluation features. However, we adopt an average feature method to average the features of the labeled data and the features after the camera style transformation, as the final evaluation feature. In the same way, there will be a better improvement. Our method can obtain a more robust initialized CNN model, and in subsequent iterations, it will obtain better performance.
Our contributions are summarized as follow:
● We propose a random style transfer strategy to transform the camera style of the training data, which eliminates the style variation between different cameras in the same datasets and obtains a better initialized CNN model.
● We adopt an average feature strategy to estimate the pseudo-labels of unlabeled data, and the estimation results are improved.
● Our method has achieved good results on commonly used datasets.
In recent years, with the advent of the CNN model [9] and the emergence of large datasets, methods based on deep learning have been widely used in computer vision tasks [10,11,12], including person re-ID. More methods [13,14,15,16,17,18,19] are used for person re-ID tasks and achieve good performance results. Deng et al. [20] use triplets as input on the siamese model. Zheng et al. [21] use a pair of images as the input of the network, and indicates whether the two images are a person according to the similarity value of the output. Ahmed et al. [22] propose a joint learning framework that combines end-to-end re-ID learning and data generation to make better use of the generated data. Zheng et al. [18] use a conventional fine-tuning approach called the Identity Discriminative Embedding (IDE) on the Market-1501 dataset and finally obtain competitive results. Zheng et al. [21] combine the verification model and recognition model to learn more differentiated pedestrian descriptors with a pair of training images. Huynh-The et al. [23] learn the full gait information of an individual by comprehensively studying gait information from 3D human skeleton data with a deep learning-based identifier.
Semi-supervised learning methods [24,25,26,27] use partially labeled data and partially unlabeled data to solve a given task. In recent years, some semi-supervised person re-ID [28] tasks use the Progressive Cross-camera Soft Label Learning (PCSL) framework. Kipf et al. [29] use a scalable method of semi-supervised learning on graph structure data. Yu et al. [30] propose an asymmetric metric clustering to discover potential labels in unlabeled target data. Liu et al. [31] use K-Nearest Neighbor (KNN) to update the classifier. Ye et al. [32] adopt a Dynamic Graph Matching (DGM) method to iteratively update graph matching and label estimation. Liu et al. [33] propose a newemantics-Guided Clustering with Deep Progressive Learning (SGC-DPL) to gradually enhance the labeled training data.
In this paper, we follow the one-example setting person re-ID as in [8]. It assumes that only one image of each person in the training set is labeled, while the rest of the data in the training set is not labeled. In [8], a progressive sampling strategy is proposed to increase the number of the selected pseudo-labeled candidates step by step. However, it lacks consideration of the cameras variation when estimating the pseudo-label. This can be solved with style transfer.
The unsupervised domain adaptation re-ID uses an auxiliary source dataset to label an unlabeled target dataset. Recently, some cross-domain learning methods [34,35,36,37,38] have emerged. Peng et al. [34] propose an asymmetric multi-task dictionary learning model to learn a discriminative represtation for target data. Fan et al. [35] propose a learning via translation framework to reduce the performance deviation after dataset conversion and the unsupervised self-similarity and domain-dissimilarity to ensure potential ID information. Zhong et al. [38] propose to learn camera invariance and domain connectedness simultaneously to improve the generalization ability of re-ID models on the target testing set. Xiang et al. [39] propose a new unsupervised Re-ID method through domain adaptation to use synthetic data to get rid of heavy data annotation and improve the performance of re-identification in a completely unsupervised way. Ge et al. [40] propose an unsupervised framework, namely Mutual Mean-Teaching (MMT), to reduce the inevitable label noise caused by the clustering procedure.
Style transfer refers to the transfer of the style of image A to image B to obtain a new image. The new image contains both the content of image B and the style of image A. The conversion process is achieved through GAN. Since Goodfellow et al. propose GAN [41], many variants of GAN [42,43,44,45,46] have been proposed to handle different tasks, such as natural style conversion, super-resolution, image conversion etc. Zheng et al. [47] use GAN to generate new samples for data augmentation of person re-ID, which is an early work on personnel migration done by GAN for person re-ID. Isola et al. [48] propose a conditional adversarial network to learn the mapping function from input to output image. However, this method requires paired training data, which is difficult to obtain in many tasks. Therefore, for the task of unpaired image-to-image conversion, Zhu et al. [49] propose a loop consistency loss training unpaired data. Later, Person Transfer GAN (PTGAN) [50] proposed by Wei et al. is similar to Cycle-GAN [49], and it can also perform image-to-image conversion. The difference is that, in order to ensure that the transmitted images can be used for model training, additional constraints are imposed on the identity of the person. In Camstyle [6], CycleGAN is used to transfer the labeled training image style to each camera, and form an enhanced training set with the original training samples. Zhang et al. [51] propose a novel semi-supervised re-ID by Similarity-Embedded Cycle GANs (SECGAN), which can learn cross-view features with limited labeled data by using cycle GAN. Our work focuses on using style transfer on one-example setting to eliminate the cameras variation. Liu et al. [52] propose a UnityStyle adaption method to solve the problem of more image artifacts when the difference between the images taken by different cameras is large. Chong et al. [53] propose an unsupervised domain-adaptive person re-identification method based on style transfer (STReID) to solve the potential image distinctions between different domains.
This section consists of five parts: We first describe the process of using CycleGAN [49] to generate camera conversion data in subsection 3.1. We then introduce the preliminary work in subsection 3.2. The random style transfer (RST) strategy and average feature estimation (Avg) strategy are described in subsection 3.3 and subsection 3.4 respectively. The last subsection shows the overall progressive iteration strategy. The overall framework is shown in Figure 2. In each iteration, (1) In the training phase, we use the style transfer dataset(Cam set) to perform random style transfer on labeled data, pseudo-labeled data and unlabeled data to train the CNN model. We train label data and pseudo-label data through the Cross-Entropy loss, and train unlabeled data through the Exclusive Loss. (2) In the label estimation stage, we first use style transfer data to average the features of the labeled data. Then, according to the distance in the feature space, some reliable pseudo-label candidates are selected from the unlabeled data U. Nodes with different colors in the feature space frame represent different identification samples.
Given two datasets {x}Mi=1 and {y}Nj=1 collected from domain A and domain B, {x}Mi=1 belongs to A and {y}Nj=1 belongs to B. The goal of GAN is to learn a generator G and a discriminator D, where G:A→B is to realize the conversion of the image from domain A to domain B, that is, G(A)≈B, D is to identify whether the image is from another domain conversion. CycleGAN contains two mapping functions GA2B:A→B and GB2A:A→B. Two adversarial discriminators DA and DB. The overall loss function of CycleGAN is as follows:
L(GA2B,GB2A,DA,DB)=LG(GA2B,DB,A,B)+LG(GB2A,DA,B,A)+λLcyc(GA2B,GB2A,A,B) | (3.1) |
where LG(GA2B,DB,A,B) and LG(GB2A,DA,B,A) are the loss functions for the mapping functions GA2B and GB2A and for the discriminators DB and DA. Lcyc(GA2B,GB2A,A,B) is the cycle consistency loss, in which each image can be reconstructed after a cycle mapping. λ weighs the importance of LG and Lcyc. More details about CycleGAN can be accessed in [49].
CycleGAN is employed to transform the style of one domain into the style of another domain to realize the generation of different camera data. Given a re-ID dataset collected from C cameras, the styles between different cameras are regarded as different domains and CycleGAN is used to learn the image-image conversion model of each camera pair to realize the generation of data of different camera styles.
In this paper, we need to generate camera style transfer data. There are many deep learning models that can realize style transfer data generation. This paper only uses CycleGAN as an example.
Using the learned CycleGAN model, for the training images collected from a specific camera, we generate C−1 new training samples. Figure 3 shows pictures with other camera styles generated by the Market-1501 dataset with the help of CycleGAN. For example, the picture in cam1 uses CycleGAN to generate a picture of cam2 style. The cam1to2 retains the pedestrian of cam1 but changes the camera style of cam1 to the camera style of cam2. There are 6 cameras in the Market-1501 dataset. Under the action of CycleGAN, each image generates 5 other camera-style images. Compared with the original image, the generated image has the same identity labeled, only the camera style changes. In this work, we will generate each style conversion image to retain the content of the original image and have the same identity as the original image. This kind of data is called CameraStyle data.
We first introduce the necessary symbols for the one-example re-ID task. Let x and y represent the person image and the identity label, respectively. For the training of one-example re-ID task, we have a labeled dataset L={(x1,y1),…,(xnl,ynl)}, an unlabeled dataset U={xnl+1,…,xnl+nu}, and a CamStyle dataset Z. In the training phase, these data are used to train the re-ID model ϕ(θ,⋅) in the form of identity classification. In the evaluation phase, the trained CNN model ϕ is used to embed query data and gallery data in the feature space. The query result is a ranking list of all gallery data based on the Euclidean distance between the query data and each gallery data, i.e., ‖ϕ(θ;xq)−ϕ(θ;xg)‖, where xq and xg represent the query image and gallery image, respectively. And we denote St and Mt as the pseudo-labeled dataset and unlabeled dataset of the tth step, respectively.
To utilize the abundant unlabeled data, we use the joint training method [8] in the training phase to perform joint training of labeled data, pseudo-labeled data, and unlabeled data. The objective function is as follows:
minθ,ωλnl∑i=1ℓCE(f(ω;ϕ(θ;xi)),yi)+λnl+nu∑i=nl+1st−1iℓCE(f(ω;ϕ(θ;xi)),ˆyi)+(1−λ)nl+nu∑i=nl+1(1−st−1i)ℓe(V;˜ϕ(θ;xi)) | (3.2) |
where minθ,ωnl∑i=1ℓCE(f(ω;ϕ(θ;xi)),yi) represents the optimized part of the labeled dataset L, minθ,ωnl+nu∑i=nl+1st−1iℓCE(f(ω;ϕ(θ;xi)),ˆyi) represents the optimized part of the pseudo-labeled dataset St, and minθ,ωnl+nu∑i=nl+1(1−st−1i)ℓe(V;˜ϕ(θ;xi)) represents the optimized part of the unlabeled dataset Mt. f(w;⋅) is an identity classifier, parameterized by w, to classify the embedded feature ϕ(θ;xi) into a k-dimension confidence estimation. si∈{0,1} is the selection indicator for the unlabeled sample xi. ℓCE and ℓe represent identity classification loss and exclusive loss, respectively, and λ is a hyper-parameter used to adjust their contribution. More details about joint training method can be accessed in [8].
We first introduce the types of data used in training. In the tth iteration, we will use four kinds of data: label data L, pseudo label data St, unlabeled data Mt, and CameraStyle data Z. Before the training starts, we need to generate the CamStyle dataset. We use CycleGAN mentioned in subsection 3.1 to generate one corresponding picture of other camera styles for each picture in the training dataset, that is, we keep the person label, and only change the camera style to other camera styles.
The random style transfer strategy is to perform a random conversion of the camera style with the help of CamStyle data for each piece of labeled data, pseudo-labeled data, and unlabeled data during the training process:
xk→˜xk,˜xk∈{xk}∪{x|x∈Zk} | (3.3) |
where xk represents a picture in the dataset, and Zk represents pictures of other cameras corresponding to A in the CamStyle dataset, so ˜xk represents a camera-style image of A randomly converted, including itself.
After adopting a random style conversion strategy for training, (3.2) will be optimized as:
minθ,ωλnl∑i=1ℓCE(f(ω;ϕ(θ;˜xi)),yi)+λnl+nu∑i=nl+1st−1iℓCE(f(ω;ϕ(θ;˜xi)),ˆyi)+(1−λ)nl+nu∑i=nl+1(1−st−1i)ℓe(V;˜ϕ(θ;˜xi)) | (3.4) |
where minθ,ωnl∑i=1ℓCE(f(ω;ϕ(θ;˜xi)),yi) represents the optimized part of the labeled dataset L after random style transfer, minθ,ωnl+nu∑i=nl+1st−1iℓCE(f(ω;ϕ(θ;˜xi)),ˆyi) represents the optimized part of the pseudo-labeled dataset St after random style transfer, and minθ,ωnl+nu∑i=nl+1(1−st−1i)ℓe(V;˜ϕ(θ;˜xi)) represents the optimized part of the unlabeled dataset Mt after random style transfer.
Therefore, the training data will reduce the domain difference caused by different cameras, thereby making the initial training model more robust and making the subsequent training process more effective.
Previous works use the distance between labeled data and unlabeled data in the feature space as a measure of the reliability of pseudo-labels. For label estimation of unlabeled data, the nearest neighbor (NN) classifier is used to assign pseudo labels to each unlabeled data through the nearest labeled neighbor in the feature space [8]. However, it is difficult to eliminate the domain difference caused by different cameras by just using a labeled picture feature to calculate the distance from the unlabeled data. To solve this problem, we propose the average feature estimation strategy. For label estimation of unlabeled data, we find other camera-style pictures corresponding to the labeled picture in CamStyle data, and then take the average feature of all the pictures. Finally, the distance between the average feature and the unlabeled data is used as a measure of the reliability of the pseudo-label.
We evaluate pseudo-labels for each unlabeled data xi∈U by:
x∗,y∗=argmin(xl,yl)‖ϕ(θ;xi)−ϕ(θ;˜xl)‖ | (3.5) |
d(θ;xi)=‖ϕ(θ;xi)−ϕ(θ;x∗)‖ | (3.6) |
ˆyi=y∗ | (3.7) |
where ˜xl is the average of label data xl∈L and other camera pictures in the corresponding CamStyle data. Where d(θ;xi) is the dissimilarity cost of label estimation. In order to select candidates, in iterative step t, we sample pseudo-labeled candidates into training by setting the selection indicators as follows:
st=argmin‖st‖0=mtnl+nu∑i=nl+1sid(θ;xi) | (3.8) |
where mt denotes the size of selected pseudo-labeled set. st is the vertical concatenation of all si. (3.8) selects the top mt nearest unlabeled data for all the labeled data at the iteration step t.
We train the CNN model iteratively. In the initial iteration, we only use the labeled data with a random style transfer strategy to initialize the model. Then in each subsequent iteration, we first optimize the model through (3.4). Then we use (3.7) to estimate the pseudo label of the unlabeled data, and select some reliable pseudo-label data by applying the trained model on (3.8).
When selecting pseudo-labels, we adopt a dynamic sampling strategy to ensure the reliability of the selected pseudo-labeled samples. It starts with a small amount of pseudo-labeled data in the initial stage, and then merges more samples in the following stages. We set the sampled pseudo-labeled data m0=0 and unlabeled data M0=U at the beginning. In subsequent iterations, we gradually increase the size of the selected pseudo-labeled candidate set St. In iterative step t, we expand the size of the sampled pseudo-labeled data by setting mt=mt−1+p⋅nu, where p∈(0,1) is the selection factor, which represents the speed of magnifying the candidate set during the iteration. As described in Algorithm 1,
Algorithm 1 The proposed method |
Require: Labeled data L, unlabeled data U, selection factorr p∈(0,1) initialized CNN model θ0. Ensure: The best CNN model θ∗. 1: Initialize the selected pseudo-labeled data S0←∅, sampling size m1←p⋅nu, iteration step t←0, best validation performance V∗←0. 2: while mt+1≤‖U‖ do 3: t←t+1 4: Update the model (θt,wt) on L, St and Mt after random style transfer via (3.4). 5: Estimate pseudo labels for U via (3.7) 6: Generate the selection indicators st via (3.8) 7: Update the sampling size: mt+1←mt+p⋅nu 8: end while 9: for i←1 to T do 10: Evaluate θi on the validation set → performance Vi 11: if Vi>V∗ then 12: V∗, θ∗←Vi, θi 13: end if 14: end for |
We first introduce the datasets and settings show the results compared with advanced methods, and finally introduce ablation experiments and result analysis.
Market-1501 [18] contains 32,668 pieces of labeled data with 1,501 identities taken under 6 cameras. Among them, 12,936 labeled images with 751 identities are used as the training set, and 19,732 labeled images with 750 identities are used as the test set.
DukeMTMC-reID [21] is a re-ID dataset derived from the DukeMTMC dataset [54]. It contains 36,411 pieces of labeled data with 1,404 identities taken under 8 cameras. Among them, there are 16,522 training set images with 702 identities, 2,228 query images with 702 identities, and 17,661 gallery images.
Camarket is a dataset of the other 5 camera styles of the Market-1501 dataset generated by CycleGAN [49], with a total of 64,680 images. It's just a collection of images in the training set under other 5 camera styles. The generation of Camarket dataset can refer to [6].
Camduke is a dataset of the other 7 camera styles of the duke dataset generated CycleGAN [49], with a total of 115,654 images. It's just a collection of pictures in the training set under other 7 camera styles. The generation of Camduke dataset can refer to [6].
We employ the Cumulative Matching Characteristic (CMC) curve and the mean average precision (mAP) for re-ID evaluation. The CMC scores reflect the accuracy of response retrieval and we use Rank-1, Rank-5, Rank-10 scores to represent the CMC curve. Rank-1 recognition rate means that after matching according to a certain similarity matching rule, the ratio of the number of tests with the correct label to the total number of test samples can be judged for the first time. Rank-5 means that there are five opportunities to judge. The mAP is the average of the Average Precision (AP) for each query, and the AP refers to the average of precision.
For one-example experiments, we use the same protocol as [8]. In all datasets, we randomly select an image from camera 1 as the labeled data for each identity. If camera 1 does not record any data for an identity, we will randomly select a sample from the next camera to ensure that each identity has a sample as labeled data. Other data are used as unlabeled data. Before training, we first form a set, which contains each picture and its corresponding other camera style pictures on the camarket/camduke data. During training, labeled data and unlabeled data will be randomly converted into one picture in the corresponding set. In this paper, we only use an image (single-shot) as input, not a video (multi-shot).
We use ResNet-50 (with the last classification layer removed) as the feature embedding model for all experiments. We initialize it with ImageNet [57] pre-trained model. We set the temperature scalar τ to 0.1. The setting of λ will be discussed in subsection 4.1. In each model update step, Stochastic Gradient Descent (SGD) with a momentum of 0.5 and a weight decay of 0.0005 is used to optimize the parameters of 70 epochs with a batch size of 16. The total learning rate is initialized to 0.1. In the last 15 epochs, in order to stabilize the model training, we change the learning rate to 0.01 and set λ=1.
The recent work on one-example person re-ID did not eliminate the domain variation between different cameras. Our method solves this problem well. The re-ID performance of our method on the two large-scale re-ID datasets are summarized in Table 1. We use a selection factor of p=0.05. The baseline(λ=0.8) is a model training based on one-example label data. Ours(RST, 0.8) represents the result after using the random style transfer strategy when λ=0.8. Ours(RST+Avg, 0.8) represents the result of adopting random style transfer strategy and average feature estimation strategy when λ=0.8. Ours(RST+Avg, 0.9) represents the best result after adjusting the hyperparameters λ=0.9. Even if there is only one labeled example for each identity, our method achieves amazing performance on image-based re-ID tasks, i.e., we achieve 10.3% and 8.4% points of Rank-1 accuracy improvement over the Baseline (one-example) on Market-1501 and DukeMTMC-reID, respectively. The performance on two large-scale datasets proves the effectiveness of our method.
Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | |
LOMO [7] | 27.2 | 41.6 | 49.1 | 8.0 | 12.3 | 21.3 | 26.6 | 4.8 |
BoW [18] | 35.8 | 52.4 | 60.3 | 14.8 | 17.1 | 28.8 | 34.9 | 8.3 |
UDML [34] | 34.5 | - | - | 12.4 | 18.5 | - | - | 7.3 |
ISR [55] | 40.3 | 62.2 | - | 14.3 | - | - | - | - |
CAMEL [30] | 54.5 | 73.1 | - | 26.3 | 40.3 | 57.6 | - | 19.8 |
PUL [35] | 45.5 | 60.7 | 66.7 | 20.5 | 30.0 | 43.4 | 48.5 | 16.4 |
TJ-AIDL [36] | 58.2 | 74.8 | 81.1 | 26.5 | 44.3 | 59.6 | 65.0 | 23.0 |
PTGAN [50] | 38.6 | 57.3 | - | 15.7 | 27.4 | 43.6 | - | 13.5 |
SPGAN [37] | 51.5 | 70.1 | 76.8 | 22.8 | 41.1 | 56.6 | 63.0 | 22.3 |
SPGAN+LMP [37] | 57.7 | 75.8 | 82.4 | 26.7 | 46.4 | 62.3 | 68.0 | 26.2 |
HHL [38] | 62.2 | 78.8 | 84.0 | 31.4 | 46.9 | 61.0 | 66.7 | 27.2 |
Rank [56] | 26.0 | 41.4 | 49.2 | 9.0 | 16.4 | 27.9 | 32.8 | 6.8 |
Baseline [8] | 55.8 | 72.3 | 78.4 | 26.2 | 48.8 | 63.4 | 68.4 | 28.5 |
Ours(RST, 0.8) | 63.3 | 78.8 | 83.5 | 30.4 | 54.6 | 66.4 | 71.2 | 30.0 |
Ours(RST+Avg, 0.8) | 64.6 | 78.5 | 83.0 | 31.6 | 55.5 | 68.1 | 72.9 | 31.0 |
Ours(RST+Avg, 0.9) | 66.1 | 80.0 | 84.2 | 32.7 | 57.2 | 69.7 | 74.4 | 32.5 |
We compare our method with the state-of-the-art image-based person re-ID approaches. Among them, there are four hand-crafted feature representation methods (LOMO [7], BoW [18], UDML [34], ISR [55]), and multiple deep-learning-based methods. The latter includes three recent pseudo-label-learning-based methods(CAMEL [30], PUL [35], TJ-AIDL [36]), three domain-adaptation-based methods(PTGAN [50], SPGAN [37], HHL [38]) and two recent one-example-based methods (Rank [56], Baseline [8]). The results show that our method is the best compared with all methods on the two datasets. The main reason for the gap with the hand-crafted feature method is that most of the early works are based on heuristic design, so they cannot learn the best distinguishing features. Our method is superior to the unsupervised re-ID method based on pseudo-label learning. The main reason is that our average feature strategy on KNN can better assign pseudo-labels to unlabeled data. On the contrary, the pseudo-label learning of [30,35] directly compares the visual features, ignoring potential label information and camera variation. Compared with domain-adaptation-based approaches, our approach achieves superior performance. A key reason is that we have adopted style transfer and information mining for unlabeled data. After adopting the random style transfer strategy (Ours(RST, 0.8)), on the market-1501 dataset, Rank-1 and mAP increase by 7.5% and 4.2% than Baseline [8], respectively; on the DukeMTMC-reID dataset, Rank-1 and mAP increase by 4.8% and 1.5% than Baseline [8], respectively. On this basis, after using the average feature estimation strategy (Ours(RST+Avg, 0.8)), the Rank-1 and mAP of the market-1501 dataset and DukeMTMC-reID dataset can be improved again. After adjusting the hyperparameter λ=0.9, the best performance can be obtained: on the market-1501 dataset, Rank-1 and mAP increase by 10.3% and 6.5% than Baseline [8], respectively; on the DukeMTMC-reID dataset, Rank-1 and mAP increase by 8.4% and 4.0% than Baseline [8], respectively. Compared with the state-of-the-art image-based person re-ID methods, it can be seen that our method performs better than existing methods, and will get higher accuracy in many practical fields. The specific process of adjusting hyperparameters is in the next section.
p is a key parameter in our framework, and it controls the speed at which pseudo-marked data are selected in the iterative process. A smaller selection factor indicates a lower selection speed, therefore, more iteration steps and training time are required. The results of different selection factors can be found in Figure 4. Figure 4a is the result of Rank-1 with different selection factors on Marekt-1501. Figure 4b is the result of Rank-1 with different selection factors on DukeMTMC-reID. The x-axis represents the ratio of selected data from the entire unlabeled dataset. Here we set λ=0.8. In experiments, a smaller selection factor can produce better performance. An important reason is that each selection step is more cautious, so the label estimation is more accurate. We can also find that the gap between the five curves in the first few iterations is relatively small, and the gap gradually increases in the subsequent iterations, which indicates that the estimation error continues to accumulate during the iteration.
The hyperparameter λ is the only parameter that adjusts the proportion of labeled data, pseudo-labeled data and unlabeled data in the optimization process. It represents the contribution degree of labeled data and pseudo-labeled data in training. We have selected three values, λ=0.7, λ=0.8 (λ=0.8 in the baseline), λ=0.9, The results with λ=0.7, λ=0.8, λ=0.9, are shown in Figure 5. Figure 5a is the Rank-1 result of different hyperparameter λ on Marekt-1501. Figure 5b is the Rank-1 result of different hyperparameters λ on DukeMTMC-reID. The x-axis represents the ratio of data selected from the entire unlabeled dataset. Here we set p=0.05. As seen, λ=0.9 can achieve the best performance under different selection factors. An important reason is that the random style transfer strategy eliminates camera variation and the average feature estimation strategy makes the label estimation more accurate, so the labeled data and pseudo-labeled data become more reliable.
The effectiveness of the average feature estimation strategy. The average feature estimation strategy uses the average feature of the labeled data and its corresponding style transfer data to evaluate the pseudo-label. The results of it under different selection factors are shown in Table 2. Here we only take values of 0.05, 0.10, 0.15 for p. Baseline(0.8) and Avg(0.8) represent the result of baseline when λ=0.8 and the result after we only adopt the average feature estimation strategy when λ=0.8, respectively. As seen, using the average feature estimation strategy in the experiment can produce better performance. An important reason is that the average feature is better than the original feature to reduce the impact of camera style variation, so the pseudo label estimation is more accurate.
Selection factor | Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | ||
p=0.05 | Baseline(0.8) | 55.8 | 72.3 | 78.4 | 26.2 | 48.8 | 63.4 | 68.4 | 28.5 |
Avg(0.8) | 59.5 | 74.9 | 80.7 | 30.1 | 51.7 | 66.2 | 70.7 | 29.1 | |
p=0.10 | Baseline(0.8) | 51.5 | 66.8 | 73.6 | 23.2 | 40.5 | 53.9 | 60.2 | 21.8 |
Avg(0.8) | 60.1 | 75.0 | 81.0 | 27.4 | 53.1 | 64.6 | 69.2 | 28.3 | |
p=0.15 | Baseline(0.8) | 44.8 | 61.8 | 69.1 | 19.2 | 35.1 | 49.1 | 54.3 | 18.2 |
Avg(0.8) | 51.0 | 68.1 | 74.1 | 23.6 | 51.3 | 63.1 | 67.8 | 26.4 |
The random style transfer strategy is to randomly change the camera style of labeled data and unlabeled data in the training phase. As shown in Table 3, here p is set to 0.05, Baseline-start(0.8) represents the model initialized in the first iteration when the random style conversion strategy is not used; Ours-start(RST, 0.8) represents the model initialized in the first iteration when the random style conversion strategy is used. As seen, using a random style transfer strategy can achieve better performance. This is because the random style transfer strategy eliminates the style variation between cameras, so the obtained initialization model will be better.
Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | |
Baseline-start(0.8) | 28.3 | 43.6 | 51.0 | 10.1 | 17.2 | 28.1 | 34.1 | 7.1 |
Ours-start(RST, 0.8) | 34.0 | 53.0 | 61.6 | 12.4 | 32.9 | 48.1 | 54.4 | 13.7 |
In this paper, we propose a random style transfer strategy and the average feature estimation strategy. In the training process, we adopt a random style transfer strategy to randomly change the styles of labeled data and unlabeled data. In the pseudo-label evaluation process, we adopt an average feature estimation strategy to more accurately evaluate pseudo-labels for unlabeled data. These two strategies eliminate the camera style variation during the training process and the pseudo-label evaluation process. The obvious performance improvement proves the effectiveness of our method.
In the future, we hope to use these two strategies to improve the accuracy of video-based person re-ID tasks.
The work is partially supported by the National Natural Science Foundation of China (Nos. U1836216, 61772322, 62076153), the major fundamental research project of Shandong, China (No. ZR2019ZD03), and the Taishan Scholar Project of Shandong, China (No. ts20190924).
The authors declare there is no conflict of interests.
[1] | Z. Pawlak, Rough sets, International Journal of Computer and Information Sciences, 11 (1982), 341–356. |
[2] |
Z. Bonikowski, E. Bryniarski, U. Wybraniec, Extensions and intentions in the rough set theory, Inform. Sciences, 107 (1998), 149–167. doi: 10.1016/S0020-0255(97)10046-9
![]() |
[3] |
A. Jackson, Z. Pawlak, S. LeClair, Rough sets applied to the discovery of materials knowledge, J. Alloy. Compd., 279 (1998), 14–21. doi: 10.1016/S0925-8388(98)00607-0
![]() |
[4] |
E. Lashin, A. Kozae, A. Khadra, T. Medhat, Rough set theory for topological spaces, Int. J. Approx. Reason., 40 (2005), 35–43. doi: 10.1016/j.ijar.2004.11.007
![]() |
[5] | Z. Pawlak, Rough Sets: Theoretical aspects of reasoning about data, Springer Netherlands, 1991. |
[6] |
D. Pei, On definable concepts of rough set models, Inform. Sciences, 177 (2007), 4230–4239. doi: 10.1016/j.ins.2007.01.020
![]() |
[7] |
Y. Y. Yao, Views of the theory of rough sets in finite universes, Int. J. Approx. Reason., 15 (1996), 291–317. doi: 10.1016/S0888-613X(96)00071-0
![]() |
[8] |
Y. Y. Yao, Relational interpretations of neighborhood operators and rough set approximation operators, Inform. Sciences, 111 (1998), 239–259. doi: 10.1016/S0020-0255(98)10006-3
![]() |
[9] |
Y. Y. Yao, Three-way decision: An interpretation of rules in rough set theory, Lect. Notes Comput. Sc., 5589 (2009), 642–649. doi: 10.1007/978-3-642-02962-2_81
![]() |
[10] |
W. Zhu, Relationship between generalized rough sets based on binary relation and covering, Inform. Sciences, 179 (2009), 210–225. doi: 10.1016/j.ins.2008.09.015
![]() |
[11] | G. Xun, On covering approximation subspaces, Computer Science Journal of Moldova, 17 (2009), 74–88. |
[12] |
G. Xun, An application of covering approximation spaces on network security, Comput. Math. Appl., 60 (2010), 1191–1199. doi: 10.1016/j.camwa.2010.05.043
![]() |
[13] | G. Xun, L. Jinjin, G. Ying, Some separations in covering approximation spaces, International Journal of Computational and Mathematical Sciences, 4 (2010), 156–160. |
[14] | M. Kondo, On the structure of generalized rough sets, Inform. Sciences, 176 (2006), 586–600. |
[15] |
E. Lashin, T. Medhat, Topological reduction of information systems, Chaos, Soliton. Fract., 25 (2005), 277–286. doi: 10.1016/j.chaos.2004.09.107
![]() |
[16] |
Y. Leung, W. Wu, W. Zhang, Knowledge acquisition in incomplete information systems: A rough set approach, Eur. J. Oper. Res., 168 (2006), 164–180. doi: 10.1016/j.ejor.2004.03.032
![]() |
[17] | K. Qin, Y. Gao, Z. Pei, On covering rough sets, In: Rough Sets and Knowledge Technology, Springer, Berlin, Heidelberg, 2007, 34–41. |
[18] | W. Żakowski, Approximations in the space (U,Π), Demonstration Mathematica, 16 (1983), 761–769. |
[19] |
Y. Y. Yao, On generalizing rough set theory, Lect. Notes Comput. Sc., 2639 (2003), 44–51. doi: 10.1007/3-540-39205-X_6
![]() |
[20] |
W. Zhu, Topological approaches to covering rough sets, Inform. Sciences, 177 (2007), 1499–1508. doi: 10.1016/j.ins.2006.06.009
![]() |
[21] |
W. Zhu, F. Wang, Reduction and axiomization of covering generalized rongh sets, Inform. Sciences, 152 (2003), 217–230. doi: 10.1016/S0020-0255(03)00056-2
![]() |
[22] |
W. Zhu, F. Wang, On three types of covering rough sets, IEEE T. Knowl. Data En., 19 (2007), 1131–1144. doi: 10.1109/TKDE.2007.1044
![]() |
[23] |
Á. Császár, Separation axioms for generalizaed topologies, Acta Math. Hung., 104 (2004), 63–69. doi: 10.1023/B:AMHU.0000034362.97008.c6
![]() |
[24] | R. Engelking, General topology: Rrevised and completed edition, Heldermann Verlag, 1989. |
[25] |
T. Noiri, E. Hatir, Λsp-sets and some weak separation axioms, Acta Math. Hung., 103 (2004), 225–232. doi: 10.1023/B:AMHU.0000028409.42549.72
![]() |
[26] |
Z. Yun, X. Ge, X. Bai, Axiomatization and conditions for neighborhoods in a covering to form a partition, Inform. Sciences, 181 (2011), 1735–1740. doi: 10.1016/j.ins.2011.01.013
![]() |
[27] | J. G. Bazan, A. Skowron, P. Synak, Dynamic reducts as a tool or extracting laws from decisions tables, In: Methodologies for Intelligent Systems, 1994,346–355. |
1. | Junkai Deng, Zhanxiang Feng, Peijia Chen, Jianhuang Lai, 2021, Chapter 30, 978-3-030-88003-3, 366, 10.1007/978-3-030-88004-0_30 |
Algorithm 1 The proposed method |
Require: Labeled data L, unlabeled data U, selection factorr p∈(0,1) initialized CNN model θ0. Ensure: The best CNN model θ∗. 1: Initialize the selected pseudo-labeled data S0←∅, sampling size m1←p⋅nu, iteration step t←0, best validation performance V∗←0. 2: while mt+1≤‖U‖ do 3: t←t+1 4: Update the model (θt,wt) on L, St and Mt after random style transfer via (3.4). 5: Estimate pseudo labels for U via (3.7) 6: Generate the selection indicators st via (3.8) 7: Update the sampling size: mt+1←mt+p⋅nu 8: end while 9: for i←1 to T do 10: Evaluate θi on the validation set → performance Vi 11: if Vi>V∗ then 12: V∗, θ∗←Vi, θi 13: end if 14: end for |
Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | |
LOMO [7] | 27.2 | 41.6 | 49.1 | 8.0 | 12.3 | 21.3 | 26.6 | 4.8 |
BoW [18] | 35.8 | 52.4 | 60.3 | 14.8 | 17.1 | 28.8 | 34.9 | 8.3 |
UDML [34] | 34.5 | - | - | 12.4 | 18.5 | - | - | 7.3 |
ISR [55] | 40.3 | 62.2 | - | 14.3 | - | - | - | - |
CAMEL [30] | 54.5 | 73.1 | - | 26.3 | 40.3 | 57.6 | - | 19.8 |
PUL [35] | 45.5 | 60.7 | 66.7 | 20.5 | 30.0 | 43.4 | 48.5 | 16.4 |
TJ-AIDL [36] | 58.2 | 74.8 | 81.1 | 26.5 | 44.3 | 59.6 | 65.0 | 23.0 |
PTGAN [50] | 38.6 | 57.3 | - | 15.7 | 27.4 | 43.6 | - | 13.5 |
SPGAN [37] | 51.5 | 70.1 | 76.8 | 22.8 | 41.1 | 56.6 | 63.0 | 22.3 |
SPGAN+LMP [37] | 57.7 | 75.8 | 82.4 | 26.7 | 46.4 | 62.3 | 68.0 | 26.2 |
HHL [38] | 62.2 | 78.8 | 84.0 | 31.4 | 46.9 | 61.0 | 66.7 | 27.2 |
Rank [56] | 26.0 | 41.4 | 49.2 | 9.0 | 16.4 | 27.9 | 32.8 | 6.8 |
Baseline [8] | 55.8 | 72.3 | 78.4 | 26.2 | 48.8 | 63.4 | 68.4 | 28.5 |
Ours(RST, 0.8) | 63.3 | 78.8 | 83.5 | 30.4 | 54.6 | 66.4 | 71.2 | 30.0 |
Ours(RST+Avg, 0.8) | 64.6 | 78.5 | 83.0 | 31.6 | 55.5 | 68.1 | 72.9 | 31.0 |
Ours(RST+Avg, 0.9) | 66.1 | 80.0 | 84.2 | 32.7 | 57.2 | 69.7 | 74.4 | 32.5 |
Selection factor | Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | ||
p=0.05 | Baseline(0.8) | 55.8 | 72.3 | 78.4 | 26.2 | 48.8 | 63.4 | 68.4 | 28.5 |
Avg(0.8) | 59.5 | 74.9 | 80.7 | 30.1 | 51.7 | 66.2 | 70.7 | 29.1 | |
p=0.10 | Baseline(0.8) | 51.5 | 66.8 | 73.6 | 23.2 | 40.5 | 53.9 | 60.2 | 21.8 |
Avg(0.8) | 60.1 | 75.0 | 81.0 | 27.4 | 53.1 | 64.6 | 69.2 | 28.3 | |
p=0.15 | Baseline(0.8) | 44.8 | 61.8 | 69.1 | 19.2 | 35.1 | 49.1 | 54.3 | 18.2 |
Avg(0.8) | 51.0 | 68.1 | 74.1 | 23.6 | 51.3 | 63.1 | 67.8 | 26.4 |
Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | |
Baseline-start(0.8) | 28.3 | 43.6 | 51.0 | 10.1 | 17.2 | 28.1 | 34.1 | 7.1 |
Ours-start(RST, 0.8) | 34.0 | 53.0 | 61.6 | 12.4 | 32.9 | 48.1 | 54.4 | 13.7 |
Algorithm 1 The proposed method |
Require: Labeled data L, unlabeled data U, selection factorr p∈(0,1) initialized CNN model θ0. Ensure: The best CNN model θ∗. 1: Initialize the selected pseudo-labeled data S0←∅, sampling size m1←p⋅nu, iteration step t←0, best validation performance V∗←0. 2: while mt+1≤‖U‖ do 3: t←t+1 4: Update the model (θt,wt) on L, St and Mt after random style transfer via (3.4). 5: Estimate pseudo labels for U via (3.7) 6: Generate the selection indicators st via (3.8) 7: Update the sampling size: mt+1←mt+p⋅nu 8: end while 9: for i←1 to T do 10: Evaluate θi on the validation set → performance Vi 11: if Vi>V∗ then 12: V∗, θ∗←Vi, θi 13: end if 14: end for |
Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | |
LOMO [7] | 27.2 | 41.6 | 49.1 | 8.0 | 12.3 | 21.3 | 26.6 | 4.8 |
BoW [18] | 35.8 | 52.4 | 60.3 | 14.8 | 17.1 | 28.8 | 34.9 | 8.3 |
UDML [34] | 34.5 | - | - | 12.4 | 18.5 | - | - | 7.3 |
ISR [55] | 40.3 | 62.2 | - | 14.3 | - | - | - | - |
CAMEL [30] | 54.5 | 73.1 | - | 26.3 | 40.3 | 57.6 | - | 19.8 |
PUL [35] | 45.5 | 60.7 | 66.7 | 20.5 | 30.0 | 43.4 | 48.5 | 16.4 |
TJ-AIDL [36] | 58.2 | 74.8 | 81.1 | 26.5 | 44.3 | 59.6 | 65.0 | 23.0 |
PTGAN [50] | 38.6 | 57.3 | - | 15.7 | 27.4 | 43.6 | - | 13.5 |
SPGAN [37] | 51.5 | 70.1 | 76.8 | 22.8 | 41.1 | 56.6 | 63.0 | 22.3 |
SPGAN+LMP [37] | 57.7 | 75.8 | 82.4 | 26.7 | 46.4 | 62.3 | 68.0 | 26.2 |
HHL [38] | 62.2 | 78.8 | 84.0 | 31.4 | 46.9 | 61.0 | 66.7 | 27.2 |
Rank [56] | 26.0 | 41.4 | 49.2 | 9.0 | 16.4 | 27.9 | 32.8 | 6.8 |
Baseline [8] | 55.8 | 72.3 | 78.4 | 26.2 | 48.8 | 63.4 | 68.4 | 28.5 |
Ours(RST, 0.8) | 63.3 | 78.8 | 83.5 | 30.4 | 54.6 | 66.4 | 71.2 | 30.0 |
Ours(RST+Avg, 0.8) | 64.6 | 78.5 | 83.0 | 31.6 | 55.5 | 68.1 | 72.9 | 31.0 |
Ours(RST+Avg, 0.9) | 66.1 | 80.0 | 84.2 | 32.7 | 57.2 | 69.7 | 74.4 | 32.5 |
Selection factor | Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | ||
p=0.05 | Baseline(0.8) | 55.8 | 72.3 | 78.4 | 26.2 | 48.8 | 63.4 | 68.4 | 28.5 |
Avg(0.8) | 59.5 | 74.9 | 80.7 | 30.1 | 51.7 | 66.2 | 70.7 | 29.1 | |
p=0.10 | Baseline(0.8) | 51.5 | 66.8 | 73.6 | 23.2 | 40.5 | 53.9 | 60.2 | 21.8 |
Avg(0.8) | 60.1 | 75.0 | 81.0 | 27.4 | 53.1 | 64.6 | 69.2 | 28.3 | |
p=0.15 | Baseline(0.8) | 44.8 | 61.8 | 69.1 | 19.2 | 35.1 | 49.1 | 54.3 | 18.2 |
Avg(0.8) | 51.0 | 68.1 | 74.1 | 23.6 | 51.3 | 63.1 | 67.8 | 26.4 |
Methods | Marekt-1501 | DukeMTMC-reID | ||||||
Rank-1 | Rank-5 | Rank-10 | mAP | Rank-1 | Rank-5 | Rank-10 | mAP | |
Baseline-start(0.8) | 28.3 | 43.6 | 51.0 | 10.1 | 17.2 | 28.1 | 34.1 | 7.1 |
Ours-start(RST, 0.8) | 34.0 | 53.0 | 61.6 | 12.4 | 32.9 | 48.1 | 54.4 | 13.7 |