提出一个标签估计方法:a novel Robust Anchor Embeding (RACE) framework。

Proposed Method


论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP




随机抽选 m 个anchor序列 论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP 传入预训练的ImageNet模型,分别表示不同的行人,即:论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP,其中 论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP 表示帧级特征向量的集合,l 表示对应的初始化标签。

在本文中,采用classification loss(Person re-identification: Past, present and future. 提出)来作为训练的基础结构。【待阅读】


① 鲁棒的Anchor嵌入方法:

定义未标签的视频序列为:论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP。初始的帧级特征向量集合采用平均池化或者最大池化转化为单向量特征。考虑到一些帧存在跟踪偏差,即产生了离群帧(outlier frame),作者采用了regularized affine hull(RAH,From point to set: Extend the learning of distance metrics提出)【待阅读】,理解为对帧进行加权,得到 d 维的特征向量,即:

论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP

对于标签估计,首先学习embedding向量(姑且叫做嵌入向量)w, 用于衡量未标签的特征序列论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP和anchor集合论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP间的关系。学习到第 i 个未标签序列的最近的 k 个anchors,即论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP,k 远远小于 m,用这 k 个anchors来联合表示该未标签序列,即定义如下系数学习问题(Robust AnChor Embeding问题,RACE):

论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP

该公式的第一项为embedding term,旨在限制未标签项与anchors之间的差异;

第二项为smoothing term,旨在权重越大的anchor距离越近,其中 d 为相似度,理解为到各个anchor的距离,⊙ 为对应元素相乘,该项计算为:

论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP



论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP


(4)top-k count 标签估计:

如果两个视频序列属于同一个行人,那么它们在不同的衡量维度上需要非常接近。具体来说,如果未标签序列 x 属于行人论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP,需要满足两个条件:

① 论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP应当是距离 x 最近的部分anchor之一,定义为:论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP

② 论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP应当足够大。


论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP

其中论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP表示论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP中的排名。

【疑问:论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP已经是最近的 k 个最近的anchor了,为什么还要判断是不是最近的 k' 个?】

Experimental Results


① 数据集:PRID-2011,iLIDS-VID,MARS;

② 参数设置:dropou = 0.5;图片resize = 128*256;learning rate(MARS)= 0.003,learning rate(PRID-2011, iLIDS-VID) = 0.01,并每20个epoch下降0.1;k = 15,k’ = 1;λ = 0.1。


论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP

论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP

论文阅读笔记(二十三)【ECCV2018】:Robust Anchor Embedding for Unsupervised Video Person Re-Identification in the Wild-LMLPHP

