我制作的工程代码在这：Tiny_Face_Recognition

ArcFace论文：https://arxiv.org/abs/1801.07698

参考代码：https://github.com/deepinsight/insightface/blob/master/src/eval/verification.py

参考代码：https://github.com/auroua/InsightFace_TF

ArcFace test流程

上上一篇的训练流程的分析中，我去除了测试流程的分析，今天主要做一下测试相关。

源代码中，经过Batch中的数据经过Resnet后，又经过了ArcFace Loss，再输入inference loss作cross_entropy，代码如下：

net = get_resnet(images, args.net_depth, type='ir', w_init=w_init_method, trainable=True, keep_rate=dropout_rate)
logit = arcface_loss(embedding=net.outputs, labels=labels, w_init=w_init_method, out_num=args.num_output)
inference_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logit, labels=labels))

当然这是训练相关的，在测试的时候，我们也需要一个Tensor，能够计算测试图片通过ResNet之后得到的embeddings,结构和训练的是一样的，要注意这个结果并不通过ArcFace Loss，把这个run出来的话不需要输入标签，也就是说要run的话feed_dict只需要x和dropout就行：

test_net = get_resnet(images, args.net_depth, type='ir', w_init=w_init_method, trainable=False, reuse=True, keep_rate=dropout_rate)
embedding_tensor = test_net.outputs

然后做一个函数ver_test，输入测试的数据，输出一个准确率(accuracy)，即可方便在训练中途的时候调用：

def ver_test(
    ver_list=ver_list,    # 测试用的数据集，可能包含lfw, cfp_ff, cfp_fp, agedb_30等等，我们这里默认只考虑lfw
    ver_name_list=ver_name_list,  # 测试用的数据集名称集合
    nbatch=count,                 # 第几个batch
    sess=sess,                    # 当前的会话sess
    embedding_tensor=embedding_tensor,  # 上述的test张量
    batch_size=args.batch_size,         # batch size
    feed_dict=feed_dict_test,           # feed_dict
    input_placeholder=images,           # placeholder和image一致
)

那么这个ver_test的内部细节是如何的？

由于ver_test可能包含了一系列的待验证的测试集（输入是一个ver_list），所以我们再做一个test函数，输入单独的一个测试集，而ver_test就相当于test多运行几遍而已，即：

def test(
    data_set,         # data_set包含测试用数据和issame_list
    sess,
    embedding_tensor,
    batch_size,
    label_shape,
    feed_dict,
    input_placeholder
)

test函数的流程大致如下：

迭代data_set中的每个batch(按照batch_size)
对每个batch进行 batch -= 127.5 和 batch *= 0.0078125的操作
dropout默认设置为1，x设置为batch，一起作为feed_dict，run出来embeddings(一个batch的embedding)
一串奇怪的正则操作
将embeddings和issame_list输入evaluate函数中得到accuracy，val, val_std(标准差)和far

evaluate的输入是embeddins拆成两部分emb1和emb2，函数包含两部分：

calculate_roc(thresholds, emb1, emb2, issame, n_fold=10, pca=0)
calculate_val(thresholds, emb1, emb2, issame, n_fold=10)

第一个是算阈值在range(0, 4, 0.01)的序列中的Roc，即返回tpr, fpr和accuracy，均是一一对应阈值Threshold

第二个是算阈值在range(0, 4, 0.001)的序列中的value，返回val(均值),val_std(标准差)和far(均值)

这俩操作也很容易理解。

roc就是先算两个emb之间的距离，然后和当前Threshold进行对比，KFold中更新几个参数。ROC就更新tpr($\frac{TP}{TP+FN}$)和fpr($\frac{FP}{FP + TN}$)

val就是也算两个emb之间的距离，然后和当前Threshold进行对比，计算true_accept和false_accept，计算n_same(np.sum(issame))和n_diff(np.sun(np.logical_not(issame)))，val就是true_accept/n_same，far就是false_accept/n_diff，最终输出val的均值和标准差和far的均值

ArcFace 评测结果

论文作者在三大评测数据集上得到的评测结果如下图：

这里面告诉了我们五个结论：

$W_j$ is nearly synchronised with embeddig feature centre for ArcFace(14.29°)，but there is an obvious deviation(44.26°) between $W_j$ and the embedding feature centre for Norm-Softmax. Therefore, the angles between $W_j$ cannot absolutely represent the inter-class discrepancy on training data. Alternatively, the embedding feature centres calculated by the trained network are more representative.

大体意思就是Norm-Softmax有个比较大的偏差。用Angle算出来的特征更具有代表性。

Intra-Loss can effectively compress intra-class variations but also brings in smaller inter-class angles.

加了Intra-Loss的能更有效率的压缩类内的距离，但也会让类间的角度更小一些

Inter-Loss can slightly increase inter-class discrepancy on both W(directly) and the embedding network(indirectly), but also raises intra-class angles.

和上面的Intra-Loss对应，说明了优点和缺点。

ArcFace already has very good intra-class compactness and inter-class discrepancy.

ArcFace有很好的高内聚低耦合的效果了。

Triplet-Loss has similar intra-class compactness but inferior inter-class discrepancy compared to ArcFace. In addition, ArcFace has a more distinct margin than Triplet-Loss on the test set as illustrated in 下图.

我们可以看下面这图，Triplet-Loss之间有交叉，说明，不同类之间的差距并没有达到很好，而ArcFace达到了高内聚低耦合的效果。

在六个不通过测试集上的角度分布图如下：

在Mega Face上的CMC和ROC曲线如下图（与别的情况相比），可以看到ArcFace的面积一直都是比较高的

其他的测试结果都大同小异，那么我自己的测试结果由于时间原因，自然是：

//TODO