recsys
    
    
  深度推荐系统中的损失函数
      
        YeeKal
      
      •
      
      •
      
  
        "#recsys"
      
    sampled softmax loss: 通过采样部分负样本代替整体分母
Bayesian Personalized Ranking(BPR) loss
negative Sampling
- 只有一层Embedding层,即embedding可以通过label取出来,则直接在线选择
 - 把负样本的label作为输入
 - 把batch中其他样本作为负样本
 
triplet less
- 目标是拉近与正样本的距离,拉远与负样本的距离
 - 
easy triplets: 正样本距离本来就很近,不需要优化,或者说优化的意义不大
 - 
hard triplets: $d(q,d_+) > d(q,d_-)$, 正样本的距离比负样本还要远
 - 
semi-hard triplet: 距离适中
 

在人脸识别领域,anchor和负样本是同一种事物,都是人脸;而在搜索推荐领域,anchor一般为用户,政府样本为物品。这样在构造数据集的方法上略有不同。
实现的多种方式:
- online: 在同一个batch中在线计算选择正负样本
 - offline: 手动选择正负样本
- batch all: select all the valid triplets, and average the loss on the hard and semi-hard triplets. 
- crucial point here is to not take into account the easy triplets (those with loss 0 ), as averaging on them would make the overall loss very small $\circ$
 - this produces a total of $P K(K-1)(P K-K)$ triplets $(P K$ anchors, $K-1$ possible positives per anchor, $P K-K$ possible negatives)
 
 - batch hard: for each anchor, select the hardest positive (biggest distance $d(a, p))$ and the hardest negative among the batch
- this produces $P K$ triplets
 - the selected triplets are the hardest among the batch
 
 
 - batch all: select all the valid triplets, and average the loss on the hard and semi-hard triplets. 
 
实现
offline:
anchor_output = ...    # shape [None, 128]
positive_output = ...  # shape [None, 128]
negative_output = ...  # shape [None, 128]
d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output), 1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output), 1)
loss = tf.maximum(0.0, margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)
online:
- Triplet loss in TensorFlow
 - [tensorflow semihard]