Skip to content

被困擾了好一陣子,總算是找到原因了,筆記一下

Notifications You must be signed in to change notification settings

LunaticGhoulPiano/Why-loss-is-greater-than-1-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 

Repository files navigation

tensorflow.keras.losses.categorical_crossentropy

現在要使用LSTM模型進行五元分類:[1, 2, 3, 4, 5]
使用activation function = softmax, loss function = categorical_crossentropy的組合

公式

$loss = - \frac{1}{n} \sum_{i}^{n} y_{true} \ln(y_{pred})$
n = 資料集總筆數,y_{true}是真實值,y_pred是預測值

範例

[1, 2, 3, 4, 5]在經過one-hot encoding之後:
[[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1]]

整個資料集為:
[3, 5, 2, 2, 3, 4]
經過one-hot encding後:
[[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.]]$
在這個epoch結束時,預測值為:
[[0.6268896, 0.25881532, 0.09150913, 0.02037693, 0.00240903],
[0.09165806, 0.3015705, 0.2891071, 0.26664162, 0.05102272],
[0.08631954, 0.11195231, 0.28462902, 0.30316573, 0.21393338],
[0.11869003, 0.18319334, 0.18643036, 0.3534843, 0.15820198],
[0.10678157, 0.18504941, 0.25989717, 0.26833403, 0.17993775],
[0.11031441, 0.15937562, 0.18078934, 0.4133129, 0.13620767]]

則每一筆資料都會有對應的loss:
loss 1 = - (0 * ln(0.6268896) + 0 * ln(0.25881532) + 1 * ln(0.09150913) + 0 * ln(0.02037693) + 0 * ln(0.00240903)) = -ln(0.09150913) = 2.39131654,真實值3對應的機率是0.09150913
loss 2 = - (0 * ln(0.09165806) + 0 * ln(0.3015705) + 0 * ln(0.2891071) + 0 * ln(0.26664162) + 1 * ln(0.05102272)) = -ln(0.05102272) = 2.97548426,真實值5對應的機率是0.05102272
loss 3 = ... = 2.18968228,真實值2對應的機率是0.11195231
loss 4 = ... = 1.69721319,真實值2對應的機率是0.18319334
loss 5 = ... = 1.34746916,真實值3對應的機率是0.25989717
loss 6 = ... = 0.88355029,真實值4對應的機率是0.4133129

而這個epoch中的loss會是(2.39131654 + 2.97548426 + 2.18968228 + 1.69721319 + 1.34746916 + 0.88355029) / 6 = 1.9141192850880648

loss = 1的情況

因為使用自然對數,而loss = -ln(e^(-1)) = 1表示這五類的真實值對應的機率都是e^(-1),約等於0.36

loss > 1的情況

所以根據上面loss = 1的情況判斷,當這6筆(所有筆數)真實值對應的機率最大都是e^(-1),並且至少其中一筆的機率 < e^(-1),loss就會 > 1
例如:
probilities = [e^(-1) - 0.0001, e^(-1), e^(-1), e^(-1), e^(-1), e^(-1)]
losses = [1.0002718651348228, 1, 1, 1, 1, 1]
-> final loss = (1.0002718651348228 + 1.0 + 1.0 + 1.0 + 1.0 + 1.0) / 6 = 1.0002718651348228

About

被困擾了好一陣子,總算是找到原因了,筆記一下

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published