PyTorch Deep learning practice <http://www.ituring.com.cn/book/2456>

4　 loss function

loss function , Also called objective function , Is one of the two parameters required to compile a neural network model . Another essential parameter is the optimizer .

The loss function is the function used to calculate the difference between the tag value and the predicted value , In the process of machine learning , There are many loss functions to choose from , Typical vector with distance , Absolute value vector, etc .

The figure above is a schematic diagram used to simulate the automatic learning of linear equations . Thick lines are real linear equations , The dotted line is an illustration of the iterative process ,w1 Is the weight of the first iteration ,w2 Is the weight of the second iteration ,w3
Is the weight of the third iteration . As the number of iterations increases , Our goal is to make wn Infinitely close to real value .

So how w How about infinite close to the real value ? In fact, this is the function of loss function and optimizer . In the picture 1/2/3 These three labels are 3 Prediction in the next iteration Y Value and reality Y
Difference between values （ The difference here is what the loss function means , Yes, of course , There are many formulas for calculating the difference in practical application ）, The difference diagram here is represented by absolute difference , So there's a square difference in multidimensional space , Mean square error and other different distance calculation formulas , That's the loss function , Is that easy to understand ?

Here is the case of a one-dimensional equation , So let's use our imagination , Extending to multi dimensions , Is it the essence of deep learning ?

The following describes several common loss function calculation methods ,pytorch Many types of predefined loss functions are defined in , It's not too late to learn the formula when you need to use it .

Let's define two two-dimensional arrays , Then, different loss functions are used to calculate the loss value .
import torch from torch.autograd import Variable import torch.nn as nn import
torch.nn.functional as F sample = Variable(torch.ones(2,2)) a=torch.Tensor(2,2)
a[0,0]=0 a[0,1]=1 a[1,0]=2 a[1,1]=3 target = Variable (a)
sample The value of is ：[[1,1],[1,1]].

target The value of is ：[[0,1],[2,3]].

4.1　nn.L1Loss

L1Loss The calculation method is very simple , Take the average of absolute error between predicted value and real value .
criterion = nn.L1Loss() loss = criterion(sample, target) print(loss)
The end result is ：1.

Its computational logic is like this ：

First calculate the sum of absolute differences ：|0-1|+|1-1|+|2-1|+|3-1|=4;

And then average it ：4/4=1.

4.2　nn.SmoothL1Loss

SmoothL1Loss Also called Huber Loss, The error is within (-1,1) It's a square loss , Other situations are L1 loss .

criterion = nn.SmoothL1Loss() loss = criterion(sample, target) print(loss)
The end result is ：0.625.

4.3　nn.MSELoss

Square loss function . The formula is the average of the sum of squares between the predicted value and the real value .

criterion = nn.MSELoss() loss = criterion(sample, target) print(loss)
The end result is ：1.5.

4.4　nn.BCELoss

Cross entropy for binary classification , The calculation formula is complex , Here is mainly a concept , Not normally used .

criterion = nn.BCELoss() loss = criterion(sample, target) print(loss)
The end result is ：-13.8155.

4.5　nn.CrossEntropyLoss

Cross entropy loss function

This formula is also used more , For example, this formula is often used in image classification neural network model .
criterion = nn.CrossEntropyLoss() loss = criterion(sample, target) print(loss)
The end result is ： report errors , It doesn't look like it can be used directly !

We know by looking at the documents nn.CrossEntropyLoss
The loss function is used for image recognition verification , There are various requirements for input parameters , Here is the concept , In the article of image recognition, there will be a correct way to use it .

4.6　nn.NLLLoss

Negative log likelihood loss function （Negative Log Likelihood）

Put one in front LogSoftMax Layers are equivalent to cross entropy loss . Pay attention here xlabel It's not the same as the last cross entropy loss , Here is the passage log
Value after operation . This loss function is also used in image recognition model .
criterion = F.nll_loss() loss = criterion(sample, target) print(loss)
loss=F.nll_loss(sample,target)
The end result is ： report errors , It doesn't look like it can be used directly !

Nn.NLLLoss and nn.CrossEntropyLoss The functions of are very similar ! It is usually used in multi classification model , In practical application, we usually use NLLLoss Quite a lot .

4.7　nn.NLLLoss2d

It's similar to the above , But a few more dimensions , Usually used on pictures .

*
input, (N, C, H, W)

*
target, (N, H, W)

For example, when using full convolution network to do classification , Finally, each point in the image predicts a category label .
criterion = nn.NLLLoss2d() loss = criterion(sample, target) print(loss)
The end result is ： report errors , It doesn't look like it can be used directly !