Computer vision application of convolutional neural network（ Two）
Before we turn on the second section of convolutional neural network, Let's review these knowledge points first, What determines the number of convolution kernel channels in a convolution neural network, What is determined by the number of convolution kernels in convolution neural networks, If we haven't figured out these knowledge points, We can get to know each other again.
A typical layer in convolutional network contains three levels. In the first level, In this layer, multiple convolutions are calculated in parallel to generate a set of linear activation correspondences. At level second
in, Each linear activation will pass through a nonlinear activation function, For example, rectifier linear activation function. This level is sometimes called the detection level. In the third level, We use pooling functions to further adjust the output of this layer.
A typical layer in convolutional network contains three levels. In the first level, This one Layer parallel computing multiple convolutions generate a set of linear activation correspondences. In the second level, Each linear activation will pass through a nonlinear activation function, example
Such as rectifier linear activation function. This level is sometimes called the detection level. In the third level, We use pooling functions to further adjust the output of this layer.
The pooling function uses the overall characteristics of the adjacent outputs at a location to replace the outputs of the network at that location. For example, the maximum pooling function gives the adjacent area Internal
Maximum value. Other commonly used pooling functions include average values in adjacent areas, Weighted average function of local central pixel. Let's go through one A small example of maximum pooling
As shown in the figure above, We have a4*4 Original input image of, We make a window size of2*2 Stride length is2 Maximum pooling operation for, As shown above
Show that we have4 Color output, Represent different color areas to extract the maximum value to represent this area, The formula we use and convolution
Same operation, Onen*n Original input image of, Window size isf*f, Stride length iss, The output image should be（（n-f）/s+1）*
（（n-f）/s+1）, Different from convolution kernel, all values in convolution kernel are super parameters, There are no parameters in the pooling operation, We're just right Feature extraction at corresponding position, Parameter setting is not involved.
At present, we know convolution operation and pooling operation, So it's not hard to see that there's a problem. That's with convolution
And pooling. The size of our image will be smaller and smaller, This is a fatal problem for deep learning, Because in-depth learning, Zhongtong
The depth of constant network is hundreds of layers, In each layer, we need to perform convolution and pooling operations, Well, within a few layers, our image will
Become1*1 The size of it. This is obviously not what we want. For this problem, we introduce filling（padding） To solve it.. Here I We introduce some common convolution networkssame
padding, As the name suggests, we usedsame padding Image size after convolution
It's going to stay the same. As shown in the figure below, We give a primitive6*6 The outside of the image adds a lot0 Pixel points, Make it a8*8 Graph
image, Then use3*3 Convolution check of, According to the formula we have learned, The final output image size should be6*6,
The convolution operation keeps the size of the image unchanged, So that convolution operation can be used in deep network. In the end, we'll discuss the formula
Promotion： We have anothern*n Original image of, The filling layer isp, The size of convolution kernel isf*f, The stride size of each move iss, Then we lose
The size of the image is（（n+2p-f）/s+1）*（（n+2p-f）/s+1）.
Four Handwriting Recognition
With the above knowledge, Now let's learn how to recognize a handwritten image by convolutional neural network. As shown in the figure below：
The original picture is32*32*3 Color handwriting picture of, First pass6 individual5*5*3 Convolution operation of convolution kernel of, Then proceed2*2 Stride length is2 Maximum pooling operation for, So far, the first level is completed and the size is14*14*6 Image. The output of the first layer is the input of the second layer, Let's move on to the second floor16 individual5*5*6 Convolution operation of convolution kernel of, Then proceed2*2 Stride length is2 Maximum pooling operation for, The final output image is5*5*16 Size. Let's turn it into a one-dimensional array, Size is5*5*16=400 individual, Then conduct full connection operation（ In the next chapter, we will explain the full connection operation）, After two layers of full connection operation, we will get an output, Addsoftmax Function will generate0-9 Probability of ten numbers, We choose the maximum probability as our result. So far, a complete experiment of handwriting recognition using convolutional neural network is completed.