image segmentation — Image segmentation based on graph （Graph-BasedImageSegmentation）
image segmentation — Image segmentation based on graph （Graph-Based Image Segmentation）
Efficient Graph-Based Image Segmentation,IJCV 2004,MIT Code
Graph-Based Segmentation It is a classic image segmentation algorithm , author Felzenszwalb It is also proposed DPM Big cow of algorithm . The algorithm is based on graph greedy clustering
algorithm , Easy to implement , Very fast , The accuracy is OK . however , At present, it should be less directly used for segmentation , After all 99 Cross century veteran , But many algorithms use it as a stepping stone , such as Object
Propose The beginning of the mountain 《Segmentation as Selective Search for Object
Recognition》 Use it to create over segmentation （oversegmentation）. There are also semantic segmentation (senmatic segmentation
) The algorithm uses it to generate super pixels （superpixels） Specifically, I forgot ……
<> Basic concepts of Graphs
Because the algorithm abstracts the image with weighted graph , Therefore, some basic concepts of graph are supplemented .
A graph is composed of a set of vertices （vertices） And edge set （edges） form , Expressed as , vertex , In this paper, it is a single pixel , Edges that connect a pair of vertices have weights , The meaning in this paper is not between vertices
Similarity , The undirected graph is used .
tree ： Special Graphs , Any two vertices in a graph , There are paths connected , But there is no loop . As shown in the figure above, the bold edges are connected . If you look at it as a ball of random beads , Keep only
Beads and wires in trees , So pick a bead , Can lift all the beads in this tree . If ,i and h This side is also preserved , that h,I,c,f,g It's a loop .
minimum spanning tree （MST, minimum spanning tree
<http://en.wikipedia.org/wiki/Minimum_spanning_tree>）： Special trees , Given the vertices to be connected , Select the tree with the minimum sum of edge weights .
The picture above shows a tree MST
In this paper , Each pixel is a vertex when initialized , Then gradually merge to get a region , It's the one that connects pixels in this area MST. As shown in the figure , The brown circle is the apex , The line segment is an edge , Generated by merging Brown vertices MST, The corresponding is a segmentation region . The result of the partition is actually the forest .
Since it is a clustering algorithm , What rules should be used to determine when the two should be combined into one , When should we continue to draw the line ?
For two isolated pixels , The difference is color , Nature uses the distance of color to measure the similarity of two points , In this paper, we use RGB Distance of , Namely
You can use it, of course perceptually
uniform Of Luv perhaps Lab Color space , For grayscale images, you can only use the brightness value , in addition , You can also use texture feature filtering first , Recalculate distance , such as , Do it first Census
Transform Recalculation Hamming distance <http://en.wikipedia.org/wiki/Hamming_distance> distance .
<> Global threshold à Adaptive threshold
As mentioned above, the difference between two pixels should be used to measure the difference between two pixels . For two regions （ Subgraph ） Or the similarity between an area and a pixel , The simplest method is to consider only the dissimilarity of the edges connecting them .
As shown in the figure , Two areas of brown and green have been formed , The purple edge is now used to determine whether the two regions are merged . So we can set a threshold
, When the difference between two pixels （ That is, they are not similar ） Less than this value , become . Iterative merging , And eventually it will merge into regions , This is the basic idea of regional growth ： sparks of fire , It can start a prairie fire .
obviously , The picture above should be condensed into what the picture on the right thinks 3 class , High frequency region h, Slope area s, Flat area p. If we set a global threshold
, So if h If you want to merge the districts into one , Then the threshold should be chosen to be large , But that would make p and s Areas are also included , The segmentation result is too coarse
. If p For reference , Then the threshold value should be very small , In that case ,p Areas will merge into one , however ,h The districts will merge into a very special number of small pieces , Like a broken mirror , The segmentation result is too fine .
obviously , The global threshold is not appropriate , So, of course, we have to use adaptive thresholds . about p The threshold should be very small ,s The area is slightly larger ,h The area is huge .
For two regions （ It is called Component, It's essentially a MST, A single pixel can also be regarded as a region ）, This article uses the very intuitive , But the anti-interference method is not strong . Let's start with two definitions , According to these two additional information, the adaptive threshold is obtained .
Intra class differences in a region ：
It can be approximately understood as the maximum luminance difference value within an area , The definition is MST The edge with the largest dissimilarity in .
The differences between the two regions ：
That is to connect all the edges of the two regions , The dissimilarity of the edge with the least dissimilarity , That is, the difference between the most similar parts of the two regions .
Then the intuitive judgment of whether the merger criteria ：
condition of equivalence
explain ： , These are the biggest differences that regions and can tolerate , When both can tolerate the current difference , You love me , fit in easily with , As long as one side doesn't want to , You can't force it .
exceptional case , When both are isolated pixel values ,
, All pixels are " Zero tolerance " Only if the pixel values are exactly the same can they be merged , Nature can lead to over segmentation . So in the beginning , A tolerable range should be set for each pixel , When it grows to a certain extent , This initial tolerance value should be removed . The original conditions are as follows
Where is the number of pixels contained in the region , such , As the region expands , The effect of this item is getting smaller and smaller , In the end, it's almost negligible . So it's the size of the area that you can control , If ,
that , Almost every pixel becomes a separate region , If , Obviously, the whole picture will come together . therefore , The bigger it is , The larger the image after segmentation .
of course , The median can be used to deal with overshoot , But it became a NP Difficult problem , See original for proof
<> Similar in shape
As mentioned above, we use color information to cluster , Revision of similarity measures , Can be clustered into specific shapes we want . For example, we want to get a lot of long areas , Then we can use the region formed by clustering the measure of area / Perimeter
+ Difference of brightness value Measure the similarity between two subgraphs or two pixels . Because of the area of the strip / The perimeter will be smaller .
<> Algorithm steps
Step 1: Each pixel is calculated 8 Neighborhood or 4 Neighborhood dissimilarity .
As shown on the left , The solid line is only calculated 4 field , Add the dotted line to the calculation 8 Neighborhood , Because it is an undirected graph , From left to right , From top to bottom , Just calculate the gray line in the right figure .
Step 2: According to the dissimilarity of edges non-decreasing array （ from small to large ） Sort it out .
Step 3: choice
Step 4: Merge the currently selected edges . Let the connected vertex be . If the merger conditions are met ：
（1） Different regions ;
（2） The dissimilarity is not greater than the internal dissimilarity . Then execute Step 4. Otherwise, it will be executed Step 5
Step 5: Update threshold and class label .
Update class label ： Unify the class label of to the label of .
Update the dissimilarity threshold of this class to ：.
be careful ： The edges with small similarity are merged first , therefore , It is the largest edge of the current merged region , Namely .
Step 6: If , Then follow the order , Select the next edge to execute Step 4, Otherwise it's over .
Segmentation parameters: sigma = 0.5, k= 500, min = 50.
Sigma： Firstly, the original image is denoised by Gaussian filtering ,sigma That is the Gaussian kernel
k: Controls the size of the merged region , See above
min: Post processing parameters , After segmentation, there will be many small areas , When the number of pixels in the region is less than min Time , Select the region with the smallest difference, that is to say .
<> Nature discussion
The results were not very good , But it has good global properties , The conclusion is interesting , Those who are interested can have a look .
The first thing to note is that , For any image , There is always a segmentation method , The result of segmentation is not too fine , It's just rough . But it's not the only one .
If step 4 Time ,, But there was no merger , Namely , So there must be an area that's already segmented , such as , Then the scope of the region will not increase , It will become a region in the final segmentation region .
hypothesis ,, Because the edge is in accordance with non-decreasing sort , So the remaining edges of the connection are not less than , Not the smallest side , Naturally, the rest of the side is standing on the side .
however , The original said that only one has been separated , But I think there is another situation , also , Then the two districts should be divided .
<>Not Too fine
The segmentation is too fine , That is, the area that should not have been separated was cut off by the waist , But this algorithm can ensure that lovers get married , I will never do anything to break up a couple .
The method of proof to the contrary ： As shown above . It shouldn't have been divided , The conditions should be met . If it's separated , Then there must be an edge
As a result, they did not merge , Well, from the previous lemma , There must be a region that becomes a part of the final segmentation result , Suppose it is A part , Go back to the time of judging this side , There must be ,, thus
, Because of the non-decreasing order , therefore A Partial sum B Part of the smallest side is , So it's inconsistent with the assumptions .
<>Not Too coarse
The segmentation is too thick , That is, the areas that should have been separated are not separated . However, this algorithm can ensure that when it is broken, it will be broken , It can't be broken .
The method of proof to the contrary ： As shown above . It should have been divided , Then the conditions should be met . Suppose or not , For connection A,B The smallest side . If the merger , because , And it is non-decreasing order , So in the judgment side
before A The region has been formed . If the segmentation is too thick , Then it is determined that the smallest edge satisfies the , Then they must merge . Conflict with conditions .
<> The influence of the order of equal weight edge processing
If there are two sides , The weight of , So when sorting , Who is at the top , Does it matter who falls behind ? The conclusion is that there is no wood .
Case1：, The connected areas are the same , Namely , It's all regions that connect , So it doesn't matter who's ahead .
Case2：, The connected areas are completely different , For example, the connection area ,, Connection area , So who comes first and who comes after , It does not affect whether to merge or not , And it doesn't affect whether to merge or not .
Case3： connect , connect
Case3-1： First , After , also , To cause to merge , Exchange the two processing order , Deal with it first , Post processing . If not merged , That doesn't affect the merger ; If the merger , So after the merger , Merge as usual .
Case3-2： First , After , also , No consolidation , Exchange the two processing order , Deal with it first , Post processing . If it is . Is that a merger , Will not make the merger ; If , That's the same thing , It doesn't matter either .
<> supplement ：
<> Color picture
For color pictures , The above is about R,G,B As distance , The whole image is segmented only once , In the original text, each channel is divided once , Finally, the intersection of the results is taken , In other words, the two points in the picture should be divided into the same area , Then in the R,G,B In the results of three channel partition , They have to be in the same area all the time . The effect is better …… But his procedure uses a split .
<>Nearest Neighbor Graphs
In the previous paper, only the spatial location is used to connect the component diagram , The disadvantages are obvious , Space is not adjacent , The colors are exactly the same , So if there is a slight break in the middle, it will be divided into many parts . So the other is more equal
The strategy is to consider the two together , Map to feature space first , Reconstruction map . At this time, the connection is not necessarily 4/8 Neighborhood , Because of
Opposite side , So if we consider the connection of all edges , It's horrible ! The original text is to find each pixel 10 The nearest Euclidean distance is 10 Nearest neighbor , Construction map , of course , Another way is not to fix the number of neighbors , It's about limiting the distance .
Then the explanation of the distance within the class is intuitive , The shortest distance in a class , Then we'll take this side as the radius , A hypersphere is formed in the feature space , But I'll get to know someone else .
It is also the shortest distance between the two classes .
look for 10-NN It's also quite tiring , The original text uses approximate algorithm ANN《Approximate nearest neighbor searching》 Come and find it 10 a near neighbor , fast .
The rest is the same as above , But there's one thing I don't understand , That's the update , Take the picture above , It must have been updated with the green line , Then the meaning is no longer the shortest radius of all the points in the set , solve ?