image segmentation — Image segmentation based on graph (Graph-Based Image Segmentation)


Efficient Graph-Based Image Segmentation,IJCV 2004,MIT Code


Graph-Based Segmentation It is a classic image segmentation algorithm , author Felzenszwalb It is also proposed DPM Big cow of algorithm . The algorithm is based on graph greedy clustering
algorithm , Easy to implement , Very fast , The accuracy is OK . however , At present, it should be less directly used for segmentation , After all 99 Cross century veteran , But many algorithms use it as a stepping stone , such as Object
Propose The beginning of the mountain 《Segmentation as Selective Search for Object
Recognition》 Use it to create over segmentation (oversegmentation). There are also semantic segmentation (senmatic segmentation
) The algorithm uses it to generate super pixels (superpixels) Specifically, I forgot ……

<> Basic concepts of Graphs

Because the algorithm abstracts the image with weighted graph , Therefore, some basic concepts of graph are supplemented .

A graph is composed of a set of vertices (vertices) And edge set (edges) form , Expressed as , vertex , In this paper, it is a single pixel , Edges that connect a pair of vertices have weights , The meaning in this paper is not between vertices
Similarity , The undirected graph is used .

tree : Special Graphs , Any two vertices in a graph , There are paths connected , But there is no loop . As shown in the figure above, the bold edges are connected . If you look at it as a ball of random beads , Keep only
Beads and wires in trees , So pick a bead , Can lift all the beads in this tree . If ,i and h This side is also preserved , that h,I,c,f,g It's a loop .

minimum spanning tree (MST, minimum spanning tree
<>): Special trees , Given the vertices to be connected , Select the tree with the minimum sum of edge weights .
The picture above shows a tree MST

In this paper , Each pixel is a vertex when initialized , Then gradually merge to get a region , It's the one that connects pixels in this area MST. As shown in the figure , The brown circle is the apex , The line segment is an edge , Generated by merging Brown vertices MST, The corresponding is a segmentation region . The result of the partition is actually the forest .


<> Similarity

Since it is a clustering algorithm , What rules should be used to determine when the two should be combined into one , When should we continue to draw the line ?

For two isolated pixels , The difference is color , Nature uses the distance of color to measure the similarity of two points , In this paper, we use RGB Distance of , Namely

You can use it, of course perceptually
uniform Of Luv perhaps Lab Color space , For grayscale images, you can only use the brightness value , in addition , You can also use texture feature filtering first , Recalculate distance , such as , Do it first Census
Transform Recalculation Hamming distance <> distance .

<> Global threshold à Adaptive threshold

As mentioned above, the difference between two pixels should be used to measure the difference between two pixels . For two regions ( Subgraph ) Or the similarity between an area and a pixel , The simplest method is to consider only the dissimilarity of the edges connecting them .

As shown in the figure , Two areas of brown and green have been formed , The purple edge is now used to determine whether the two regions are merged . So we can set a threshold
, When the difference between two pixels ( That is, they are not similar ) Less than this value , become . Iterative merging , And eventually it will merge into regions , This is the basic idea of regional growth : sparks of fire , It can start a prairie fire .

obviously , The picture above should be condensed into what the picture on the right thinks 3 class , High frequency region h, Slope area s, Flat area p. If we set a global threshold
, So if h If you want to merge the districts into one , Then the threshold should be chosen to be large , But that would make p and s Areas are also included , The segmentation result is too coarse
. If p For reference , Then the threshold value should be very small , In that case ,p Areas will merge into one , however ,h The districts will merge into a very special number of small pieces , Like a broken mirror , The segmentation result is too fine .

obviously , The global threshold is not appropriate , So, of course, we have to use adaptive thresholds . about p The threshold should be very small ,s The area is slightly larger ,h The area is huge .

For two regions ( It is called Component, It's essentially a MST, A single pixel can also be regarded as a region ), This article uses the very intuitive , But the anti-interference method is not strong . Let's start with two definitions , According to these two additional information, the adaptive threshold is obtained .

Intra class differences in a region :

It can be approximately understood as the maximum luminance difference value within an area , The definition is MST The edge with the largest dissimilarity in .

The differences between the two regions :

That is to connect all the edges of the two regions , The dissimilarity of the edge with the least dissimilarity , That is, the difference between the most similar parts of the two regions .

Then the intuitive judgment of whether the merger criteria :

condition of equivalence

     explain : , These are the biggest differences that regions and can tolerate , When both can tolerate the current difference , You love me , fit in easily with , As long as one side doesn't want to , You can't force it .

     exceptional case , When both are isolated pixel values ,
, All pixels are " Zero tolerance " Only if the pixel values are exactly the same can they be merged , Nature can lead to over segmentation . So in the beginning , A tolerable range should be set for each pixel , When it grows to a certain extent , This initial tolerance value should be removed . The original conditions are as follows

     Additions :

Where is the number of pixels contained in the region , such , As the region expands , The effect of this item is getting smaller and smaller , In the end, it's almost negligible . So it's the size of the area that you can control , If ,
that , Almost every pixel becomes a separate region , If , Obviously, the whole picture will come together . therefore , The bigger it is , The larger the image after segmentation .

of course , The median can be used to deal with overshoot , But it became a NP Difficult problem , See original for proof

<> Similar in shape

As mentioned above, we use color information to cluster , Revision of similarity measures , Can be clustered into specific shapes we want . For example, we want to get a lot of long areas , Then we can use the region formed by clustering   the measure of area / Perimeter
+ Difference of brightness value   Measure the similarity between two subgraphs or two pixels . Because of the area of the strip / The perimeter will be smaller .

<> Algorithm steps

Step 1: Each pixel is calculated 8 Neighborhood or 4 Neighborhood dissimilarity .

As shown on the left , The solid line is only calculated 4 field , Add the dotted line to the calculation 8 Neighborhood , Because it is an undirected graph , From left to right , From top to bottom , Just calculate the gray line in the right figure .

Step 2:  According to the dissimilarity of edges non-decreasing array ( from small to large ) Sort it out .

Step 3:  choice

Step 4:  Merge the currently selected edges . Let the connected vertex be . If the merger conditions are met :

(1) Different regions ;

(2) The dissimilarity is not greater than the internal dissimilarity . Then execute Step 4. Otherwise, it will be executed Step 5

Step 5:  Update threshold and class label .

Update class label : Unify the class label of to the label of .

Update the dissimilarity threshold of this class to :.

be careful : The edges with small similarity are merged first , therefore , It is the largest edge of the current merged region , Namely .

Step 6:  If , Then follow the order , Select the next edge to execute Step 4, Otherwise it's over .

<> result


                  Segmentation parameters: sigma = 0.5, k= 500, min = 50.

Sigma: Firstly, the original image is denoised by Gaussian filtering ,sigma That is the Gaussian kernel

k:  Controls the size of the merged region , See above

min:  Post processing parameters , After segmentation, there will be many small areas , When the number of pixels in the region is less than min Time , Select the region with the smallest difference, that is to say .

<> Nature discussion

The results were not very good , But it has good global properties , The conclusion is interesting , Those who are interested can have a look .

The first thing to note is that , For any image , There is always a segmentation method , The result of segmentation is not too fine , It's just rough . But it's not the only one .

<> lemma

If step 4  Time ,, But there was no merger , Namely , So there must be an area that's already segmented , such as , Then the scope of the region will not increase , It will become a region in the final segmentation region .



hypothesis ,, Because the edge is in accordance with non-decreasing sort , So the remaining edges of the connection are not less than , Not the smallest side , Naturally, the rest of the side is standing on the side .

however , The original said that only one has been separated , But I think there is another situation ,  also  , Then the two districts should be divided .

<>Not Too fine

The segmentation is too fine , That is, the area that should not have been separated was cut off by the waist , But this algorithm can ensure that lovers get married , I will never do anything to break up a couple .





The method of proof to the contrary : As shown above . It shouldn't have been divided , The conditions should be met . If it's separated , Then there must be an edge
As a result, they did not merge , Well, from the previous lemma , There must be a region that becomes a part of the final segmentation result , Suppose it is A part , Go back to the time of judging this side , There must be ,, thus
, Because of the non-decreasing order , therefore A Partial sum B Part of the smallest side is , So it's inconsistent with the assumptions .

<>Not Too coarse

The segmentation is too thick , That is, the areas that should have been separated are not separated . However, this algorithm can ensure that when it is broken, it will be broken , It can't be broken .

The method of proof to the contrary : As shown above . It should have been divided , Then the conditions should be met . Suppose or not  , For connection A,B The smallest side . If the merger , because , And it is non-decreasing order , So in the judgment side
before A The region has been formed . If the segmentation is too thick , Then it is determined that the smallest edge satisfies the , Then they must merge . Conflict with conditions .

<> The influence of the order of equal weight edge processing

If there are two sides , The weight of , So when sorting , Who is at the top , Does it matter who falls behind ? The conclusion is that there is no wood .





Case1:, The connected areas are the same , Namely , It's all regions that connect , So it doesn't matter who's ahead .

Case2:, The connected areas are completely different , For example, the connection area ,, Connection area , So who comes first and who comes after , It does not affect whether to merge or not , And it doesn't affect whether to merge or not .

Case3: connect , connect

Case3-1: First , After , also , To cause to merge , Exchange the two processing order , Deal with it first , Post processing . If not merged , That doesn't affect the merger ; If the merger , So after the merger , Merge as usual .

Case3-2: First , After , also , No consolidation , Exchange the two processing order , Deal with it first , Post processing . If it is . Is that a merger , Will not make the merger ; If , That's the same thing , It doesn't matter either .

<> supplement :

<> Color picture

For color pictures , The above is about R,G,B As distance , The whole image is segmented only once , In the original text, each channel is divided once , Finally, the intersection of the results is taken , In other words, the two points in the picture should be divided into the same area , Then in the R,G,B In the results of three channel partition , They have to be in the same area all the time . The effect is better …… But his procedure uses a split .

<>Nearest Neighbor Graphs

In the previous paper, only the spatial location is used to connect the component diagram , The disadvantages are obvious , Space is not adjacent , The colors are exactly the same , So if there is a slight break in the middle, it will be divided into many parts . So the other is more equal
The strategy is to consider the two together , Map to feature space first , Reconstruction map . At this time, the connection is not necessarily 4/8 Neighborhood , Because of
Opposite side , So if we consider the connection of all edges , It's horrible ! The original text is to find each pixel 10 The nearest Euclidean distance is 10 Nearest neighbor , Construction map , of course , Another way is not to fix the number of neighbors , It's about limiting the distance .

Then the explanation of the distance within the class is intuitive , The shortest distance in a class , Then we'll take this side as the radius , A hypersphere is formed in the feature space , But I'll get to know someone else .

It is also the shortest distance between the two classes .

look for 10-NN It's also quite tiring , The original text uses approximate algorithm ANN《Approximate nearest neighbor searching》 Come and find it 10 a near neighbor , fast .

The rest is the same as above , But there's one thing I don't understand , That's the update , Take the picture above , It must have been updated with the green line , Then the meaning is no longer the shortest radius of all the points in the set , solve ?