hashmap Is a very common data structure, It is easy to use, Next, I will give you a deep analysis of this data structure, Let us know when we use it, I know why.


First, From the most basic point of view, Let's meet firstmap What is it. When we write programs, we often encounter data retrieval and other operations, For small programs with hundreds of data, The storage method or retrieval strategy of the data has little impact, But for big data, Efficiency will be far from good. Let's talk about it.

1. Linear search:

Linear retrieval is the most straightforward method, Go through all the data, And find the data you need. Its corresponding data structure is array, Linear structure such as linked list, This method is extremely inefficient for big data, Its time complexity isO(n).

2. Two point search:

Binary search is an improvement of linear search, For example, for【1,2,3,4,5,6,7,8】, I want to search for a number( Assuming that2), I'll first compare this number with4( Generally, the number of selected digits is better) compare, less than4 While in4 Left side【1,2,3】 Search in, Again with2 compare, Equal, We found it, The advantage of this method is that it can save a lot of unnecessary searches, Find only half of the elements in the collection at a time. Its time complexity isO(logn). But there are limits, His number arrangement itself needs to be orderly.

3.Hash Find in table:

Okay, The key is coming.Hash Watch comes on stage, This is a time complexity ofO(1) Search, That is, no matter how much data you have, you only need to check it once to find the target data. Isn't it amazing?? Well, it's actually retarded. Let's see the picture below.

You can see that the value in this array is equal to its subscript, For example, I want to save11, I'll keep ita[11] inside, So when I want to find a number, I can directly correspond to its subscript. It's actually a way of sacrificing space for time, This will take up a lot of memory, But the retrieval speed is very fast, You only need to search once to find the target data.

4.Hash Table changes

Look at the topHash You must want to ask, If I save only one number10000, Then I don't want to exista[10000], Isn't the rest of the space wasted, OK, Nonexistent.Hash The table already has its solution, That isHash function.Hash The essence of a watch is that it can passvalue The feature itself locates to the element subscript of the search set, To quickly find. GeneralHash Function is: Number to deposit
mod( Seek surplus) Hash Array length. For example, for the length above9 Array,12 The location is12 mod 9=3, Namely existencea3
<>, In this way, large data can be placed.

5.Hash Conflict resolution strategy

Read the above explanation, You must have found a problem, The address from the remainder may be the same. This is what we callHash conflict, If the data volume is large andHash The bucket is smaller. This kind of conflict is very serious. We have taken the following measures to resolve the conflict.

We can see12 and0 's location conflicts, Then we change every element of the array into a chain header, Conflicting elements are placed in the linked list, In this way, after finding the corresponding chain header, it will follow the chain list, As for why to use linked list, To save space, Linked list is not stored continuously in memory, So we can make full use of memory.

Java ofHashMap

It says so much, That's what we're talking about todayHashMap What does it matter?? All right, basin friends, no side, Get to the point. We knowHashMap The values in arekey,value Right? Actually, the storage here is very similar to the one above,key Will be mapped to the address where the data is located, andvalue It's in the linked list with this address as the head, This kind of data structure can be acquired very quickly. But the problem here is that ifhash Pail is smaller, Large amount of data, It will lead to a very long list. For example, the one above is11 I want to put1000 Number, whetherHash How to refine functions, The linked list at the back will be very long, suchHash The advantages of the watch no longer exist, On the contrary, it tends to be linear retrieval. Okay, Red and black trees show up.

Red black tree

stayjdk1.8 After version,java YesHashMap Improvements have been made. The length of linked list is greater than8 When, Store the following data in the red black tree, To speed up retrieval, Let's talk about the red and black trees.

avl tree

To understand the red black tree, First we need to know.avl tree, Need to knowavl tree, The first thing to know is binary tree, It's very simple, A binary tree is one or two child nodes under each parent node, Roughly as shown in the figure below.

When we store data in a binary tree, we put the number larger than the parent node on the right node, Place a number smaller than the parent on the left node, In this way, we only need to compare a number with the parent node when looking for it, Large, go to the right and call recursively, Small, then left recursion. But it has some shortcomings, If it's bad luck, my data is in order, such as【1,2,3,4,5,6,7】, This will lead to tree imbalance, Binary trees will degenerate into linked lists. So we launchedavl tree.

avl Tree is balance tree, He improved the binary tree, Every time we insert a node, It must be ensured that the height difference between the left subtree and the right subtree corresponding to each node does not exceed1. If it exceeds, adjust the balance, The specific balancing operation is not mentioned here, Four operations—— Levo, Left and right, Right and left. Finally, the height of the left and right sides of the binary tree is similar, In this way, when we search, we can search by binary search, It will not degenerate into a linked list.

Two or three tree

There are a lot of articles about red black trees on the Internet, There are all kinds of explanations, But bloggers like to put black and red trees together with two or three trees. Let's take a look at the two or three trees.

notes: This picture is from Baidu

Actually, it's easy to understand, The difference between a binary tree and a binary tree is that it has two nodes and three nodes. There are two sub nodes under two nodes, Two nodes can hold a value, There are three sub nodes under the three nodes, Three nodes can hold two values. Now let's talk about the construction of two or three numbers.

notes: The picture still comes from Baidu, Bloggers' pictures are rubbish.

In fact, the construction of two or three trees is very simple, As shown in the figure, In the pictureM Node is a two node,M SinisterEJ Node is a three node. Still big data on the right, Small data to the left. At this time, we will weight the tree if the number can be directly put into two nodes, Go straight in, But if it happens to be in three nodes, Like in the picture,Z It's just going to beSX in. Then we need to split the node into two nodes, And the number in the middle is referred to the parent node, Like in the pictureX Put it in.R Side. Of course, when the child node is mentioned to the parent node, the number of the parent nodes exceeds two, Just keep going up, Until it's satisfied.

Red black tree

The red black tree is very similar to the two or three trees, Basically, it's a deformation of two or three trees.
The traditional definition of red black tree is to meet the following five characteristics:
(1) Every node or black, Or red.
(2) The root node is black.
(3) Each leaf node(NIL) It's black.. [ Be careful: Here is the leaf node, Empty space(NIL orNULL) Leaf node of!]
(4) If a node is red, Then its child nodes must be black.
(5) All paths from a node to its children contain the same number of black nodes.
Its characteristic is to add color attribute to each node of the number, Balance by color transformation and node rotation during insertion. In fact, bloggers don't like the above definition very much, Another perspective is to compare it with two or three trees.

Of course, the picture above was also searched.
Red black trees can also be described as:
⑴ Red links are left links.
⑵ No node is connected to two red links at the same time.
⑶ The tree is perfectly black balanced, That is to say, the number of black links on the path of any empty link to the root node is the same.
Here, the connection between nodes is divided into red connection and black connection, Replaced the definition of red node and black node( The essence is the same), Define previous black height equality as equal number of black connections. More intuitive.
As shown in the figure, In fact, every step of the operation of the red black tree corresponds to the operation of the two or three trees, If it's two nodes, it's black connection, If there are three nodes, the two numbers are connected in red.

Advantages of Mangrove

Red and black treesavl tree, The efficiency is almost the same when searching, It's all about finding two points by balancing. But the operation efficiency of insertion and deletion is greatly improved. The red and black trees are not likeavl Tree like pursuit of absolute balance, He allows for a small partial imbalance, This has little effect on efficiency, But a lot of unnecessary balancing operations are omitted,avl Tree tuning balance sometimes costs a lot, So it's not as efficient as the red black tree, In many places, there are red and black trees at the bottom~


HashMap It's a structure of linked list and red black tree, In this way, the memory utilization of linked list and the efficient retrieval of red black tree are used, It is very kind.happy Data structure of.

Small welfare at the end of the article

I used to useC++ Handwrittenavl Implementation of tree, For those who are a little confused about the course design of data structure in sophomore year, please refer to.

Next notice