Trie tree( Dictionary tree)

One, Introduce

What is a dictionary for? Search word.

The dictionary tree also plays the role of searching. What to look for? Word.

Look at the following questions:

1, Given Word andm Asking about, Ask one word at a time, Answer if the word appears in the word list.

answer: simple!map, Not of imposing stature but strong and capable.

good. Next

2, Given Word andm Asking about, One prefix at a time, How many words are prefixes for answering questions.

answer:map, Take each word apart.


This requires a high-level data structure——Trie tree( Dictionary tree)

Two, principle

In this article, Suppose all words are made up of lowercase letters only

Yescat,cash,app,apple,aply,ok Build a dictionary tree, After completion, as shown in the figure below

From this we can see that:

1, The dictionary tree represents letters with edges

2, Word common prefix nodes with the same prefix, Then we can get that each node has at most26 Child node( When a word contains only lowercase letters)

3, The root node of the whole tree is empty. Why?? Easy to insert and find, This will be explained later.

4, Each word ends with a special character, Used in figures‘$’, Then from the root node to any one‘$’ All the letters on the edge indicate a word.

Three, basic operation

A,insert, Insert a word

1. thinking

  It can be seen directly from the figure, Scan the word from left to right, If the letter does not appear under the corresponding root node, Just insert the letter; Otherwise, go down the dictionary tree, Look at the next letter of the word.

  That raises a question: Where to insert? The computer will not choose its own location, We need to give it a location, You need to number each letter.

  Let's set the arraytrie[i][j]=k, Indicates Noi Of nodesj The children are numberedk Node.

  What do you mean?

  Here you are2 Species number, One isi,k Indicates the location number of the node, This is relative to the whole tree; The other isj, Representation nodei The firstj Children, This is the relative nodei In terms of.

  Incomprehension? Look at the picture

  Or word?cat,cash,app,apple,aply,ok 

  We'll number them in the first order, Red
Indicates number result. Because it was entered firstcat, thereforec,a,t Namely1,2,3, And the input iscash, becausec,a Is a common prefix, So froms Start editing,s yes4, And so on.

Note that the numbers of the same letters may be different here


  Second number, Number of relative nodes, Purple indicates number result.

Because each node has at most26 Child node, We can follow their dictionary order from0——25 number, That's theirASCLL code-a OfASCLL code.

Note that the numbers of the same letters are the same here

  In fact, the children of each node should0 Compile to——25, But you'll find a lot of things that don't work at all. For example, the root node in the figure above should be separated26 Fork. Save space, Which points to use.

  What's the use of numbering like this?

Back to arraytrie[i][j]=k.  arraytrie[i][j]=k, Indicates Noi Of nodesj The children are numberedk Node.

Then the second number isj, The first number isi,k

2, Code
void insert()// Insert words { len=strlen(s);// Words Length root=0;// The root node number is0 for(int i=0
;i<len;i++) { int id=s[i]-'a';// Second number if(!trie[root][id])// If you didn'troot reachid Prefix
trie[root][id]=++tot;// insert,tot Is the first number root=trie[root][id];// Go down the dictionary tree } }
B,search, lookup

There are many ways to find, You can find a prefix, You can also find the whole word.

Again, let's take looking up whether a prefix has appeared as an example

1, thinking

  Scan each letter from left to right, Look down the dictionary tree, Can find the letter, Go down, Otherwise, end the search, There is no such prefix; Prefix scanning finished, Indicates that there is this prefix.

2, Code
bool find() { len=strlen(s); root=0;// Starting from the root node for(int i=0;s[i];i++) { int
x=s[i]-'a';// if(trie[root][x]==0) return false;// withroot For head nodex Letter does not exist, Return0
root=trie[root][x];// Prepare for next letter query, Go down } return true;// Eureka }
3, If you are looking for a word, We usebool variable v[i] Representation nodei Is it a sign of the end of a word.

    Then finallyreturn Yes.v[root], So after inserting each word in the insert operation is, For the last letter of the wordv[i] Set astrue, Everything elsefalse

4, If it's the number of times the query prefix appears, That's onesum[], Indicated positioni Number of visits,

    Then finallyreturn Yes.sum[root], Every node accessed in the insert operation, We have to let himsum++

    Here, the number of prefixes is marked at the position after the last letter of the prefix.

  such as: prefixabc The number of occurrences is marked atc In the next position,


Four, Complete code

1, Whether the query appears
/* trie tree Storage mode of: Save letters on edge, The node of an edge connects the letters to it
trie[rt][x]=tot:rt Is the last node number,x It's the letter.tot Is the next node number*/ #include<cstdio> #include<iostream>
#include<algorithm> #include<cstring> #define maxn 2000010 using namespace std;
int tot=1,n; int trie[maxn][26]; //bool isw[maxn]; To query the whole word void insert(char *s,
int rt) { for(int i=0;s[i];i++) { int x=s[i]-'a'; if(trie[rt][x]==0)//
The inserted letter does not appear at the same node before { trie[rt][x]=++tot;// Insert letters in a new position, Otherwise, we will not deal with it } rt=trie[rt][x];
// Prepare for the insertion of the next letter } /*isw[rt]=true; Mark the end of the last letter of the word, Used when querying the whole word*/ } bool find(char *s,
int rt) { for(int i=0;s[i];i++) { int x=s[i]-'a'; if(trie[rt][x]==0)return false
;// withrt For head nodex Letter does not exist, Return0 rt=trie[rt][x];// Prepare for next letter query } return true; //
When querying the whole word, Shouldreturn isw[rt] } char s[22]; int main() { tot=0; int rt=1; scanf("%d"
,&n); for(int i=1;i<=n;i++) { cin>>s; insert(s,rt); } scanf("%d",&n); for(int i=
1;i<=n;i++) { cin>>s; if(find(s,rt))printf("YES\n"); else printf("NO\n"); }
return 0; } Array simulation
2, Number of occurrences of query prefix
#include<iostream> #include<cstring> #include<cstdio> #include<algorithm> using
namespace std; int trie[400001][26],len,root,tot,sum[400001]; bool p; int n,m;
char s[11]; void insert() { len=strlen(s); root=0; for(int i=0;i<len;i++) { int
id=s[i]-'a'; if(!trie[root][id]) trie[root][id]=++tot; sum[trie[root][id]]++;//
Prefix preservation root=trie[root][id]; } } int search() { root=0; len=strlen(s); for(int i=0
;i<len;i++) { int id=s[i]-'a'; if(!trie[root][id]) return 0; root=
trie[root][id]; }//root After this cycle, it becomes the position of the last letter of the prefix return sum[root]; } int main() {
scanf("%d",&n); for(int i=1;i<=n;i++) { cin>>s; insert(); } scanf("%d",&m); for(
int i=1;i<=m;i++) { cin>>s; printf("%d\n",search()); } } Array simulation Array simulation
#include<cstdio> #include<cstring> #include<iostream> #include<algorithm> using
namespace std; char s[11]; int n,m; bool p; struct node { int count; node *
next[26]; }*root; node * build() { node * k=new(node); k->count=0; memset(k
->next,0,sizeof(k->next)); return k; } void insert() { node * r=root; char *
word=s; while(*word) { int id=*word-'a'; if(r->next[id]==NULL) r->next[id]=
build(); r=r->next[id]; r->count++; word++; } } int search() { node * r=root;
char * word=s; while(*word) { int id=*word-'a'; r=r->next[id]; if(r==NULL)
return 0; word++; } return r->count; } int main() { root=build(); scanf("%d",&
n);for(int i=1;i<=n;i++) { cin>>s; insert(); } scanf("%d",&m); for(int i=1
;i<=m;i++) { cin>>s; printf("%d\n",search()); } } Pointer writing
Five, Template problem

hud 1251 Statistical puzzles

codevs 4189 Dictionaries <>