-
Notifications
You must be signed in to change notification settings - Fork 15
/
Trie_Tutorial.txt
295 lines (230 loc) · 12.7 KB
/
Trie_Tutorial.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
Trie
====
- Introduction:
===============
A Trie, also known as a prefix tree or digital tree, is a type of search tree used to store a dynamic set or associative
array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with
that node; instead, its position in the tree defines the key with which it is associated.
* Characteristics of a Trie:
----------------------------
1. Structure: Each node represents a single character of a key. The root node is associated with an empty string.
Each edge represents a character in the key.
2. Keys: Keys are typically strings. A path from the root to a particular node corresponds to a prefix of some of
the keys.
3. Search Efficiency: Tries are highly efficient for searching and storing strings. The time complexity for
search, insert, and delete operations is O(L), where L is the length of the key.
* Key Operations:
------------------
- Insert: Adding a word involves traversing the Trie according to the characters of the word and creating nodes
as necessary.
- Search: To search for a word, traverse the Trie according to the characters of the word. If you reach the end
of the word and the corresponding node is a terminal node, the word exists in the Trie.
- Delete: Removing a word involves finding the node corresponding to the last character of the word and removing
it, adjusting the tree as necessary to maintain the structure.
* Example:
----------
Let's consider inserting the words "cat", "cap", and "dog" into a Trie:
1. Start with an empty root node.
2. Insert "cat":
- Add 'c' as a child of the root.
- Add 'a' as a child of 'c'.
- Add 't' as a child of 'a' and mark it as the end of the word.
3. Insert "cap":
- Traverse to 'c'.
- Traverse to 'a'.
- Add 'p' as a child of 'a' and mark it as the end of the word.
4. Insert "dog":
- Add 'd' as a child of the root.
- Add 'o' as a child of 'd'.
- Add 'g' as a child of 'o' and mark it as the end of the word.
The resulting Trie structure would look something like this:
(root)
/ \
c d
/ \ \
a a o
/ / \
t p g
(E) (E) (E)
Here, (E) indicates the end of a word.
* Advantages:
-------------
- Efficient Prefix Searches: Ideal for applications where many keys share common prefixes.
- Predictive Text: Useful in autocomplete features.
- Spell Checking: Can be used to implement fast spell-checking algorithms.
* Disadvantages:
----------------
- Space Consumption: Can use a lot of memory, especially if the set of keys stored is sparse.
- Complex Implementation: More complex to implement compared to other data structures like hash tables.
Tries are commonly used in scenarios where we need to handle a large set of strings, such as dictionaries, phonebooks,
and search engines.
- High-Level Pseudocode for Trie:
==================================
Here's a high-level pseudocode for a Trie that captures the essential logic of insert, search, and delete operations without
going into the specifics of class definitions and internal structure details.
* Trie Operations:
------------------
1. Initialize Trie
function initializeTrie():
root = createNode()
2. Insert a Word
function insert(word):
node = root
for each character in word:
if character is not in node's children:
add a new node for character in node's children
move to the child node corresponding to character
mark the current node as end of word
3. Search for a Word
function search(word):
node = root
for each character in word:
if character is not in node's children:
return False
move to the child node corresponding to character
return True if current node is end of word, else False
4. Delete a Word
function delete(word):
function deleteRecursively(node, word, depth):
if depth == length of word:
if node is not end of word:
return False
unmark the node as end of word
return True if node has no children
character = word[depth]
if character is not in node's children:
return False
shouldDeleteChild = deleteRecursively(node's child corresponding to character, word, depth + 1)
if shouldDeleteChild:
remove character from node's children
return True if node has no children and is not end of another word
return False
deleteRecursively(root, word, 0)
* Explanation:
--------------
1. initializeTrie:
- Creates the root node of the Trie.
2. insert(word):
- Starts from the root node.
- For each character in the word, if it doesn't exist in the current node's children, add a new node.
- Move to the corresponding child node for each character.
- Mark the last node as the end of the word after processing all characters.
3. search(word):
- Starts from the root node.
- For each character in the word, check if it exists in the current node's children.
- If a character is missing, return `False`.
- After processing all characters, return `True` if the current node is marked as the end of a word.
4. delete(word):
- Uses a recursive helper function `deleteRecursively` to delete a word from the Trie.
- The helper function checks if the current depth matches the length of the word.
- If at the end of the word, unmark the node as the end of the word and check if the node has no children.
- If not at the end, continue to the next character and call the function recursively.
- If a child node should be deleted, remove the character from the current node's children.
- Return whether the current node should be deleted based on the presence of children and end-of-word status.
This high-level pseudocode focuses on the core logic of the Trie operations, providing a clear and concise overview
without delving into the implementation specifics.
- Trie Example: LeetCode Problem 1268. Search Suggestions System
=================================================================
To solve this problem, we can use a Trie data structure to efficiently store and search for product names with a common prefix.
We'll implement the Trie with methods to insert words and to get suggestions based on a prefix. For each character of `searchWord`,
we'll retrieve at most three lexicographically smallest product names that have the current prefix. (ChatGPT coded the solution 🤖)
* Here's a step-by-step approach and the corresponding Java code:
-----------------------------------------------------------------
1. Define the Trie Node: Each node contains an array for child nodes and a list of suggestions.
2. Insert Products into Trie: Insert each product into the Trie, maintaining the list of suggestions at each node.
3. Search for Suggestions: For each prefix formed by the characters of `searchWord`, search the Trie to get suggestions.
* Trie Node Definition:
-----------------------
class TrieNode {
TrieNode[] children;
List<String> suggestions;
public TrieNode() {
children = new TrieNode[26]; // For each letter in the alphabet
suggestions = new ArrayList<>();
}
}
* Trie Definition:
------------------
class Trie {
private TrieNode root;
public Trie() {
root = new TrieNode();
}
public void insert(String word) {
TrieNode node = root;
for (char c : word.toCharArray()) {
int index = c - 'a';
if (node.children[index] == null) {
node.children[index] = new TrieNode();
}
node = node.children[index];
addSuggestion(node.suggestions, word);
}
}
private void addSuggestion(List<String> suggestions, String word) {
suggestions.add(word);
Collections.sort(suggestions);
if (suggestions.size() > 3) {
suggestions.remove(suggestions.size() - 1); // Keep only the 3 smallest lexicographically
}
}
public List<String> getSuggestions(String prefix) {
TrieNode node = root;
for (char c : prefix.toCharArray()) {
int index = c - 'a';
if (node.children[index] == null) {
return new ArrayList<>();
}
node = node.children[index];
}
return node.suggestions;
}
}
* Solution Class:
-----------------
import java.util.*;
class Solution {
public List<List<String>> suggestedProducts(String[] products, String searchWord) {
Trie trie = new Trie();
for (String product : products) {
trie.insert(product);
}
List<List<String>> result = new ArrayList<>();
String prefix = "";
for (char c : searchWord.toCharArray()) {
prefix += c;
result.add(trie.getSuggestions(prefix));
}
return result;
}
}
* Explanation:
--------------
1. TrieNode Class:
- children: An array representing 26 possible children nodes for each character (a to z).
- suggestions: A list to keep track of up to three lexicographically smallest products with the current prefix.
2. Trie Class:
- insert(String word): Inserts a word into the Trie, updating the suggestions list at each node.
- addSuggestion(List<String> suggestions, String word): Adds a word to the suggestions list and ensures the
list contains at most three lexicographically smallest words.
- getSuggestions(String prefix): Returns the list of suggestions for a given prefix by traversing the Trie.
3. Solution Class:
- suggestedProducts(String[] products, String searchWord): Initializes the Trie with the given products, then
iteratively forms prefixes from `searchWord` and retrieves suggestions for each prefix.
* Time and Space Complexity Analysis:
-------------------------------------
* Building the Trie:
--------------------
- Time Complexity: Inserting each product into the Trie takes O(L) time, where L is the average length of the
product names. If there are N products, the total time to build the Trie is O(N * L).
- Space Complexity: The space complexity for storing the Trie is O(T), where T is the total number of characters
in all the products.
* Searching for Suggestions:
----------------------------
- Time Complexity: For each character in the search word, we traverse the Trie from the root to the current character,
which takes O(L) time in the worst case. After reaching the node, performing a DFS to collect suggestions takes O(3)
time for each node visited, since we limit suggestions to three. Thus, for a search word of length M, the time
complexity is O(M * (L + 3)).
- Space Complexity: The space complexity for the DFS stack is O(L), where L is the depth of the Trie. Additionally,
storing the results requires O(M * 3), as we store up to three suggestions for each prefix of the search word.
This approach efficiently handles the problem requirements using a Trie for prefix-based search and suggestion management.