Algorithms R OBERT S EDGEWICK | K EVIN W AYNE 3.1 S YMBOL T ABLES ‣ API ‣ elementary implementations Algorithms F O U R T H E D I T I O N R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ ordered operations Data structures “ Smart data structures and dumb code works a lot better than the other way around. ” – Eric S. Raymond 2 3.1 S YMBOL T ABLES ‣ API ‣ elementary implementations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ ordered operations Symbol tables Key-value pair abstraction. ・Insert a value with specified key. ・Given a key, search for the corresponding value. Ex. DNS lookup. ・Insert domain name with specified IP address. ・Given domain name, find corresponding IP address. domain name IP address www.cs.princeton.edu 128.112.136.11 www.princeton.edu 128.112.128.15 www.yale.edu 130.132.143.21 www.harvard.edu 128.103.060.55 www.simpsons.com 209.052.165.60 key value 4 Symbol table applications application purpose of search key value dictionary find definition word definition book index find relevant pages term list of page numbers file share find song to download name of song computer ID financial account process transactions account number transaction details web search find relevant web pages keyword list of page names compiler find properties of variables variable name type and value routing table route Internet packets destination best route DNS find IP address domain name IP address reverse DNS find domain name IP address domain name genomics find markers DNA string known positions file system find file on disk filename location on disk 5 Symbol tables: context Also known as: maps, dictionaries, associative arrays. Generalizes arrays. Keys need not be between 0 and N – 1. Language support. ・External libraries: C, VisualBasic, Standard ML, bash, ... ・Built-in libraries: Java, C#, C++, Scala, ... ・Built-in to language: Awk, Perl, PHP, Tcl, JavaScript, Python, Ruby, Lua. every array is an associative array every object is an associative array table is the only primitive data structure hasNiceSyntaxForAssociativeArrays["Python"] = True hasNiceSyntaxForAssociativeArrays["Java"] = False legal Python code 6 Basic symbol table API Associative array abstraction. Associate one value with each key. public class ST<Key, Value> ST() void put(Key key, Value val) Value get(Key key) boolean contains(Key key) void delete(Key key) boolean isEmpty() int size() Iterable<Key> keys() create an empty symbol table put key-value pair into the table value paired with key a[key] = val; a[key] is there a value paired with key? remove key (and its value) from table is the table empty? number of key-value pairs in the table all the keys in the table 7 Conventions ・Values are not null. ・Method get() returns null if key not present. ・Method put() overwrites old value with new value. Java allows null value Intended consequences. ・Easy to implement contains(). public boolean contains(Key key) { return get(key) != null; } ・Can implement lazy version of delete(). public void delete(Key key) { put(key, null); } 8 Keys and values Value type. Any generic type. specify Comparable in API. Key type: several natural assumptions. ・Assume keys are Comparable, use compareTo(). ・Assume keys are any generic type, use equals() to test equality. ・Assume keys are any generic type, use equals() to test equality; use hashCode() to scramble key. built-in to Java (stay tuned) Best practices. Use immutable types for symbol table keys. ・Immutable in Java: Integer, Double, String, java.io.File, … ・Mutable in Java: StringBuilder, java.net.URL, arrays, ... 9 Equality test All Java classes inherit a method equals(). Java requirements. For any references x, y and z: ・Reflexive: ・Symmetric: ・Transitive: ・Non-null: x.equals(x) is true. x.equals(y) iff y.equals(x). equivalence relation if x.equals(y) and y.equals(z), then x.equals(z). x.equals(null) is false. do x and y refer to the same object? Default implementation. (x == y) Customized implementations. Integer, Double, String, java.io.File, … User-defined implementations. Some care needed. 10 Implementing equals for user-defined types Seems easy. public class Date implements Comparable<Date> { private final int month; private final int day; private final int year; ... public boolean equals(Date that) { if (this.day != that.day ) return false; if (this.month != that.month) return false; if (this.year != that.year ) return false; return true; check that all significant fields are the same } } 11 Implementing equals for user-defined types Seems easy, but requires some care. typically unsafe to use equals() with inheritance (would violate symmetry) public final class Date implements Comparable<Date> { private final int month; private final int day; private final int year; ... public boolean equals(Object y) { if (y == this) return true; must be Object. Why? Experts still debate. optimize for true object equality if (y == null) return false; check for null if (y.getClass() != this.getClass()) return false; objects must be in the same class (religion: getClass() vs. instanceof) Date that = (Date) y; if (this.day != that.day ) return false; if (this.month != that.month) return false; if (this.year != that.year ) return false; return true; cast is guaranteed to succeed check that all significant fields are the same } } 12 Equals design "Standard" recipe for user-defined types. ・Optimization for reference equality. ・Check against null. ・Check that two objects are of the same type; cast. ・Compare each significant field: – if field is a primitive type, use == but use Double.compare() with double (to deal with -0.0 and NaN) – if field is an object, use equals() apply rule recursively – if field is an array, apply to each entry can use Arrays.deepEquals(a, b) but not a.equals(b) Best practices. e.g., cached Manhattan distance ・No need to use calculated fields that depend on other fields. ・Compare fields mostly likely to differ first. ・Make compareTo() consistent with equals(). x.equals(y) if and only if (x.compareTo(y) == 0) 13 ST test client for traces Build ST by associating value i with ith string from standard input. public static void main(String[] args) { ST<String, Integer> st = new ST<String, Integer>(); for (int i = 0; !StdIn.isEmpty(); i++) keys S E A R C H { values 0 1 2 3 4 5 String key = StdIn.readString(); st.put(key, i); output for basic symbol table } (one possibility) for (String s : st.keys()) L 11 StdOut.println(s + " " + st.get(s)); } P 10 M X keys S E A R C H E X A M values 0 1 2 3 4 5 6 7 8 9 output for basic symbol table (one possibility) H C P L ER 10 11 12A E S output for ordered symbol table A 8 9 7 5 4 3 8 12 0 E X A M P L 6 7 8 9 10 11 1 output outputfor ordered symbol table A 8 C E H L M P R S 4 12 5 11 9 10 3 0 X 7 Keys, values, and output for test client 14 ST test client for analysis Frequency counter. Read a sequence of strings from standard input and print out one that occurs with highest frequency. % more it was it was it was it was it was it was it was it was it was it was tinyTale.txt the best of times the worst of times the age of wisdom the age of foolishness the epoch of belief the epoch of incredulity the season of light the season of darkness the spring of hope the winter of despair % java FrequencyCounter 1 < tinyTale.txt it 10 tiny example (60 words, 20 distinct) % java FrequencyCounter 8 < tale.txt business 122 real example (135,635 words, 10,769 distinct) % java FrequencyCounter 10 < leipzig1M.txt government 24763 real example (21,191,455 words, 534,580 distinct) 15 Frequency counter implementation public class FrequencyCounter { public static void main(String[] args) { int minlen = Integer.parseInt(args[0]); ST<String, Integer> st = new ST<String, Integer>(); while (!StdIn.isEmpty()) { String word = StdIn.readString(); ignore short strings if (word.length() < minlen) continue; if (!st.contains(word)) st.put(word, 1); else st.put(word, st.get(word) + 1); } String max = ""; st.put(max, 0); for (String word : st.keys()) if (st.get(word) > st.get(max)) max = word; StdOut.println(max + " " + st.get(max)); } } create ST read string and update frequency print a string with max freq 16 3.1 S YMBOL T ABLES ‣ API ‣ elementary implementations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ ordered operations Sequential search in a linked list Data structure. Maintain an (unordered) linked list of key-value pairs. Search. Scan through all keys until find a match. Insert. Scan through all keys until find a match; if no match add to front. key value first S 0 S 0 red nodes are new E 1 E 1 S 0 A 2 A 2 E 1 S 0 R 3 R 3 A 2 E 1 S 0 C 4 C 4 R 3 A 2 E 1 S 0 H 5 H 5 C 4 R 3 A 2 E 1 S 0 E 6 H 5 C 4 R 3 A 2 E 6 S 0 X 7 X 7 H 5 C 4 R 3 A 2 E 6 S 0 A 8 X 7 H 5 C 4 R 3 A 8 E 6 S 0 M 9 M 9 X 7 H 5 C 4 R 3 A 8 E 6 gray nodes are untouched S 0 P 10 P 10 M 9 X 7 H 5 C 4 R 3 A 8 E 6 S 0 L 11 L 11 P 10 M 9 X 7 H 5 C 4 R 3 A 8 E 6 S 0 E 12 L 11 P 10 M 9 X 7 H 5 C 4 R 3 A 8 E 12 S 0 black nodes are accessed in search circled entries are changed values Trace of linked-list ST implementation for standard indexing client 18 Elementary ST implementations: summary guarantee average case operations on keys implementation sequential search (unordered list) search insert search hit insert N N N N equals() Challenge. Efficient implementations of both search and insert. 19 Binary search in an ordered array Data structure. Maintain an ordered array of key-value pairs. Rank helper function. How many keys < k ? keys[] 0 1 2 3 4 5 6 7 8 9 successful search for P keys[] lo hi m A C E H L M P R S X successful search for for P entries in black 0 4 M 5 P 6 7 9 successful 0search 9 4P A 1 C 2 E 3 H keys[] L R 8 S X are a[lo..hi] lo hi m 5 9 7 A 0 C 1 E 2 H 3 L 4 M 5 P 6 R 7 S 8 X 9 successful search for P entries in black 0 A R S S X 5 9 6 4 5 A C C E E H H L L M M P P R X are a[lo..hi] lo hi m entry in red is a[m] 5 9 7 A C E H L M P R S X entries in black 6 A 0 6 9 6 4 A C C E E H H L L M M P P R R S S X X are a[lo..hi] 5 search 6 5 A C E H L M P R S exits X keys[m] = P: return 6 unsuccessful 5 9 7for Q A C E H L M P R loop S X with entry in red is a[m] 6 6 6 A C E H L M P R S X lo 5 hi 6 m5 A C E H L M P R S X exits with keys[m] P: return 6 entry in red is=a[m] unsuccessful search4for Q A C E H L M P R loop 0 6 9 6 6 A C E H L M P R S S X X lo 5 hi 9 form7Q A C E H L M P R loop S exits X with keys[m] = P: return 6 unsuccessful search unsuccessful search for Q 0 A R S S X 5 9 6 4 5 A C C E E H H L L M M P P R X lo hi m 5 9 7 A C E H L M P R S X 7 A 0 6 9 6 4 A C C E E H H L L M M P P R R S S X X 5 6 5 A C E H L M P R S X 5 9 7 loop exits A with C Elo >H hi: L return M P7 R S X 7 6 6 A C E H L M P R S X 5 6 5 A C E H L M P R S X Trace of binary searchreturn for rank in an ordered array 7 6 6 loop exits A with C Elo >H hi: L M P7 R S X loop exitsof with lo >search hi: return Trace binary for rank7 in an ordered array Trace of binary search for rank in an ordered array 20 Binary search: Java implementation public Value get(Key key) { if (isEmpty()) return null; int i = rank(key); if (i < N && keys[i].compareTo(key) == 0) return vals[i]; else return null; } number of keys < key private int rank(Key key) { int lo = 0, hi = N-1; while (lo <= hi) { int mid = lo + (hi - lo) / 2; int cmp = key.compareTo(keys[mid]); if (cmp < 0) hi = mid - 1; else if (cmp > 0) lo = mid + 1; else if (cmp == 0) return mid; } return lo; } 21 Elementary symbol tables: quiz 1 Implementing binary search was A. Easier than I thought. B. About what I expected. C. Harder than I thought. D. Much harder than I thought. E. I don't know. (Well, you should!) 22 FIND THE FIRST 1 Problem. Given an array with all 0s in the beginning and all 1s at the end, find the index in the array where the 1s start. input 0 0 0 0 0 … 0 0 0 0 0 1 1 1 … 1 1 1 Variant 1. You are given the length of the array. Variant 2. You are not given the length of the array. 23 Binary search: trace of standard indexing client Problem. To insert, need to shift all greater keys over. key value 0 1 2 keys[] 3 4 5 6 7 8 9 N 0 1 0 1 2 2 2 0 1 1 4 0 3 1 0 3 0 2 2 2 8 8 4 4 4 4 4 1 6 6 6 6 5 5 5 5 5 3 3 3 3 9 8 8 8 8 S E A R C 0 1 2 3 4 S E A A A S E E C S R E S R H E X A M 5 6 7 8 9 A A A A A C C C C C E E E E E H H H H H 1 2 entries in red 3 were inserted 4 S entries in gray 5 did not move 6 R S R S 6 R S X 7 R S X 7 M R S X 8 P L E 10 11 12 A A A C C C E E E H H H M L L P M M R P P S R R X S S X X A C E H L M P R S X 9 10 10 2 vals[] 3 4 5 6 7 8 9 entries in black moved to the right 0 0 0 0 3 circled entries are changed values 7 7 0 7 4 6 4 6 4 12 5 9 10 3 5 11 9 10 5 11 9 10 0 3 3 7 0 0 7 7 4 12 5 11 3 0 7 9 10 Trace of ordered-array ST implementation for standard indexing client 24 Elementary ST implementations: summary guarantee average case operations on keys implementation search insert search hit insert sequential search (unordered list) N N N N equals() binary search (ordered array) log N N log N N compareTo() Challenge. Efficient implementations of both search and insert. 25 3.1 S YMBOL T ABLES ‣ API ‣ elementary implementations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ ordered operations Examples of ordered symbol table API min() get(09:00:13) floor(09:05:00) select(7) keys(09:15:00, 09:25:00) ceiling(09:30:00) max() keys values 09:00:00 09:00:03 09:00:13 09:00:59 09:01:10 09:03:13 09:10:11 09:10:25 09:14:25 09:19:32 09:19:46 09:21:05 09:22:43 09:22:54 09:25:52 09:35:21 09:36:14 09:37:44 Chicago Phoenix Houston Chicago Houston Chicago Seattle Seattle Phoenix Chicago Chicago Chicago Seattle Seattle Chicago Chicago Seattle Phoenix size(09:15:00, 09:25:00) is 5 rank(09:10:25) is 7 Examples of ordered symbol-table operations 27 Ordered symbol table API public class ST<Key extends Comparable<Key>, Value> ... Key min() smallest key Key max() largest key Key floor(Key key) Key ceiling(Key key) largest key less than or equal to key smallest key greater than or equal to key int rank(Key key) number of keys less than key Key select(int k) key of rank k void deleteMin() delete smallest key void deleteMax() delete largest key int size(Key lo, Key hi) Iterable<Key> keys() Iterable<Key> keys(Key lo, Key hi) number of keys between lo and hi all keys, in sorted order keys between lo and hi, in sorted order 28 Binary search: ordered symbol table operations summary sequential search binary search search N log N insert N N min / max N 1 floor / ceiling N log N rank N log N select N 1 ordered iteration N log N N order of growth of the running time for ordered symbol table operations 29 Algorithms R OBERT S EDGEWICK | K EVIN W AYNE 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms F O U R T H E D I T I O N R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion Binary search trees Definition. A BST is a binary tree in symmetric order. root a left link A binary tree is either: a subtree ・Empty. ・Two disjoint binary trees (left and right). right child of root null links Anatomy of a binary tree Symmetric order. Each node has a key, and every node’s key is: ・ ・Smaller than all keys in its right subtree. Larger than all keys in its left subtree. parent of A and R left link of E S E X A R C keys smaller than E key H 9 value associated with R keys larger than E Anatomy of a binary search tree 3 Binary search tree demo Search. If less, go left; if greater, go right; if equal, search hit. successful search for H S E X A R C H M 4 Binary search tree demo Insert. If less, go left; if greater, go right; if null, insert. insert G S E X A R C H G M 5 BST representation in Java Java definition. A BST is a reference to a root Node. A Node is composed of four fields: ・A Key and a Value. ・A reference to the left and right subtree. smaller keys larger keys private class Node { private Key key; private Value val; private Node left, right; public Node(Key key, Value val) { this.key = key; this.val = val; } } BST Node key left BST with smaller keys val right BST with larger keys Binary search tree Key and Value are generic types; Key is Comparable 6 BST implementation (skeleton) public class BST<Key extends Comparable<Key>, Value> { private Node root; private class Node { /* see previous slide */ root of BST } public void put(Key key, Value val) { /* see next slides */ } public Value get(Key key) { /* see next slides */ } public void delete(Key key) { /* see next slides */ } public Iterable<Key> iterator() { /* see next slides */ } } 7 BST search: Java implementation Get. Return value corresponding to given key, or null if no such key. public Value get(Key key) { Node x = root; while (x != null) { int cmp = key.compareTo(x.key); if (cmp < 0) x = x.left; else if (cmp > 0) x = x.right; else if (cmp == 0) return x.val; } return null; } Cost. Number of compares = 1 + depth of node. 8 BST insert Put. Associate value with key. inserting L S E X A H C Search for key, then two cases: ・Key in tree ⇒ reset value. ・Key not in tree ⇒ add new node. R M P search for L ends at this null link S E X A R H C M create new node P L S E X A R C H M reset links on the way up L P Insertion into a BST 9 BST insert: Java implementation Put. Associate value with key. public void put(Key key, Value val) { root = put(root, key, val); } concise, but tricky, recursive code; read carefully! private Node put(Node x, Key key, Value val) { if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); else if (cmp == 0) x.val = val; return x; } Cost. Number of compares = 1 + depth of node. 10 S C Tree shape R E A X ・Many BSTs correspond to same set of keys. typical case case search/insert = 1 + depth of node. ・Number of compares for best H E S C best case R X A typical case X A R H C E H R H S X S E worst case C R C H C X X R A S E S C E X typical case H A R E A A S A worst case C BST possibilities E H R S BottomAline. Tree depends on order of X insertion. worstshape case C E BST possibilities H 11 BST insertion: random order visualization Ex. Insert keys in random order. 12 Binary search trees: quiz 1 In what order does the traverse(root) code print out the keys in the BST? private void traverse(Node x) { if (x == null) return; traverse(x.left); StdOut.println(x.key); traverse(x.right); } A. A C E H M R S X B. A C E R H M X S C. S E A C R H M X D. C A M H R E X S E. S E X A R C H M I don't know. 13 Inorder traversal ・Traverse left subtree. ・Enqueue key. ・Traverse right subtree. public Iterable<Key> keys() { Queue<Key> q = new Queue<Key>(); inorder(root, q); return q; } private void inorder(Node x, Queue<Key> q) { if (x == null) return; inorder(x.left, q); q.enqueue(x.key); inorder(x.right, q); } BST key left val right BST with larger keys BST with smaller keys smaller keys, in order key larger keys, in order all keys, in order Property. Inorder traversal of a BST yields keys in ascending order. 14 Binary search trees: quiz 2 What is the name of this sorting algorithm? 1. Shuffle the keys. 2. Insert the keys into a BST, one at a time. 3. Do an inorder traversal of the BST. A. Insertion sort. B. Mergesort. C. Quicksort. D. None of the above. E. I don't know. 15 Correspondence between BSTs and quicksort partitioning P H T D O E A I S U Y C M L Remark. Correspondence is 1–1 if array has no duplicate keys. 16 BSTs: mathematical analysis Proposition. If N distinct keys are inserted into a BST in random order, the expected number of compares for a search/insert is ~ 2 ln N. Pf. 1–1 correspondence with quicksort partitioning. Proposition. [Reed, 2003] If N distinct keys are inserted into a BST in random order, the expected height is ~ 4.311 ln N. expected depth of function-call stack in quicksort How Tall is a Tree How Tall is a Tree? Bruce Reed Bruce Reed CNRS, Paris, France CNRS, Paris, France reed@moka.ccr.jussieu.fr reed@moka.ccr.jussieu.fr ABSTRACT But… Worst-case height is N – 1. purpose of th ABSTRACT have: purpose this note that for /3 -- ½ Let H~ be the height of a random binaryof search tree toonprove n nodes. Wesearch show tree that on there constants a = 4.31107... Let H~ be the height of a random binary n existshave: THEOREM a n d / 3 = 1.95... such that E(H~) = c~logn - / 3 1 o g l o g n + nodes. We show that there exists constants a = 4.31107... Var(Hn) We also O(1). 1. E(H~) = ~ l o g n - / 3 1 o g l=o gOn a n d / 3 = 1.95... such that E(H~)O(1), = c~logn - / 3show 1 o g l othat g n +Var(H~) =THEOREM Var(Hn) = O(1) . O(1), We also show that Var(H~) = O(1). R e m a r k By Categories and Subject Descriptors nition given i R e m a r k By the definition of a, /3 = 7"g~" 3~ T Categories and Subject Descriptors we this will see E.2 [ D a t a S t r u c t u r e s ] : Trees nition given is more suggestive ofaswhy val For more inf as we will see. E.2 [ D a t a S t r u c t u r e s ] : Trees consult For more information on randommay binary searc[ 1. THE RESULTS R e m a r k may consult [6],[7], [1], [2], [9], [4], and [8]. Afte A binary search tree is a binary tree to each node of which 1. THE RESULTS an R e m a r k After I announced these developed results, Drmo we have associated key; these keys axe drawn from some O(1) using A binary search tree is a binary tree to each node of awhich developed an alternative proof of the fact thatc totally ordered set and the key at v cannot be larger than illumin we have associated a key; these keys axe drawn from some O(1) using completely different proofs techniques. 17 the key at its right child nor smaller than the key at its left decided su totally ordered set and the key at v cannot be larger than proofs illuminate different aspects of the to probl [ exponentially small chance when keys are inserted in random order ] ST implementations: summary guarantee average case implementation operations on keys search insert search hit insert sequential search (unordered list) N N N N equals() binary search (ordered array) log N N log N N compareTo() BST N N log N log N compareTo() Why not shuffle to ensure a (probabilistic) guarantee of log N? 18 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion Minimum and maximum Minimum. Smallest key in table. Maximum. Largest key in table. max()max S min() E min X A R C H M Examples of BST order queries Q. How to find the min / max? 20 Floor and ceiling Floor. Largest key ≤ a given key. Ceiling. Smallest key ≥ a given key. floor(G) max() S min() E X A R C H ceiling(Q) M floor(D) Examples of BST order queries Q. How to find the floor / ceiling? 21 Computing the floor finding finding floor(S) floor(S) Floor. Find largest key ≤ k ? Case 1. [ key in node x = k ] E E A A The floor of k is in the left subtree of x. E C AC R R H H M HM M C The floor of k is k. Case 2. [ key in node x > k ] S S finding floor(S) R finding finding floor(G) floor(G) S S finding floor(G) E E A A E C AC H H R R on the left The floor of k can't be in left subtree of x: it is either in the right subtree of x or it is the key in node x. X SX X R greater than G so S S is is greater than G so M floor(G) H M floor(G) must must be be S is greater than G so on the left on the left M floor(G) must be C Case 3. [ key in node x < k ] X SX X A A C AC S S E E E H H M CG so H M is less than E E is less than G so floor(G) M floor(G) could could be be is less Eon the right on the than right G so floor(G) could be R R X SX X R 22 Computing the floor finding floor(G) public Key floor(Key key) { Node x = floor(root, key); if (x == null) return null; return x.key; } private Node floor(Node x, Key key) { if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp == 0) return x; if (cmp < 0) return floor(x.left, key); Node t = floor(x.right, key); if (t != null) return t; else return x; S E X A R H C M G is less than S so floor(G) must be on the left S E X A R H C G is greater than E so floor(G) could be M on the right S E X A R H C M floor(G)in left subtree is null S } E A C result X R H M 23 Rank and select Q. How to implement rank() and select() efficiently? A. In each node, store the number of nodes in its subtree. subtree count 8 S 6 E 2 A 3 1 2 C H 1 X R 1 M 24 BST implementation: subtree counts private class Node { private Key key; private Value val; private Node left; private Node right; private int count; } public int size() { return size(root); } private int size(Node x) { if (x == null) return 0; ok to call return x.count; when x is null } number of nodes in subtree initialize subtree private Node put(Node x, Key key, Value val) count to 1 { if (x == null) return new Node(key, val, 1); int cmp = key.compareTo(x.key); if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); else if (cmp == 0) x.val = val; x.count = 1 + size(x.left) + size(x.right); return x; } 25 Computing the rank Rank. How many keys < k ? node count 8 6 S Case 1. [ key in node = k ] All keys in left subtree < k; no key in right subtree < k. Case 2. [ key in node x > k ] E 2 A 3 1 2 C H 1 X R 1 M No key in right subtree < k; recursively compute rank in left subtree. Case 3. [ key in node x < k ] All keys in left subtree < k; some keys in right subtree may be < k. 26 Rank Rank. How many keys < k ? node count 8 6 S Easy recursive algorithm (3 cases!) E 2 A 3 1 2 C H 1 X R 1 M public int rank(Key key) { return rank(key, root); } private int rank(Key key, Node x) { if (x == null) return 0; int cmp = key.compareTo(x.key); if (cmp < 0) return rank(key, x.left); else if (cmp > 0) return 1 + size(x.left) + rank(key, x.right); else if (cmp == 0) return size(x.left); } 27 BST: ordered symbol table operations summary sequential search binary search BST search N log N h insert N N h min / max N 1 h floor / ceiling N log N h rank N log N h select N 1 h ordered iteration N log N N N h = height of BST (proportional to log N if keys inserted in random order) order of growth of running time of ordered symbol table operations 28 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion ST implementations: summary guarantee average case ordered ops? implementation key interface search insert delete search hit insert delete sequential search (unordered list) N N N N N N binary search (ordered array) log N N N log N N N ✔ compareTo() BST N N N log N log N ? ✔ compareTo() equals() Next. Deletion in BSTs. 30 BST deletion: lazy approach To remove a node with a given key: ・Set its value to null. ・Leave key in tree to guide search (but don't consider it equal in search). E E A S A S delete I C H ☠ C I R N H tombstone R N Cost. ~ 2 ln N' per insert, search, and delete (if keys in random order), where N' is the number of key-value pairs ever inserted in the BST. Unsatisfactory solution. Tombstone (memory) overload. 31 Deleting the minimum To delete the minimum key: ・Go left until finding a node with a null left link. ・Replace that node by its right link. ・Update subtree counts. go left until reaching null left link A S X E R H C M return that node’s right link S X E R A public void deleteMin() { root = deleteMin(root); H C } private Node deleteMin(Node x) { if (x.left == null) return x.right; x.left = deleteMin(x.left); x.count = 1 + size(x.left) + size(x.right); return x; } M available for garbage collection update links and node counts after recursive calls 7 S 5 X E R C H M Deleting the minimum in a BST 32 Hibbard deletion To delete a node with key k: search for node t containing key k. Case 0. [0 children] Delete t by setting parent link to null. deleting C update counts after recursive calls S S E X A R C E R M node to delete replace with null link 1 E X A R H H H S 5 X A 7 M M available for garbage collection C 33 Hibbard deletion To delete a node with key k: search for node t containing key k. Case 1. [1 child] Delete t by replacing parent link. deleting R update counts after recursive calls S S E X A R C H M node to delete E C X E M X H A replace with child link S 5 H A 7 C M available for garbage collection R 34 Hibbard deletion deleting E node to delete S To delete a node with key k: search for node tEcontainingX key k. A R H C M Case 2. [2 children] ・ ・Delete the minimum in t's right subtree. A C ・Put x in t's spot. Find successor x of t. t S E node to delete S E X H C M x search for key E successor M go right, then go left until reaching null left link min(t.right) X deleteMin(t.right) X M 7 S 5 H A X R C S E E R R H C min(t.right) S H X x M C S E x t.left still a BST successor H A t A R go right, then go left until reaching null left link R x has no left child but X don't garbage collect x x deleting E A search for key E M update links and node counts after recursive calls Deletion in a BST 35 Hibbard deletion: Java implementation public void delete(Key key) { root = delete(root, key); } private Node delete(Node x, Key key) { if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp < 0) x.left = delete(x.left, key); else if (cmp > 0) x.right = delete(x.right, key); else { if (x.right == null) return x.left; if (x.left == null) return x.right; Node t = x; x = min(t.right); x.right = deleteMin(t.right); x.left = t.left; } x.count = size(x.left) + size(x.right) + 1; return x; search for key no right child no left child replace with successor update subtree counts } 36 Hibbard deletion: analysis Unsatisfactory solution. Not symmetric. Surprising consequence. Trees not random (!) ⇒ √ N per op. Longstanding open problem. Simple and efficient delete for BSTs. 37 ST implementations: summary guarantee average case ordered ops? implementation key interface search insert delete search hit insert delete sequential search (unordered list) N N N N N N binary search (ordered array) log N N N log N N N ✔ compareTo() BST N N N log N log N √N ✔ compareTo() equals() other operations also become √N if deletions allowed Next lecture. Guarantee logarithmic performance for all operations. 38
© Copyright 2024