Cracking the Coding Interview [Ch.1: Arrays and Strings]
Introduction
Hash tables are essential data structures known for efficient key-value mapping and quick lookup operations. This chapter delves into a common implementation of hash tables using arrays of linked lists and hash code functions. We will explore concepts such as insertion, retrieval, and collision handling, and analyze time complexity. Additionally, the chapter discusses resizable arrays (ArrayList in Java), and the StringBuilder class, and includes key interview questions on strings and data structures.
Hash Tables
A hash table is a data structure that maps keys to values for highly efficient lookup. There are many ways of implementing this. Here, we will describe a simple but common implementation.
In this simple implementation, we use an array of linked lists and a hash code function.
Insert
To insert a key (which might be a string or essentially any other data type) and value, we do the following:
First, compute the key's hash code, which will usually be an int or long. (Note that two different keys could have the same hash code, as there may be an infinite number of keys and a finite number of ints).
Then, map the hash code to an index in the array. This could be done with something like hash (key) % array_length. (Note two different hash codes could, of course, map to the same index).
At this index, there is a linked list of keys and values. Store the key and value in this index. We must use a linked list because of collisions.
Retrieve
To retrieve the value pair by its key, you repeat this process. Compute the hash code from the key, and then compute the index from the hash code. Then, search through the linked list for the value with this key.
Time complexity
If the number of collisions is very high, the worst-case runtime is O(N), where N is the number of keys. However, we generally assume a good implementation that keeps collisions to a minimum, in which case the lookup time is 0(1).
Example
Let's create a hash map for these key-value pairs ("Mohamed", 20), ("Ahmed", 22), ("Youssef", 24), ("Omar", 26) and see how it would be stored.
1- Compute the key's hash code
Using any simple hash function let's assume these are the hash codes of the map keys:
"Mohamed" hashed to
699
"Ahmed" hashed to
479
"Youssef" hashed to
750
"Omar" hashes to index
399
2- Map the hash code to an index
"Mohamed" hashes to index
699 % 4 = 3
"Ahmed" hashes to index
479 % 4 = 3
(collision with "Mohamed")"Youssef" hashes to index
750 % 4 = 2
"Omar" hashes to index
399 % 4 = 3
(collision with "Mohamed" and Ahmed)
3- Store the key and value in this index
Index 0: NULL
Index 1: NULL
Index 2: (Youssef, 24)
Index 3: (Mohamed, 20) -> (Ahmed, 22) -> (Omar, 26)
ArrayList & Resizable Arrays
In some languages, arrays (often called lists in this case) are automatically resizable. The array or list will grow as you append items. In other languages, like Java, arrays are fixed length. The size is defined when you create the array.
When you need an array-like data structure that offers dynamic resizing, you would usually use an Arraylist. An Arraylist is an array that resizes itself as needed while still providing 0(1) access. A typical implementation is that when the array is full, the array doubles in size. Each doubling takes 0(n) time but happens so rarely that its amortized insertion time is still O(1)
Why is the insertion runtime 0(1)?
suppose you have an array of size N. We can work backwards to compute how many elements we copied at each capacity increase. Observe that when we increase the array to K elements, the array was previously half that size. Therefore, we needed to copy k/2 elements.
final capacity increase: n/2 elements to copy
previous capacity increase: n/4 elements to copy
previous capacity increase: n/8 elements to copy
and so on ...
second capacity increase: 2 elements to copy
first capacity increase: 1 element to copy
Therefore, the total number of copies to insert N elements is roughly n/2 + n/4+ n/8+ ... + 2 + 1, which is just less than N.
Therefore, inserting N elements takes O(N) work total. Each insertion is 0(1) on average, even though some insertions take O ( N) time in the worst case.
StringBuilder
Imagine you were concatenating a list of strings, as shown below. What would the running time of this code be? For simplicity, assume that the strings are all the same length (call this x) and that there are n strings.
String joinWords(String[] words) {
String sentence = "";
for (String w: words) {
sentence = sentence + w;
}
return sentence;
}
On each concatenation, a new copy of the string is created, and the two strings are copied over, character by character. The first iteration requires us to copy x characters. The second iteration requires copying 2x characters. The third iteration requires 3x, and so on. The total time therefore is O( x + 2x + . . . + nx). This reduces to O(xn^2)
StringBuilder can help you avoid.this problem. StringBuilder simply creates a resizable array of all the strings, copying them back to a string only when necessary.
String joinWords(String[] words) {
StringBuilder sentence new StringBuilder();
for (String w: words) {
sentence.append(w);
}
return sentence.toString();
}
Interview Questions
Is Unique: Implement an algorithm to determine if a string has all unique characters. What if you cannot use additional data structures?
Check Permutation: Given two strings, write a method to decide if one is a permutation of the other
URLify: Write a method to replace all spaces in a string with '%20'. You may assume that the string has sufficient space at the end to hold the additional characters, and that you are given the "true" length of the string. (Note: If implementing in Java, please use a character array so that you can perform this operation in place.) EXAMPLE Input: "Mr John Smith ", 13 Output: "Mr%20John%20Smith"
Palindrome Permutation: Given a string, write a function to check if it is a permutation of a palindrome. A palindrome is a word or phrase that is the same forwards and backwards. A permutation is a rearrangement of letters. The palindrome does not need to be limited to just dictionary words. 1.5 1.6 EXAMPLE Input: Tact Coa Output: True (permutations: "taco cat", "atco eta", etc.)
One Away: There are three types of edits that can be performed on strings: insert a character, remove a character, or replace a character. Given two strings, write a function to check if they are one edit (or zero edits) away. EXAMPLE pale, ple -> true pales, pale -> true pale, bale -> true pale, bake -> false
String Compression: Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. If the "compressed" string would not become smaller than the original string, your method should return the original string
Rotate Matrix: Given an image represented by an NxN matrix, where each pixel in the image is 4 bytes, write a method to rotate the image by 90 degrees. Can you do this in place?
Zero Matrix: Write an algorithm such that if an element in an MxN matrix is 0, its entire row and column are set to 0.
String Rotation: Assume you have a method isSubstringwhich checks if one word is a substring of another. Given two strings, s1 and s2, write code to check if s2 is a rotation of s1 using only one call to isSubstring (e.g., "waterbottle" is a rotation of"erbottlewat").
Summary:
Hash tables offer fast lookup times by mapping keys to indices in an array using hash codes. Collisions, when different keys hash to the same index, are handled using linked lists. The time complexity for lookups is typically O(1) on average, assuming a good implementation with minimal collisions. Resizable arrays like ArrayList provide dynamic sizing with amortized O(1) insertion times. StringBuilder optimizes string concatenation to avoid quadratic time complexities. The provided interview questions cover various string manipulation and data structure concepts, offering practical challenges to reinforce understanding and problem-solving skills.
If you have any questions or need further clarification on any topic discussed here, please feel free to reach out to me. I'm here to help!
See you in the next chapter ๐