Comes in Paper 2, not paper 1.
A hash table is a form of key, value table with the goal to immediately find an item. A hashing function is used to calculate the position of an item in a hash table.
Big O Notation: O(1), however worse case may be higher depending on the collision handling
Implementation
class HashTable:
def __init__(self, size: int):
self.table = [None for x in range(size)]
self.size = size
def get(self, key: str):
return self.table[self.__hash__(key)]
def put(self, key: str, data: str):
self.table[self.__hash__(key)] = data
def __hash__(self, data: str):
tol = 0
for char in data:
tol += ord(char)
return tol % self.size
ht = HashTable(100)
ht.put("gurt", "yo")
ht.put("darragh", "hi")
ht.put("so", "tuff")
ht.put("tuff", "tuff x2")
print(ht.get("gurt"))
print(ht.get("darragh"))
print(ht.get("so"))
print(ht.get("tuff"))
Linear Probing
class HashTable:
def __init__(self, size: int):
self.table = [None for _ in range(size)]
self.size = size
def get(self, key: str):
index = self.__hash__(key)
start_index = index
while self.table[index] is not None:
stored_key, stored_value = self.table[index]
if stored_key == key:
return stored_value
index = (index + 1) % self.size
if index == start_index:
break # full loop, key not found
return None
def put(self, key: str, data: str):
index = self.__hash__(key)
start_index = index
while self.table[index] is not None:
stored_key, _ = self.table[index]
if stored_key == key:
# update existing key
self.table[index] = (key, data)
return
index = (index + 1) % self.size
if index == start_index:
raise Exception("Hash table is full")
self.table[index] = (key, data)
def __hash__(self, key: str):
total = 0
for char in key:
total += ord(char)
return total % self.size
ht = HashTable(100)
ht.put("gurt", "yo")
ht.put("darragh", "hi")
ht.put("so", "tuff")
ht.put("tuff", "tuff x2")
print(ht.get("gurt"))
print(ht.get("darragh"))
print(ht.get("so"))
print(ht.get("tuff"))
Hashing Function
One-way algorithm which produces a hashed result. The hashing function is applied to an item to determine a hash value.
A good hashing function should:
- Be calculated quickly
- Result in minimal collisions
- Use minimal memory
Examples:
- MD5
- SHA512, SHA256, SHA128
Resolving Collisions
Collisions: When an input gets the same hash value.
Linear Probing
Repeatedly checking the next space in a hash table until an empty position is found, iterating in a given interval.
Requires a linear search to be performed when inserting but also searching for the item.
Chaining
When it is possible for a multitude of items to be placed at the same position.
Examples:
- Overflow table
- Linked list
- Two-dimensional table
Multiple Hash Functions
Iterating through various hash functions to find an empty position for an item.
Exam Example: Having multiple functions will reduce the likelihood of two or more inputs having the same hash value.
Usage
Applications
- File systems (file name → file path)
- Identifying keywords (token → token class, data type, etc.)
Operations
- Add
- Delete
- Retrieve