This is entry 1/150
in the NeetCode150 Challenge.
What seems like a trivial problem to solve actually led me down a blackhole of Python efficiencies. The problem states:
Given an integer array
nums
, return true if any value appears at least twice in the array, and return false if every element is distinct.
Now obviously the problem isn’t data type specific since we are using Python. There are two methods which I impemented for this, with the first being the naive one:
- Make an empty dict
- If the current element IS NOT in the dict, insert it. Else, the current element IS in the dict, break the loop and return True.
- Return False if we don’t find a value in the dict at all after all elements have been iterated through.
The second implementation is quite simple.
- Convert the list into a set
- If the length of the list is the same as the length of the set, there are no duplicates, return False.
- Else: if they differ, return True.
Here is the code I implemented for both functions outside of Leetcode (I tested both and each pased all taste cases and counted as a submission):
1def contains_duplicate(nums):
2 dupe_found = False
3 results = {}
4 for number in nums:
5 if number in results:
6 dupe_found = True
7 print(dupe_found)
8 return dupe_found
9 else:
10 results[number] = "Found"
11 print(dupe_found)
12 return dupe_found
13
14
15def contains_duplicate_with_set(nums):
16 if len(nums) == len(set(nums)):
17 print("No Duplicates")
18 return False
19 else:
20 print("Print Duplicate found")
21 return True
22
23
24# Should return True
25nums_case1 = [1, 2, 3, 1]
26# Should return False
27nums_case2 = [1, 2, 3, 4]
28# Should return True
29nums_case3 = [1, 1, 1, 3, 3, 4, 3, 2, 4, 2]
30
31# run test cases
32contains_duplicate(nums_case1)
33contains_duplicate(nums_case2)
34contains_duplicate(nums_case3)
35contains_duplicate_with_set(nums_case1)
36contains_duplicate_with_set(nums_case2)
37contains_duplicate_with_set(nums_case3)
The set execution ran more quickly on Leetcode (non premium account). I have no idea if this matters, but as I was coding the second implementation I started to wonder whether converting to a set was actually more efficient.
A list -> set
conversion relies on O(N)
iterations, which is essentially what the basic insert does. However, this doesn’t account for SPACE complexity. I started to research this and it appears sets do take up quite a bit more memory than lists with the most inefficient conversions being based around lists with a small set of numbers. The source I used was the following:
is-python-set-more-space-efficient-than-list
Let’s use the space complexity defintion from WikiPedia:
The space complexity of an algorithm or a data structure is the amount of memory space required to solve an instance of the computational problem as a function of characteristics of the input.
So thinking of hash tables vs. sets on the memory level, which is more efficient? I don’t think it matters very much in this case because if we have a collision on the underlying structure we will automatically exit the program and return True
. This isn’t a practical problem to be worrying about but it makes good practice to start thinking about how implementations of algorithms will look on underlying hardware.
I will try in the future to make my solutions not rely on external libraries unless absolutely necessary.
I stream these Neetcode problems on Twitch and have the recordings on YouTube. You can watch me attempt this module or follow me on any of the links below.
👇