This is entry 4/150
in the NeetCode150 Challenge.
This was the first problem where I did not reach a solution at all. The problem states:
Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Originally my thinking was as follows:
- Iterate through each
word
via loop - Itereate through each letter in
word
- Check the next word to ensure the length matches and all letters are present
- If they do, add to a temp list
- If not, move on to the next word in the nested loop to
- repeat checks on current work
- Move onto next word
Just typing out the above algorithm made me wince. This LeetCode medium appeared to be easy, because as a human, the operation IS easy. Complexity from the above algorithm requires a nested loop, and I need to iterate through EACH entry. so I am already at O(nĀ²)
. Regardless, I try to code a naive solution prior to optimizing. Eventually I came up with this (after three hours):
1def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
2 sol = []
3 print(strs)
4
5 # get empties and reals
6 empty = 0
7 real_words = []
8 for word in strs:
9 if not word:
10 empty = empty + 1
11 else:
12 real_words.append(word)
13
14 # add proper pair to solution
15 print(empty)
16 empty_list = []
17 for x in range(empty):
18 print("adding empty")
19 empty_list.append("")
20
21 # add to sol
22 if empty != 0:
23 sol.append(empty_list)
24
25 for x in range(len(real_words)):
26 word_list = []
27 print(f"start of word list: {word_list}")
28 # get cur
29 cur = real_words[x]
30 print(f"STARTING WORD = {cur}")
31 word_in_sol = False
32 if x != 0:
33 # check for word
34 for group in sol:
35 if cur in group:
36 print(f"FOUND! SKIPPING")
37 word_in_sol = True
38 break
39
40 if word_in_sol:
41 continue
42 else:
43 print(f"{cur} not in SOL ... checking ...")
44 # process word
45 cur_dict = {}
46 for letter in cur:
47 if letter not in cur_dict:
48 cur_dict[letter] = 1
49 else:
50 cur_dict[letter] = cur_dict.get(letter) + 1
51 # we have a dict. Check remaining words
52 for y in range(x+1, len(real_words)):
53 # create dict from y
54 temp_dict = {}
55 for letter in real_words[y]:
56 if letter not in temp_dict:
57 temp_dict[letter] = 1
58 else:
59 temp_dict[letter] = temp_dict.get(letter) + 1
60 if cur_dict == temp_dict:
61 print("Word is an anagram")
62 print(f"adding {real_words[y]} to word list")
63 word_list.append(real_words[y])
64 # always add original word
65 word_list.append(cur)
66 print(f"adding {word_list} to SOL")
67 sol.append(word_list)
68 print(f"SOL IS: {sol}")
69 return sol
I received a “Time Limit Exceeded” message and called it quits. I wasn’t going to get the answer in the allotted rules for my challenge. Also, the rules apply to a naive version and not getting even one solution working bothered me quite a bit.
I took a day to reflect and with no other ideas I looked up the optimal solution. The requirements are:
- Create an empty dict
- Sort each word
- Check if sorted word is in the dict
- If the word is the dict, add the current word to the existing list that is currently the
Value
part of theKey:Value
pair where theKey
is the sorted word - If the word is
NOT
in the dict, we need to add it. TheKey:Value
pair that needs to be inserted is:CurrentWord-Sorted: ["CurrentWord"]
- If the word is the dict, add the current word to the existing list that is currently the
I referenced this page for the optimal lookup:
Group Anagrams Problem (with Solution in Java, C++ & Python)
This is actually a great approach. A sorted word will be the same everytime and negates the need for a length check. Using it as a key makes lookups on the resulting string extremely quick. Since our Value
can be set to anything, we can just simply create a list and append, this giving us all the groupings.
Here is the code I wrote:
1def group_anagrams_optimized(strs):
2 results = {}
3 for word in strs:
4 print(f"WORD: {word}")
5 key = ''.join(sorted(word))
6 print(f"KEY: {key}")
7 if key in results:
8 # We have found the key. Append the actual word to the list in value
9 results[key] += [word]
10 else:
11 # we do not have the key in our dict, add the key:value pair
12 results[key] = [word]
13 # now we return a list of lists
14 sol = []
15 for value in results.values():
16 sol.append(value)
17 print(sol)
18 return sol
This passes all test cases. Here are the stats:
1# time
2107ms Beats 12.50% of users with Python3
3# space
419.70 MB Beats 78.37%of users with Python3
I also tested the list comprehension that the source material used and received much better stats:
1# time
287 ms Beats 69.91% of users with Python3
3# space
419.28 MB Beats 99.29% of users with Python3
It appears the list comprehension approach saves about 50MB and cuts down on runtime by ~19%
(-18.69158878504673
). This is substantial for me. I will be sure to see if we can use any other list comprehensions from here on out.
I have a few thoughts on this challenge after failing my first module:
- I seriously thought I would be blazing through these. Turns out I was wrong.
- I recorded about three hours of footage prior to quitting. I read online if you don’t solve these type of problems within an hour to just look up the optimal solution. After experiencing this, I agree with this opinion. All streams/recordings surrounding a Leetcode/Neetcode problem will be capped at one module and one hour. No one wants to watch three hours of failures.
- There is a pattern to all of these problems. I need to
git gud
at recognizing these patterns. - The one video, one module limitation will ensure I do not lose any proof of this challenge. The viewers will have all of the receipts.
I stream these Neetcode problems on Twitch and have the recordings on YouTube. You can watch me attempt this module or follow me on any of the links below.
š