davido.io | Neetcode 4/150

Neetcode 4/150 - Group Anagrams

May 1, 2024

This is entry 4/150 in the NeetCode150 Challenge.

This was the first problem where I did not reach a solution at all. The problem states:

Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Originally my thinking was as follows:

Iterate through each word via loop
Itereate through each letter in word
Check the next word to ensure the length matches and all letters are present
- If they do, add to a temp list
- If not, move on to the next word in the nested loop to
- repeat checks on current work
Move onto next word

Just typing out the above algorithm made me wince. This LeetCode medium appeared to be easy, because as a human, the operation IS easy. Complexity from the above algorithm requires a nested loop, and I need to iterate through EACH entry. so I am already at O(n²). Regardless, I try to code a naive solution prior to optimizing. Eventually I came up with this (after three hours):

 1def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
 2    sol = []
 3    print(strs)
 4
 5    # get empties and reals
 6    empty = 0
 7    real_words = []
 8    for word in strs:
 9        if not word:
10            empty = empty + 1
11        else:
12            real_words.append(word)
13
14    # add proper pair to solution
15    print(empty)
16    empty_list = []
17    for x in range(empty):
18        print("adding empty")
19        empty_list.append("")
20
21    # add to sol
22    if empty != 0:
23        sol.append(empty_list)
24
25    for x in range(len(real_words)):
26        word_list = []
27        print(f"start of word list: {word_list}")
28        # get cur
29        cur = real_words[x]
30        print(f"STARTING WORD = {cur}")
31        word_in_sol = False
32        if x != 0:
33            # check for word
34            for group in sol:
35                if cur in group:
36                    print(f"FOUND! SKIPPING")
37                    word_in_sol = True
38                    break
39
40        if word_in_sol:
41            continue
42        else:
43            print(f"{cur} not in SOL ... checking ...")
44        # process word
45        cur_dict = {}
46        for letter in cur:
47            if letter not in cur_dict:
48                cur_dict[letter] = 1
49            else:
50                cur_dict[letter] = cur_dict.get(letter) + 1
51        # we have a dict. Check remaining words
52        for y in range(x+1, len(real_words)):
53            # create dict from y
54            temp_dict = {}
55            for letter in real_words[y]:
56                if letter not in temp_dict:
57                    temp_dict[letter] = 1
58                else:
59                    temp_dict[letter] = temp_dict.get(letter) + 1
60            if cur_dict == temp_dict:
61                print("Word is an anagram")
62                print(f"adding {real_words[y]} to word list")
63                word_list.append(real_words[y])
64        # always add original word
65        word_list.append(cur)
66        print(f"adding {word_list} to SOL")
67        sol.append(word_list)
68    print(f"SOL IS: {sol}")
69    return sol

I received a “Time Limit Exceeded” message and called it quits. I wasn’t going to get the answer in the allotted rules for my challenge. Also, the rules apply to a naive version and not getting even one solution working bothered me quite a bit.

I took a day to reflect and with no other ideas I looked up the optimal solution. The requirements are:

Create an empty dict
Sort each word
Check if sorted word is in the dict
- If the word is the dict, add the current word to the existing list that is currently the Value part of the Key:Value pair where the Key is the sorted word
- If the word is NOT in the dict, we need to add it. The Key:Value pair that needs to be inserted is: CurrentWord-Sorted: ["CurrentWord"]

I referenced this page for the optimal lookup:

Group Anagrams Problem (with Solution in Java, C++ & Python)

This is actually a great approach. A sorted word will be the same everytime and negates the need for a length check. Using it as a key makes lookups on the resulting string extremely quick. Since our Value can be set to anything, we can just simply create a list and append, this giving us all the groupings.

Here is the code I wrote:

 1def group_anagrams_optimized(strs):
 2    results = {}
 3    for word in strs:
 4        print(f"WORD: {word}")
 5        key = ''.join(sorted(word))
 6        print(f"KEY: {key}")
 7        if key in results:
 8            # We have found the key. Append the actual word to the list in value
 9            results[key] += [word]
10        else:
11            # we do not have the key in our dict, add the key:value pair
12            results[key] = [word]
13    # now we return a list of lists
14    sol = []
15    for value in results.values():
16        sol.append(value)
17    print(sol)
18    return sol

This passes all test cases. Here are the stats:

1# time
2107ms Beats 12.50% of users with Python3
3# space
419.70 MB Beats 78.37%of users with Python3

I also tested the list comprehension that the source material used and received much better stats:

1# time
287 ms Beats 69.91% of users with Python3
3# space
419.28 MB Beats 99.29% of users with Python3

It appears the list comprehension approach saves about 50MB and cuts down on runtime by ~19% (-18.69158878504673). This is substantial for me. I will be sure to see if we can use any other list comprehensions from here on out.

I have a few thoughts on this challenge after failing my first module:

I seriously thought I would be blazing through these. Turns out I was wrong.
I recorded about three hours of footage prior to quitting. I read online if you don’t solve these type of problems within an hour to just look up the optimal solution. After experiencing this, I agree with this opinion. All streams/recordings surrounding a Leetcode/Neetcode problem will be capped at one module and one hour. No one wants to watch three hours of failures.
There is a pattern to all of these problems. I need to git gud at recognizing these patterns.
The one video, one module limitation will ensure I do not lose any proof of this challenge. The viewers will have all of the receipts.

I stream these Neetcode problems on Twitch and have the recordings on YouTube. You can watch me attempt this module or follow me on any of the links below.

👇

Neetcode 4/150 - Group Anagrams

Twitch

YouTube