This is entry 2/150
in the NeetCode150 Challenge.
The task is as follows:
Given two strings
s
andt
, returntrue
ift
is an anagram ofs
, andfalse
otherwise. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
I realized that the code needed to conform to the following criteria:
- The strings must be the same length.
- The strings must use all individual letters.
We managed to get the naive implementation with the following code (function name edited for clarification):
1def is_anagram_naive(s: str, t: str) -> bool:
2 temp_word = t
3 if len(s) == len(t):
4 for letter in s:
5 if letter in temp_word:
6 temp_word = temp_word.replace(letter, '', 1)
7 else:
8 print("False")
9 return False
10 print("True")
11 return True
12 else:
13 print("False")
14 return False
There is massive room for optimization here. When I looked up the solution for this problem, there were numerous solutions but what it boils down to is limiting all operations to O(n)
capacity. Well how would we do this?
My original implementation was to use create two dicts and use them as hashmaps, iterate through each string, and perform a lookup to see if the current letter is present in the hashmap. If it is, increment the value by 1`` or if it isn't, insert the letter as the key and set the value to
1`. We would then repeat with the second string and compare the hashmaps.
This would require at most, two iterations on said strings and the comparison lookup. I used this StackOverflow page to confirm the complexities:
What is the time complexity of comparing 2 dictionaries in python
Since they have the same number of letters confirmed, we would not run into an edge case where it would take more than O(n)
lookups.
Before I started coding this hashmap solution I also read an implementation in Java that uses a frequency array. This is essentially the fancy version of what I was originally wanting to use. The implementation I found is located here:
Valid Anagram — LeetCode Java Solution
The only difference I wanted to make in Python was instead of checking to see if everything was zero, we could simply create two of these frequency arrays and compare them, rather than check for 0
value. We won’t need to hold ALL the values, but only the letters that are found in the string. So how do I create a frequency array as quickly as possible? The answer is in the collections
library.
Normally I’d hesitate to use additional libraries, especially since we have the hashmap to fall back on. However, there are some additional data structures that I have had little exposure to, so I figured this would be an excellent exercise. Here is the code I used:
1def is_anagram_optimized(s: str, t: str) -> bool:
2 # iterate through s string
3 s_fre = collections.Counter(s)
4 t_fre = collections.Counter(t)
5 if s_fre == t_fre:
6 print("Is anagram")
7 return True
8 else:
9 print("Is not anagram")
10 return False
This is a dict
subclass so it should just work, and it did. Here are both implementations checking the first few test cases:
1import collections
2
3
4def is_anagram_naive(s: str, t: str) -> bool:
5 temp_word = t
6 if len(s) == len(t):
7 for letter in s:
8 if letter in temp_word:
9 temp_word = temp_word.replace(letter, '', 1)
10 # print(f"""Word left: {temp_word}""")
11 else:
12 print("False")
13 return False
14 print("True")
15 return True
16 else:
17 print("False")
18 return False
19
20
21def is_anagram_optimized(s: str, t: str) -> bool:
22 # iterate through s string
23 s_fre = collections.Counter(s)
24 t_fre = collections.Counter(t)
25 if s_fre == t_fre:
26 print("Is anagram")
27 return True
28 else:
29 print("Is not anagram")
30 return False
31
32
33# case 1/2
34x1 = "anagram"
35x2 = "nagaram"
36y1 = "rat"
37y2 = "car"
38is_anagram_naive(x1, x2)
39is_anagram_naive(y1, y2)
40is_anagram_optimized(x1, x2)
41is_anagram_optimized(y1, y2)
The runtime was vastly improved. According to LeetCode, the stats for my first solution vs. the second are as follows:
1# First Runtime
2196ms
3Beats 5.24% of users with Python3
4
5# First Memory
616.94MB
7Beats 64.75% of users with Python3
8
9# Second Runtime
1040ms
11Beats 91.85% of users with Python3
12
13# Second Memory
1416.95MB
15Beats 64.75% of users with Python3
Obviously, the next question I have is, “Which metric should I be focusing on?”. Some rumblings on Reddit say to ignore these numbers and only focus on the actual complexity. In this case, I did manage to come to the optimized solution on my own, with a confirmation. For now, if we get to the optimized solution somewhat, we can notate and add it to our toolbox.
I stream these Neetcode problems on Twitch and have the recordings on YouTube. You can watch me attempt this module or follow me on any of the links below.
👇