Group Anagrams - Categorize Strings by Count - Leetcode 49

preview_player
Показать описание


0:00 - Drawing solution
4:35 - Coding solution

leetcode 49

#string #anagram #python
Рекомендации по теме
Комментарии
Автор

For those who don't know what is defaultdict(list). It means "Defining a dictionary with values as a list
". Also it will give you error so dont forget to use this in your code.
from collections import defaultdict

TheFazilaashraf
Автор

I went with a O(m*n*log(n)) solution. This one seems super advanced to come up with on the spot, but I would have never thought of sorting words and key them in that way. Both solutions are super smart.

bosteador
Автор

Always love the direct and no nonsense explanations! Great job!

QuadQuadQuadQuad
Автор

If you don't want to use defaultdict() from collections, you may do this:

def groupAnagrams(strs):
res = {}
for s in strs:
count = [0] * 26 # a .... z
for c in s:
count[ord(c) - ord("a")] += 1
key = tuple(count)
if key in res:
res[key].append(s)
else:
res[key] = [s]

return list(res.values())

AbhishekMishraiitkgp
Автор

Just did this problem with O(m . nlog(n)) yay, just something to note:
- The time complexity between sorting and using a map for character frequency is very similar (if not equal) depending on the average length of the strings. The problem in LeetCode specifies that the string lengths vary from 0 to 100, making it so the sorting time isn't as impactful on O. (Tested multiple times both solutions runtime as well)

- The algorithm will take more space on the solution of frequency map than the sorting solution if the strings do not have much characters (depending on the sorting algorithm you use as well) as it will take O(n) space complexity for the algorithm to sort the string. This is because the strings may be way less longer than 26 characters, which is the size of the frequency map, so it's a thing to take into consideration for sure.

So, in other words, if you see that the string length is specified as having the possibility of being very large then the frequency map is the way to go, otherwise the sorting algorithm is the best solution.

Pretty sure you won't read this Neet but I just had a meeting with you and Pooja Dutt, both of you are awesome.

theornament
Автор

Wow! Very different than my soultion. In my head I went about it by grabbing the word at strs[0], check to see if every char in that word exitists in any of the other words, and if so append all matching words to a group. After all possible words have been found we remove them from the strs list, so the element at strs[0] is now a new word that is not any of the anagrams of the orginal strs[0].

etchris
Автор

This one was really difficult for me for some reason.. I need to come back and solve this again without looking at the solution.
Thanks man, liked and comment for support.

symbol
Автор

For anyone wanting to write exact same solution in Javascript, here it is:
var groupAnagrams = function (strs) {
let res = {}; //mapping charCount to list of anagrams i.e "1e, 1a, 1t": ["eat", "tea", "ate"]
for(let s of strs) {
//To count each char:
let count = new Array(26).fill(0); //for a....z
//We'll index a to 0 till z to idx 25
for(let c of s) {
count[c.charCodeAt(0) - "a".charCodeAt(0)] += 1;
}
let commaSeparatedCount = count.join(", ");
if(commaSeparatedCount in res) {

} else {
res[commaSeparatedCount] = [s];
}
//console.log("res: ", res);
", Object.values(res))
}
return Object.values(res);
}

Basically, I don't know Python, so I typed out line by line Neet's python solution to understand the output of each of the lines to reproduce this solution in Javascript.
Thank you Neetcode!

sahildhawan
Автор

@neetcode I think the O(m.n.logn) solution will always be optimum as log(n) will always be smaller than 26 in all of the cases

AbhishekKumar-cxmo
Автор

According to the question, each string has only 100 chars => log(100) < 7 => O(mnlogn) < (7mn). Therefore, the first solution is much faster than the second solution, which is O(26mn).
O(26mn) is better than O(m nlogn) only when the average string length is 67.108.864 chars.

nguyen-dev
Автор

first, thank you @NeetCode for such great videos
so I am doing these problems on the Jupyter Notebook. Here are the minor changes to get the code to work:
1. add "from collections import defaultdict"
2. lowercase the word "List"
def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
3. change "return res.values()" to "list(res.values)"

codingyearold
Автор

Thanks. Great work!
Here is a solution without using defualtdict . I also used pretty print here to see the result .
import pprint
strs = ["eat", "tea","tan","ate","nat","bat"]
result = {}

for i in strs:
key = [0] * 26
for c in i:
key[ord(c) - ord('a')] += 1
if tuple(key) in result:
result[tuple(key)].append(i)
else:
result[tuple(key)] = [i]
pprint.pprint(result)

rishabhjain
Автор

For practical scenarios, I found that it's faster to sort each word. I actually got a better result on leetcode by sorting the words and then checking if they're in a dictionary.

tomdarank
Автор

Such a clever solution, I wonder if I will get to this level of problem solving to come up with something similar on my own ._.

Mustafa-
Автор

Thank you @NeetCode for all the great videos! I just want to point out that the optimal approach is great for a large n but in case of this particular problem because n is limited to max of 100 (0 <= strs[i].length <= 100), log(n) would be 6.6 which makes m*n*log n faster than m*n*26

imang
Автор

Thank you! Never would have thought of making the keys the ASCII character count, that is pretty clever

vaiterius
Автор

First of all, thanks for your work.

I coded first and second solutions in python3, solution 1 (using sorted strings) seems to perform better in speed use and in memory use on leetcode test cases.

lgami
Автор

awesome but I wish you went through one example step by step. i bet it's simple but it takes me an hour to fully understand the problem :(

dimakavetskyy
Автор

I went with a different approach where I just check whether sorted(string) is in the dict, if not then just store an array against that sorted string in the dict

hassansuhaib
Автор

more efficient time space way would be to sort each str ele in strs and put the original form in a hashmap with hashmap values as lists of similar anagrams (ie map = { 'aet':['tea', 'ate'], 'abt':['bat', 'tab'], ... } then you just output the hashmaps values

slayerzerg