Given a non-empty array of integers, return the k most frequent elements.
For example, Given [1,1,1,2,2,3] and k = 2, return [1,2].
Note: You may assume k is always valid, 1 ≤ k ≤ number of unique elements. Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size.
Solution:
In the problem description, it ask for an algorithm that has time complexity better than O(n log n). We can use bucket sort.
Create a map to count the element appearance in array, use elements value as key, which takes O(n)
Then create a 2d array to classify elements with item’s frequency
Search from the highest to kth frequently item.
1 2 3 4 5 6 7 8 9 10
classSolution: deftopKFrequent(self, nums: List[int], k: int) -> List[int]: buckets = [[] for i in range(len(nums))] for num,freq in collections.Counter(nums).items(): print("num ", num, ", freq ", freq) buckets[-freq].append(num) print(buckets) return list(itertools.chain(*buckets))[:k]
classSolution { public: vector<int> topKFrequent(vector<int>& nums, int k) { unordered_map<int, int> map; for(int n: nums) ++map[n]; //put elements in map, increase the frequency vector<vector<int>> buckets(nums.size()+1); //size()+1 is because the sequence might full with the same element for(auto p: map) buckets[p.second].push_back(p.first); //use frequency as index, put elements with same frequency into the same box vector<int> ans; for(int i= buckets.size()-1; i>=0 && ans.size() < k; --i){ //start from right side, because we want to find top k frequent for(int num: buckets[i]){ ans.push_back(num); if(ans.size() == k) break; } } return ans; } };
time complexity: O(n) space complexity: O(n)
Solution 2 Heap
Thought process: Idea is similar to bucket sort. Build frequency map. Then use heap to make the highest frequency element on heap.peek.
classSolution: deftopKFrequent(self, nums: List[int], k: int) -> List[int]: # build frequency map dic = {} for i in nums: if i in dic: dic[i] -= 1 else: dic[i] = -1 # use heap to sort by frequency hq = [] for key in dic: heapq.heappush(hq, (dic[key], key)) # get the top kth res = [] for _ in range(k): res.append(heapq.heappop(hq)[1]) return res
time: O(NlogN) space: O(N)
Improvment:
We don’t have to keep all the elements in heap, just the top K ones. So we can pop out elements when there’s a higher frequency appeared.
classSolution: deftopKFrequent(self, nums: List[int], k: int) -> List[int]: # build frequency map dic = {} for i in nums: if i in dic: dic[i] += 1 else: dic[i] = 1 # use heap to sort by frequency hq = [] for key in dic: heapq.heappush(hq, (dic[key], key)) if len(hq) > k: heapq.heappop(hq) # get the top kth res = [] while hq: tmp = heapq.heappop(hq) res.append(tmp[1]) return res