Problem description:

Given a non-empty array of integers, return the k most frequent elements.

For example,
Given [1,1,1,2,2,3] and k = 2, return [1,2].

Note:
You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size.

Solution:

In the problem description, it ask for an algorithm that has time complexity better than O(n log n). We can use bucket sort.

  1. Create a map to count the element appearance in array, use elements value as key, which takes O(n)
  2. Then create a 2d array to classify elements with item’s frequency
  3. Search from the highest to kth frequently item.
1
2
3
4
5
6
7
8
9
10
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
buckets = [[] for i in range(len(nums))]

for num,freq in collections.Counter(nums).items():
print("num ", num, ", freq ", freq)
buckets[-freq].append(num)
print(buckets)

return list(itertools.chain(*buckets))[:k]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Solution {
public:
vector<int> topKFrequent(vector<int>& nums, int k) {
unordered_map<int, int> map;
for(int n: nums)
++map[n]; //put elements in map, increase the frequency

vector<vector<int>> buckets(nums.size()+1); //size()+1 is because the sequence might full with the same element
for(auto p: map)
buckets[p.second].push_back(p.first); //use frequency as index, put elements with same frequency into the same box

vector<int> ans;
for(int i= buckets.size()-1; i>=0 && ans.size() < k; --i){ //start from right side, because we want to find top k frequent
for(int num: buckets[i]){
ans.push_back(num);
if(ans.size() == k)
break;
}
}
return ans;
}
};

time complexity: O(n)
space complexity: O(n)

Solution 2 Heap

Thought process:
Idea is similar to bucket sort. Build frequency map. Then use heap to make the highest frequency element on heap.peek.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
# build frequency map
dic = {}
for i in nums:
if i in dic:
dic[i] -= 1
else:
dic[i] = -1

# use heap to sort by frequency
hq = []
for key in dic:
heapq.heappush(hq, (dic[key], key))

# get the top kth
res = []
for _ in range(k):
res.append(heapq.heappop(hq)[1])
return res

time: O(NlogN)
space: O(N)

Improvment:

We don’t have to keep all the elements in heap, just the top K ones. So we can pop out elements when there’s a higher frequency appeared.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
# build frequency map
dic = {}
for i in nums:
if i in dic:
dic[i] += 1
else:
dic[i] = 1

# use heap to sort by frequency
hq = []
for key in dic:
heapq.heappush(hq, (dic[key], key))
if len(hq) > k:
heapq.heappop(hq)

# get the top kth
res = []
while hq:
tmp = heapq.heappop(hq)
res.append(tmp[1])
return res

time: O(NlogK)
space: O(N)