Problem description:

Given an array nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one.

1
2
3
4
Example 1:

Input: [1,3,4,2,2]
Output: 2

1
2
3
4
Example 2:

Input: [3,1,3,4,2]
Output: 3

Note:

  • You must not modify the array (assume the array is read only).
  • You must use only constant, O(1) extra space.
  • Your runtime complexity should be less than O(n2).
  • There is only one duplicate number in the array, but it could be repeated more than once.

Solution:

1. Floyd’s Tortoise and Hare Algorithm

The first while loop ensures you goes in the correct cycle which has duplicates. For example:
index = [0 1 2 3 4 5 6 7]; nums = [5 2 1 3 5 7 6 4].
(slow)nums[slow] = (0)5 (5)7 (7)4 (4)5; fast = (0)5 (7)4 (5)7 (4)5; —-> when they meets at (idx=4)(value=5), you know you have a cycle.

Take a look at the cycle by the indices and values:

idx: 0—>5—>7—>4–>(goes back to idx=5)

val: 5—>7—>4—>5–>(goes back to val=7)

The second while loop will stop when “fast=0” and “slow=4” (their values = 5, the duplicate number). The duplicate number 5 is the reason why the two pointers will meet at a same index (next number). In fact, the second loop will always stop right before they meet at the first item of the cycle.

*Proof of second step:

Distance traveled by tortoise while meeting = x + y
Distance traveled by hare while meeting = (x + y + z) + y = x + 2y + z
Since hare travels with double the speed of tortoise,
so 2(x+y)= x+2y+z => x+2y+z = 2x+2y => x=z

Hence by moving tortoise to start of linked list, and making both animals to move one node at a time, they both have same distance to cover .
They will reach at the point where the loop starts in the linked list

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class Solution {
public:
int findDuplicate(vector<int>& nums) {
if (nums.size() > 1)
{
int slow = nums[0];
int fast = nums[nums[0]];
while (slow != fast)
{
//printf("slow:%d, fast:%d\n", slow, fast);
slow = nums[slow];
fast = nums[nums[fast]];
}

fast = 0;
while (fast != slow)
{
//printf("slow:%d, fast:%d\n", slow, fast);
fast = nums[fast];
slow = nums[slow];
}
//printf("slow:%d, fast:%d\n", slow, fast);
return slow;
}
return -1;
}
}

This solution is based on binary search.

At first the search space is numbers between 1 to n. Each time I select a number mid (which is the one in the middle) and count all the numbers equal to or less than mid. Then if the count is more than mid, the search space will be [1 mid] otherwise [mid+1 n]. I do this until search space is only one number.

Let’s say n=10 and I select mid=5. Then I count all the numbers in the array which are less than equal mid. If the there are more than 5 numbers that are less than 5, then by Pigeonhole Principle (https://en.wikipedia.org/wiki/Pigeonhole_principle) one of them has occurred more than once. So I shrink the search space from [1 10] to [1 5]. Otherwise the duplicate number is in the second half so for the next step the search space would be [6 10].

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Solution {
public:
int findDuplicate(vector<int>& nums) {
int n=nums.size()-1;
int low=1;
int high=n;
int mid;
while(low<high){
mid=(low+high)/2;

int count=0;
for(int num:nums){
if(num<=mid) count++;
}
printf("low=%d, high=%d, mid=%d, count=%d\n", low, high, mid, count);
if(count>mid) high=mid;
else low=mid+1;
}
return low;
}
};