Find the Duplicate Number

LeetCode Problem #287 (Medium)

Problem Statement

Given an array, nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one.

Note:

You must not modify the array (assume the array is read-only).
You must use only constant, O(1) extra space.
Your runtime complexity should be less than O(n^2).
There is only one duplicate number in the array, but it could be repeated more than once.

Testcases

Example 1:

Input: [1,3,4,2,2]
Output: 2

Example 2:

Input: [3,1,3,4,2]
Output: 3

Prerequisite Knowledge

Floyd's Cycle Detection Algorithm

Floyd's "Tortoise and Hare" Cycle Detection Algorithm is an algorithm which is conventionally used to detect a cycle in a linked list. However, the logic of this approach can also be applied in this context to detect the presence of a duplicate element in an array.

Cycle Formation

The elements in the array can be represented as:

x, nums[x], nums[nums[x]], nums[nums[nums[x]]], ...

In effect, we are creating a sequence of elements which are indexed based on the value of the previous element. If a sequence of elements contains a duplicate element, then it would create a cycle within the sequence.

In the illustration below, we can see that the arrangement of elements is dependent on the value of the previous element.

nums[0] = 2
nums[1] = nums[nums[0]] = nums[2] = 4
nums[2] = nums[nums[2]] = nums[4] = 3
nums[3] = nums[nums[4]] = nums[3] = 1-> Duplicate
nums[4] = nums[nums[3]] = nums[1] = 6
nums[5] = nums[nums[1]] = nums[6] = 5
nums[6] = nums[nums[6]] = nums[5] = 1-> Duplicate

Approach

This approach utilises two pointers—fast pointer (hare), slow pointer (tortoise). The fast pointer moves two elements at a time while the slow pointer moves one element at a time.

Phase One

In the first phase, the hare moves twice as fast as the tortoise. Since the hare moves faster than the tortoise, it would be the first to enter the cycle, and would eventually catch up to the tortoise. At this point of intersection, phase one is complete.

Phase Two

In the second phase of this algorithm, we slow down the hare such that it traverses the array one element at a time. We move the hare back to the starting point and allow the tortoise to continue its traverse the array from the point of intersection from phase one.

Now when the hare and tortoise meet again, this point of intersection would be the entry point of the cycle or the duplicate element.

Algorithm

Hash Map Approach

In this approach, we simply count the number of occurrences of each element until we reach an element which occurs twice.

Although this solution is accepted by LeetCode, it does not satisfy the condition that the space complexity must by O(1).

Initialize an unordered map to store the count of each element.
Traverse through the array.
a. If the element is already in the map, return that element since it is a duplicate.
b. If the element is not found, add it to the map.

Time Complexity: O(n)

Space Complexity: O(n)

Sorting Approach

In this approach, the array elements are sorted before the process of traversal and element comparison, so no additional space is required. This means that the space complexity is O(1) and satisfies the condition.

Although this solution is accepted by LeetCode, we cannot consider it to be the correct solution because it does not satisfy the condition that the original array cannot be modified.

Sort the elements of the array in-place.
If adjacent elements in the array are equal, then duplicates exist. Return the element.

Time Complexity: O(nlogn)

Space Complexity: O(1)

Cycle Detection Approach

In this approach, we try to identify the point at which a cycle begins within the array due to the presence of a duplicate element. There are two pointers; a slow pointer and a fast pointer. The fast pointer traverses the array at twice the speed of the slow pointer. The traversal happens within the original array, with no modifications being made to the array.

Therefore, the time complexity is less than O(n^2) and the space complexity is O(1), which satisfies the constraints.

Initialize two pointers, tortoise and hare. The hare pointer would traverse the array twice as fast as the tortoise pointer.
Until the hare and tortoise pointer meet, make the hare pointer advance two elements at a time and the tortoise pointer advance one element at a time.
Reinitialize the hare pointer to 0.
Slow down the speed of the hare pointer such that it advances only one element at a time.
The point at which the hare and tortoise meet in this second traversal is the entry point of the cycle.
Return the entry point of the cycle as the repeated element.

Time Complexity: O(n)

Space Complexity: O(1)

Code

Hash Map Approach

int findDuplicate(vector<int>& nums) {
    unordered_map<int, int> m;
    for(int i = 0; i < nums.size(); i++) {
        // If the number is already in the map, it is a duplicate
        if(m.find(nums[i]) != m.end()) {
            return nums[i];
        }
        // If the number is not found, add it to the map
        else {
            m[nums[i]]++;
        }
    }
    return -1;
}

Sorting Approach

int findDuplicate(vector<int>& nums) {
    // Sort the array - O(1) 
    sort(nums.begin(), nums.end());
    for(int i = 1; i < nums.size(); i++) {
        // If the adjacent numbers are equal, it is the duplicate number
        if(nums[i] == nums[i-1]) {
            return nums[i];
        }
    }
    return -1;
}

Cycle Detection Approach

int findDuplicate(vector<int>& nums) {
    int tortoise = nums[0];
    int hare = nums[nums[0]];
    // Phase 1:
    // The hare moves twice as fast as the tortoise
    // Exit the loop when the tortoise and hare meet
    while(tortoise != hare) {
        tortoise = nums[tortoise];
        hare = nums[nums[hare]];
    }
    // Phase 2:
    // Slow down the hare
    // The intersection point of the tortoise and hare is the entry point of the cycle
    hare = 0;
    while(tortoise != hare) {
        hare = nums[hare];
        tortoise = nums[tortoise];
    }
    // Return the intersection point
    return tortoise;
}

PreviousUnique Binary Search Trees NextSum Root to Leaf Numbers

Last updated 4 years ago

Was this helpful?

Find the Duplicate Number

LeetCode Problem #287 (Medium)

Problem Statement

Note:

You must not modify the array (assume the array is read-only).
You must use only constant, O(1) extra space.
Your runtime complexity should be less than O(n^2).
There is only one duplicate number in the array, but it could be repeated more than once.

Testcases

Example 1:

Input: [1,3,4,2,2]
Output: 2

Example 2:

Input: [3,1,3,4,2]
Output: 3

Prerequisite Knowledge

Floyd's Cycle Detection Algorithm

Cycle Formation

The elements in the array can be represented as:

x, nums[x], nums[nums[x]], nums[nums[nums[x]]], ...

In the illustration below, we can see that the arrangement of elements is dependent on the value of the previous element.

nums[0] = 2
nums[1] = nums[nums[0]] = nums[2] = 4
nums[2] = nums[nums[2]] = nums[4] = 3
nums[3] = nums[nums[4]] = nums[3] = 1-> Duplicate
nums[4] = nums[nums[3]] = nums[1] = 6
nums[5] = nums[nums[1]] = nums[6] = 5
nums[6] = nums[nums[6]] = nums[5] = 1-> Duplicate

Approach

This approach utilises two pointers—fast pointer (hare), slow pointer (tortoise). The fast pointer moves two elements at a time while the slow pointer moves one element at a time.

Phase One

Phase Two

Now when the hare and tortoise meet again, this point of intersection would be the entry point of the cycle or the duplicate element.

Refer to the for an animated illustration of the images above and a more detailed explanation of the working of this algorithm.

Algorithm

Hash Map Approach

In this approach, we simply count the number of occurrences of each element until we reach an element which occurs twice.

Although this solution is accepted by LeetCode, it does not satisfy the condition that the space complexity must by O(1).

Initialize an unordered map to store the count of each element.
Traverse through the array.
a. If the element is already in the map, return that element since it is a duplicate.
b. If the element is not found, add it to the map.

Time Complexity: O(n)

Space Complexity: O(n)

Sorting Approach

Although this solution is accepted by LeetCode, we cannot consider it to be the correct solution because it does not satisfy the condition that the original array cannot be modified.

Sort the elements of the array in-place.
If adjacent elements in the array are equal, then duplicates exist. Return the element.

Time Complexity: O(nlogn)

Space Complexity: O(1)

Cycle Detection Approach

Therefore, the time complexity is less than O(n^2) and the space complexity is O(1), which satisfies the constraints.

Initialize two pointers, tortoise and hare. The hare pointer would traverse the array twice as fast as the tortoise pointer.
Until the hare and tortoise pointer meet, make the hare pointer advance two elements at a time and the tortoise pointer advance one element at a time.
Reinitialize the hare pointer to 0.
Slow down the speed of the hare pointer such that it advances only one element at a time.
The point at which the hare and tortoise meet in this second traversal is the entry point of the cycle.
Return the entry point of the cycle as the repeated element.

Time Complexity: O(n)

Space Complexity: O(1)

Code

Hash Map Approach

int findDuplicate(vector<int>& nums) {
    unordered_map<int, int> m;
    for(int i = 0; i < nums.size(); i++) {
        // If the number is already in the map, it is a duplicate
        if(m.find(nums[i]) != m.end()) {
            return nums[i];
        }
        // If the number is not found, add it to the map
        else {
            m[nums[i]]++;
        }
    }
    return -1;
}

Sorting Approach

int findDuplicate(vector<int>& nums) {
    // Sort the array - O(1) 
    sort(nums.begin(), nums.end());
    for(int i = 1; i < nums.size(); i++) {
        // If the adjacent numbers are equal, it is the duplicate number
        if(nums[i] == nums[i-1]) {
            return nums[i];
        }
    }
    return -1;
}

Cycle Detection Approach

int findDuplicate(vector<int>& nums) {
    int tortoise = nums[0];
    int hare = nums[nums[0]];
    // Phase 1:
    // The hare moves twice as fast as the tortoise
    // Exit the loop when the tortoise and hare meet
    while(tortoise != hare) {
        tortoise = nums[tortoise];
        hare = nums[nums[hare]];
    }
    // Phase 2:
    // Slow down the hare
    // The intersection point of the tortoise and hare is the entry point of the cycle
    hare = 0;
    while(tortoise != hare) {
        hare = nums[hare];
        tortoise = nums[tortoise];
    }
    // Return the intersection point
    return tortoise;
}

PreviousUnique Binary Search Trees NextSum Root to Leaf Numbers

Last updated 4 years ago

Was this helpful?