H-Index II

LeetCode Problem #275 (Medium)

Problem Statement

Given an array of citations sorted in ascending order (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.

Note:

If there are several possible values for h, the maximum one is taken as the h-index.

Testcase

Input: citations = [0,1,3,5,6]
Output: 3 
Explanation: 
[0,1,3,5,6] means the researcher has 5 papers in total and each of them had 
received 0, 1, 3, 5, 6 citations respectively. 
Since the researcher has 3 papers with at least 3 citations each and the remaining 
two with no more than 3 citations each, her h-index is 3.

Problem Explanation

According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N - h papers have no more than h citations each."

In other words, we must find the first element such that the value of that element is greater than or equal to the number of elements remaining in the array. We may not find an element in the array with such a value, but we can calculate the h-index value as len - mid.

The reason for this is because we always need to find a value h wherein, there are at least h number of values from the array that are the greater than or equal to h. Calculating len - mid will give us the optimized version of this value.

There may also be more than one value which satisfies the condition, but we are interested in the maximum value among these values.

Prerequisite Knowledge

Binary Search

Refer to the Prerequisite Knowledge section in Random Pick with Weight:

Random Pick with Weight

Algorithm

Binary Search Approach

In this approach, we need to traverse through the array to find the value of the first array element that is greater than or equal to the number of elements that remain in the array. The conventional approach would be to traverse the array in linear time, but since the array has already been sorted, we can make use of binary search to achieve the same result.

Initialize the high and low pointers to point to the extreme ends of the sorted array.
Calculate the mid-value.
Check if the condition to find the h-index is satisfied by the mid element:
citations[index] >= length(citations) - index
Store the value of the remaining elements in a variable and continue the process to find the optimum value.
When the first optimal value that satisfies the condition is found, return the value.

Time Complexity: O(logn)

Example :

len = 5, low = 0, high = 4, mid = 2

[1, 2, 3, 6, 7]
 0  1  2  3  4
 l     m     h

The mid-value is currently 3. There must be at least 3 papers with a value greater than or equal to 3.

len - mid = 5 - 2 = 3

3 >= 3 -> true

h_index = len - mid = 5 - 2 = 3

There are 3 (h) papers that have an h-index of at least 3 and 2 (N-h) papers that have an h-index of no more than 3. This satisfies the stipulations of the given problem.

Code

int hIndex(vector<int>& citations) {
    int len = citations.size();
    int low = 0;
    int high = len - 1;
    int h_index = 0;
    if(len == 0) 
        return 0;
    while(low <= high) {
        int mid = low + ((high - low) / 2);
        // Condition to find the h-index: 
        // citations[index] >= length(citations) - index
        if(citations[mid] >= len - mid) {
            h_index = len - mid;
            high = mid - 1;
        }
        else {
            low = mid + 1;
        }
    }
    return h_index;
}

PreviousSurrounded Regions NextLongest Duplicate Substring

Last updated 5 years ago

Was this helpful?