Interview Questions

Find all duplicates in an array in linear time (v1)

This is a common interview question where you need to write a program to find all duplicates in an array where the numbers in the array are in the range of 0 to n-1 where n is the size of the array. For example: [1, 2, 3, 3] is okay but [1, 2, 6, 3] is not. In this version of the challenge there can be multiple duplicate numbers as well.
The algorithm below is commented to explain what each piece of code does, but the general algorithm is:
(1) Loop through the array
(2) For each element, find array[absolute(array[i])] in the array and set its value to negative if positive
(3) If in step 2 you encounter a negative number, then it means the element at index i in the array is a duplicate
## An example

arr = [1, 2, 2, 3, 1]
At i = 0
arr[absolute(arr[0])] = 2 which is positive so make it negative
arr = [1, -2, 2, 3, 1]
At i = 1
arr[absolute(arr[1])] = 2 which is positive so make it negative
arr = [1, -2, -2, 3, 1]
At i = 2
arr[absolute(arr[2])] = -2 which is negative which means the element originally at index 2 is a duplicate
duplicates = [2]
At i = 3
arr[absolute(arr[3])] = 3 which is positive so make it negative
arr = [1, -2, -2, -3, 1]
At i = 4
arr[absolute(arr[4])] = -2 which is negative which means the element originally at index 4 is a duplicate
duplicates = [2, 1]
## Running time

This algorithm runs in O(*n*) time because it only needs to loop through the array once. A problem though, is that this algorithm modifies the original array which may not be what you want. You can always copy the entire array before running the algorithm, which would add to the running time. Copying the entire array would run in linear time though, so the final running time would be O(2*n*) which is still O(*n*).
## Sources

http://www.careercup.com/question?id=10255343
https://www.glassdoor.ca/Interview/Given-an-input-array-remove-all-any-duplicate-occurrences-and-return-the-array-QTN_1258495.htm
http://www.glassdoor.com/Interview/Determine-if-an-array-from-1-n-has-a-duplicate-in-constant-time-and-space-QTN_504470.htm

function duplicates(arr) { // where we will store our duplicate values var dups = []; for (var i = 0; i < arr.length; i++) { // get element in array var el = arr[Math.abs(arr[i])]; // element in array is positive so make it negative if (el > 0) { arr[Math.abs(arr[i])] = -arr[Math.abs(arr[i])]; } // special case if element is zero // we set the value at this index to -arr.size as not to // mix this number up with the others because we know the // numbers are all in the range of 0 to n-1 else if (el === 0) { arr[Math.abs(arr[i])] = -arr.length; } // element is negative so it is a duplicate else { if (Math.abs(arr[i]) === arr.length) { dups.push(0); } else { dups.push(Math.abs(arr[i])); } } } return dups; } duplicates([0,2,0,1,3,3]);

def duplicates(arr): # where we will store our duplicate values dups = [] for i in range(0, len(arr)): # get element in array and check if array is in bounds if abs(arr[i]) == len(arr): el = -1 else: el = arr[abs(arr[i])] # element in array is positive so make it negative if el > 0: arr[abs(arr[i])] = -arr[abs(arr[i])] # special case if element is zero # we set the value at this index to -arr.size as not to # mix this number up with the others because we know the # numbers are all in the range of 0 to n-1 elif el == 0: arr[abs(arr[i])] = -len(arr) # element is negative so it is a duplicate else: if abs(arr[i]) == len(arr): dups.append(0) else: dups.append(abs(arr[i])) return dups print duplicates([0, 2, 0, 1, 3, 3])

mrdaniel
published this on 11/11/15 **|**

2

Since JavaScript has the concept of -0, the code can be significanlty simplified. The test for -0 is 1/x > 0. Evaluates to false if x is -0 and true if normal +0.

var abs = (x) => Math.abs(arr[x]); for (var i = 0; i < arr.length; i++) { // get element in array var el = arr[abs(i)]; // element in array is positive so make it negative; handle -0 if (el >= 0 && 1/el > 0) arr[abs(i)] = -arr[abs(i)]; // element is negative so it is a duplicate else dups.push(Math.abs(arr[i])); }

numbergames

commented on 09/28/16

1

@numbergames
Don't see how to edit, but need to update the test in my code -- don't need two tests, so

if (el >= 0 && 1/el > 0)should be:

if (1/el > 0)

numbergames

commented on 09/28/16

1

@karn09 This specific solution solves the challenges of finding duplicates in an array with numbers *only* in the range 0 to n-1 where n is the size of the array.
So for [1,2,1,3,6, 5] it wouldn't work because the size of the array is 6 so the range must be 0 to 5, and there is a 6 in the array. Same goes for the second example.
The v2 algorithm gives a solution for the more general challenge of finding duplicates in an array of any range of numbers.

mrdaniel

commented on 03/14/16

0

Sorry I just realized this was building up to using a hash table.

camsbury

commented on 06/18/16

0

I did this using a hash table in Python - seems to still run in O(n) time (and with less array manipulation), and works for any numbers:

def get_dups(array): ht = {} dups = [] for x in array: if x in ht: dups.append(x) else: ht[x] = x return list(set(dups))

camsbury

commented on 06/18/16

0

Very good point. I will need to be more careful when reading over the cases. Thank you.

karn09

commented on 03/15/16

0

There seems to be a problem with this solution. In certain cases it returns a duplicate multiple times and in others will return non-duplicated numbers.
duplicates([1,2,1,3,6, 5]); => [1,0]
duplicates([0,2,1,3,6, 6]); => [0,0]
I think part of the issue is with setting the special case 0 to be equal to the length where arrays contain duplicates that are equal to the length.

karn09

commented on 03/14/16

Login to submit a comment