Interview Questions

Find all duplicates in an array in linear time (v2)
This is a common interview question where you need to write a program to find all duplicates in an array. The elements in the array have no restrictions, but in this algorithm we'll work specifically with integers. Finding duplicates in an array can be solved in linear time by using a hash table to store each element as we pass through the array. The general algorithm is:
(1) Loop through the array (2) At each element check if it exists in the hash table, which has a lookup of O(1) time (3) If the element exists in the hash table then it is a duplicate, if it doesn't exist, insert it into the hash table, also O(1)

## Solution

```function duplicates(arr) {

// our hash table to store each element
// in the array as we pass through it
var hashTable = [];

// store duplicates
var dups = [];

// check each element in the array
for (var i = 0; i < arr.length; i++) {

// if element does not exist in hash table
// then insert it
if (hashTable[arr[i].toString()] === undefined) {
hashTable[arr[i].toString()] = true;
}

// if element does exist in hash table
// then we know it is a duplicate
else { dups.push(arr[i]); }

}

return dups;

}

duplicates([1, 21, -4, 103, 21, 4, 1]);
```
```def duplicates(arr):

# our hash table to store each element
# in the list as we pass through it
hashTable = {}

# store duplicates
dups = []

# check each element in the array
for i in range(0, len(arr)):

# if element does not exist in hash table
# then insert it
if arr[i] not in hashTable:
hashTable[arr[i]] = True

# if element does exist in hash table
# then we know it is a duplicate
else:
dups.append(arr[i])

return dups

print duplicates([1, 21, -4, 103, 21, 4, 1])
```

## Running time

This algorithm runs in O(n) time because it loops through the array only once, each time checking if the element exists in the hash table and inserting it, which are both constant operations (on average) running in O(1) time.

## Sources

http://www.careercup.com/question?id=10255343 https://www.glassdoor.ca/Interview/Given-an-input-array-remove-all-any-duplicate-occurrences-and-return-the-array-QTN_1258495.htm
mrdaniel published this on 11/11/15 |
• +
• 2
• -
• duplicates([1, 21, -4, 103, 21, 4, 1,1]); => [21, 1, 1]; should we expect [21, 1]?
• +
• 1
• -
• Why do you use ` for i in range(0, len(arr)): ` instead of ` for i in arr ` - and reference i instead of arr[i] in the following code?
• +
• 1
• -
• This code is not completely correct because it returns duplicates, eg if you try it with the array [0, 0, 1, 0], it will return [0, 0].It should return each duplicated value only once.
• +
• 1
• -
• Solution for Java ```package main; import java.io.*; import java.util.*; import java.text.*; import java.math.*; import java.util.regex.*; import java.util.Scanner; import java.util.ArrayList; public class MainCls { public static void main(String[] args){ Scanner inputs = new Scanner(System.in); String unos = inputs.next(); String[] niz = unos.split(","); ArrayList<String> duplikati = new ArrayList <String>(); ArrayList<String> druginiz = new ArrayList <String> (); for(String k: niz){ if(!druginiz.contains(k)){ druginiz.add(k); } else { if(duplikati.contains(k)) { continue; } else duplikati.add(k); } } System.out.print(duplikati); } } ```
• +
• 0
• -
• @jaychiarella, Yes, in this case we can actually use the second line of code you posted because we are not doing anything in particular with the index i. Both lines of code you posted work, but maybe the second one is more readable and easier to understand :)
• +
• 0
• -
• This code will be returning duplicate if multiple numbers of duplicate in array[0,1,2,0,2,3,0,4,5,0,2,4,0,5,6,0,3,4].
• +
• 0
• -
• @montekaka... try this ``` function duplicateArray( argArray ){ hashTable = []; duplicate = []; for( var i=0; i < argArray.length; i++){ //console.log( hashTable[ argArray[i].toString()] ); if( hashTable[ argArray[i].toString() ] === undefined ){ hashTable[ argArray[i].toString() ] = true; }else{ if( duplicate.indexOf(argArray[i]) == -1 ){ duplicate.push( argArray[i] ); } } } return duplicate; } ```
• +
• 0
• -
• @montekaka Since I am still afraid of the indexOf's, I would prefer to define 'dups' as a Set like below: ```var findDupplicate = (arr) => { var hashTable = []; // store duplicates var dups = new Set(); // check each element in the array for (var i = 0; i < arr.length; i++) { // if element does not exist in hash table // then insert it if (hashTable[arr[i].toString()] === undefined) { hashTable[arr[i].toString()] = true; } // if element does exist in hash table // then we know it is a duplicate else { dups.add(arr[i]); } } return Array.from(dups); }```