Tuesday, October 30, 2012

Interviewstreet Amazon India - Meeting Schedule

Given M busy-time slots of N people, You need to print all the available time slots when all the N people can schedule a meeting for a duration of K minutes.
Event time will be of form HH MM ( where 0 <= HH <= 23 and 0 <= MM <= 59 ), K will be in the form minutes
An event time slot is of form [Start Time, End Time ) . Which means it inclusive at start time but doesn’t include the end time.

Sample Input:                                                       Sample Output:
5 120                                                                    00 00 09 00
16 00 17 00                                                          17 00 20 45
10 30 15 30
20 45 22 15
10 00 13 25
09 00 11 00

The time is expressed as hh mm, and we have 1440 minutes in a day. So we can create a integer array of size 1440, with elements initialized to 0. Now read each value of the busy-time and set the corresponding slot in the integer array to 1s. For instance 01:28 is at array position 88. Since its 1 hour and 28 minutes. 
Once all the busy-time slots are mapped to the integer array, then you can find the free slot of given duration. So to find a free slot, start from integer array position 0 and search for consecutive 0s of length atleast 'duration'. All such durations are the free durations. 


Monday, October 29, 2012

Interview Street - pairs of numbers that have a difference of K

Given N numbers , [N<=10^5] we need to count the total pairs of numbers that have a difference of K

Input Format:
1st line contains N & K (integers).
2nd line contains N numbers of the set. All the N numbers are assured to be distinct.
Output Format:
One integer saying the no of pairs of numbers that have a diff K.

Sample Input #00:
5 2
1 5 3 4 2

Sample Output #00:3

This problem can be solved with hashing or by sorting.I have taken the sorting approach.
Sort the contents of the array in (n log n) time. Then for each number in the array, do a binary search to see if the num+k exists in the array.

In hashing approach, maintain a hashtable. As you iterate each number X, check if X+k exists in the hashtable with 1 value. If its there then you have a pair. Else put the key (X+k) and value '0'. 

Interview Street - Indian Startup - Matrix Multiplication

Mr. Evan has given to his students an assignment where they need to multiply two boolean matrices and write the resultant matrix. Boolean matrix multiplication is done in the same way as standard matrix multiplication except for in the resultant matrix any entry which is not zero is treated as one.
Mr. Evan is tired of checking the correctness of their results, so he has asked you for help.
Given three N X N boolean matrices A, B and C, you need to write a code to determine whether A x B = C.
NOTE: There need not be a deterministic algorithm so you need to come up with a better probabilistic algorithm to get accepted.
Matrix A
0 0 0 0 0
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0 1

Matrix B 
1 0 0 0 0
0 0 0 0 0
0 1 0 0 0
1 0 0 0 0
0 0 0 0 0

Matrix C 
0 0 0 0 0
1 0 0 0 0
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0

Here we don't need to do actual multiplication. We can identify all matrix A elements having 1's and matrix B elements having 1's. So in the above - A has 1 in (4, 3) and B in (3, 2). Here A col and B row are same (3). So the resultant matrix C would have 1 in (4, 2). 


InterviewStreet Puzzle for Startups India

This question was posted in the InterviewStreet for India Startups. InterviewStreet regularly runs contests were the topics are related to coding, algorithms and puzzles. The toppers of the contest get a chance to apply for a job with the startups. This question is taken from the Indian startups round.

Alice and Bob are playing a game. This game starts with two piles of n1 and n2 chips. They play alternatively and Alice starts first.
In his/her turn a person has to remove one of the piles and split the other pile into two piles, these two new piles need not be of same size. The person who cannot make a move in his turn loses.
Write a program which given n1 and n2 finds the winner. Assume both the players play optimally.

Visualizing this kind of a puzzle is easy with Mathematical Induction. So take a smaller value and gradually build up. 

Example 1:-
Start with (2, 1).
Alice keeps 1, and splits the 2.
Bob gets (1,1). Since he can keep 1, but cannot keep the other 1, he looses.

Example 2:-
Start with (2, 2)
Alice keeps 2, and splits the other 2.
Bob gets (1,1). Since he can keep 1, but cannot keep the other 1, he looses.

Example 3:-
Start with (3, 2)
Alice keeps 3, and splits the 2.
Bob gets (1,1). Since he can keep 1, but cannot keep the other 1, he looses.

Example 4:-
Start with (3, 3)
Alice keeps 3, and splits the other 3.
Bob gets (2,1). Bob keeps 1 and splits 2.
Alice gets (1,1) she looses.

Example 5:-
Start with (8, 1)
Alice keeps 1, and splits the 8
Bob gets (5, 3). He cannot keep 5 and split 3, since in that case he will have to provide (2, 1) in which case he will loose immediately. So he splits 5. To split 5, he cannot split it to (3, 2) as Alice will keep 3 and split the 2 as (1,1). In this case again Bob will loose. So to keep the game going he will split 5 as (4,1).
Alice gets (4,1). She keeps 1 and splits 4. There are 2 options (2, 2) and (3,1). But if she gives (2,2), in the next move Bob will provide her (1,1) and she will loose. So she splits (3,1).
Bob gets (3,1). He keeps 1 and has to split (2,1). So again he looses.

In example 5, instead of Alice splitting (5, 3), what if she split (4,4) or (6, 2). Well (6,2) is not possible, as she will loose immediately. Assume she splits (4, 4)
Bob gets (4, 4). He can split as (2, 2) which he won't do. So he will split (3, 1).
Alice gets (3,1). She keeps 1 and splits ( 2, 1). She will loose.

So splitting 8 as (5, 3) was a better move.

Example 6:-
Start with (10, 1).
Bob (5, 5)
Alice ( 4, 1)
Bob (3, 1)
Alice ( 2, 1) and wins

Start with (10, 1)
Bob (6, 4)
Alice (3, 1)
Bob (2, 1) and wins

Start with (10, 1)
Bob (6, 4)
Alice (3, 3)
Bob (2, 1) and wins

I have provided a solution which ensures the highest chances for Alice to win. Alice winning chances are when she gets atleast one input as even. For instance (10, 1) here 10 was even. She will split it into 2 odds - 5 and 5. This way she can ensure her winning. If she got (8, 1) then she can split (5, 3). If she got (8, 12) still she can split (5, 3). This way she can ensure the game is finished early.
If she gets (13, 1) that is the only option available is an odd, then Bob stands a winning chance. Still she can keep the game on by splitting into (12, 1).
The program prints the most optimized path favoring Alice.

Saturday, October 27, 2012

Facebook Question - Find the longest increasing subsequence

You are given an unsorted array of integers. In that array, find the longest increasing subsequence.
For eg. {1, 5, 3, 4, 6, 1, 2, 4, 8, 5 }
In the above, there are multiple increasing sequences:
Of the above, last one is the longest sequence.
Write a program to solve the above. It should be linear O(N) time.

Initialize an array of same length as input. Call it current array. Read the first element and store it in this array. Keep navigating the input array and if the current element is greater than the last element in the current array, then put the value into the current array. If the value is less than the last current array element, then we need to start again. So we need to reset the current array. But hold on, the current array that we have might be the  longest subsequence. So before resetting the array check if the current length of current array is the biggest we found ever. If so copy the array elements into the result array.

current array: {1, 0, 0, 0, 0, 0, 0, 0, 0, 0}
length: 1

i=1, a[1] = 5 (increasing)
current array: {1, 5, 0, 0, 0, 0, 0, 0, 0, 0}
length: 2
maxlength: 2

i=2, a[2] = 3 (decreasing)
current array: {3, 0, 0, 0, 0, 0, 0, 0, 0, 0}
result: {1, 5}
length: 1
maxlength: 2

i=3, a[3] = 4 (increasing)
current array: {3,4, 0, 0, 0, 0, 0, 0, 0, 0}
length: 2


Friday, October 26, 2012

FaceBook Question - Find a missing number in 1 Billion numbers

Problem -
Facebook has over 1 Billion users and each user is assigned a unique FacebookID. Assume that each FacebookID is a valid int. These FacebookIDs are written in a text file sequentially. When an user de-activates his account on Facebook, the corresponding FacebookID is deleted from the file.
Write a program to find a user who has deactivated his account.

Approach -
Assuming int is 32 bits the total number of positive values supported are - 4,294,967,296. So it can support 4.2 billion numbers. 

1. Bit Map approach - We use a BitMap (or a boolean array in java) with values initialized to 0 (or false). The size of the bitMap is 2^32. Read each FacebookID and set the corresponding bitMap index to 1 (or true). Once all the 1 billion numbers are read and the bitMap updated - search for a bitMap value which is 0 and that is the missing number. 
So lets analyze the memory requirement for this approach. To store a bitMap of 2^32 size means 2^29 bytes (since 1 byte is 2^3 bits). This 2^29 bytes translates to 512 MB RAM. How did I arrive at this number?  2^29 = 536,870,912
or 536,870,912/1024 KB = 524,288 KB
or 524,288/1024 MB = 512 MB
So what if we didn't have this much RAM capability?

2. Bucket Partition Method - Take square-root of 2^32, which translates to 2^16 (=65,536). Create an int[] of size 65,536. Lets call these buckets. So there are 65,536 buckets. Each bucket is responsible for 65,536 values. So the bucket0 manages values (0,1,2... 65535). And bucket1 manages (65536, 65537 .... 131,071). and so on. This way the last bucket which is bucket65535 manages the last set of numbers.
Read each FacebookID and find the corresponding bucket. So for instance if the id is 9 it translates to bucket0. Do not store the value in the bucket, just increment the bucket value. So essentially each bucket just holds a count of numbers that it encountered in its limit. Once all the facebookIDs are read and the bucket counts set, identify a bucket which is not equal to 65,536. Lets say the bucket number is 1. Which means, the missing number is between 65536 to 131071. Lets call this TargetBucket.
Now reset the bucket values to 0. Next, read the faceBookIDs once again, but this time check if the value is between 65536 to 131071. If it is set the corresponding bucket to 1. This time our bucket0 manages 65536, bucket1 manages 65537, bucket2 manages 65538 and so on. Once all the values are read and the buckets populated, find a bucket which has 0. That is the missing number.
So what was the memory requirement for this one? 
2^16 bits or 2^13 bytes or 8,192 bytes or 8 KB. Woow, amazingly less!!

I have created a program which basically simulates the bucket partition method but limits itself to 16 Million numbers (16,777,216 or 2^24). It generates a text file of 16 million numbers with a few numbers skipped. So my bucket size is 4096 (or 2^12).

Page Hits for WebSites and Blogs

Assume that you are building an Analytics Tool for a very popular site (like facebook). And the tool should capture the real time hits for the site and maintain a cumulative frequency. The site will be hit every second and you have to capture the details and show a summary like last 1 month 45,000 hits, last 1 week 20,000 hits, yesterday 1,200 hits and so on. You should also support range queries like 22,000 hits between 02/10/2012 to 21/02/2012.

Assume that you have stored the hits sequentially in an array. For eg.
3, 2, 0, 9, 2, 5, 1 
The above is the page hits for a 1 week period. The most natural way of calculating the total cumulative frequency is to sum up. It becomes-
3, 5, 5, 14, 16, 21, 22
So we know we received 22 hits in 1 week. Lets say we have data for last every hour like this and the data is stored for last 5 years. This is pretty huge data. Now if for some reason the frequency for day 2 is updated from 2 to 5. This impacts our frequency results table, so we have to recalculate the cumulative values for the 2nd day to 7th day. 

A better solution for this is Fenwick Tree (or Binary Indexed Tree). In this data structure each index position is expressed as a binary number. Lets consider 12. Its represented in binary as 0000 1100. The last set bit (from right to left) is at position 2 (with 0-indexed). Lets call this last set index of idx as lsi(idx). Each idx manages in the range [ idx to idx - 2^lsi(idx) + 1]. So in case of idx 12, the lsi(12) was 2. So it manages 9..12. The following chart provides the indexes and ranges.


For the below input:

The output tree is:
[1, 1, 2, 4, 1, 4, 0, 12, 2, 7, 2, 11, 3]

So if you observe, 12 is managing 9 to 12. So the values its managing are 2,5,2,2 (The tree is 1-indexed and not 0-indexed). The total sum is 11. Similarly, index 6 manages 5..6. So its values are 1,3 summing up 4.
There is an easy way to find this - which is bit manipulation ( i have used left shift and right shift ).
Updating an index value is simple. Identify the impacted indexes and add the value.

To calculate the range values, use Tree[j] - Tree[i]. Please see the code snippet below.

Thursday, October 25, 2012

Facebook - Calculate Birthday's this week

Facebook provides a notification service to inform users that one or more of their friend's birthday is lined up  this week. Design a data structure to efficiently fetch the birthday's in the given range.
There is another Facebook interview question similar to this - Facebook maintains a log of when somebody logged in and when somebody logged out. Read this data and store it in a data structure. Now find how many users were logged into Facebook at a given time (say 10:25 am today).

I have used a Segment Tree. It is very efficient in performing range based queries. For the birthday problem, i have used a range of 1 to 366 (total number of days in a year). The Segment tree is similar to a binary tree, but it stores range values instead of single value. Each node will contain a range [i, j]. In that case the left node would contain range [i, (i+j)/2]. The right node will contain [(i+j)/2+1, j]. This way it would go on. Once the Segment tree is created, reach each user and store it in the Segment tree. Search is very efficient as it only have to match the given range value.

The same data structure can be used to create bar charts. In  a few bar charts that we see, the bar chart range can be gradually decreased. For instance, it shows people between 1-100 years old, then click on it to show 1-50 range and 51-100 range. Then click again to show 1-25, 26-50, 51-75, 76-100...so on. It can go upto single level values. This is an ideal use case for Segment Tree.

Tuesday, October 23, 2012

Microsoft: Find biggest number that can be formed from a list of unsorted numbers

  Microsoft Interview - Given an array of unsorted numbers (1 or 2 digits),
  Find the biggest number that can be created using these numbers
  For e.g. {6, 83, 12, 7, 16, 35}
  The biggest number that can be made is 83,7,6,35,16,12
  The idea is to create 2 maxheaps - one for one digit numbers
  The other for 2 digit numbers. Then compare and create final string.
  Extending this to any arbitary length is simple. Instead of using 2 maxheaps,
  use n maxheaps, where n is the length of the number with the maximum length.
  Compare and create can use N-way merge then, instead of simple if-else.

Sum of elements in unsorted array

 Given an unsorted array.
 With each number, we can associated a sum which is equal to the sum of all the numbers, less than the   current number.
  We've to find the total sum of all those numbers.
  e.g. unsorted array :1, 5, 3, 6, 4.
  for a[0]=1, sum[0]=0
  for a[1]=5, sum[1]=1
  for a[2]=3, sum[2]=1
  for a[3]=6, sum[3]=1+5+3
  for a[4]=4, sum[4]=1+3
  total sum =sum[0]+sum[1]+sum[2]+sum[3]+sum[4] = 15

Read each element in the array and put it into a Binary Search Tree. Each node of the BST will hold the value of the element,  and also the sum of the left subtree. As the elements are inserted, calculate the sum of the left subtree. At the end of inserting all the elements, the total sum is calculated

Least Common Ancestor in Binary Tree

Given a Binary Tree (not Binary Search Tree), and two nodes A and B. Find the closest common ancestor for both the nodes.


Binary Tree - Microsoft Interview Questions

This one contains 3 different problems on a complete Binary Tree.
1. Create and Print a Binary Tree in the same input order.
2. Print the tree in a zig-zag fashion. Even levels right to left and Odd levels left to right
3. Take a snapshot of the tree and put it on a co-ordinate system with left-most node as (0,0)


Sort array of 0,1,2

Given an array of 0s, 1s and 2s only in a mixed fashion. The task is to seperate out them such that 0s are on the left, 1s in the middle, and 2s on the right. The algorithm should run in linear time.

This problem is called "Dutch National Flag". There are lot of puzzles asked in the interview based on this.


Friday, October 19, 2012

Google Search - Prototype

Implement a basic web search. It should load a document list into memory. Each document contains text data of variable length. When the user searches for a particular text, it should find all matching patterns for the complete word list (in the same sequence) he entered.
For. e.g
If my documents are -

"where is the house",
"my house is where the mountain is",
"it is my house",
"my dog is a pet",
"where is the horse"
And I search for - "where is the house". The exact pattern should be matched. It should only return "where is the house". It should not return the 2nd string "my house is where the mountain is", even though it contains all the 4 words (where, is, the, house) in a jumbled fashion.
The user should be able to search for substring as well e.g. "my house". It will return "my house is where the mountain is" and also "it is my house".

If we did not have to support substring search, it could have been a scenario for storing key-value pair. 

With this requirement, we can only implement it with a Tree or a TRIE. I have choosen a Binary Search Tree since I can search for words in Log(N). 
Read documents one by one. For every document read the words. Check if the word is in the BST. If not add a node to the BST with value as the word. The index for it would be a linked list with one key-value pair (key is document-id and value is word-id in the document). If there is a node already with the word, then append the key-value index to the end of the linked-list. This way, the document1's index would be stored before the document-2's index. 

Search - "x y z".
First search for word "x". Get the indexes for the word. Lets say we got (1,4), (3,9). Then search for word "y". If it returned (1,5), (2,1), (3,11). Then we are only interested in document 1. Since neither 3 nor 2 fit in our criteria. It is because in document 1, word x was at position 4 and word y is at position 5. We continue the same with word "z". But this time our focus is only on index (1,5).

As always, this is just a prototype i came up with and it can have a lot of additional features like document ranking, suggestion, completion, etc. In an actual web search engine, there would be a complex algorithm and processing involved.


Tour a city - Puzzle

 There is a one-way circular route from City A to City X. In between these 2 cities,
  there are a number of cities separated by any distance. Each city has a petrol pump
  and holds a different storage capacity. The total petrol available in all the cities
  put together is say 'Z' litres. The total distance between City A to City X will be
  'Z' kilo meters. Assume the bike has a mile-age of 1 litre per KM.
  Your goal is to around the circle, starting with no petrol. Find which city you will
  start first, to ensure you wont run out of petrol mid-way.  

Book Store: Auto Suggest and Auto Completion

Implement a Book Store to maintain a catalog of Books. Currently i am only storing the title of the book. Users should be able to add a book, and search . The search should be exact or partial. Implement auto-suggest and auto-completion as well. The search operation is user centric, so it should be extremely fast. 
Usually 90% of spelling correction is done for 2 edits. An edit is one of the following - missing letter, misplaced letter, extra letter. This algorithm can be tuned for further spell check.


Thursday, October 18, 2012

Implement Mobile T9 using TRIE

T9's objective is to make it easier to type text messages. It allows words to be entered by a single keypress for each letter. T9's objective is to make it easier to type text messages. It allows words to be entered by a single keypress for each letter.  For instance, in English, 4663 matches "good", "home", "gone", "hood", etc. 

This program takes in a dictionary file as input, reads it and creates a Ternary Search Trie (TST). Once the Trie is created, any word can be searched. The matching words for a given sequence are stored in ascending order (instead of word ranking). This code can be modified to accomplish word ranking and machine learning. 

The above code can be implemented with HashMap. However hashing does not provide efficient searching. 
For the purpose of benchmarking, I have enabled the runtime stats in this class. I have tested with 3 types of dictionaries. The biggest dictionary I could get hold off was containing 3.7 million words. The program currently accepts only alphabets. 


Unimodal array has an increasing sequence followed by decreasing sequence, find the number at which the transition happens

  Unimodal array contains a list of 'N' integers. There exists M < N, such that
  a[0] to a[M] are in increasing series
  a[M+1] to a[N] are in decreasing series
  The problem is to find a[M] in O(Log N)
  E.g. {2, 4, 8, 10, 13, 18, 12, 7, 5, 4}. Here M is 18 as the switch happens after it


Pairs of numbers whose sum is "S"

In a given sorted array of integers, print the pairs of numbers which give a sum "S" in O(N) time.

The problem like this can be solved with 2 pointers.


Suffix Tree - Common pattern in 'N' strings

Given 'n' number of strings, find the common pattern among all of them in O(N) time.
For e.g. "sparrows", "tomorrow", "borrower". In the above 3 strings, the common pattern is "rrow".
Problems like this can be solved in linear time. The main point to note it is, this can be achieved at the cost of additional space. Searching will be linear time.


Wednesday, October 17, 2012

Kth Largest using MaxHeap

Find the Kth largest element in an unsorted array of integers, using MaxHeap


Median of Medians to find Kth Smallest

Given an array of unsorted integers find the Kth smallest element using "Median of Medians".
This method is guaranteed to work in max linear time


Kth smallest element in an unsorted array

Given an array of unsorted numbers, find the Kth smallest element in the array.
Input:  10, 8, 5, 7, 14, 3
K = 4
Output: 8


Count occurrence of an element in a Sorted Array

Given an array of sorted elements, count the number of times a given element is repeated. The solution cannot be linear.

Solution:- Have given 2 solutions, both are modifications to binary search