Most Often Substring / Most Common Substring


Came across an interesting question on hackerrank which goes as follows -
We have a string of length N. Can you figure out the number of occurrences of the most frequent substring in this string? We are only interested in substring of length from K to L and in each substring the number of distinct characters must not exceed M. The string contains only lower-case letters(a-z). 
Constraints:
2<=N<=100000
2<=K<=L<=26,L<N
2<=M<=26

Sample Input :-
5
2 4 26 // value of K L M
abcde 
Sample Output :-
1
The solution to the above problem is
-Any substring between length K and L will be inserted into the trie.  
-Each trieNode also has a frequency counter which will be incremented when an already existing substring is inserted.
-While inserting, we need to check for the number of distinct characters in the given substring and not insert any substring which has distinct characters greater than m. In case, the number of distinct characters is greater than m, we return -1 as frequency from insertTrie function so as to know that the given substring was invalid and not inserted into trie.
- While inserting, we can keep track of the current maximum frequency and return that at the end as the answer.

Implementation in java is as below :-
New Document
import java.util.*;

public class MostOftenSubstring {
    TrieNode root;
    Set distinctCharacters = new HashSet();
 
    public int getHighestFrequencyOfSubstring(int n, int k, int l, int m, String s) {
        if (s == null || n == 0) {
            return 0;
        }
        if (k < 0 || l < 0 || m <= 0) {
            return 0;
        }
        int maxFrequency = 1;
        root = new TrieNode();
        for (int i = 0; i < n; ++i) {
            for (int j = k; j <= l && j <= n; ++j) {
                String substring = s.substring(i,j);
                // return -1 if distincts more than m
                int currentFrequency = insertIntoTrie(substring, m);
                if (currentFrequency == -1) 
                    break;
                maxFrequency = (currentFrequency > maxFrequency) ? currentFrequency : maxFrequency;                 
            }
            k += 1;
            l += 1;
        }
        return maxFrequency;
    }
     
     public int insertIntoTrie(String word, int m) {
         distinctCharacters.clear();
         TrieNode parent = root;
         for(char c : word.toCharArray()) {
             distinctCharacters.add(c);
             if (distinctCharacters.size() > m) {
                 return -1;
             }
             int i = c - 'a';
             if(parent.next[i] == null) 
                 parent.next[i] = new TrieNode();
             parent = parent.next[i];
         }
         if (parent.word != null) {
            parent.frequency += 1;
         } else {
             parent.word = word;
             parent.frequency = 1;
         }
         return parent.frequency;
     }

     class TrieNode {
         TrieNode[] next = new TrieNode[26];
         String word;
         int frequency = 0;
     }
}

Time Complexity is O(N.(L-K)) without considering trie insertion and O(N. (L-K)2) if considering trie insertion.

1 comments :

PixBlog 6 - The land of 14ers [Colorado]

  In the midst of the DeCaLiBron - Mt. Democrat to the right, Mt. Bross to the far left[Alma, Colorado]

  Moment of contemplation - left (Mt.Democrat) or right(Mt.Cameron) [Cameron-Democrat saddle]

Heavy legs and tired lungs - Summit of Mt. Democrat(14154 feet !!)

Football field at 14k feet !! - Summit of Mt. Cameron

A water-filled kite on the ground - Looking down at Kite Lake Trailhead [Alma, Colorado]

Summit Lake below Mt.Evans

Looking down Mt.Evans Road and Scenic Byway [Summit of Mt. Evans]

The scramble awaits - Looking up Mt. Bierstadt summit [Clear Creek, Colorado]

Sun-lit sawtooth ridge - Looking down from summit of Mt. Bierstadt

Of fleeting lakes and perennial ridges - Summit of Mt. Bierstadt

Scary Sawtooth ridge from the side 

For some lines are drawn pretty fine - The Continental Divide

Road to Aspen aint that way !! [Independence Pass, Colorado]

El famous Maroon Bells - [Maroon Bells, Colorado]

Beautiful Breck - [Main St, Breckenridge, Colorado]


0 comments :