Thursday, October 8, 2015

LeetCode - Wildcard Matching

Link
https://leetcode.com/problems/wildcard-matching/

Problem
Implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "*") → true
isMatch("aa", "a*") → true
isMatch("ab", "?*") → true
isMatch("aab", "c*a*b") → false

Solution

public class Solution {
// Stores failed matches for given pattern and string. Used for memoization.
private Map<Integer, Set<String>> map;
public Solution() {
map = new HashMap<Integer, Set<String>>();
}
public boolean isMatch(String s, String p) {
return isMatch2(s, cleanPattern(p));
}
public boolean isMatch2(String s, String p) {
int sPtr = 0;
int pPtr = 0;
char sChar = ' ';
char pChar = ' ';
while (sPtr < s.length() && pPtr < p.length()) {
sChar = s.charAt(sPtr);
pChar = p.charAt(pPtr);
if (pChar == sChar || pChar == '?') {
// Continue
sPtr += 1;
pPtr += 1;
}
else if (pChar == '*') {
// Check if a previous attempt to match string and pattern has already failed.
Set<String> failed = map.get(pPtr);
// If it doesn't match, we don't need to go down the same rabbit hole and waste computation cycles.
if (failed == null || !failed.contains(s.substring(sPtr))) {
boolean isMatch = isMatch(s.substring(sPtr), p.substring(pPtr + 1));
if (isMatch) {
return true;
}
if (failed == null) {
failed = new HashSet<String>();
}
}
// Record failed match so we can look it up later.
failed.add(s.substring(sPtr));
map.put(pPtr, failed);
sPtr += 1;
}
else {
return false;
}
}
// Consume all wildcards
while (pPtr < p.length() && p.charAt(pPtr) == '*') {
pPtr++;
}
if (sPtr == s.length() && pPtr == p.length()) {
// All characters consumed
return true;
}
if (sPtr < s.length() && pPtr == p.length()) {
// Remaining characters in s
return false;
}
if (sPtr == s.length() && pPtr < p.length()) {
// Remaining characters in p
return false;
}
return false;
}
// Remove contiguous wildcard characters from string.
// For example, passing in "aa**bb" will return "aa*bb"
private String cleanPattern(String s) {
StringBuilder copy = new StringBuilder();
char lastChar = ' ';
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) == '*' && lastChar == '*') {
continue;
}
lastChar = s.charAt(i);
copy.append(lastChar);
}
return copy.toString();
}
}
view raw Solution.java hosted with ❤ by GitHub
Approaches to the Problem

#1

The general approach is to loop through each character in s and p and compare them.  Break out of the loop once the end is reached for either s or p.

- If the characters match, then move on to the next set of characters.

- If the pattern character is '?', then move on to the next set of characters.

- If the pattern character is a wildcard '*', then call the function recursively with the wildcard character removed.  If the recursive call returns true, then the candidate string and pattern matches.  If the recursive call returns false, then move on to the next character in the candidate string.

- If the characters don't match, then return false.

Finally, return true if the end has been reached for both s and p.  Otherwise, return false if the end has not been reached for either s or p.

#2

The first approach works, but has an incredibly inefficient runtime complexity of O(2^n).  This is caused by the fact that whenever a wildcard is encountered, it performs a recursive call and possibly recomputes the same exact s and p multiple times.  Due to the O(2^n) time complexity, LeetCode threw a 'Time limit exceeded" error.

This problem is analogous to the fibonacci problem where the same computation is performed many times.

To solve the issue of duplicate computations, we can use a technique called memoization.  By creating a lookup table to store failed matches, we can avoid going down the same rabbit hole by checking if s and p has already failed to match.

#3

The second approach significantly improves runtime complexity, but LeetCode still threw a 'Time limit exceeded" error on the inputs:

s = "aaaaabbbbbababbabbbbaaabbaabaaabaabbbbabaaaabaaabbbbbbabbaaaaabbbaaababbaaabaabbaabbbbaaabaabbbaabbbbabbabaabbabbbbaabbaabbabaababababbaabbabbbaaabaaababaaaabbaaaaabbbaaaabbbababbabaaaabbbaabaabbaaaabaaab"

p = "*aab***aaab**a*b**b*aa***baabababb*abbbbb**a**ba******b**b*a*ab*b***aa*****a*aba*a***aa*b****a*a*****"

Upon closer inspection, we can infer that the contiguous wildcards was causing the algorithm to branch out wildly.  To solve this issue, we can remove contiguous wildcards from p so it looks like

p = "*aab*aaab*a*b*b*aa*baabababb*abbbbb*a*ba*b*b*a*ab*b*aa*a*aba*a*aa*b*a*a*"

Conclusion

By combining the three approaches above, LeetCode accepted my submission.


This was one of the more challenging problems I have encountered, and justifiably so because the solution acceptance rate is 15.8% (13th lowest out of 273 problems on the site) and its difficulty is rated 'Hard'.



The real world application of the wildcard matching problem includes anything that uses text search, some of which are: Google search, Microsoft Word, text editors, etc.

Saturday, May 9, 2015

Computational Time Complexity of Algorithms

Computer science problems often have multiple solutions/algorithms for a given problem.  For example, what algorithm would you use to find the largest number in an array of integers?  You could sort the array and get the largest number in O(n log(n)) time, or simply iterate through the array and maintain a variable that keeps track of the largest number in O(n) time.

We usually want to use the optimal (most efficient) algorithm because it makes our software run faster.  In order to find the optimal algorithm, it helps to understand the major time complexity 'classes' and the common problems and algorithms that fit within each class.

O(1)
  • Single operation, e.g. setting a value or performing a math operation
  • Lookup/delete nth element from an array
  • Lookup/insert/delete element from a hash table (average case, where there are few collisions)
O(log n)
  • Continuously splitting a data set or number of operations in half
  • Binary search
  • Search/insert/delete element from a binary search tree (average case, where tree is balanced)
  • Search/insert/delete element from a min/max heap
O(n)
  • Iterating through an entire data set
  • Search/insert/delete from a linked list
O(n log n)
  • Merge sort, quick sort, heap sort
O(n^2)
  • Looking at all pairs of items
  • Selection sort, insertion sort, bubble sort
  • 0/1 knapsack problem using dynamic programming
O(2^n)
  • Continuously doubling a data set or number of operations
  • Naive fibonacci
  • Travelling salesman problem using dynamic programming
O(n!)
  • Generating all permutations of a set
  • Naive travelling salesman problem

Sunday, February 22, 2015

Objection Orientation: Polymorphism, Abstraction, Encapsulation, and Inheritance

Developers who are new to object-oriented programming often struggle to understand the long-winded explanations that are scattered across the interwebs.  The intent of this post is to ease the pain (hopefully) by summarizing the main concepts of object orientation in a concise and precise format by providing a definition, two examples, and a brief explanation of its benefits.

Polymorphism

Definition:  Polymorphism is the concept where a parent type can have many sub-types.

Example #1:  ArrayList and LinkedList are sub-types of the parent type List.

List<Object, Object> myList = new ArrayList<Object, Object>();
List<Object, Object> myList = new LinkedList<Object, Object>();
view raw List hosted with ❤ by GitHub
Example #2:  Car, Motorcyle, and Bus are sub-types of the parent type Vehicle.

Vehicle myVehicle = new Car();
Vehicle myVehicle = new Motorcycle();
Vehicle myVehicle = new Bus();
view raw Vehicle hosted with ❤ by GitHub
Benefits:  Polymorphism allows you to use a single parent type to interact with all possible sub-types.  It also helps to decouple your code.  When you are working with a List object, you can freely swap out the underlying implementation from an ArrayList to a LinkedList and vice-versa.

Abstraction

Definition:  Abstraction is the generalization of common functionality across different objects.

Example #1:  List is an abstraction of ArrayList and LinkedList because it generalizes the common functionality of adding and removing elements.

public interface List<Object> {
boolean add(Object element);
boolean remove(Object element);
}
view raw List hosted with ❤ by GitHub
Example #2:  Vehicle is an abstraction of Car, Motorcyle, and Bus because it generalizes the common functionality of accelerating and braking.

public interface Vehicle {
void accelerate(int distance, int time);
void brake(int distance, int time);
}
view raw Vehicle hosted with ❤ by GitHub
Benefits:  Abstraction allows you to interact with different objects via a common interface.

Encapsulation

Definition:  Encapsulation is the packing (hiding) of data and implementation details into a class or function.

Example #1:  Data and implementation details about a car are encapsulated (packed, hidden) into a Car class.  Access to its data can be controlled via public, protected, and private modifiers.

public class Car {
// Hidden data
private int velocity;
private int weight;
public void accelerate(int distance, int time)
{
// Hidden implementation
}
public void brake(int distance, int time)
{
// Hidden implementation
}
}
view raw Class hosted with ❤ by GitHub
Example #2:  The functionality to accelerate and brake a car is encapsulated (packed, hidden) into functions.  Access to these functions can be controlled via public, protected, and private modifiers.

public void accelerate(int distance, int time)
{
this.velocity = distance / time;
}
public void brake(int distance, int time)
{
this.velocity = distance / time;
}
view raw Functions hosted with ❤ by GitHub
Benefits:  Encapsulation allows you to hide implementation details and protect against mis-use by other classes.  It also helps to decouple your code.  For example, I could modify the formula in the accelerate function without breaking any code dependencies on that function.

Inheritance

Definition:  Inheritance is the re-use of a base class's functionality by a sub-class.

Example #1:  Car is a sub-class that inherits the functionality of its base class Vehicle, including the data (currentHeading, velocity, weight) and functions (turn, accelerate, brake).

public abstract class Vehicle {
int currentHeading;
int velocity;
int weight;
void turn(int degrees) {
this.currentHeading += degrees;
}
abstract void accelerate(int distance, int time);
abstract void brake(int distance, int time);
}
public class Car extends Vehicle {
@Override
public void accelerate(int distance, int time) {
}
@Override
public void brake(int distance, int time) {
}
}
view raw Abstract Class hosted with ❤ by GitHub
Example #2:  A Car can turn because it inherited the functionality from its base class Vehicle.

Vehicle myCar = new Car();
myCar.turn(90);
view raw Car turn hosted with ❤ by GitHub
Benefits:  Inheritance allows you to re-use functionality and reduce duplicate code.

Polymorphism, abstraction, encapsulation, and inheritance are distinctly different concepts, but together they support the paradigm of object orientation.  I hope this post cleared up any misunderstandings of these four pillars object orientation!