Theory of Dynamic Programming, Matrix Chain Multiplication and Edit Distance

Dynamic Programming:

Dynamic Programming (or DP) is a programming method that can dramatically decreases the runtime of some algorithms (however not all problem has DP features) from exponential to polynomial. Most of the (and still rising) real world problems are only solvable in reasonable time by using DP.

To be capable to use DP, the original problem must encompass:

A) Optimal sub-structure property: The optimal solution to problem contains in it optimal solutions to sub-problems

B) Overlapping sub-problems property: We accidentally re-compute the same problem two times or more.

There are two kinds of DP: We can either build up the solutions of sub-problems from small to big (bottom up) or we can save outcomes of solutions of sub-problems in a table (top down + memorization).

Let's begin with a sample of Dynamic Programming (DP) method. We will observe the simplest form of overlapping sub-problems. Remember Fibonacci? A popular problem that makes a lot of redundancy when you use standard recursion fn = fn-1 + fn-2.

Top-down Fibonacci DP answer will record the each Fibonacci computation in a table therefore it won't have to re-calculate the value again whenever you require it, a simple table-lookup is sufficient (memorization), while Bottom-up DP solution will make the solution from the smaller numbers.

Now let's see the comparison among Non-DP solution vs. DP solution (both bottom-up and top-down), given in the C source code shown below, all along with the suitable comments:

#include <stdio.h>
#define MAX 20 // to test with bigger number, adjust this value
int memo[MAX]; // array to store the prior calculations
// the slowest, needless computation is repeated
int Non_DP(int n) {
  if (n==1 || n==2)
    return 1;
else
    return Non_DP(n-1) + Non_DP(n-2);
}
// top down DP
int DP_Top_Down(int n) {
  // base case
  if (n == 1 || n == 2)
    return 1;
  // instantly return the formerly computed outcome
  if (memo[n] != 0)
    return memo[n];
  // or else, do the same as Non_DP
  memo[n] = DP_Top_Down(n-1) + DP_Top_Down(n-2);
  return memo[n];
}
// fastest DP, bottom up, store the prior oitcomes in array
int DP_Bottom_Up(int n) {
  memo[1] = memo[2] = 1; // default values for DP algorithm
  // from 3 to n (we already know that fib(1) and fib(2) = 1
  for (int i=3; i<=n; i++)
    memo[i] = memo[i-1] + memo[i-2];
  return memo[n];
}
void main() {
  int z;
  // this will be the slowest
  for (z=1; z<MAX; z++) printf("%d-",Non_DP(z));
  printf("\n\n");
  // this will be faster than the first
  for (z=0; z<MAX; z++) memo[z] = 0;
  for (z=1; z<MAX; z++) printf("%d-",DP_Top_Down(z));
  printf("\n\n");
  /* this generally will be the fastest */
  for (z=0; z<MAX; z++) memo[z] = 0;
  for (z=1; z<MAX; z++) printf("%d-",DP_Bottom_Up(z));
  printf("\n\n");
}

Matrix Chain Multiplication (MCM):

Let's begin by analyzing the cost of multiplying 2 matrices:

Matrix-Multiply(A,B):
  if columns[A] != columns[B] then
    error "incompatible dimensions"
  else
    for i = 1 to rows[A] do
      for j = 1 to columns[B] do
        C[i,j]=0
        for k = 1 to columns[A] do
          C[i,j] = C[i,j] + A[i,k] * B[k,j]
  return C


Time complexity = O(pqr),  here |A|=p x q and |B| = q x r
|A| = 2 * 3, |B| = 3 * 1, so to multiply such 2 matrices, we require O(2*3*1)=O(6) scalar multiplication. The outcome is matrix C with |C| = 2 * 1

Matrix Chain Multiplication Problem:

Input: Matrices A1, A2,...An, each Ai of size Pi-1 x Pi
Output: Completely parenthesized product A1A2...An which minimizes the number of scalar multiplications

The product of matrices is completely parenthesized if it is either:
A) a single matrix
B) the product of two completely parenthesized matrix products enclosed by parentheses

Illustration of MCM problem:

We have three matrices and the size of each matrix:
A1 (10 x 100), A2 (100 x 5), A3 (5 x 50)

We can completely parenthesized them in two manners:

1. (A1 (A2 A3)) = 100 x 5 x 50 + 10 * 100 * 50 = 75000
2. ((A1 A2) A3) = 10 x 100 x 5 + 10 x 5 x 50 = 7500 (that is, 10 times better)

We can see how the cost of multiplying such 3 matrices differs appreciably. The cost truly based on the choice of the fully parenthesization of the matrices. Though, exhaustively checking all possible parenthesizations takes exponential time.

Step1: characterize the optimal sub-structure of this problem:

Let Ai..j (i < j) represent the outcome of multiplying AiAi+1..Aj.  Ai..j can be obtained by divide it into Ai..k and Ak+1..j and then multiplying the sub-products. There are j-i possible divides (that is, k=i,...,j-1)

In the optimal parenthesization of Ai..j:

(i) the parenthesization of Ai..k ought to be optimal
(ii) the parenthesization of Ak+1..j should be optimal

Since if they are not optimal, then there exist another split which is better, and we must select that split and not this split.

Step 2: Recursive formulation

Need to find A1..n
Let m[i,j] = minimum number of scalar multiplications required to calculate Ai..j
As Ai..j can be received by breaking it into Ai..k Ak+1..j, we encompass

m[i,j] = 0, if i=j
        = min i<=k<j { m[i,k]+m[k+1,j]+pi-1pkpj }, if i<j


Assume s[i,j] be the value k, where the optimal split takes place.

Step 3: Computing the Optimal Costs:

Matric-Chain-Order(p)
  n = length[p]-1
for i = 1 to n do
    m[i,i] = 0
  for l = 2 to n do
    for i = 1 to n-l+1 do
      j = i+l-1
      m[i,j] = infinity
      for k = i to j-1 do
        q = m[i,k] + m[k+1,j] + pi-1*pk*pj
        if q < m[i,j] then
          m[i,j] = q
          s[i,j] = k
  return m and s


Step 4: Constructing an Optimal Solution:

Print-MCM(s,i,j)
  if i=j then
    print Ai
  else
    print "(" + Print-MCM(s,1,s[i,j]) + "*" + Print-MCM(s,s[i,j]+1,j) +
")"

Longest Common Subsequence (LCS):

Input: Two sequences
Output: A longest common sub-sequence of such two sequences, see the details shown below.

A sequence Z is a subsequence of X <x1,x2,...,xm>, when there exists a strictly rising sequence <i1,i2,...,ik> of indices of X such that for all j = 1,2,..,k, we have xi=zj. Illustration: X=<B,C,A,D> and Z=<C,A>.

A sequence Z is termed as common subsequence of sequence X and Y when Z is a subsequence of both X and Y.

Longest common subsequence (or LCS) is just the longest "common subsequence" of the two sequences.

A brute force approach of finding LCS like enumerating all the subsequences and finding longest common one takes too much time. Though, Computer Scientist has found a Dynamic Programming solution for LCS trouble, we will just write the final code here, written in C, ready to employ. Note that, this code is slightly changed and we utilize global variables.

#include <stdio.h>
#include <string.h>
#define MAX 100
char X[MAX],Y[MAX];
int i,j,m,n,c[MAX][MAX],b[MAX][MAX];
int LCSlength() {
 
  m=strlen(X);
  n=strlen(Y);
 
  for (i=1;i<=m;i++) c[i][0]=0;
  for (j=0;j<=n;j++) c[0][j]=0;
 
  for (i=1;i<=m;i++)
    for (j=1;j<=n;j++) {
      if (X[i-1]==Y[j-1]) {
        c[i][j]=c[i-1][j-1]+1;
        b[i][j]=1; /* from north west */
      }
 
      else if (c[i-1][j]>=c[i][j-1]) {
      c[i][j]=c[i-1][j];
       b[i][j]=2; /* from north */
      }
      else {
        c[i][j]=c[i][j-1];
        b[i][j]=3; /* from west */
      } 
    }
  return c[m][n];
}
void printLCS(int i,int j) {
  if (i==0 || j==0) return;
 
  if (b[i][j]==1) {
    printLCS(i-1,j-1);
   printf("%c",X[i-1]);}
  else if (b[i][j]==2)
    printLCS(i-1,j);
  else
    printLCS(i,j-1);
}
void main() {
  while (1) {
    gets(X);
    if (feof(stdin)) break; /* press ctrl+z to terminate */
    gets(Y);
    printf("LCS length -> %d\n",LCSlength()); /* count length */
    printLCS(m,n); /* reconstruct LCS */
    printf("\n");
  }
}

Edit Distance:

Input: Given two strings, Cost for insertion, deletion and replace.
Output: Give the minimum actions required to transform first string into the second one.

Edit Distance problem is a bit alike to LCS. DP Solution for this problem is very helpful in Computational Biology like for comparing DNA.

Assume that d(string1,string2) be the distance between such 2 strings.

The two-dimensional matrix, m[0..|s1|,0..|s2|] is employed to hold the edit distance values, in such a way that m[i,j] = d(s1[1..i], s2[1..j]).

m[0][0] = 0;
for (i=1; i<length(s1); i++) m[i][0] = i;
for (j=1; j<length(s2); j++) m[0][j] = j;
for (i=0; i<length(s1); i++)
  for (j=0; j<length(s2); j++) {
    val = (s1[i] == s2[j]) ? 0 : 1;
    m[i][j] = min( m[i-1][j-1] + val,
                   min(m[i-1][j]+1 , m[i][j-1]+1));
  }

To output trace, use the other array to store our action all along the way. Trace back such values later.

Longest Increasing or Decreasing Subsequence (LIS/LDS):

Input: Given a sequence
Output: The longest subsequence of the given sequence is that all values in this longest subsequence are strictly increasing or decreasing.

O(N^2) DP solution for LIS problem (this code check for rising values):

for i = 1 to total-1
  for j = i+1 to total
    if height[j] > height[i] then
      if length[i] + 1 > length[j] then
        length[j] = length[i] + 1
        predecessor[j] = i

Is O(n^2) is the best algorithm to solve LIS/LDS ?

Luckily, the answer is ‘No’.

There exist an O(n log k) algorithm to calculate LIS (for LDS, this is merely a reversed-LIS), where k is the size of the real LIS.

This algorithm employs some invariant, where for each longest sub-sequence with length l, it will finish with value A[l]. (Notice that, by maintaining this invariant, array A will naturally sorted.) Subsequent insertion (that is, you will just do n insertions, one number at one time) will utilize binary search to find the suitable position in this sorted array A

0  1  2  3  4  5  6  7  8
a    -7,10, 9, 2, 3, 8, 8, 1
A -i  i, i, i, i, i, i, i, i (iteration number, i = infinity)
A -i -7, i, i, i, i, i, i, i (1)
A -i -7,10, i, i, i, i, i, i (2)
A -i -7, 9, i, i, i, i, i, i (3)
A -i -7, 2, i, i, i, i, i, i (4)
A -i -7, 2, 3, i, i, i, i, i (5)
A -i -7, 2, 3, 8, i, i, i, i (6)
A -i -7, 2, 3, 8, i, i, i, i (7)
A -i -7, 1, 3, 8, i, i, i, i (8)

We can see that the length of LIS is 4, which is accurate. To rebuild the LIS, at each step, store the predecessor array as in the standard LIS + this time keep in mind the real values, as array A only store the last element in subsequence, not the real values.

Zero-One Knapsack:

Input: N items, each with different Vi (Value) and Wi (Weight) and with max Knapsack size MW.
Output: Maximum value of items which one can carry, if he can either take or not-take a specific item.

Let C[i][w] be the maximum value when the available items are {X1,X2,...,Xi} and the knapsack size is w.

a) if i == 0 or w == 0 (when no item or knapsack full), we cannot take anything C[i][w] = 0
b) if Wi > w (this item much heavy for our knapsack),skip this item C[i][w] = C[i-1][w];
c) if Wi <= w, take the maximum of "not-take" or "take" C[i][w] = max(C[i-1][w] , C[i-1][w-Wi]+Vi);
d) The solution can be found in C[N][W];

for (i=0;i<=N ;i++) C[i][0] = 0;
for (w=0;w<=MW;w++) C[0][w] = 0;
 
for (i=1;i<=N;i++)
  for (w=1;w<=MW;w++) {
    if (Wi[i] > w)
      C[i][w] = C[i-1][w];
    else
      C[i][w] = max(C[i-1][w] , C[i-1][w-Wi[i]]+Vi[i]);
  }

output(C[N][MW]);

Counting Change:

Input: A list of denominations and a value N to be modified with such denominations.
Output: Number of ways to change N

Assume that you have coins of 1 cent, 5 cents and 10 cents. You are requested to pay 16 cents, so you have to give 1 one cent, 1 five cents, and 1 ten cents. The counting Change algorithm can be employed to find out how many ways you can employ to pay an amount of money.

Number of ways to change the amount A by using N kinds of coins equivalents to:

A) Number of ways to change the amount A by using all however the first kind of coins, +
B) Number of ways to change the amount A-D by using all N kinds of coins, where D is the denomination of the initial kind of coin.

The tree recursive method will gradually decrease the value of A, and then by employing this rule, we can find out how many ways to change coins.

i) If A is exactly 0, we must count that as 1 way to make the change.
ii) If A is less than 0, we must count that as 0 ways to make the change.
iii) If N kinds of coins are 0, we must count that as 0 ways to make the change.

#include <stdio.h>
#define MAXTOTAL 10000
long long nway[MAXTOTAL+1];
int coin[5] = { 50,25,10,5,1 };
void main() 
{
  int i,j,n,v,c;
  scanf("%d",&n);
  v = 5;
  nway[0] = 1;
  for (i=0; i<v; i++) {
    c = coin[i];
    for (j=c; j<=n; j++)
      nway[j] += nway[j-c];
  }
  printf("%lld\n",nway[n]);
}

Latest technology based Programming Languages Online Tutoring Assistance

Tutors, at the www.tutorsglobe.com, take pledge to provide full satisfaction and assurance in Programming Languages help via online tutoring. Students are getting 100% satisfaction by online tutors across the globe. Here you can get homework help for Programming Languages, project ideas and tutorials. We provide email based Programming Languages help. You can join us to ask queries 24x7 with live, experienced and qualified online tutors specialized in Programming Languages. Through Online Tutoring, you would be able to complete your homework or assignments at your home. Tutors at the TutorsGlobe are committed to provide the best quality online tutoring assistance for Programming Languages Homework help and assignment help services. They use their experience, as they have solved thousands of the Programming Languages assignments, which may help you to solve your complex issues of Programming Languages. TutorsGlobe assure for the best quality compliance to your homework. Compromise with quality is not in our dictionary. If we feel that we are not able to provide the homework help as per the deadline or given instruction by the student, we refund the money of the student without any delay.