Skip to main content

Minimum Edit Distance - Dyanamic Programming

Problem Statement : - 

  • Given two strings str1 and str2 and below operations that can performed on str1. Find minimum number of edits (operations) required to convert ‘str1’ into ‘str2’.
  • Operations and their Cost which we can perform to achieve this goal
    • Insert - 1
    • Delete - 1
    • Replace - 2 (Note : Delete The Element + Insert New Element)
Example 1 : -
  Input       : s1= Hogward , s2= Hogwartz
  Output      : 4
  Explanation : 1. Insert 'g' at index 3 ; cost = 1
                2. Replace 'd' with 't' at index 7; cost = 2
                3. Insert 'z' at index 8 = 1; cost=1
                Total cost = 1+2+1 = 4
Idea : -
  1. We create array MED of  Size M * N where M= length of s1 N= length of s2
  2. To Solve This Problem We use Dynamic Approach.
  3. We solve All sub problem which is |sub1|=m , |sub2|=n  where m <=|s1| and n<=|s2|
  4. If last characters of two strings are same, nothing need to do. So we use Previous Solution and Recur for  (MED(i-1,j-1)).
  5. Else (If last characters are not same), we consider all operations on ‘str1’, consider all three operations on last character of first string, recursively compute minimum cost for all three operations and take minimum of three values.
    1. Insert: Recur for MED(m, n-1)
    2. Remove: Recur for MED(m-1, n)
    3. Replace: Recur for MED(m-1, n-1)
Switch to Desktop View For Better Experience

Base Conditions : -
  • for |s1|=0 and |s2|=0 
    • MED[0][0]=0
Sub Problems : -
  • for |s1|=0 and |s2| > 0 we required to insert each element of s2 in s1.
    • MED[0][1]=1
    • MED[0][2]=2
  • for |s2|=0 and |s1| > 0 we required to delete each element of s1.
    • MED[1][0]=1
    • MED[2][0]=2
  • otherwise
    • MED[i][j]=min{MED[i-1][j]+1,MED[i][j-1]+1,MED[i-1][j-1]+2}   if(s1[i]!=s2[i])
    • MED[i][j]=MED[i-1][j-1]   if(S1[i]=s2[j])
Example 2  :-
  S1 = ALIBABA
  S2 = ALLADIN
MED Array : -
S2 ^ A L I B A B A
        
S1 i\j 0    1 2 3 4 5 6 7
^ 0        0     1 2 3 4 5 6 7
A 1 1 0 1 2 3 4 5 6
L 2 2 1 0 1 2 3 4 5
L 3 3 2 1 2 3 4 5 6
A 4 4 3 2 3 4 3 4 5
D 5 5 4 3 4 5 4 5 6
I 6 6 5 4 3 4 5 6 7
N 7 7 6 5 4 5 6 7 8

Algo : -
for(j=0 to M)
   MED[0][j]=j
for(i=0 to N)
   MED[i][0]=i;
for(i=1 to N)
   for(j=1 to M)
      if(s1[i]==s2[i])
        MED[i][j]=MED[i-1][j-1]
      else
        MED[i][j]=min{MED[i-1][j]+1,MED[i][j-1]+1,MED[i-1][j-1]+2}
Programme : -
Download Programme
#include<stdio.h>
#include<string.h>
int min(int a,int b,int c)
{
    if(a<b)
    {
        return(c<a?c:a);
    }
    else
    {
        return(c<b?c:b);
    }
}
main()
{
    int i,j;
    char s1[20],s2[20];
    printf("ENTER THE FIRST STRING : -");
    scanf("%s",s1);
    printf("ENTER THE SECOND STRING : -");
    scanf("%s",s2);
    int M,N;
    M=strlen(s1);
    N=strlen(s2);
    int MED[M+1][N+1];
    for(i=0;i<=N;i++)
    {
        MED[i][0]=i;
    }
    for(j=0;j<=M;j++)
    {
        MED[0][j]=j;
    }
    for(i=1;i<=N;i++)
    {
        for(j=1;j<=M;j++)
        {
            if(s1[i-1]==s2[j-1])
            {
                MED[i][j]=MED[i-1][j-1];
            }
            else
            {
                MED[i][j]=min((MED[i-1][j]+1),(MED[i][j-1]+1),(MED[i-1][j-1]+2));
            }

        }

    }
    printf("   ^  ");
    for(i=0;i<M;i++)
    {
        printf("%c  ",s2[i]);
    }
    printf("\n");
    printf("^  ");
    for(i=0;i<=N;i++)
    {

        for(j=0;j<=M;j++)
        {
            printf("%d  ",MED[i][j]);
        }

         printf("\n");
            printf("%c  ",s1[i]);
    }

    printf("\nCOST OF CONVERTING '%s' TO '%s' IS  : %d",s1,s2,MED[N][M]);

}


Comments

Popular Posts

Adaptive Software Development (ASD)

Adaptive Software Development (ASD) :- Adaptive Software Development(ASD) has been proposed by Jim High smith as technique for building complex software and system. ASD focus on human collaboration and team self-organization. Jim High smith defines an ASD "Life Cycle" that consist of three phases  Speculation Collaboration learning  During speculation, the project is initiated and adaptive cycle planning is conducted. Adaptive cycle planning uses project Initiation information,the customer's mission statement ,basic requirements and project constraint to define the set of release cycles that will be required for the project. Collaboration encompasses communication and team-work but it also enphasizes individualism because individual creativity plays an important role in collaborative thinking.It is all above all,a matter of trust. people working together must trust one another to comment without war help without irritation work as hard as ...

Dijkstra's algorithm

Djikstra's algorithm   solves the problem of finding the shortest path from a point in a graph (the  source ) to a destination. It turns out that one can find the shortest paths from a given source to  all  points in a graph in the same time, hence this problem is sometimes called the  single-source shortest paths  problem. This problem is related to the spanning tree one. The graph representing all the paths from one vertex to all the others must be a spanning tree - it must include all vertices. There will also be no cycles as a cycle would define more than one path from the selected vertex to at least one other vertex. For a graph, G = (V,E) where V is a set of vertices and E is a set of edges. Dijkstra's algorithm keeps two sets of vertices: S    the set of vertices whose shortest paths from the source have already been determined  and V-    S :   the remaining v...

Kruskal’s Minimum Spanning Tree Algorithm & Programme

PROBLEM :                  Kruskal's algorithm is a minimum-spanning-tree algorithm which finds an edge of the least possible weight that connects any two trees in the forest. It is a greedy algorithm in graph theory as it finds a minimum spanning tree for a connected weighted graph adding increasing cost arcs at each step. To understand Kruskal's algorithm let us consider the following example − Step 1 - Remove all loops and Parallel Edges Remove all loops and parallel edges from the given graph. In case of parallel edges, keep the one which has the least cost associated and ove all others. Step 2 - Arrange all edges in their increasing order of weight The next step is to create a set of edges and weight, and arrange them in an ascending order of weightage (cost). Step 3 - Add the edge which has the least weightage Now we start adding edges to the graph beginning from the one which has the least weight. Th...