如何快速查找两个字符串中的最大公字串?


如"abcecade","dgecadde"的最大子串为"ecad"

python 算法 C++ JavaScript

德國的榮耀 11 years, 6 months ago

把字符串1(长度m)横排,串2(长度n)竖排,得到一个m×n的矩阵c,矩阵的每个元素的值如下,如果m[i]=n[j],则c[j][i]=1,否则,c[j][i]=0。然后找出矩阵中连续是1的对角线最长的一个,则对角线的长度就是公共子串的长度.
经过改进,可以不需要构造矩阵,因为第i行如果有字母匹配,其取值仅与第i-1行相关,若m[i]=n[j],则c[j][i] = c[j-1][i-1] + 1,这样仅需要记录一个长度为m的一维数组就可以了。

   
  #include <stdio.h>
  
#include <stdlib.h>

char * StringSearch( char * str1, char * str2 )
{
int i;
int j;
char* ptempBuffer1;
char* ptempBuffer2;
char* pwork;
char* plast;
char* ptemp;
char* retstr;
int resultIndex = 0;
int resultLength = 0;
int str1Size = 0;
int str2Size = 0;
ptempBuffer1 = str1;
while( *ptempBuffer1 != '\0' )
{
ptempBuffer1++;
str1Size++;
}
ptempBuffer2 = str2;
while( *ptempBuffer2 != '\0' )
{
ptempBuffer2++;
str2Size++;
}

ptempBuffer1 = ( char * ) malloc( str1Size );
pwork = ptempBuffer1;
memset( pwork, 0, str1Size );
ptempBuffer2 = ( char * ) malloc( str1Size );
plast = ptempBuffer2;
memset( plast, 0, str1Size );
for( i = 0; i < str2Size; i++ )
{
for( j = 0; j < str1Size; j++ )
{
if( *( str1 + j ) == *( str2 + i ) )
{
if( j == 0 )
{
*( pwork + j ) = 1;
}
else
{
*( pwork + j ) = *( plast + j - 1 ) + 1;
}
if( resultLength < *( pwork + j ) )
{
resultIndex = j;
resultLength = *( pwork + j );
}
}
else
{
*( pwork + j ) = 0;
}
}
ptemp = pwork;
pwork = plast;
plast = ptemp;
}
retstr = ( char * ) malloc( resultLength + 1 );
memcpy( retstr, str1 + resultIndex - resultLength + 1, resultLength );
*( retstr + resultLength ) = '\0';
printf( "resultIndex = %d, resultLength = %d\n", resultIndex, resultLength );
free( ptempBuffer1 );
free( ptempBuffer2 );
return retstr;
}

int main(int argc, char *argv[])
{
char* ret = NULL;
ret = StringSearch( "adbccadebbca", "edabccadece" );
printf( "result String is %s\n", ret );
free( ret );
system("PAUSE");
return 0;
}

为了方便,采用了两个容量为m的一维数组来保存运行中的结果,空间复杂度为m+n+2 m(保存打印输出的结果字符串可以不需要),也就是O(m+n)。由于需要事先遍历字符串得到长度,算法复杂度为m n + m + n,O(m*n)级别。

引用地址

darling answered 11 years, 6 months ago

Your Answer