A
A
Algorithm
Search…

# 一、匹配

## 1.1 字符串之间的匹配

### 1.1.1 KMP

Could not load image

next数组的优化 考虑到匹配时，模式串若出现p[j] = p[next[j]]的情况，则必定发生再次回溯。因此可在求next数组时，即令next[j] = next[k]回溯一次。

### 1.1.2 BM算法

BM算法基于坏字符规则和好后缀规则，效率更高。详细描述可参见这里

(1) "好后缀"的位置以最后一个字符为准。假定"ABCDEF"的"EF"是好后缀，则它的位置以"F"为准，即5（从0开始计算）。 (2) 如果"好后缀"在搜索词中只出现一次，则它的上一次出现位置为 -1。比如，"EF"在"ABCDEF"之中只出现一次，则它的上一次出现位置为-1（即未出现）。 (3) 如果"好后缀"有多个，则除了最长的那个"好后缀"，其他"好后缀"的上一次出现位置必须在头部。比如，假定"BABCDAB"的"好后缀"是"DAB"、"AB"、"B"，请问这时"好后缀"的上一次出现位置是什么？回答是，此时采用的好后缀是"B"，它的上一次出现位置是头部，即第0位。这个规则也可以这样表达：如果最长的那个"好后缀"只出现一次，则可以把搜索词改写成如下形式进行位置计算"(DA)BABCDAB"，即虚拟加入最前面的"DA"。

Could not load image
Describe BM Algorithm in English
The Boyer-Moore string-search algorithm is an efficient string-searching algorithm that is the standard benchmark for practical string-search literature. The algorithm preprocesses the pattern string instead of the string being searched in. The BM algorithm uses information during preprocess and it can skip along the text in jumps of multiple characters rather than searching every single character in the text. Specifically speaking, the start of pattern string will be aligned with the start of text string at the beginning. Characters are then compared from the end of pattern string, moving backward. If a mismatch occurs, the alignment will be shifted forward according to the maximum value permitted by several rules. The comparisons will be performed again at the new alignment. A shift is calculated by applying two rules: the bad character rule and the good suffix rule. The bad character rule considers the character at which the comparision process failed. The next occurrence of that character to the left is found, and a shift which brings that occurrence with the mismatched occurrence is proposed. Whereas the good suffix rule considers the suffix that has been matched and a shift which brings the occerrence with the matched occurrence is proposed. The BM algorithm has worst-case running time of O(nm), and the best-case running time is O(n/m). Generally speaking, the algorithm run faster as the pattern length increases.

### 1.1.3 Sunday算法

md Sunday算法是真的厉害
Sunday算法在开始匹配的时候仍然是将头部对齐，但从前端开始匹配，若出现匹配失败，要考虑文本串中参与匹配的最末尾字符的下一位字符在模式串中的位置，将两者对齐再进行匹配。
Sunday算法易于理解且速度快，但容易出现O(mn)的情况。
Could not load image

# 二、子串

## 2.1 最长不重复子串

Leetcode第三题 直接穷举的话复杂度肯定会很高，而初次写就想到使用Map也不是很容易，尤其是Map的value可以不断变化的想法。
public int lengthOfLongestSubstring(String s) {
int result = 0;
int currentStart = 0;
HashMap<Character, Integer> infoMap = new HashMap<>();
for (int i = 0; i < s.length(); i++) {
if (infoMap.containsKey(s.charAt(i))) {
currentStart = Math.max(infoMap.get(s.charAt(i)) + 1, currentStart);
}
infoMap.put(s.charAt(i), i);
result = Math.max(result, i - currentStart + 1);
}
return result;
}

## 2.2 字符串数组中最长公共前缀

Leetcode之所以设为easy难度大抵在于O(nm)也可以过。
Could not load image

## 2.4 最长回文子串/子序列

$F[i][j] = \begin{cases} true, \quad i=j \\\\ c[i] == c[j], \quad j=i+1 \\\\ c[i] == c[j]\ and\ f(i+1, j-1), \quad j>i+1 \end{cases}$
// 最长回文子串
int maxPalin(string str)
{
vector<vector<bool>> f(str.size(), vector<bool>(str.size(), false));
int resultMax = 0;
f = true;
for(int j = 1; j < str.size(); j++)
{
// 单个字符一定是回文串
f[j][j] = true;
for(int i = 0; i < j; i++)
{
f[i][j] = (str[i] == str[j] && (j == i + 1 || f[i + 1][j - 1]));
if(f[i][j])
resultMax = max(resultMax, j - i + 1);
}
}
return resultMax;
}

$F[i][j] = \begin{cases} 1, \quad i=j \\\\ 2, \quad j=i+1\ and\ c[i] == c[j] \\\\ f(i+1, j-1) + 2, \quad j>i+1\ and\ c[i] == c[j] \\\\ max(f(i+1, j),\ f(i, j-1)),\quad c[i]\ !=\ c[j] \end{cases}$
int maxPalinSubSeq(string str)
{
vector<vector<int>> f(str.size(), vector<int>(str.size(), 1));
for(int i = str.size() - 1; i >= 0; i--)
{
for(int j = i + 1; j < str.size(); j++)
{
if(j == i + 1 && str[i] == str[j])
f[i][j] = 2;
else if(str[i] == str[j])
f[i][j] = f[i + 1][j - 1] + 2;
else
f[i][j] = max(f[i + 1][j], f[i][j - 1]);
}
}
return f[str.size() - 1];
}

# 三、变换

## 3.1 字符串的排列子串问题

// 3.1 字符串的排列是否是另一字符串的子串
bool permutationInStr(string pattern, string text)
{
if(pattern.size() > text.size() || pattern == "") {
return false;
}
vector<int> patternCountMap(128, 0);
for(char& eachC : pattern) {
patternCountMap[eachC]++;
}
vector<int> textCurrentCountMap(128, 0);
unsigned int left = 0;
unsigned int right = pattern.size() - 1;
for(int i = left; i <= right; i++) {
textCurrentCountMap[text[i]]++;
}
if(patternCountMap == textCurrentCountMap) {
return true;
}
while(right + 1 < text.size()) {
textCurrentCountMap[text[left]]--;
left++;
right++;
textCurrentCountMap[text[right]]++;
if(patternCountMap == textCurrentCountMap) {
return true;
}
}
return false;
}

## 3.2 求字符串的全排列

// 字符串的全排列
vector<string> result;
void allPermutationHelper(string str, unsigned int currentStart);
void allPermutation(string str) {
result.clear();
allPermutationHelper(str, 0);
sort(result.begin(), result.end());
copy(result.begin(), result.end(), ostream_iterator<string>{std::cout, "\n"});
}
void allPermutationHelper(string str, unsigned int currentStart) {
if (currentStart == str.size() - 1) {
result.push_back(str);
return;
}
for (unsigned int j = currentStart; j < str.size(); j++) {
if(j != currentStart && str[currentStart] == str[j]) {
continue;
}
swap(str[j], str[currentStart]);
allPermutationHelper(str, currentStart + 1);
swap(str[j], str[currentStart]);
}
}

## 3.3 翻转字符串中的全部单词

// 3.3 翻转字符串中的全部单词
string reverseWords(string s) {
string result;
for(int i = 0;i < s.size(); i++) {
if(s[i] == ' ') {
continue;
}
int nextPos = s.find(" ", i);
if(nextPos != string::npos) {
// 注意起始的空格
// 其实可以考虑最后移掉多余的空格
if(result.size() > 0) {
result = s.substr(i, nextPos - i) + " " + result;
} else {
result = s.substr(i, nextPos - i);
}
i = s.find(" ", i);
} else {
if(result.size() > 0) {
result = s.substr(i, nextPos - i) + " " + result;
} else {
result = s.substr(i, nextPos - i);
}
break;
}
}
return result;
}

## 3.4 旋转字符串

// 3.4 旋转字符串
bool rotateString(string A, string B) {
return A.size() == B.size() && ((A + A).find(B) != string::npos);
}

1.1 字符串之间的匹配
1.2 简单正则表达式的匹配

2.1 最长不重复子串
2.2 字符串数组中最长公共前缀
2.3 最长回文串
2.4 最长回文子串/子序列

3.1 字符串的排列子串问题
3.2 求字符串的全排列
3.3 翻转字符串中的全部单词
3.4 旋转字符串
3.5 parseInt