示例0
输入:
ilovechina
i,ilove,lo,love,ch,china,lovechina
输出:
ilove,china
示例1
输入:
ilovechina
i,love,china,ch,na,ve,lo,this,is,the,word
输出:
i,love,china
说明:
示例2
输入:
iat
i,love,china,ch,na,ve,lo,this,is,the,word,beauti,tiful,ful
输出:i,a,t
说明:单个字母,不在词库中且不成词则直接输出单个字母
示例3
输入:
ilovechina,thewordisbeautiful
i,love,china,ch,na,ve,lo,this,is,the,word,beauti,tiful,ful
输出:
i,love,china,the,word,is,beauti,ful
说明:标点符号为英文标点符号
import java.util.Arrays;
import java.util.List;
import java.util.Scanner;
public class 中文模拟分词器2 {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String input = in.nextLine();
List<String> dict = Arrays.asList(in.nextLine().split(","));
int len = input.length();
StringBuilder sb = new StringBuilder();
int i = 0;
while (i < len) {
int j = len;
boolean found = false;
while (j > i) {
String s = input.substring(i, j);
if (s.matches("[a-zA-Z]+") && (dict.contains(s) || s.length() == 1)) {
sb.append(s).append(",");
found = true;
i = j;
break;
}
j--;
}
if (!found) {
i++;
}
}
System.out.println(sb.substring(0, sb.length() - 1));
}
}