发信人: sioungiep (老实的苦力熊), 信区: SearchEngineTech 标 题: Re: 求大规模文本相似性计算方法 发信站: 水木社区 (Fri Feb 15 10:16:18 2008), 站内 据一位原百度员工对我说,百度新闻消重的方法很简单。 两篇新闻,各将最长的句子找出来,比较之,相同的就说明两篇新闻相同。 此方法相当快,也相当准。
Archives for February 2008
发现五个搜索引擎系统
今天发现的五个搜索引擎系统: 龙蛛搜索 http://www.51kdv.com/ 深蓝搜索引擎 http://www.darkbluesoft.com K风网页搜索引擎系统 K-PageSearch Engine http://www.kwindsoft.com/ 射日搜索 http://www.shootsoft.net/ Sphider 轻量级搜索引擎 http://www.smf.cn/bbs/index.php/topic,114.0.html http://www.sphider.eu/
ZOJ 1151 Word Reversal
C++代码 /* Title: Word Reversal Problem URL: http://acm.zju.edu.cn/show_problem.php?pid=1151 Author: moqi Date: 2008-02-08 Description: Accepted 1151 C++ 00:00.04 392K with Presentation Error * 4 */ #include <stdio.h> #include <string.h> int i, lines; int total, now; char s[250]; char c; int main() { #ifdef ONLINE_JUDGE #else freopen("1151.txt", "r", stdin); #endif scanf("%d", &total); for (now = 0; now < total; now++) { scanf("%d\n", &lines);//IMPORTANT plus \n while (lines– >= 0) { i = 0; while ((c = getchar()) != EOF) […]
ZOJ 1067 Color Me Less
C++代码 /* Title: Color Me Less Problem URL: http://acm.zju.edu.cn/show_problem.php?pid=1067 Author: moqi Date: 2008-02-08 Description: Accepted 1067 C++ 00:00.00 388K */ #include <stdlib.h> #include <stdio.h> int t[16][3]; int now[3]; int i, j; long min, d; int main() { #ifdef ONLINE_JUDGE #else freopen("1067.txt", "r", stdin); #endif for (i = 0; i < 16; i++) { scanf("%d %d %d", &t[i][0], &t[i][1], &t[i][2]); } while (scanf("%d %d %d", &now[0], &now[1], &now[2]) != -1) { if (now[0] == -1) break; […]
ZOJ 1051 A New Growth Industry
C++代码 /* Title: A New Growth Industry Problem URL: http://acm.zju.edu.cn/show_problem.php?pid=1051 Author: fairylan Date: 2008-02-07 Reference: http://blog.csdn.net/fairylan/archive/2006/07/10/900817.aspx Description: Accepted 1051 C++ 00:00.00 392K */ #include <stdio.h> #include <string.h> #define MAXN 20 char ch[]=".!X#"; int dish[MAXN][MAXN],res[MAXN][MAXN]; int day,d[16]; void solve() { int i,j,k; scanf ("%d",&day); for (k=0; k<16; ++k) scanf ("%d",&d[k]); for (i=0; i<MAXN; ++i) for (j=0; j<MAXN; ++j) scanf ("%d",&dish[i][j]); while (day–){ for (i=0; i<MAXN; ++i) for (j=0; j<MAXN; ++j){ k = dish[i][j]; if (i-1>=0) k += dish[i-1][j]; […]