After previous section’s good results, the linguistic information plays
a key role in LS query finding. According to Rada Mihalcea’s study on the
academic paper’s keywords extraction and paragraph summarization by graph-based
ranking algorithms, undirected graph sentence rank, forward graph sentence rank
and backward graph sentence rank which have achieved great effect in plain text
retrieval, now are all practiced and tested in the project on all 225 pages.
SentenceRank
|
Google
|
Yahoo
|
3
|
87.00
|
38.67%
|
93.00
|
41.33%
|
4
|
113.00
|
50.22%
|
122.00
|
54.22%
|
5
|
133.00
|
59.11%
|
136.00
|
60.44%
|
6
|
146.00
|
64.89%
|
151.00
|
67.11%
|
7
|
152.00
|
67.56%
|
155.00
|
68.89%
|
8
|
157.00
|
69.78%
|
157.00
|
69.78%
|
9
|
159.00
|
70.67%
|
161.00
|
71.56%
|
10
|
162.00
|
72.00%
|
157.00
|
69.78%
|
11
|
166.00
|
73.78%
|
158.00
|
70.22%
|
12
|
165.00
|
73.33%
|
160.00
|
71.11%
|
13
|
165.00
|
73.33%
|
161.00
|
71.56%
|
14
|
164.00
|
72.89%
|
166.00
|
73.78%
|
15
|
167.00
|
74.22%
|
167.00
|
74.22%
|
Average
|
148.92
|
66.19%
|
149.54
|
66.46%
|
Table4.25
(a) (b)
Figure4.27 success retrieved pages’ counts per 225 pages and
corresponding percentage value by undirected graph sentence rank.
Forward
|
Google
|
Yahoo
|
3
|
85.00
|
37.78%
|
93.00
|
41.33%
|
4
|
105.00
|
46.67%
|
127.00
|
56.44%
|
5
|
139.00
|
61.78%
|
144.00
|
64.00%
|
6
|
155.00
|
68.89%
|
155.00
|
68.89%
|
7
|
161.00
|
71.56%
|
160.00
|
71.11%
|
8
|
165.00
|
73.33%
|
159.00
|
70.67%
|
9
|
167.00
|
74.22%
|
163.00
|
72.44%
|
10
|
169.00
|
75.11%
|
168.00
|
74.67%
|
11
|
172.00
|
76.44%
|
167.00
|
74.22%
|
12
|
172.00
|
76.44%
|
168.00
|
74.67%
|
13
|
170.00
|
75.56%
|
169.00
|
75.11%
|
14
|
169.00
|
75.11%
|
171.00
|
76.00%
|
15
|
173.00
|
76.89%
|
174.00
|
77.33%
|
Average
|
154.00
|
68.44%
|
155.23
|
68.99%
|
Table4.26
(a) (b)
Figure4.28 success retrieved pages’ counts per 225 pages and
corresponding percentage value by forwarded graph sentence rank.
Backward
|
Google
|
|
Yahoo
|
|
3
|
79.00
|
35.11%
|
87.00
|
38.67%
|
4
|
109.00
|
48.44%
|
105.00
|
46.67%
|
5
|
128.00
|
56.89%
|
117.00
|
52.00%
|
6
|
147.00
|
65.33%
|
136.00
|
60.44%
|
7
|
154.00
|
68.44%
|
135.00
|
60.00%
|
8
|
158.00
|
70.22%
|
153.00
|
68.00%
|
9
|
159.00
|
70.67%
|
151.00
|
67.11%
|
10
|
164.00
|
72.89%
|
152.00
|
67.56%
|
11
|
165.00
|
73.33%
|
162.00
|
72.00%
|
12
|
164.00
|
72.89%
|
166.00
|
73.78%
|
13
|
164.00
|
72.89%
|
155.00
|
68.89%
|
14
|
168.00
|
74.67%
|
153.00
|
68.00%
|
15
|
170.00
|
75.56%
|
160.00
|
71.11%
|
Average
|
148.38
|
65.95%
|
140.92
|
62.63%
|
Table4.27
(a) (b)
Figure4.29 success retrieved pages’ counts per 225 pages and
corresponding percentage value by backward graph sentence rank.
Besides a higher
success retrieve rate by sentence rank, which is reaching up to 75%, it is
worth to mention that the results from Google and Yahoo are very closed and
similar to each other, rather than all the previous sections’ results. The
average results are shown in Figure4.30, for easy comparison, the Title method is
also included in Figure4.30. The benefits from sentence rank cannot be
disregarded.
Figure4.30 all sentence rank related methods comparison along
with title method
After all, the comprehensive chart which
includes all the average success retrieve rates is shown in Figure4.31.
Figure4.31 all methods comparison
x-axis
|
Method
|
Google
|
Yahoo
|
1
|
Title
|
50.05%
|
44.00%
|
2
|
TF
|
60.24%
|
51.52%
|
3
|
DF
|
71.35%
|
58.36%
|
4
|
TFIDF
|
66.36%
|
55.62%
|
5
|
PW
|
60.55%
|
49.94%
|
6
|
TF3DF2
|
71.38%
|
61.09%
|
7
|
TF4DF1
|
66.84%
|
55.90%
|
8
|
TF5DF5
|
71.25%
|
61.23%
|
9
|
TFIDF3DF2
|
71.62%
|
63.08%
|
10
|
TFIDF4DF1
|
69.16%
|
58.63%
|
11
|
TFIDF5DF5
|
71.69%
|
62.12%
|
12
|
Word Rank
|
55.79%
|
49.57%
|
13
|
Nouns & Verbs Rank
|
47.32%
|
44.55%
|
14
|
WordRank3DF2
|
71.69%
|
59.90%
|
15
|
WordRank4DF1
|
65.95%
|
56.79%
|
16
|
WordRank5DF5
|
71.52%
|
60.72%
|
17
|
WordRank3TFIDF2
|
53.64%
|
44.79%
|
18
|
WordRank4TFIDF1
|
54.87%
|
47.90%
|
19
|
WordRank5TFIDF5
|
51.38%
|
43.01%
|
20
|
Random Sentence Pick
|
67.97%
|
64.14%
|
21
|
Sentence Rank
|
66.19%
|
66.46%
|
22
|
Forward Sentence Rank
|
68.44%
|
68.99%
|
23
|
Backward Sentence Rank
|
65.95%
|
62.63%
|
Table4.28