Word-rank is one
implementation of weighted graph ranking algorithm including undirected
weighted (on edges) graph and directed weighted (on edges) graph when a single
word/term is considered as a vertex and all content is a graph. A window size
parameter ‘w’ is introduced for
implementing connection among vertices. In undirected weighted graph, each word
has connection with other words only in the window size distance, including previous
w words and following w words. In directed weighted graph,
each word has connection with the following words only in the window size
distance. Take Figure2.14 as an example and set window size to 2, ‘packing’
has connections with ‘ferocious’, ‘storm’, ‘freezing’ and ‘rain’ in undirected
weighted graph, while it only has connections with ‘freezing’ and ‘rain’ in
directed weighted graph. The score associated with each vertex is set to an
initial value of 1 and ranking algorithm, 2-6 for undirected weighted graph and 2-7 for directed weighted graph, is run on graph repeatedly
until it converges – usually for 20-30 iterations, at a threshold of 0.0001 [9].
Figure2.14
The expected end
result for this application is a set of words or phrases that are
representative for a natural language text. The terms to be ranked are
therefore sequences of one or more lexical units extracted from text, and these
represent the vertices that are added to the text graph. If more than 1 term
happened to be neighbors, they can be connected as a key phrase. Thus, the
language consistency in the content is preserved. Rada and Paul in their paper “TextRank:
Bring Order into Texts” gave a clear view of this passage “Compatibility of
systems of linear constraints over the set of natural numbers. Criteria of
compatibility of a system of linear Diophantine equations, strict inequations,
and nonstrict inequations are considered. Upper bounds for components of a
minimal set of solutions and algorithms of construction of minimal generating
sets of solutions for all types of systems are given. These criteria and the
corresponding algorithms for constructing a minimal supporting set of solutions
can be used in solving all the considered types systems and systems of mixed
types.” and extracted the following terms as results: “linear constraints”, “linear
Diophantine equations”, “natural
numbers”, “nonstrict inequations”,
“strict inequations” and “uper bounds” [9].