Google Blogoscoped


awkward page and entry count

Hadas Velan [PersonRank 0]

Thursday, January 15, 2009
15 years ago2,238 views

when we try to search a word ("Hadas Velan" f.e) in the page, we get 8 pages and a count of 109 web pages. but when I go to the last page, suddenly there are only 5 pages with only 46 web pages. we encounter the same problem in Maltese and in Hebrew.

Roger Browne [PersonRank 10]

15 years ago #

It's almost inevitable that the counts for multi-word searches must be approximate, in a world of finite data storage.

Google's indexes hold the result count for every individual word. It's not practical to also do this for every multi-word phrase – the combinatorial explosion of possibilities is just too large.

Suppose you search for two words together. Google's index reveals that the first word occurs in say 0.0001% of all web pages, and the second in say 0.003% of all web pages. From this, Google estimates the percentage of web pages that would contain both search terms.

But some search terms are more (or less) likely to occur together than other search terms, so the front page estimate can never be more than an estimate.

No doubt Google's actual algorithms are more refined than I've described, but this is how I've seen multi-word indexing implemented on several software projects. Computationally, it's intractable to obtain and store the result count for every phrase.

Forum home


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!