Forum Moderators: open
According to the term frequency-inverse document frequency model of text retrieval that most SE's use, that is, if I understand it right, I shouldn't use the name of the city in ANY of the site's pages (ideally) except the home page.
What I understand about tf-idf is that the more the word is used throughout the entire corpus (site) the higher the IDF and hence a lower overall relevancy score.
Is this a correct line of thinking?
Here's a basic paper that gives a clear definition of IDF and covers a few other related concepts
What I understand about tf-idf is that the more the word is used throughout the entire corpus (site) the higher the IDF and hence a lower overall relevancy score.
In this case, the corpus would be the Web as indexed by that engine.
So in an engine using IDF, if you searched for [widgets in cityname], then either widgets or cityname would be given more weight to match documents, depending on which is rarer.
I think that most people would agree that it is helpful for search engines to mention the city name on pages about that city.