Page is a not externally linkable
rainborick - 3:48 pm on Sep 25, 2007 (gmt 0)
The search engines behavior in this area has remained unchanged in at least 2 years. They check for a CC TLD first, and failing that refer to the IP address of the server. It is easy to see why they chose these methods. Its dirt simple to implement because it doesn't require a lot of ongoing processing, and reasonably reliable in terms of the Web overall. Routine updates of the generic TLDs' IP addresses is all that's required. No convoluted analysis of <meta> tags, content, or links is involved. Unfortunately, the search engines' policies are essentially invisible to users. I suspect that they don't document it well because its wrapped up in the ranking algorithm. They won't give away any part of the secret sauce recipe even when its obviously just 1000 Island Dressing. And by keeping it a secret, it bites small websites who rely on inexpensive hosting options in the US who also happen to have chosen a .com or .net domain name without any idea of the impact it may have. But, language is a separate issue that I've been curious about, but have yet to investigate. Here again, the search engines don't seem to explicitly outline their methods for determining language. It would be interesting to try a few language-restricted searches and analyze the pages that appear in the results. You'd have to record at least the server response header, the <meta> tags, and 'lang' attributes and see which ones seem to work.
I had to check into geo-location for a client about 2 or 3 years ago. It took me several hours of scouring the online docs for both the US and non-US versions of their websites to find any information at all about geo-location. In recent months, I've tried to find all of that documentation again because of all of the online discussions I've been involved in, and I've been unable to find any definitive statements from any of the search engines. At the moment, the only public declaration I know of is from Google where they address the issue from a user's standpoint in explaining country-specific searches, which mentions the CC TLD factor and indirectly refer to the IP address factor. Google, for example, used to have advice for webmasters that said they would sometimes refer to the domain name registration data. If that advice is still out there, I can't find it (and I've never seen any evidence of their ever using it, anyway). So I've been relying on what I had found in the past and what I've experienced in the meantime.