Forum Moderators: open
I noticed the follwing problem with the fast, av, inctomi crawler:
We have a multilingual website running under two domains www.ourdomain.com and www.ourdomain.de. The .com-domain serves the English site and the de.-Domain serves the German site.
The Domains are running via virtual hosting under the same IP-Adress (shared with about 25 different other Domains - all ours - all with rich content - only few crosslinking between some of them that are topic related - none is spammy).
Now fast (and Inctomi and AV) has started to back check the IP-Adress of our domains probably to check for domainspam etc. We are now listed at fastsearch with our .de-Domain, with correct title, correct description etc. But the listing of our .com-Domain only shows the default page that is served when only the IP-Adress is requested (without a specified Hostname). There's no Title, No Description - it's simply the wrong page.
Since we use virtual hosting for all our domains, i'm afraid, that step by step they all will be replaced in the fast-index by the default page.
I could set up the default master page (for the IP) with Metatags to tell robots to not index it. But this doesn't solve the problem and it could result in no indexing of all the Domains that are runing under this IP!?
Is there probably a HTTP-Request-State that could be sent to robots to tell them to specify a host name instead of the IP?
Does anybody have a solution or a workaround except moving to dedicated IPs or changing the domain/language structure?
Bandwidth costs money, for webmaster and search engines. Should spiders limit the amount of bandwidth they use on any given website. The answer seems to be yes, any page they index and don't send a visitor too by the next index is a waste.
What should be considered? Subdomains on the same IP address? Should all IP addresses be limited to the same bandwidth?