Forum Moderators: open

Message Too Old, No Replies

Yahoo spiders virtual hostnames by IP Address

yahoo spider

         

frankray

6:14 pm on Apr 2, 2004 (gmt 0)

10+ Year Member



In order to save a few bucks on our hosting, we have multiple domain names hosted on one IP address.

Pretty standard stuff, really.

We have NO duplicate content, and we are not playing any other search engine games.

But the Yahoo robot seems to take what it learns about one website's structure and tries to access the web pages by the other domain names hosted on the IP address.

I do not understand.

The only links to our pages are by the appropriate domain name.

Here is an example:

domainA.com/contentA.html
domainB.com/contentB.html

domainA is never linked to contentB.

But Yahoo will try to look up all of the following permutations:

domainA.com/contentA.html (OK)
domainA.com/contentB.html (NOT OK) !
domainB.com/contentB.html (OK)
domainB.com/contentA.html (NOT OK) !

I am blocking these requests, but this is making me very angry.

We are on IIS, not that it should matter...

Is anyone else experiencing this offending Yahoo spider behavior?

I see a message here:

[help.yahoo.com...]

That says:

"Sites with numerous, unnecessary virtual hostnames" are considered unwanted.

I wonder too if this is somehow related?

Thanks

billygg

6:15 pm on Apr 3, 2004 (gmt 0)

10+ Year Member



hey frankray,
i see what u have posted and i too have a question about this. I work for a company that owns over 400 websites. and since ive worked there we have hosted off 1 ip address as well. we have never had a problem, but recently, when yahoo did its update, we lost many sites. whats funny is that all the sites we lost were c-name sites. for example our main site is www.-----.com well we have maybe 50 c-names: ----.----.com well as of the update, all our c-name sites disappeared. every single one of them. im thinking that for some reason they filtered out the c-name sites, based on our main domain site. All sites are great content, with no spamming, and have always done great in Y. im wondering if the spider problem u see, is why we got filtered. might have to try and set up a couple diff ip addresses and test a couple sites. interesting

frankray

7:53 pm on Apr 3, 2004 (gmt 0)

10+ Year Member



If you do not have any duplicate content, then I do not know what the problem is.

I am suprised that no one else has chimed in by now...

dhatz

11:53 am on Apr 4, 2004 (gmt 0)

10+ Year Member



Could it be the same with the issue I wrote about in

[webmasterworld.com ]

Are you perhaps using the same naming convention for html files in both sites, so what you perceived as a Slurp mixup of IPs could be just those 404 probes?

In my case (see posts in abovementioned link) the ratio of "probes" by Yahoo-Slurp for non-existat URLs were 1:10 of the "legit" requests, which raised a red flag.

It cost me a coupld of hours of debugging a problem that didn't exist.

Dimitris

frankray

11:53 pm on Apr 4, 2004 (gmt 0)

10+ Year Member



From researching the above threads as reference (thanks by the way), it seems that my expectations are too high for Yahoo spider.

There seem to be considerable bugs with the spider.

Some acknowledged and some still undisclosed.

This problem that I highlighted is in fact real because the pages are now making it into the Yahoo index. I just checked today.

So we will probably have the "duplicate content" penalty on some of our sites.

So I have to move all of the sites off this virtual hostnames approach.

Thanks Yahoo.

billygg

2:58 am on Apr 5, 2004 (gmt 0)

10+ Year Member



i have no duplicate content at all on the sites. The basic layout is the same per domain, but thats just cause we try to keep the company logo flowing, as for actual text content, every page of every site is different. i think it could be this virtual host thing as well, gonna do some ip moving and see if i see any changes. also like frankray said, i see some of our sites are slowly starting to move back up, almost as if it was a glitch in the spider :S who knows....

outland88

3:28 am on Apr 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That is a very interesting theory Frankray. I'd be very interested in hearing if somebody switched an IP address and their site came back.

billygg

1:22 pm on Apr 5, 2004 (gmt 0)

10+ Year Member



im working on switching one site today, and seeing if this makes any type of difference, ill post back when we see :)

treeline

2:13 pm on Apr 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



With so many hosting services sharing IP addresses, it seems it would be very unfair to penalize people for being on shared IPs. Could also lead to exhausting the number of available IPs a lot faster...

soapystar

4:50 pm on Apr 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



until someone can show Yahoo is not penalising sites simply on the basis of inbound links then i dont see the difference between that and any site on the same IP. I just see the overall goal from Yahoo as being to reduce their idea of spam and if doing that means huge collateral damage then so be it. This may be why they are now giving a route back in.

mfishy

5:53 pm on Apr 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We have many domains hosted on shared hosts and have not experienced this problem at all.

dhatz

6:43 pm on Apr 5, 2004 (gmt 0)

10+ Year Member



until someone can show Yahoo is not penalising sites simply on the basis of inbound links then i dont see the difference between that and any site on the same IP.

What do you mean "penalising sites on basis of inbound links"? "Many links" or "bad links" (ie bad neighbourhoods)? Any references to back this? I believe Yahoo team people here said that they view sites with many inbound links favorably.

Also IPs get re-allocated by hosting companies. How am I supposed to know if the IP I'm going to get for the next site isn't blacklisted due to previous owners?

I just see the overall goal from Yahoo as being to reduce their idea of spam and if doing that means huge collateral damage then so be it. This may be why they are now giving a route back in.

You mean, after excluding 1000s of sites from SERPs via algo, giving a manual (via humans) override? Only the very wealthy, or determined professional webmasters (and with lots of time on their hands to frequent here) would take that route.

None of this makes much sense to me.

soapystar

7:06 pm on Apr 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



a route back in if the orginal reason for the penalty is removed. This could have been an algo or human review. The point is a penalty is a lifetime ban unless removed by review. Human reviews are taking out sites that are not being removed by filter therefore i would suggest this is a sbjective judgement otherwise it would be defined and filtered for.

It has been suggested before that yahoo are penalising on inbound links. I havent seen a denial of this. It could anything from a single bad link from a bad neighbourhood defined by Yahoo to a level of links before triping a filter. I have no idea. All i am saying is some guys have suggested inbound links can take you down and ive not seen it refuted. It may have been but i havent seen it.

frankray

4:10 am on Apr 6, 2004 (gmt 0)

10+ Year Member



To clarify.

I am not talking about shared hosting.

I am talking about having multiple domain names on the same IP and in the *same* web hosting account.

Kinda like domain pointers. Or even domain aliases.

Each domain pulls up unique content through the "magic" of my favorite scripting language.

And all sub links point to unique directories.

One domain name does _not_ share content with another, but if you were to type in a directory for another domain, the content would show.

This is what Yahoo is doing.

There are not following links. They are "guessing" quite correctly based onthe shared IP.

Now that I have thought about this.

This is not the best way to save a few bucks, but Yahoo is the only search engine that has forced my hand.

No other search engine "guesses" links like this.

And I think it is wrong.

Oh well. It is a lot of work ripping out domain names and their content. But our fault I guess...