I would like to deliver content to a visitor that differs depending upon the http referrer tag. What I would like to be assured of is that some of the content is NEVER displayed to a search engine bot.
My first question is, do search engine bots carry http referrer data when they arrive from another site? Namely, if I have a script that pseudo wise does the follow:
If HTTP referrer is blank deliver content A; elseif HTTP referrer is not blank deliver content B; endif;
Will the search engine robot ever be delivered content B? This is what I'd like to avoid...
A turing test isn't exactly what I need however as I still want to deliver some content to the bot. The turing tests I've seen most often on the internet (and described in that article) are based on blocking robots from proceeding. I suppose I could do a time delayed redirect and hope that I've given the human enough time to fill in the turing test while also suggesting to the bot not to index the human content. (Without going into detail I don't want to include bot exclusion for the page).
Doing some reading at WebmasterWorld it also makes sense that search engines really protect against delivering content based upon whether a visitor is a robot/human. I really don't want to get into a non-stop redirect battle and face getting accidentally banned if my attempts are mistaken as malicious cloaking.
Can you suggest any other resources that might be helpful?
Are there any legitimate ways to psuedo wise do: If you're a robot: You get content B elseif you're a human: You get content A endif;
I fully agree, the second portion using http referrers will be quite easy for me. If a bot accidentally slipped through and started spaming random referrers this would be problematic however.
do your cloaking the traditional way; based on IP and/or UA.
This is something that's entirely new to me. I've printed off a few threads here, so will get cracking on learning some more.
Altavista frequently dumps funny URLs in the referer field.
I took a quick look at my logs but most of the time found scooter was behaving himself/herself. One record I did find a scooter referrer of "http://www.root.mysite.com/category....." which doesn't actually exist. Is this the type of thing you found and have you noticed any patterns?
At the same time I'm still curious about the http referrer junk that bots send. I've periodically checked my logs and empirically I've got to agree that at this time googlebots don't send referrers.
Does anyone know if scooter follows any kind of pattern with it's referrer junk dumps? I could then always just add a ereg to just dump any referrers that contained a known junk pattern....