Welcome to WebmasterWorld Guest from 54.226.67.166

Forum Moderators: goodroi

Message Too Old, No Replies

1 robots.txt file across multiple domains

Unable to implement individual robots.txt files - can I just use 1

     

deadsetchamp

10:50 pm on Feb 26, 2009 (gmt 0)

5+ Year Member



I have a retail site targeting different countries. Unfortunately it is basically the same content but with just different prices. We have different country specific domains but they are all on one server and we are unable to implement different robots.txt. I just want to block out all the sites except so we don't get penalised for duplicate content.

So is it possible to have a robots.txt file us the following code

User-agent: *
Disallow: <our UK domain>/
Disallow: <our AU domain>/
Disallow: <our CA domain>/

Or does robots.txt ignore any domain information and just look at what comes after the /. Very important we don't ruin our US rankings.

jdMorgan

11:03 pm on Feb 26, 2009 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



> does robots.txt ignore any domain information and just look at what comes after the /.

Yes, only the server-local URL-paths can be specified.

If you have the technology to "change the prices" between domains, you likely also have the technology to serve a different robots.txt per domain... I suspect the right questions are not being asked.

Jim

deadsetchamp

11:06 pm on Feb 26, 2009 (gmt 0)

5+ Year Member



Thanks Jim,

They said that they can't do individual files but might be able to do the meta spider restriction method.

Cheers for the quick reply.

jdMorgan

11:23 pm on Feb 26, 2009 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



> they can't do individual files

I'll bet you can easily find someone who *can* do individual files -- Good help is cheap in an economic downturn, something that "no-can-do" people should bear in mind... ;)

Use mod_rewrite or ISAPI Rewrite to internally rewrite robots.txt URL requests to different files based on the Host header sent with the client HTTP request. Or again, use a rewrite engine to pass all robots.txt requests to a PERL or PHP script which can generate different robots.txt content, again based on the Host header sent with the HTTP request. Or build this function into the script you use to generate your custom 404 error page contents, and let the robots.txt requests activate that script as well, with that script producing the robots.txt content (and a proper 200-OK server status response)... There are many ways to do it.

Jim

deadsetchamp

1:15 am on Feb 27, 2009 (gmt 0)

5+ Year Member



Thanks for this!

I know what you mean about getting 'can-do' people. I will pass this on to them and it might kick start their imagination.

g1smd

9:29 am on Mar 16, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



What you need can be done in just a couple of lines of code, as jd has outlined above.

WebmasterWorld uses a similar system to serve a different robots.txt file to different bots.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month