Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Does blocking URLs in robots.txt create a problem for Googlebot?

         

gunjanp

8:12 am on Jul 23, 2008 (gmt 0)

10+ Year Member



Hallo,

I am using robots.txt for blocking affiliate pages of my site.

do you think that Blocking more and more pages in robots.txt for Googlebot, is it creating a problem for my site?

Due to this problem, is it possible that my ranking in Google will be gone down?

Because there is affiliate urls are increasing more and more and for protecting my site from the penalty of Duplicate content i have blocked all affiliate urls in robots.txt and even any new affiliate url generate that will also block in robots.txt

I am using this command for blocking my affiliate urls:

User-agent: *
Disallow: /*?a_aid
Disallow: /*?

Because before implementing this command in robots.txt my ranking is very well but after blocking these kind of pages in robots.txt, I found that my ranking gone down.

so now what i should to do for getting my ranking up.

Let me know your suggestions.

Thanks

tedster

8:34 am on Jul 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The last line in your set of rules disallows ANY url that contains a question mark. This is fine if you are not using query strings in the urls of any important pages, but just be sure that is what you want.

Blocking urls in robots.txt does not cause problems with Google just because you do it. But if your rules do something other than what you intended, then there can be big trouble. If that last Disallow rule I pointed out is really what you need, then the drop in your rankings may have a different cause.

At the same time, if Google previouly passed link juice through all those backlinks and you sort of chopped them off from receiving and circulating PR - that could create some drops. Another approach is to take any affiliate ID, store the information you need to do right by your affiliate, and then serve a 301 redirect that removes the query string. This way you allow Google to give you the credit for those backlinks and to pass along any link juice.

Marcia

8:43 am on Jul 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



WMT doesn't seem too happy over URLs disallowed in robots.txt

gunjanp

9:44 am on Jul 23, 2008 (gmt 0)

10+ Year Member



Thanks Tedster for great suggestion.

Robots.txt working well and it blocks only those urls which contains "?".

and as you suggest that if i will give 301 redirect to all urls which contains "?" then i will get the benefits of Link Juice.

But i am getting confused. if i am giving 301 redirect on affiliate urls, then affiliate tracking is possible? or it creates problem in affiliate order tracking?

Let me know...

Lord Majestic

10:51 am on Jul 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is not a standard valid robots.txt - wildcards in those are only supported for User-Agent part, if you are targeting specifically Googlebot that understands it then I think you really should mention it there explicitly.

g1smd

12:44 pm on Jul 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Disallow: /*?a_aid
Disallow: /*?

The first rule is redundant.

tedster

1:33 pm on Jul 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is not a standard valid robots.txt...if you are targeting specifically Googlebot that understands it then I think you really should mention it there explicitly.

Yahoo and MSN also support wildcard pattern matching in robots.txt.

tedster

1:35 pm on Jul 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if i am giving 301 redirect on affiliate urls, then affiliate tracking is possible

You can script this with your serveer-side technology. Capture and store the tracking info before you do the redirect that strips the affiliate parameter.

Lord Majestic

1:36 pm on Jul 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yahoo and MSN also support wildcard pattern matching in robots.txt.

Sure, and so they should also get added to the list of explicit bots in that robots.txt - if thats all the site owner cares about then it is better to disallow all other bots than suffer unavoidable anger that some other bot that is perfectly robots.txt compliant won't obey these non-standard directives.

skweb

1:51 pm on Jul 23, 2008 (gmt 0)

10+ Year Member



<,do you think that Blocking more and more pages in robots.txt for Googlebot, is it creating a problem for my site?>>

Hell, no. I do it for a variety of reasons for many pages and it works just fine. I mean do I really want Google to index the page that serve the alternate ads from? It has no content except code for Amazon ads.

<<Due to this problem, is it possible that my ranking in Google will be gone down?>

No, absolutely not. Actually Google only wants to index pages that make sense to humans. The page that I have blocked with just Amazon code is really not meant for a human visitor directly.

gunjanp

6:19 am on Jul 24, 2008 (gmt 0)

10+ Year Member



I want to block that affiliate urls in all SE so i am using "*". Not only in Google.

But my main targeted SE is Google.