Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

How to block "index.php?something" via robots.txt

         

realmaverick

2:11 am on Mar 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My CMS is constantly causing errors in Google WMT. We are talking tens of thousands.

Each page has stuff like "report this file" etc. But when Google clicks it, it leads to an error page.

There are tons of issues like this, which I am going to go through, one at at time and fix.

However, all of these URLs are index.php?something

I want to block Google from accessing all URLs that contain index.php?

What is the safest way to do this?

User-agent: *
Disallow: /index.php?


Will this work and not inadvertently block / somehow?

realmaverick

2:58 am on Mar 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ive added

User-agent: *
Disallow: /*?

Which felt a bit safer. I don't think that should effect anything I want indexed.

tedster

3:02 am on Mar 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As long as you want NO query strings at all to be crawled with index.php - yes, that works. I've used it in several cases with no problem.

realmaverick

1:15 pm on Mar 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Tedster. I am hoping this will free up the errors in WMT and move them over to the restricted tab.

ScreamingFrog is reporting no issues.

I'm a little worried about the flow of link juice now. As the login and signup on every page use index.php?

Do you think I should be concerned?

realmaverick

1:43 pm on Mar 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Already found an issue, as my sitemaps are generated with index.php?

Gah!

g1smd

7:56 pm on Mar 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Document all the different types of URLs the site has, their format and variations, so you have a solid plan to work from. Compile it as a list or on a spreadsheet.

klark0

9:12 pm on Mar 9, 2012 (gmt 0)

10+ Year Member



How about Webmaster Tools > Site configuration > URL parameters.

I use that to tell googlebot what to do with my mobile switching parameters.

If you visit the page, it should show what parameters googlebot is already detecting.