Welcome to WebmasterWorld Guest from 54.162.240.235

Forum Moderators: goodroi

Message Too Old, No Replies

robots.txt need help

     
5:03 pm on Jan 11, 2013 (gmt 0)



Hello.

I need help. I see that google crowl some bad links that does not exist in my website.

like

http://example.com/test%p%12%5%.html

I am wanted to block that the urls that contain %.

If i Add in my Robot file:

Disallow: %

will this work perfectly or it will disterb other web urls that does not contain %.

Please give your suggestion and feed back.

Thanks a lot.
9:52 am on Jan 15, 2013 (gmt 0)



Hi,

Robots.txt is a text file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do.

robot file contains only:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
User-agent: Googlebot

[edited by: engine at 9:30 am (utc) on Jan 16, 2013]
[edit reason] see WebmasterWorld TOS [/edit]

10:17 pm on Jan 15, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



You are looking in the wrong place.

The problem is not that google is crawling urls you don't want it to know about. The problem is that it is inventing urls that don't exist. If it only does it now and then, this is normal: It is testing for "soft 404" responses. But if it is happening very often, you need to figure out where it is getting these imaginary URLs. There are other threads discussing this problem.
10:20 pm on Jan 15, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Disallow: /*%

will disallow crawling of URLs with a % in.

However, as stated above, that's the right answer to the wrong question.


If the requests are met with a 404 or 410 response then there is nothing further to do.

If your server is returning '200 OK' then you have a bigger problem to fix.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month