Forum Moderators: goodroi
Here's the appropriate parts of the file:
User-agent: *
Disallow: /*.jpg$
Disallow: /*.gif$
Disallow: /uploads/
Disallow: /uploads/textareas/
Disallow: /uploads/textareas/image/
I'm trying to block http://www.example.com/uploads/textareas/image/image.jpg
Webmaster Tools confirms
Blocked by line 12: Disallow: /uploads/textareas/image/
But the image shows in my 'top queries' and I can see it in Image Search in Google!
Orginally I just had the top-level folder in the txt file. Then I added the subfolders, and the wildcard matches. I even blocked Googlebot-Image from the whole site. Nothing has made any difference.
Just today I've re-added
User-agent: Googlebot-Image
Disallow: /
to the file as a last resort. Perhaps I need to keep it out of the pages the images are called from as well as the folders they reside in?
Anyone had/having this problem?
Google keeps indexing pages which are blocked by robots.txt.
/top-du-top/?c=0&an=0&mo=8&au=3- Blocked by line 46: Disallow: /*?*
I already did a big clean up last month but it never stops, I have now 300 pages indexed that should not be there.
Is there any tools to make sure that my robots.txt is correct ?
Are we the only one to have this nightmare ?
There is a whole section of Google's Webmastertools dedicated to helping you check your robots.txt file.
It can take a long time for changes to follow through in the SERPs.
You don't need the final * in your rule.
Disallow /*?
already says "disallow URLs that start with anything, and that start is then followed by a question mark".
The rule disallows anything that ends with a question mark, as well as URLs that have something, anything, after the question mark.