Welcome to WebmasterWorld Guest from

Message Too Old, No Replies

Google and Yahoo Both Ignoring Robots.txt?



4:30 pm on Feb 1, 2011 (gmt 0)

Hey All,

I am back for some more expert advice, thanks in advance!

I have a personal blog (wordpress) and have decided to disallow indexing of tags and categories through an SEO plugin in the CMS. I have also disallowed any comment pages, any feed pages and basically anything that's not the actual post in my robots.txt.

My problem is that for some reason both Google and Yahoo are indexing some of the tag pages as well as some of the feed pages.

For example:

I have these all disallowed in my robots.txt file. Is there any other reason these would be indexed as pages? They aren't linked from anywhere, either on my site or anyone else's.

Interestingly, Bing is the only one following my instructions...



6:33 pm on Feb 1, 2011 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

A robots.txt disallow rule stops spidering - but URLs can be indexed even though the content isn't crawled. The information listed can come from internal or external backlinks, DMOZ etc.

However, seeing the feed indexed on a Wordpress site seems odd. Are you seeing these URLs only for a site: operator result - or are they ranking for some kind of regular query?


6:46 pm on Feb 1, 2011 (gmt 0)

@tedster: Just the "site:" operator, nothing is returning in any SERPs. It's just annoying to see it, was wondering if there is any specific reason this could be happening.

And as I mentioned Bing's "site:" operator shows exactly what I want it to show but Yahoo and Google's does not. Weird.

Featured Threads

Hot Threads This Week

Hot Threads This Month