Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Why are blocked pages showing in site: search results?

         

cheemo

3:39 pm on May 5, 2008 (gmt 0)

10+ Year Member



Whenever I do a site:mysite.com, page through all the results and click the 'repeat the search with the omitted results included.' I see hundreds of unindexed results.

Unindexed result shows like this:
Page Title
Similar page - Note this

Many of them even display '23 hours ago' or whenver the last crawl date was. The thing I can't understand is why Google is crawling these as I have them blocked in my robots.txt.

I have the following setup to block page-1234.html, page-1235.html, etc.

Disallow: /page
Disallow: /page*
Disallow: /page$

According to Google webmaster tools these pages will be blocked. Why are they showing in search results?

tedster

3:46 am on May 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google will index a url even if they know about it only through links. If googlebot is blocked from crawling via robots.txt, the title and description they display will be just as you describe.

If you want Google to stop even this level of indexing, you can use their URL Removal Request.

ecmedia

1:41 pm on May 6, 2008 (gmt 0)

10+ Year Member



Plus you will need to include a noindex tag on each page to make sure that G never indexes it.

jimbeetle

2:56 pm on May 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Plus you will need to include a noindex tag on each page to make sure that G never indexes it.

Yeah, the meta noindex is the most effective way to keep pages out of the index. But if you use the noindex be sure not to block the pages using robots.txt. The bots have to be able to see the pages in order to obey the noindex.