Forum Moderators: goodroi
My robots.txt file contains
User-agent: *
Disallow: /bin/
Disallow: /cgi-bin/
Disallow: /config/
Disallow: /docs/
Disallow: /extensions/
Disallow: /includes/
Disallow: /languages/
Disallow: /local/
Disallow: /maintenance/
Disallow: /math/
Disallow: /serialized/
Disallow: /skins/
Disallow: /t/
Disallow: /tests/
While doing a site: command the skins folder IS indexed so are all the sub directories in it? out of all the disallow lines above only skins has a problem, any ideas where to look, thanks
This is often true of content that has few or no external links to it (which is likely the case with your /skins/ folder). Such content can hang around for months and months since Googlebot never revisits it.
If it's important to get the files removed you can use the URL removal tool in webmaster tools, otherwise it's just a case of waiting. Certainly, there doesn't appear to be any problem with your robots directives.
No the whole robots.txt file was created in Feb 2008 I have just started to work on it again and to see how it was doing in the serps I did a site: command and found the skins folder to be indexed and every folder within it, I thought it was strange.
edit: there's no cache date only the Similar OR note tags
there's no cache date only the Similar OR note tags
Is there a snippet underneath the listing, or do you just see the URLs? If it's just a URL, then this is quite common: files excluded in robots.txt often appear in Google listings in that way.
Excluded files can hang around in this way for a long time (forever?) and while they make a mess of site: search results, in my experience there isn't usually any impact on performance.
Indexed pages that are disallowed by robots.txt [webmasterworld.com]