This looks like you're blocking all bots
Maybe I'm wrong and it's skipping over the disallows of the other bots.... I'd do this though
|User-agent: Googlebot |
EDIT: I've heard that blocking a lot of pages in robots.txt isn't good so maybe you just shocked googlebot a bit with the new htaccess. We blocked something last month to get rid of some pages and took a dip for about 4 days; could have been unrelated though.
read this post of matt cutss may be it can help u
the initial group (*) is empty and the group of Disallows at the end are also useless since bots match paths left-to-right so there must be a leading '/'.
in other words it appears there is no exclusion rule that applies to any of google's bots.
what is GWT telling you?
have you tried "fetch as googlebot"?
You MUST include a blank line after each record, i.e. before the next User-agent.
"Allow" is Non-Standard. Use
to allow all.
Premably this is a cut-and-paste error. I assume the " User-agent: * " line belongs just after the # mark, as those Disallow: rules have no preceding User-agent definition.
@g1smd does this only apply to Google? [developers.google.com...] - way down the page they have:
|disallow - The disallow directive specifies paths that must not be accessed by the designated crawlers. When no path is specified, the directive is ignored. |
allow - The allow directive specifies paths that may be accessed by the designated crawlers. When no path is specified, the directive is ignored.
I have to agree, [robotstxt.org...] does mention it:
|To exclude all files except one - This is currently a bit awkward, as there is no "Allow" field. The easy way is to put all files to be disallowed into a separate directory, say "stuff", and leave the one file in the level above this directory: |
We use "allow" and "disallow" with no problems that are apparent.
|We use "allow" and "disallow" with no problems that are apparent. |
That's great for Google, what about other bots?
No problem with Bing or Yahoo. We get plenty of other bot traffic to it as well.
EDIT: By "apparent" I was inferring that there were no bot issues of any kind that were apparent.
A compliant robot MUST honor the "disallow" directive.
A compliant robot may CHOOSE to honor non-standard directives such as "allow".
The word "compliant" means everything.
There are no numbered versions of the robots.txt standard.
|The /robots.txt standard is not actively developed. |
|Also, you may not have blank lines in a record, as they are used to delimit multiple records. |
|You MUST include a blank line after each record, i.e. before the next User-agent. |
that's true according to the robots exclusion protocol.
|you may not have blank lines in a record, as they are used to delimit multiple records. |
however according to the (non-standard) google documentation...
|Note the optional use of white-space (an empty line) to improve readability. |
as jimbeetle says:
|That's great for Google, what about other bots? |
given that the only bots that have a chance of honoring your robots.txt are non-google, you might want to add the blank lines between groups.