Msg#: 3252492 posted 5:45 pm on Feb 14, 2007 (gmt 0)
Robots.txt takes effect as soon as you upload it. Every time googlebot visits your site the first thing it does is. Googlebot then looks to see if you have specific user agent instructions for it. If you list specific instructions for the user agent googlebot, googlebot will ignore the generic instructions and only follow the specific instructions.
Your current robots.txt has several problems with it. Disallow: /topic should be Disallow: /topic/ (notice the trailing slash). This is why googlebot is indexing www.example.com/topic/heating.
I assume you do not want googlebot to crawl your /admin/ folder so you should place a copy of the generic instructions under the googlebot line.
To block pages based on url wildcards (aka pattern matching) add this line: Disallow: /*blog?* (this will block all URLs that contain "blog?". For more information about Google's robots.txt [google.com]
Also when in doubt you can test your robots.txt with googlebots validator. It is located within Google Sitemaps [google.com]
Msg#: 3252492 posted 6:02 pm on Feb 14, 2007 (gmt 0)
Blank lines are required after Disallow and before the next User-agent, and also at the end of the file.
While Google and some of the other major search engines will look for the most specific User-agent record that applies to them, many search engines will not. All that is required by the robots.txt Standard is that a robot accept the first record which matches (or partially matches) its user-agent name, or a User-agent record specifying "*" whichever comes first.