need to block. and don't know - pls help - Sitemaps, Meta Data, and robots.txt forum at WebmasterWorld

Forum Moderators: goodroi

Message Too Old, No Replies

need to block. and don't know - pls help

need help with blocking - Malware Ref

Reilly

8:05 pm on Mar 16, 2008 (gmt 0)

66.249.72.40 - - [16/Mar/2008:18:33:53 +0100] "GET /file.html?ref=example.com HTTP/1.1" 304 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

how can i block only the file with ref=example.com from crawling by google....

example.com is an malware domain - and i don't know why my Server response with a 304 - please - i have to disallow Google to crawl that site with that Malware Parameter...

Disallow: example.com

does this work ?

thanks in advance...

Samizdata

9:10 pm on Mar 16, 2008 (gmt 0)

Assuming you are on an Apache server, have a look at the posts in the WebmasterWorld Apache Web Server forum on how to deal with unwanted query strings using mod_rewrite (if on Windows try that forum).

There is nothing you can do about this with a robots.txt file.

The 304 response means that the content has not changed since last requested.

[edited by: Samizdata at 9:10 pm (utc) on Mar. 16, 2008]

Reilly

8:25 am on Mar 17, 2008 (gmt 0)

i need to prevent Google to spider that query

file.html is not the same like file.html?ref=example.com

i do not have mod_rewrite :( - please how is the syntax to prevent Googlebot from spidering that fake query?

i need disallow: ref=

or what ? i do not use querys with ?ref= - so it must be easy to disallow Googlebot to spider that file with this query?

Reilly

7:55 pm on Mar 17, 2008 (gmt 0)

i have found this

Disallow: /*ref=*

does this disallow Google to spider that ?

[edited by: Reilly at 7:56 pm (utc) on Mar. 17, 2008]