homepage Welcome to WebmasterWorld Guest from 184.73.104.82
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
need to block. and don't know - pls help
need help with blocking - Malware Ref
Reilly

5+ Year Member



 
Msg#: 3602301 posted 8:05 pm on Mar 16, 2008 (gmt 0)

66.249.72.40 - - [16/Mar/2008:18:33:53 +0100] "GET /file.html?ref=example.com HTTP/1.1" 304 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

how can i block only the file with ref=example.com from crawling by google....

example.com is an malware domain - and i don't know why my Server response with a 304 - please - i have to disallow Google to crawl that site with that Malware Parameter...

Disallow: example.com

does this work ?

thanks in advance...

 

Samizdata

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 3602301 posted 9:10 pm on Mar 16, 2008 (gmt 0)

Assuming you are on an Apache server, have a look at the posts in the WebmasterWorld Apache Web Server forum on how to deal with unwanted query strings using mod_rewrite (if on Windows try that forum).

There is nothing you can do about this with a robots.txt file.

The 304 response means that the content has not changed since last requested.

[edited by: Samizdata at 9:10 pm (utc) on Mar. 16, 2008]

Reilly

5+ Year Member



 
Msg#: 3602301 posted 8:25 am on Mar 17, 2008 (gmt 0)

i need to prevent Google to spider that query

file.html is not the same like file.html?ref=example.com

i do not have mod_rewrite :( - please how is the syntax to prevent Googlebot from spidering that fake query?

i need disallow: ref=

or what ? i do not use querys with ?ref= - so it must be easy to disallow Googlebot to spider that file with this query?

Reilly

5+ Year Member



 
Msg#: 3602301 posted 7:55 pm on Mar 17, 2008 (gmt 0)

i have found this

Disallow: /*ref=*

does this disallow Google to spider that ?

[edited by: Reilly at 7:56 pm (utc) on Mar. 17, 2008]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved