homepage Welcome to WebmasterWorld Guest from 54.226.147.84
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque

Webmaster General Forum

    
Concern with spammy search words
el_roboto

5+ Year Member



 
Msg#: 3389310 posted 8:24 am on Jul 9, 2007 (gmt 0)

I've been using awstats to monitor traffic etc to my website.

For atleast the past 3 months I've noticed that a certain number of spammy search terms are appearing in the search keywords/phrases logs.

I'm wondering how this could be happening? might someone else on the server be spamming somehow as I'm on a VPS?

 

DXL

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3389310 posted 12:52 pm on Jul 9, 2007 (gmt 0)

This happens to just about every site that I manage. I generally get keywords related to prescription drugs even though none of their names appear on any of the sites that I maintain.

el_roboto

5+ Year Member



 
Msg#: 3389310 posted 12:57 pm on Jul 9, 2007 (gmt 0)

It's crazy!

fioricet and gambling are the two which appear every month without fail - likewise I can't see anything thats been hacked into my source anywhere.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3389310 posted 2:42 pm on Jul 9, 2007 (gmt 0)

If these are like the ones I've seen, they're mostly (or all) coming from the "Tide" proxy servers at Microsoft. As such they're fairly easy to block based on their REMOTE_ADDR, REMOTE_HOST, and VIA headers.

You should be able to confirm this by reviewing your server access log file, and searching for the 'spammy' terms.

Jim

el_roboto

5+ Year Member



 
Msg#: 3389310 posted 2:55 pm on Jul 9, 2007 (gmt 0)

hah! there it is

131.107.0.96 - - [08/Jul/2007:06:52:15 +0100] "GET / HTTP/1.1" 200 2290 "http://search.live.com/result.aspx?q=fioricet&mrt=en-us&FORM=LVSP" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; Win64; x64; SV1; .NET CLR 2.0.50727)

with the information above how would I then go about blocking?

physics

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3389310 posted 11:51 pm on Jul 9, 2007 (gmt 0)

jdMorgan is probably right. However, if I were you I would do a search like:
site:yoursite.com gambling
on Google to make sure that none of your pages have those terms on them.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3389310 posted 12:43 am on Jul 10, 2007 (gmt 0)

Yes, that IP address resolves to tide526.microsoft.com

I don't know what these people (or robots) are up to --or whether blocking these accesses will actually accomplish anything-- but if you're on Apache, something like this in .htaccess would return a 403-Forbidden response for these requests:

RewriteCond %{REMOTE_ADDR} ^131\.107\.0\.(6[4-9]¦[789][0-9]¦1[01][0-9]¦12[0-7])$
RewriteCond %{HTTP:Via} SEA-PRXY
RewriteRule .* - [F]

This snippet assumes that you already have other working mod_rewrite rules. If not, you'll need to "set up" mod_rewrite before using this code.

An alternative rule, if your host allows/supports reverse DNS lookups, might be:

RewriteCond %{HTTP_REFERER} ^http://search\.live\.com/result\.aspx\?q=
RewriteCond %{REMOTE_HOST} ^tide[0-9]+\.microsoft\.com
RewriteRule .* - [F]

This second version, based on the tide.microsoft.com hostnames, eliminates the possible need to maintain the IP address "list" for the REMOTE_ADDR check in the first code snippet.

In the second version, the RewriteCond examining the referrer isn't strictly necessary, but I included it to limit the number of reverse DNS lookups, which are by nature 'expensive' in terms of CPU cycles and waiting server threads. You can make it even more specific by including the search terms that you find in the requests to your server, as in:

RewriteCond %{HTTP_REFERER} ^http://search\.live\.com/result\.aspx\?q=(widgets¦wodgets¦whatever)

as long as you replace the broken pipe "¦" characters with solid pipe characters before use; Posting on this forum modifies the pipe characters.

These are just two example routines; Adjust as desired to suit your needs.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved