homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque & physics

Webmaster General Forum

Concern with spammy search words

 8:24 am on Jul 9, 2007 (gmt 0)

I've been using awstats to monitor traffic etc to my website.

For atleast the past 3 months I've noticed that a certain number of spammy search terms are appearing in the search keywords/phrases logs.

I'm wondering how this could be happening? might someone else on the server be spamming somehow as I'm on a VPS?



 12:52 pm on Jul 9, 2007 (gmt 0)

This happens to just about every site that I manage. I generally get keywords related to prescription drugs even though none of their names appear on any of the sites that I maintain.


 12:57 pm on Jul 9, 2007 (gmt 0)

It's crazy!

fioricet and gambling are the two which appear every month without fail - likewise I can't see anything thats been hacked into my source anywhere.


 2:42 pm on Jul 9, 2007 (gmt 0)

If these are like the ones I've seen, they're mostly (or all) coming from the "Tide" proxy servers at Microsoft. As such they're fairly easy to block based on their REMOTE_ADDR, REMOTE_HOST, and VIA headers.

You should be able to confirm this by reviewing your server access log file, and searching for the 'spammy' terms.



 2:55 pm on Jul 9, 2007 (gmt 0)

hah! there it is - - [08/Jul/2007:06:52:15 +0100] "GET / HTTP/1.1" 200 2290 "http://search.live.com/result.aspx?q=fioricet&mrt=en-us&FORM=LVSP" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; Win64; x64; SV1; .NET CLR 2.0.50727)

with the information above how would I then go about blocking?


 11:51 pm on Jul 9, 2007 (gmt 0)

jdMorgan is probably right. However, if I were you I would do a search like:
site:yoursite.com gambling
on Google to make sure that none of your pages have those terms on them.


 12:43 am on Jul 10, 2007 (gmt 0)

Yes, that IP address resolves to tide526.microsoft.com

I don't know what these people (or robots) are up to --or whether blocking these accesses will actually accomplish anything-- but if you're on Apache, something like this in .htaccess would return a 403-Forbidden response for these requests:

RewriteCond %{REMOTE_ADDR} ^131\.107\.0\.(6[4-9]¦[789][0-9]¦1[01][0-9]¦12[0-7])$
RewriteCond %{HTTP:Via} SEA-PRXY
RewriteRule .* - [F]

This snippet assumes that you already have other working mod_rewrite rules. If not, you'll need to "set up" mod_rewrite before using this code.

An alternative rule, if your host allows/supports reverse DNS lookups, might be:

RewriteCond %{HTTP_REFERER} ^http://search\.live\.com/result\.aspx\?q=
RewriteCond %{REMOTE_HOST} ^tide[0-9]+\.microsoft\.com
RewriteRule .* - [F]

This second version, based on the tide.microsoft.com hostnames, eliminates the possible need to maintain the IP address "list" for the REMOTE_ADDR check in the first code snippet.

In the second version, the RewriteCond examining the referrer isn't strictly necessary, but I included it to limit the number of reverse DNS lookups, which are by nature 'expensive' in terms of CPU cycles and waiting server threads. You can make it even more specific by including the search terms that you find in the requests to your server, as in:

RewriteCond %{HTTP_REFERER} ^http://search\.live\.com/result\.aspx\?q=(widgets¦wodgets¦whatever)

as long as you replace the broken pipe "¦" characters with solid pipe characters before use; Posting on this forum modifies the pipe characters.

These are just two example routines; Adjust as desired to suit your needs.


Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved