|
Possible Bot or Spammer?
Or is this Live bot? |
EarleyGirl
#:3265933
| 10:09 pm on Feb. 27, 2007 (utc 0) |
I see something strange in my access log. At first glance, it appears someone came in from MSN Live on a search for "airlines" to my site which has nothing to do with airlines. That's the first red flag. The log file shows this:
tide526.microsoft.com - - [11/Feb/2007:02:10:22 -0500] "GET / HTTP/1.1" 200 3612 "http://search.live.com/result.aspx?q=airlines&mrt=en-us&FORM=LVSP" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; WOW64; SV1)"
Yet, doing a search for this on Live brings an error page. A true search would look like this: search.live.com/results.aspx?q=airlines&mkt=en-us&FORM=LIVSOP Notice the differences: results.aspx and LIVSOP and MKT Looks awfully suspect to me. Can someone shed some light? Is this really a microsoft bot? If it really is from microsoft.com, why would they be performing fake searches? Edit: I just checked January's log file. They used keyword "hydrocodone" last month (again, nothing to do with my site). What is up? [edited by: EarleyGirl at 10:28 pm (utc) on Feb. 27, 2007]
|
Brett_Tabke
#:3265966
| 10:41 pm on Feb. 27, 2007 (utc 0) |
First, that user and host is ms corporate. I would say you are being visited by a human checking qc on selected keywords.
|
wilderness
#:3266065
| 12:30 am on Feb. 28, 2007 (utc 0) |
tide526.microsoft.com equals 207.46.18.30 Dan's page "DID" stay current on IP ranges of all the major bots. His site however has changed. An old link from Archive org http://web.archive.org/web/20060603044348/http://joseluis.pellicer.org/ua/ MS and all the major SE's are offering a variety of tools, tool bars, plug ins and such which offer a variety of IP ranges not formerly utilized.
|
EarleyGirl
#:3266985
| 7:30 pm on Feb. 28, 2007 (utc 0) |
Odd that hydrocodone (a drug) would be used as a keyword for checking qc. My site has nothing to do with airlines or drugs nor would it appear in a search using those keywords. Also, is the search being done from a desktop or different application? It doesn't work from a browser when I try it. It brings up an error, the URL isn't right. It might be from Microsoft but I don't think they were coming in from a search. It just appears that way. I wonder why? | First, that user and host is ms corporate. I would say you are being visited by a human checking qc on selected keywords. |
|
|
Brett_Tabke
#:3266988
| 7:33 pm on Feb. 28, 2007 (utc 0) |
That doesn't mean your site wasn't seen in those kw's - and hence the qc check behind the scenes...
|
volatilegx
#:3267046
| 8:04 pm on Feb. 28, 2007 (utc 0) |
My site tracks specifically search engine spiders. I've never seen 207.46.18.30 displaying spider-like behavior.
|
EarleyGirl
#:3267075
| 8:36 pm on Feb. 28, 2007 (utc 0) |
| That doesn't mean your site wasn't seen in those kw's - and hence the qc check behind the scenes... |
| You've lost me on that one. I don't see how that's possible. I don't have a site search (or a wiki or comments or a forum for that matter) so that wasn't used by a spammer. Up until this month, it was a Flash site, nothing but an swf - no words to pick up. Also, for the months of November and December, "www" was used as the keyword from tide525.microsoft.com. Why would that need a qc check? [edited by: EarleyGirl at 8:38 pm (utc) on Feb. 28, 2007]
|
Brett_Tabke
#:3267099
| 9:06 pm on Feb. 28, 2007 (utc 0) |
They were checking THEIR index behind the scenes before pushing it live.
|
EarleyGirl
#:3267106
| 9:17 pm on Feb. 28, 2007 (utc 0) |
Thanks Brett. My apologies if I seem slow to catch on. I was just trying to understand it.
|
Brett_Tabke
#:3267121
| 9:35 pm on Feb. 28, 2007 (utc 0) |
I know it is all complicated. What happens is: - they build an index. - it has errors in it. - they run a util on high value kws that flags possible problems for a hand check. - they do the hand check and delete obvious mistakes.
|
EarleyGirl
#:3267129
| 9:49 pm on Feb. 28, 2007 (utc 0) |
Thanks Brett. That clarifies things.
|