homepage Welcome to WebmasterWorld Guest from 54.161.214.221
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Microsoft / Bing Search Engine News
Forum Library, Charter, Moderators: mack

Bing Search Engine News Forum

This 135 message thread spans 5 pages: 135 ( [1] 2 3 4 5 > >     
Strange Referrer Activity
live
confuscius




msg:3424478
 8:29 am on Aug 17, 2007 (gmt 0)

I am getting thousands of hits where the items in my log show the referrer as follows;

http://search.live.com/result.aspx?q=KEYWORD&mrt=en-us&FORM=LVSP

When I load the referred page then I am told that there are no results. Also there is no relationsfip between the keyword and the page requested. The Kkeywords are single words and seem to be mainly concerned with the normal spam areas.

I have scoured Live to try and find form 'LSVP', searched everywhere that I acn think of.

Can anyone enlighten me as to what the heck form LSVP is? Have the spammers foound another flaw? I am based in the UK.

Thanks in advance.

[edited by: engine at 10:30 am (utc) on Aug. 18, 2007]
[edit reason] delinked [/edit]

 

Receptional Andy




msg:3424484
 9:17 am on Aug 17, 2007 (gmt 0)

I've seen a few of these, with searches for random keywords (one was 'sex' bizarrely) that could not have found the site in question. In each case, the IP of the visitor was owned by Microsoft.

I think it's some type of internal testing, although I haven't figured out of what. I think the actual referrer is faked.

Achernar




msg:3424532
 10:36 am on Aug 17, 2007 (gmt 0)

I'm seeing this since yesterday. Half of them on honeypot pages. When will they learn that links matching robots.txt rules are a no-no? Not even as a url-only reference in the SERP. Apparently this set of robots is using URLs from their index database. That's what I call eating ones own s..t :)
Once again a broken "technology" from microsoft.

[edited by: Achernar at 10:37 am (utc) on Aug. 17, 2007]

The Contractor




msg:3424570
 11:31 am on Aug 17, 2007 (gmt 0)

In each case, the IP of the visitor was owned by Microsoft.

I have found they are all owned by MS also. This has been going on for a long time and yes the terms are often 'sex related'. Why, I have no idea and I'm sure they get a kick out giving this type of referrer spam.

Achernar




msg:3424750
 2:44 pm on Aug 17, 2007 (gmt 0)

Yes, it has been going on for some time now. But it was once or twice a week. Currently it's 3 or 4 per hour.

confuscius




msg:3425406
 8:14 am on Aug 18, 2007 (gmt 0)

Thanks for the feedback. I have now had several thousand requests over multiple sites in the last few days so I have a clearer picture of what is happening - mainly single word requests with most of them drug related. The requests call pages with Adsense on with the results that not only are my website logs completely messed up but also my Adsense statistics because of all the 'false' visitors. Conspiracy theory time!

Anyways, the requests all seem to come from the IP range 65.55.165.#*$! so I have denied them through .htaccess as follows:

<Limit GET>
order allow,deny
deny from 65.55.165.
allow from all
</Limit>

I have also emailed Live support - let's see if I get a response!

Any more cases of this out there?

Paul

gosman




msg:3425586
 4:29 pm on Aug 18, 2007 (gmt 0)

I'm expierencing the same. All to a subdirectory that is excluded in my robots.txt file. Most of the search terms are pharmacy rleated and my site is travel related.

Weird.

BillyS




msg:3425599
 4:51 pm on Aug 18, 2007 (gmt 0)

Yes, I've seen this too. Just another sign of the mess at Microsoft.

Romeo




msg:3425762
 10:37 pm on Aug 18, 2007 (gmt 0)

... they are using 'strange' key words from the grey side : cartier volkswagen codeine zithromax lodging cell+phone mazda ringtone pron make+money+online mazda ... etc.

It is offending to run such crap against my sites, which are not about such a thing as 'ringtones'.

It is a parallel operation. It looks automated --> a bot. It is not looking for /robots.txt. It does not show a robot UA either but hiding behind a faked user browser USER_AGENT. And furthermore, they are not even able to have valid rDNS PTR records: a thing like 'bl2sch1082116.phx.gbl' is not a valid domain name and as such unprofessional.

This is just rude and nasty and showing poor recognition of basic netiquette. Bad manners.
What do they think? Not much, probably.

So this dubious operation is excluded from my sites now, which -- ironically -- are on Linux systems.

Are we having fun yet?
Kind regards,
R.

confuscius




msg:3425949
 8:41 am on Aug 19, 2007 (gmt 0)

I have recieved an email from Microsoft Support requesting additional information and pointed them at this thread and asked them to comment.

Consequently, if anyone can add further details and examples then this would be appreciated, particularly information about the scale of the issue and its effects.

Thanks in advance.

Paul

AussieWebmaster




msg:3426063
 1:52 pm on Aug 19, 2007 (gmt 0)

Is it possible it is through one of the filters.... are people using any of the demographic filters to cause this?

Is this organic traffic? Or are the sites in question buying traffic from MS and they are testing some bizarre behavioral or demographic filter?

The Contractor




msg:3426364
 11:07 pm on Aug 19, 2007 (gmt 0)

Is this organic traffic? Or are the sites in question buying traffic from MS and they are testing some bizarre behavioral or demographic filter?

Nope, never paid for any traffic myself and this is coming from inside MS.

BillyS




msg:3426736
 12:27 pm on Aug 20, 2007 (gmt 0)

Based on my experience it's not paid or anything else. The one I noticed had to do with a prescription diet pill - which is a long way from my topic (finance).

I thought I read something about MS checking new results with this type of query - if that's true they've got big problems.

antikva




msg:3426800
 1:21 pm on Aug 20, 2007 (gmt 0)

I've also getting these strange search strings with no correlation to my site or any of my clients. It's been happening for months now.

All have Microsoft IP's and were related to drugs now they've switched to fairly offensive sex terms. I checked the referring pages and there aren't any site results showing at all for the search terms used today.

I did notice that I had another Microsoft IP in that range show up with the same Agent string but with no referral info.

Made me curious enough to see if anyone else was seeing the same thing.

Receptional Andy




msg:3426841
 2:09 pm on Aug 20, 2007 (gmt 0)

I had a more detailed look into this for one site. Here's some info about this MS 'visitor':

- Landing pages appear to be pseudo-random (seemed to request 'batches' of related pages)
- IPs changeable in the range 65.55.165.*
- Spoofed referrers are search.live.com/result.aspx?q=[KEYWORD]&mrt=en-us&FORM=LVSP. However, the bot also makes requests without a referrer, some times in the course of the same visit
- Keywords are usually single words, varying from obvious commercial (but inoffensive) to drug names and pr0n-related words
- System details are identical for each 'visitor': MSIE 7.0, Windows 2003, 32 bit colour res, 800x600 screen res, both cookies and javascript accepted

I'm not seeing any flood of requests, but this bot appears to visit on a pattern of a number of pages a day. It then requests a handful of pages within a short time frame, but from different IP addresses and with cookies reset.

Now, this isn't a huge problem, but as it seems extremely likely this is automated traffic, it really isn't good for MS to make it appear human, and thus skew website stats. In addition, some of the keywords are adult, and should not be showing up indiscriminately in non-adult site's logfiles.

I am definitely interested in what the MS justification/explanation of this is.

[edited by: Receptional_Andy at 2:24 pm (utc) on Aug. 20, 2007]

exposure




msg:3426883
 2:44 pm on Aug 20, 2007 (gmt 0)

I've seen this also - on a small scale on a small, mostly personal website. I assumed these were spoofed IPs when I first started seeing the referrers since the referring SERP is surely bogus. I haven't checked logs on any larger sites sites but I can share this:

Several referrals with a search query term that is irrelevant to site and landing page. Always generic search term. Things like "cash" and "payday advance" and "airline" and "nokia" are all terms this little website shouldn't and will never rank for.

Two IP ranges:
In August, all hits were from 65.55.165.*
In July, hits were from 131.107.0.*
Both of which are supposedly inside Microsoft Redmond.

And, of course, as reported here, none of the referring pages actually work. Page Not Found messages.

Here are examples:

on 8/18
[search.live.com...]
ip: 65.55.165.100

on 8/1
[search.live.com...]
ip: 131.107.0.95

Receptional Andy




msg:3426901
 2:58 pm on Aug 20, 2007 (gmt 0)

spoofed IPs

That's possible, if the bot doesn't need to retrieve any information. However this bot/whatever it is does request files linked from a particular page like stylesheets and images, so it seems unlikely that the IP is fake, unless it is collecting data from another source/a real IP.

I can confirm the alternative IP range for July (different U/A then too).

Receptional Andy




msg:3426907
 3:03 pm on Aug 20, 2007 (gmt 0)

One other note, I did find an older thread whioch seems to be about the same bot:

Possible Bot or Spammer? [webmasterworld.com]

I'm not sure the QC explanation will work here, since I can't see how the sites in question could ever appear for the sorts of words in the referrers (since they don't mention any of the words in question at all).

mvandemar




msg:3426990
 4:18 pm on Aug 20, 2007 (gmt 0)

this bot/whatever it is does request files linked from a particular page like stylesheets and images

To add to it, they started hammering my poetry site on Thursday, and they were also downloading my AdSense blocks, completely inflating my stats. Myblogblog however, didn't recognize them as visitors, so the discrepancy was immediately obvious.

I blocked the IP range as soon as I saw them, but am concerned that there might at some point be legit MSN traffic from that range. If anyone knows for sure, or if an MS rep could reply to this thread that would be great. :)

confuscius




msg:3427778
 11:46 am on Aug 21, 2007 (gmt 0)

Live Search Technical Support have replied to my second email with the following:

"Thank you for writing back to Live Search Technical Support. This is Marichu and I understand that you would like to know what is form LSVP and the purpose of this Microsoft activity. I realize the importance of this matter.

I have forwarded your concern to the Live Search Product Specialist Team so it may be given due attention. I understand the importance of this issue. Rest assured that we are doing everything within our means to remedy the situation.

We appreciate your continued support as we strive to provide you with the highest quality service available. Thank you for using Live Search."

Looks like a pass the parcel response to me or are they recognising a 'situation'? I will wait for a response but I am not hopeful of receiving one. Time will tell.

Paul

tim222




msg:3430601
 9:14 pm on Aug 23, 2007 (gmt 0)


[search.live.com...]
...
I have scoured Live to try and find form 'LSVP', searched everywhere that I acn think of.

First of all, you mixed up some of the letters. Your URL says "LVSP" but you are searching for "LSVP"

Next, LVSP is an acronym for Linux Virtual Server Project

[ntua.gr...]

So I did a search on Netcraft, and found that search.live.com is running Linux.

Is Microsoft running Linux? *gasp* Is that the TRUE conspiracy here? NOT. Actually, they've been using Linux for years.

So my conclusion from all of this is that MSN is experimenting with Linux Virtual Server Project, and well, computers being computers, things aren't going as planned.

mvandemar




msg:3431958
 2:53 pm on Aug 25, 2007 (gmt 0)

So my conclusion from all of this is that MSN is experimenting with Linux Virtual Server Project, and well, computers being computers, things aren't going as planned.

That does nothing to explain the referrer spam for terms that none of us rank for though, or why the bot would be downloading AdSense scripts from Google.

-Michael

jdMorgan




msg:3431988
 3:52 pm on Aug 25, 2007 (gmt 0)

The IP addresses beginning with 131.107 indicate the same exploit discussed in this thread [webmasterworld.com], where someone is using MSN's "tide" proxy servers to make it appear that these requests are coming from Microsoft when in fact, they're proxied requests. These can easy be blocked by denying access if the REMOTE_HOST contains "tideNNN.microsoft.com", where NNN is a series of numbers. This is discussed in the previous thread.

The 65.55.165.*** address range resolves back to MSN as well, but no PTR records exist for this range so it's not possible to tell if these are also proxy servers at MSN. However, it may still be possible to block these requests by looking for VIA or X-FORWARDED_FOR headers on the requests.

Jim

confuscius




msg:3432612
 5:34 pm on Aug 26, 2007 (gmt 0)

I am now seeing lots of the following type using a different form but emanating from the previously identified IP address range:

[search.live.com...]

So a new form and a new set of keywords but the same IP range.

Still no reply from Microsoft - cannot say that I am surprised.

Paul

exposure




msg:3433863
 2:00 am on Aug 28, 2007 (gmt 0)

I see the same now, now it's LIVSOP

chance1376




msg:3434298
 1:25 pm on Aug 28, 2007 (gmt 0)

I am getting the same thing with the terms online and then support, mostly support. This morning I checked and found cc.msnscache.com/... as a refer 112 times. The url leads to a cache of a photo gallery that we removed from the site.

justageek




msg:3434422
 2:52 pm on Aug 28, 2007 (gmt 0)

I'll bet it is from the live family safety beta.

JAG

incrediBILL




msg:3434530
 4:40 pm on Aug 28, 2007 (gmt 0)

Based on my experience I'm guessing it's not an MS test run foul but a proxy service being abused.

I've seen similar crap happen with Yahoo and Google and it's usually the result of someone running a scraping operation via one of their proxy services, something like a translator, web accelerator, wireless services, etc..

This too looks like a scrape attack to me because the keyword in the query couldn't possibly resolve to the landing page being accessed, at least not based on the current Live SERPs so I'm thinking the actual QUERY string is what's being faked just to throw people off the trail of what's really happening.

Anyone know of a proxy service Live runs that could be used in such a manner?

Whois for the 65.55.165.* range shows MS but the reverse DNS resolved to something like bl2sch0000000.phx.gbl whatever that is.

The user agent is always:
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)"

Is that the same user agent you all see?

blend27




msg:3434580
 5:55 pm on Aug 28, 2007 (gmt 0)

in my case the user agent is always:
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)"

all requests with with method GET, NONE of the requests went pass the landing page, all requests are made to distinct URIs. no images, no CSS no JAVASCRIPT.

33 requests from 25 distinct IPs since Aug 15th, 2007

I say zisis a scraypa

chance1376




msg:3434754
 8:03 pm on Aug 28, 2007 (gmt 0)

"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)" <<<yep thats the one.

This 135 message thread spans 5 pages: 135 ( [1] 2 3 4 5 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Microsoft / Bing Search Engine News
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved