Welcome to WebmasterWorld Guest from 126.96.36.199
Forum Moderators: mack
When I load the referred page then I am told that there are no results. Also there is no relationsfip between the keyword and the page requested. The Kkeywords are single words and seem to be mainly concerned with the normal spam areas.
I have scoured Live to try and find form 'LSVP', searched everywhere that I acn think of.
Can anyone enlighten me as to what the heck form LSVP is? Have the spammers foound another flaw? I am based in the UK.
Thanks in advance.
[edited by: engine at 10:30 am (utc) on Aug. 18, 2007]
[edit reason] delinked [/edit]
Yes, it is OK to do quality checks. No problem with that.
However, if it is done by a bot, the bot has to identfy as such:
it must have a name, so we can put it into /robots.txt, and a legitimate bot should obey the /robots.txt standard.
Hiding behind a user browser UA is, yes, just lying. And not respecting robots.txt is plainly rude.
Furthermore, there is no need to send an artificially crafted fake REFERER string and randomly attach crappy words to it -- or are you saying that your search index database is that bad, that my sites about none of those crappy words are associated with just exactly these in your search index?
Exactly what quality are you checking -- your own? Well, it surely can't be mine.
Others said that the bot is triggering adsense ads. So Adsense would see that badly spoofed crappy referer strings, too, wouldn't it? Will this have negative impacts on my adsense profile and reputation?
So I stand to what I already said before:
This is just rude and nasty and showing poor recognition of basic netiquette. Bad manners.
What do they think? Not much, probably.
Could Msndude please advise further on what is going on - I do not really want to block the spider but see little option at the moment but fear that I may be risking my rankings.
Anyone else seeing increased activity?
Thanks in advance.
It seems to me that they are pulling words off the page or other pages on my site(s) just to find a relevant word to use - but these aren't terms my site ranks for and are basically just a more stealthy way to lie.
And it's an attempt to disguise the issue from people who may not have become aware of this situation yet.
So, I thought, a pleasant Saturday morning of looking over search live for the phrases that point to my site. (Dreams of a big new traffic source dancing in my head.) Nope! Nothing at search live seems to point to my site. Check the logs...what is that?...that makes no sense...?
So here I am too. Hi folks.
MSN - Thanks for a big waste of time. I thought you had more style and skill than that.
Can't understand it...I took MSNDudes advice last year for page seo and showed top billing in all my KW serps...Now gone...bummer.
80% of my pages haven't even been indexed yet with this new Algo.
Now what? I'm testing 20 or so pages with different KW densities etc. Will post my findings if anyone is interested.
If anyone is experiencing the same, please post your comments, opinions or advice. I can't be the only one...
Am I jumping the gun, before they get the bugs out?
Scratching my head,
Please reply. The most important issue, in my mind, is the fact that the bot is processing AdSense scripts. Since you changed the behavior to disguise the activity (by making the fake-referrer strings look more relevant), did you perhaps have the courtesy to also cause it to stop screwing with our AdSense accounts?
I'll see how my tests come out and then wait awhile before making changes. You never know what they're going to do next...change this, change that...GEZZZZ!
Thanks for the advice.
They are coming from some excellent serp rankings for my most important internal pages - the ones that earn the money. I would have noticed that activity. This is new since the weird "quality checks" began.
Just a coincidence? Due to site specific factors? Anyone else seeing anything similar?
Now I found this thread and I'm bummed. Fake traffic? Quality testing?
Figures. I wish indeed they would at least name their user agent as a bot or something and not disguise themselves as users.
And oh. Please don't visit pages listed in the robots.txt file. I got a set up that'll block you automatically for doing that.
I've already blocked three servers from the MSN bots. This morning I find several hundred website logs on servers I rent space on filled with garbage from MSN. The log files I depend on are almost useless. Do I have to spend all next week going through hundreds of robots.txt files to block MSN? This is the sort of behaviour I expect from a third world comment spammer, not a 'respectable' company.
Can someone from MSN please tell us how much longer we have to put up with this?
If only this matter was receiving wider negative coverage then I am sure that we would see some sort of response from MSN.
The traffic you are seeing is part of a quality check we run on selected pages.
I haven't seen any update on what this really is. Could you, please, come back after that long silence, Msndude, to enlighten us about any 'good intention' behind this whole thing that we may be too ignorant to see at our end, and explain to us, what bad quality on your end you are checking by sending fake traffic to swamp and pollute our logs with worthless referrals to let that search.live thingie rise from 0.0x% to some more remarkable 1.x% in our sites' web statistics.
To my surprise, I am now finding some new legitimate traffic to my site from live search, mixed in with the funny stuff.
Yes, I see that too, it looks more targeted, but it is still generated meaningless fake traffic to screw up our logs.
This is the sort of behaviour I expect from a third world comment spammer, not a 'respectable' company.
Now I've just read on other threads that their bot sometimes ignores robots.txt.
If you are running Apache you can block the whole IP range using .htaccess. I have a block of code for matching IP's or user-agents, and I just added the offending block to it when I discovered what it was doing:
SetEnvIfNoCase Remote_Addr "(65\.55\.165).*" bad_bot
<Limit GET POST HEAD>
Allow from all
Deny from env=bad_bot
I think this link is relevant, and no, it's not my site nor do I know the blogger in question. If we can't have the link, here's an excerpt:
Any thoughts on the accuracy of this theory? It seems plausible, because the referrer always contains ONE word, and the word is one which matches the site content on my sites.
I feel that log files and referral info is basically/technically sacred information and it should not be faked or otherwise F'ed with.
Microsoft is doing this on purpose! We should all do everything that we can to make the general public aware of this and encourage/force MS to re-consider what they are doing!
[edited by: engine at 11:54 am (utc) on Oct. 17, 2007]
MSN's fake referrer is hitting our site at a rate of 40+ per hour for stupidly generic keywords are on-topic but way too competitive to rank for.
This whole episode is beginning to stink like a rotten egg, and creating very bad PR.
I've always been an advocate for MSN (strangely enough), but their lack of transparency and honesty is changing this rapidly.
Bad, bad, bad.
[edited by: Chris_H at 10:56 am (utc) on Oct. 17, 2007]
The pseudo UA: "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)" is following each access of msnbot within one minute (usually less). It re-reads the page that msnbot just accessed (still without compression) and then takes all linked .CSS and .JS files (without cache, even if it read them a few seconds earlier) - but absolutely no image files.
It does this even if the page (which has just been read by msnbot) contains "NOINDEX, NOFOLLOW".
It takes the .css and .js stuff despite it being off limits in robots.txt.
It doesn't duplicate every msnbot access, but about half of them: certain areas of the site seem to interest it more than others.
-- iptables, obviously.