|You may be assured that all WebmasterWorld members claiming to be employees of well-known corporations are checked out thoroughly here. |
Although I am saying it jokingly, it is actually a valid concept that employment situations change. :P
This so-called 'quality check' is getting ridiculous. It's really gone into overdrive today; logs are showing clusters of totally irrelevant searches every few minutes.
Regardless of msndude's advice to not block the IP range, it's now banned. I don't need this c$&p.
Thanks for the explanation, Msndude.
Yes, it is OK to do quality checks. No problem with that.
However, if it is done by a bot, the bot has to identfy as such:
it must have a name, so we can put it into /robots.txt, and a legitimate bot should obey the /robots.txt standard.
Hiding behind a user browser UA is, yes, just lying. And not respecting robots.txt is plainly rude.
Furthermore, there is no need to send an artificially crafted fake REFERER string and randomly attach crappy words to it -- or are you saying that your search index database is that bad, that my sites about none of those crappy words are associated with just exactly these in your search index?
Exactly what quality are you checking -- your own? Well, it surely can't be mine.
Others said that the bot is triggering adsense ads. So Adsense would see that badly spoofed crappy referer strings, too, wouldn't it? Will this have negative impacts on my adsense profile and reputation?
So I stand to what I already said before:
This is just rude and nasty and showing poor recognition of basic netiquette. Bad manners.
What do they think? Not much, probably.
Since the 'new' MSN Live arrived a few days ago the I am seeing a substantial increase in the LISVOP spider activity (still all 403ed for now) BUT it looks like the queries are a little less spammy and more related to the site content.
Could Msndude please advise further on what is going on - I do not really want to block the spider but see little option at the moment but fear that I may be risking my rankings.
Anyone else seeing increased activity?
Thanks in advance.
The query values are more relevant now, but they still aren't legit.
It seems to me that they are pulling words off the page or other pages on my site(s) just to find a relevant word to use - but these aren't terms my site ranks for and are basically just a more stealthy way to lie.
And it's an attempt to disguise the issue from people who may not have become aware of this situation yet.
MSN's efforts at search have never generated enough traffic for me to pay much attention to them. However, this morning they sit at the top of my list of referrers. Never seen that before!
So, I thought, a pleasant Saturday morning of looking over search live for the phrases that point to my site. (Dreams of a big new traffic source dancing in my head.) Nope! Nothing at search live seems to point to my site. Check the logs...what is that?...that makes no sense...?
So here I am too. Hi folks.
MSN - Thanks for a big waste of time. I thought you had more style and skill than that.
Relevancy still sucks pond water in my sector. Probably because I'm (not showing in the serps any longer...:-), except my index page...
Can't understand it...I took MSNDudes advice last year for page seo and showed top billing in all my KW serps...Now gone...bummer.
80% of my pages haven't even been indexed yet with this new Algo.
Now what? I'm testing 20 or so pages with different KW densities etc. Will post my findings if anyone is interested.
If anyone is experiencing the same, please post your comments, opinions or advice. I can't be the only one...
Am I jumping the gun, before they get the bugs out?
Scratching my head,
Fish Texas - I am absolutely not an expert on search live, but I would advise you not to invest much time looking for useable logic in their serps. There is none, hasn't been anything worth learning there for a long time.
Please reply. The most important issue, in my mind, is the fact that the bot is processing AdSense scripts. Since you changed the behavior to disguise the activity (by making the fake-referrer strings look more relevant), did you perhaps have the courtesy to also cause it to stop screwing with our AdSense accounts?
Skier, I totally agree. It's been that way since they released their own Algo last year.
However, MSN was sending half my traffic, so I gotta get it right.
I'll see how my tests come out and then wait awhile before making changes. You never know what they're going to do next...change this, change that...GEZZZZ!
Thanks for the advice.
That’s it! I just cannot sit back and do nothing anymore. As of today I have blocked out Microsoft traffic on ip 65.55.165.* on 200 commercial sites.
They keep banging my sites with useless referrals for over 3 weeks now. Funny thing is that some people on other forums think that MSN is sending them traffic.
|some people on other forums think that MSN is sending them traffic. |
That's one way to make it look like someone is using their SE.
To my surprise, I am now finding some new legitimate traffic to my site from live search, mixed in with the funny stuff.
They are coming from some excellent serp rankings for my most important internal pages - the ones that earn the money. I would have noticed that activity. This is new since the weird "quality checks" began.
Just a coincidence? Due to site specific factors? Anyone else seeing anything similar?
Iam getting severely hacked off with this. My logs are the most important tool I've got and they are being wrecked. Another three servers full of sites are about to be blocked from the MS spider, life's too short to put up with this constantly.
I was happy in the past week or so to notice that I was finally getting some MSN traffic, but I looked at the referer, which in my case was relevant keywords to my sites, so happy to see that was actually coming up in their SERPs for high traffic keywords, but could not find my site in question in the results.
Now I found this thread and I'm bummed. Fake traffic? Quality testing?
Figures. I wish indeed they would at least name their user agent as a bot or something and not disguise themselves as users.
And oh. Please don't visit pages listed in the robots.txt file. I got a set up that'll block you automatically for doing that.
How much longer is this going to go on for?
I've already blocked three servers from the MSN bots. This morning I find several hundred website logs on servers I rent space on filled with garbage from MSN. The log files I depend on are almost useless. Do I have to spend all next week going through hundreds of robots.txt files to block MSN? This is the sort of behaviour I expect from a third world comment spammer, not a 'respectable' company.
Can someone from MSN please tell us how much longer we have to put up with this?
The longer that MSN fail to respond positively to this issue then the more I think that it may not be anything to do with quality issues but more to do with the issue of messing with Adwords/Adsense impression and click rates - anyone fancy telling Google (Matt Cutts?) that this MSN spambot is messing with its figures?
If only this matter was receiving wider negative coverage then I am sure that we would see some sort of response from MSN.
Msndude said on 2007-09-05, more than 1 month ago:
|The traffic you are seeing is part of a quality check we run on selected pages. |
I haven't seen any update on what this really is. Could you, please, come back after that long silence, Msndude, to enlighten us about any 'good intention' behind this whole thing that we may be too ignorant to see at our end, and explain to us, what bad quality on your end you are checking by sending fake traffic to swamp and pollute our logs with worthless referrals to let that search.live thingie rise from 0.0x% to some more remarkable 1.x% in our sites' web statistics.
|To my surprise, I am now finding some new legitimate traffic to my site from live search, mixed in with the funny stuff. |
Yes, I see that too, it looks more targeted, but it is still generated meaningless fake traffic to screw up our logs.
|This is the sort of behaviour I expect from a third world comment spammer, not a 'respectable' company. |
I don't think it is fair to associate a harmless "third world comment spammer" with that kind of arrogant rudeness. ;-)
One of my female staff complained this afternoon about a particularly offensive search term in the logs of one of our sites - that term about increasing a certain size. I gave instructions for several hundred robots.txt files to be altered over the next weekend to block the spider. Now I've just read on other threads that their bot sometimes ignores robots.txt. Quality control? What has that term got to do with a mortgage site? I cannot express my disgust.
|Now I've just read on other threads that their bot sometimes ignores robots.txt. |
If you are running Apache you can block the whole IP range using .htaccess. I have a block of code for matching IP's or user-agents, and I just added the offending block to it when I discovered what it was doing:
SetEnvIfNoCase Remote_Addr "(65\.55\.165).*" bad_bot
<Limit GET POST HEAD>
Allow from all
Deny from env=bad_bot
Some of the sites are on linux servers, some on Windoze. I've already blocked the sites on my own servers using iptables - one tiny change to a script and 300 sites blocked. The problem is that I've got several hundred sites on other people's servers so I'm stuck with individually altering either robots.txt or .htaccess files for every one of them - that is one whole lot of work. Not happy.
I gave it a few days to settle down, then had a look again this morning.
-Live search is still distorting my stats. Hate that!
-Most of the traffic from LS is still fraudulent. Speaks volumes about the ethics and values at M$.
-Looking over the SERPs at live search I still see nothing but crap.
Here's a blogger who thinks that this may be due to an errant client-side script used on search.live.com, which is including only one keyword from the original search in the referrer string:
I think this link is relevant, and no, it's not my site nor do I know the blogger in question. If we can't have the link, here's an excerpt:
Any thoughts on the accuracy of this theory? It seems plausible, because the referrer always contains ONE word, and the word is one which matches the site content on my sites.
Inactivist, your blogger does not know what he is talking about. MSN admitted that they are doing it on purpose!
My opinion is simple, as I've stated here in WebmasterWorld for two months also, that Microsoft is intentionally screwing up our log files for their own gain.
I feel that log files and referral info is basically/technically sacred information and it should not be faked or otherwise F'ed with.
Microsoft is doing this on purpose! We should all do everything that we can to make the general public aware of this and encourage/force MS to re-consider what they are doing!
[edited by: engine at 11:54 am (utc) on Oct. 17, 2007]
Producing weekly reports for our company's management is becoming almost impossible for us with this ridiculous MSN experiment.
MSN's fake referrer is hitting our site at a rate of 40+ per hour for stupidly generic keywords are on-topic but way too competitive to rank for.
This whole episode is beginning to stink like a rotten egg, and creating very bad PR.
I've always been an advocate for MSN (strangely enough), but their lack of transparency and honesty is changing this rapidly.
Bad, bad, bad.
[edited by: Chris_H at 10:56 am (utc) on Oct. 17, 2007]
Agreed - I want MSN/Live to become stronger too - competition for the others is a good thing and who knows, maybe their product will someday even be better than the current leader... but this is the wrong way to go about getting people on their side.
Just saw the same thing in my logs today. It's really annoying.
Forgive me if this has been mentioned already:
The pseudo UA: "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)" is following each access of msnbot within one minute (usually less). It re-reads the page that msnbot just accessed (still without compression) and then takes all linked .CSS and .JS files (without cache, even if it read them a few seconds earlier) - but absolutely no image files.
It does this even if the page (which has just been read by msnbot) contains "NOINDEX, NOFOLLOW".
It takes the .css and .js stuff despite it being off limits in robots.txt.
It doesn't duplicate every msnbot access, but about half of them: certain areas of the site seem to interest it more than others.
-- iptables, obviously.
This looks like old fashioned referrer spam to me.
They are hitting my sites mostly with keywords that appear on the site.
(With the exception of the porn keywords that are bad enough that I won't list them here.)
Sending porn referrers to Web sites on a massive scale cannot seriously be considered a "quality check", unless we are talking about the low quality of the concept.
My thought is that Microsoft wants webmasters to click through on the referrer links which will boost live.com's traffic. Microsoft might be thinking, "if we send them keywords that they are interested in, maybe they will start using Live.com Search".
| This 135 message thread spans 5 pages: < < 135 ( 1 2  4 5 ) > > |