Forum Moderators: DixonJones
We normally get around 1000 - 1700 pages views a day on one of our web sites - imagine my shock to look in on the stats this morning to find that yesterday we have had 11,000 plus page views in one day!
9700 of these were accounted to:
Microsoft URL Control - 6.00.8169
I've seen some other posts by searching the forum on this topic - but nothing telling me what I should do with it. Is it a good thing? Should I let it keep crawling my site like this - or should i set up a robots.txt exclusion? What entry would I need to put in the robots.txt (if Indeed i should be blocking this bot/site)?
Any thoughts would be greatly received.
Best wishes
Richard Thomas
So I think this might be the case where banning bot is fully justified. Where is wilderness when you need him? ;)
I think i've got the right line - I just save this in robots.txt in root dir?
User-agent: Microsoft URL Control - 6.00.8169
Disallow: /
All other bots will just ignore this and index the rest ok?
If it turns out that the url control bot ignores the robots txt - what can I do to stop this - i've had yet another day of 10000 page views..it appears that one page in particular is being targeted - product listing page with ten products on it - no e-mails to be found. All other pages appear to be having a normal level of activity.
Cheers
Richard
I highly doubt that this code supports robots.txt by default however - I believe (but could be wrong) its just a simple library which gets URL as instructed - extra code will be necessary to support robots.txt. If they are spammers then they would not care or even know what to do.
Oi, in fact - check logs, if this control did not even TRY to request robots.txt (it will be error 404 if its not present), then they don't support it!
G and Slurp have been through already several hundred hits a piece and i've never bothered with Robots.txt before so i'm assuming it's them that have created the 404s mainly.
So i guess the microsoft url control doesn't bother with robots.txt. If this is the case - how do I stop this bot from hammering me - it's completely screwing up my analysis and sales have dropped completely in the last 48 hours from this site and I was thinking whether this bot is slowing page load times etc as the server is busy dealing with the bot's requests rather than "real" peoples.
Arrrgh
You also can't block 'rogue' robots using robots.txt - by definition they will ignore it.
You should block "Microsoft URL Control" using .htaccess on Apache or (I think) browsecap.ini on Windoze. Plenty of examples in WW.