Forum Moderators: open
grabbed many pages in 3-second intervals.
All without robots.txt or image,
I care less if it's a valid MSN bot or not.
MSN has far too many bots spidering already.
This one is not an accepted method of crawling and is indifferent to MSN's standard compliance.
FWIW, after I saw my first fake msnbot, I restricted the UA to MS-only hosts:
RewriteCond %{HTTP_USER_AGENT} ^msnbot
RewriteCond %{REMOTE_HOST}!^[^.]+\.search\.msn\.com$
RewriteCond %{REMOTE_HOST}!^[^.]+\.msn\.com$
RewriteCond %{REMOTE_HOST}!^[^.]+\.phx\.gbl$
# hotmail: 64.4.0.0 - 64.4.63.255
RewriteCond %{REMOTE_ADDR} ^64\.4\.$
RewriteRule ^.*$ - [F]
Notes:
The preceding code works for me but my code can always be streamlined and/or more properly/efficiently written:)
If copy-pasting, be sure there's a space before each exclamation mark -- this forum's programming strips them out.