Welcome to WebmasterWorld Guest from 54.144.48.252

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

WASALive

     
10:45 pm on Oct 8, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Hit two different sites and never requested robots.txt. Note space before middle semi-colon:

dev.wasalive.com
Mozilla/5.0 (compatible; WASALive Bot ; http://blog.wasalive.com/wasalive-bots/)

robots.txt? NO

dev.wasalive.com = 94.23.239.127 = OVH France

Apparently they have different bots/UAs for different purposes. Have only seen this one. Anyone else?
2:26 am on Oct 9, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Note space before middle semi-colon:

I'd always assumed in a vague sort of way that spurious space = useless robot. Shove in a BrowserMatch looking for space followed by [;:,)] et cetera and you can forget about it. But after applying some brute force and a Regular Expression* I've had to conclude that 'tain't necessarily so.**

SV1) ;
Configuration/CLDC-1.1 )
U; ;

all appear to be legitimate. (The third one shows up in some rare ex-Soviet-bloc UAs, but seems to be human.)

On the other hand are mostly the no-brainers:

"GeoHasher/Nutch-1.0 (GeoHasher Web Search Engine; geohasher.gotdns.org; geo_hasher at yahoo * com)"
(This only turned up because there was no reason to exclude asterisk from the search)

"Mozilla/5.0 (compatible; spbot/3.0; +http://www.seoprofiler.com/bot )"
(Really, I don't think we need the extra space to give us any information here!)

"^Mozilla/4.0 \\(compatible; MSIE 8.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727\\)$"
(I'm not kidding. That's from raw logs, not from an .htaccess file. Maybe they pasted it in from someone else's htaccess. Or even their own.)

"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FunWebProducts; .NET CLR 1.1.4322; &id;)"
(As above: didn't exclude &. What is &id; anyway? It's not an HTML entity. No! BAD smiley! Get out of there!)

"Lotus-Notes/4.5 ( Windows-NT )"
(Really? You think it might be a robot?)

Phooey. Haha. Another good idea down the drain.


* [\p{Punct}&&[^-/.{(\[quote]] (with leading space) applied to raw log files.
** Like those scientific surveys where they investigate something everyone already knows. Huge waste of money if it turns out everyone was right all along-- but infuriating when it turns out that "common knowledge" is wrong.
5:19 am on Oct 9, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Erm... Seen any WASAlive bots, lucy? :)
6:33 pm on Nov 16, 2011 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Just noting another Hostname:

bot.45.wasalive.com
Mozilla/5.0 (compatible; WASALive Bot ; http://blog.wasalive.com/wasalive-bots/)

robots.txt? NO

bot.45.wasalive.com = 94.23.251.171 = OVH France
94.23.192.0 - 94.23.255.255 = 94.23.0.0/16
 

Featured Threads

Hot Threads This Week

Hot Threads This Month