homepage Welcome to WebmasterWorld Guest from 54.227.62.141
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
chlooe
SEO service
keyplyr




msg:4488080
 6:50 pm on Aug 24, 2012 (gmt 0)

UA: bot-pge.chlooe.com/1.0.0 (+http://www.chlooe.com/)
Referrer: no
Robots.txt: no

Dedibox, FR
88.190.40.0 - 88.190.40.255
88.190.40.0/24

Blocked by range. Hit index page 4 times in one second.

Other Dedibox ranges I have blocked are:

88.190.16.0 - 88.190.16.255
88.190.16.0/24

88.190.18.0 - 88.190.18.255
88.190.18.0/24

88.190.244.0 - 88.190.244.255
88.190.244.0/24

88.191.131.0 - 88.191.131.255
88.191.131.0/24

 

keyplyr




msg:4488135
 10:23 pm on Aug 24, 2012 (gmt 0)

I did find a request for robots.txt _long_ after the other GETs:

88.190.40.44 - - [23/Aug/2012:13:18:25 -0700] "GET example.com/robots.txt HTTP/1.1" 200 314 "-" "bot-pge.chlooe.com/1.0.0 (+http://www.chlooe.com/)"



Also, riding shotgun from same IP:

88.190.40.44 - - [23/Aug/2012:13:18:25 -0700] "GET example.com/ HTTP/1.1" 403 1060 "-" "SEOstats 2.1.0 https://github.com/eyecatchup/SEOstats"

Note: the forum software removes blank spaces, but there are 10 spaces between "SEOstats 2.1.0 ...and... https://github.com/eyecatchup/SEOstats"



lucy24




msg:4488216
 8:09 am on Aug 25, 2012 (gmt 0)

the forum software removes blank spaces

The remedy is the same as in html. Use non-breaking spaces.

Incidentally, I thought I could block UA's with multi-spaces as a guaranteed Mark of the Robot. I promptly ran into a human with two spaces. Drat. There goes another bright idea.

88.190, what is that? Proxad? I always assume a RIPE visitor is Ukrainian until I see evidence to the contrary.

keyplyr




msg:4488221
 9:01 am on Aug 25, 2012 (gmt 0)


The remedy is the same as in html. Use � � � � � non-breaking � � � � � spaces.


FYI - Your example displays as spaces in IE9, Chrome 21.0.1180.83 and Safari 5.1.7 but not my copy of Firefox 14.0.1... they display as unknown icons (little question marks inside of a black diamond.)

But thanks for the suggestion. I'll do that next time and bear with Firefox.



lucy24




msg:4488331
 8:49 pm on Aug 25, 2012 (gmt 0)

<begin topic drift>

Ooh, what fun. On my screen, the Unicode Replacement Character (the black diamond) turns into � (i-umlaut, inverted question mark, one-half) because my browser has already decided the page is in Latin-1, and it can't change its mind halfway.

It means that your copy of FF defaults to UTF-8 encoding while your other browsers default to Latin-1. The difference only becomes noticeable when someone enters a non-ASCII character.

:: detour to confirm ::

Ay-yup. The browser doesn't know what to do with A0 alone-- one-byte characters stop at 7F-- so it shows each one as EF BF BD-- which unpack in the other direction to your three Latin-1 characters.

And that's why so many people insist on using entities for anything outside the vanilla ASCII range. A simple <charset> declaration will...

n/m. Been there already.

</end topic drift>

... sort of, because every time I see "chlooe dot com" I wonder if the site owners really wanted to say Chloe with dieresis-- which, ahem, I won't attempt here, see above-- but something got garbled in transit.


But Seriously:
Is that a French server farm, or is somone running a robot from the middle of a "real" ISP?


Footnote: And then, for reasons I won't try to go into, my Latin-1 preview turned into UTF-8 as soon as I posted for real. I believe this has happened before. It confuses me horribly. Especially the part where it toggled back in the other direction when I added this paragraph.

keyplyr




msg:4488335
 9:17 pm on Aug 25, 2012 (gmt 0)


Is that a French server farm, or is somone running a robot from the middle of a "real" ISP?


My original thinking was that Dedibox was the web site hosting (server farm) subsidiary inside the Proxad ISP company. I've had all known Dedibox ranges blocked for a long time, however I could be completely wrong and be blocking French users. Be nice to know for sure.

dstiles




msg:4488336
 9:26 pm on Aug 25, 2012 (gmt 0)

I have the full FR range 88.190.0.0/15 blocked for dedibox. 88.160/11 is assigned to Proxad (France) but mostly broadband (it says).

Leosghost




msg:4488337
 9:40 pm on Aug 25, 2012 (gmt 0)

Dedibox is the server farm run by online" who are a subsidiary of Iliad.."Proxad" does indeed cover some Iliad broadband customers..I'm one of them :)

wilderness




msg:4488340
 10:27 pm on Aug 25, 2012 (gmt 0)

FWIW the null character in windoze is ALT-0160

lucy24




msg:4488409
 10:22 am on Aug 26, 2012 (gmt 0)

Now set your browser's encoding manually to UTF-8 and watch the dramatic effect :)

For me it's opt-space. Except in php/bb forums, where the literal character collapses just like a plain space so you have to use the decimal entity &#160; (At least in php/bb2. It may have changed in 3.)

wilderness




msg:4488434
 1:39 pm on Aug 26, 2012 (gmt 0)

Thanks.

Now set your browser's encoding manually to UTF-8


It is increasingly common for multilingual websites and websites in non-Western languages to use UTF-8,


Makes perfect sense to me (NO).
I rarely visit non-North American websites.
I have most every Region of the world that is non-North American denied access to my websites.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved