homepage Welcome to WebmasterWorld Guest from 107.22.127.92
register, login, search, subscribe, help, library, PubCon, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library : Charter : Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 42 message thread spans 2 pages: < < 42 ( 1 [2]     
UNTRUSTED in Nokia User Agent
lucy24




msg:4440897
 6:35 pm on Apr 14, 2012 (gmt 0)

Nokia6300/2.0 (06.01) Profile/MIDP-2.0 Configuration/CLDC-1.1 nokia6300/UC Browser8.0.3.107/69/444 UNTRUSTED/1.0

Took the words right out of my mouth.

The "MIDP-2.0" element has apparently been around for a while-- it goes with mobiles-- but you have to give ChinaCache (65.255.37.nn) points for honesty. Can we look forward to a long line of UNTRUSTED versions?

 

keyplyr




msg:4539555
 9:04 pm on Jan 26, 2013 (gmt 0)


I have blocked all UAs containing "nutch" for well over 10 years without any adverse affect.

This generic bot can be used by any unaccountable agent for any unknown purpose, and the accountable ones should customize and rename so their bot UA reflects they are on the level IMO.

blend27




msg:4540988
 5:14 pm on Jan 31, 2013 (gmt 0)

Note to self... if I ever write a scraper, name it after something universally popular, like a record-breaking Ferrari.

Scrapers are one thing but when RDNS points to that...
67.18.54.176 (ferrari.websitewelcome.com)

And as always: AS21844 67.18.0.0/15 ThePlanet.com Internet Services, Inc.

lucy24




msg:4542833
 6:33 am on Feb 6, 2013 (gmt 0)

Verbatim:

Mozilla/5.0 (Windows;) NimbleCrawler 1.12 obeys UserAgent NimbleCrawler For problems contact: crawler@health

Mmmwell... For a given definition of "nimble", anyway ;)

lucy24




msg:4544027
 2:38 am on Feb 9, 2013 (gmt 0)

Verbatim-- or rather, litteratim-- again:

204.236.138.148 - - [08/Feb/2013:00:39:40 -0800] "GET /robots.txt HTTP/1.0" 200 1005 "-" "Web front page analyser. robots.txt complaint (norw.acd.inst@gmail.com)"

I can't decide whether I do, or do not, want that to be a typo :(

dstiles




msg:4544242
 10:34 pm on Feb 9, 2013 (gmt 0)

Well, 204.236.128/17 is amazon aws and anything with a gmail address is automatically suspicious in my book... Kill it. :)

keyplyr




msg:4544249
 11:09 pm on Feb 9, 2013 (gmt 0)




Well, 204.236.128/17 is amazon aws and anything with a gmail address is automatically suspicious in my book... Kill it. :)

It requested robots.txt. I allow *almost* everything to get robots.txt, even the Amazon ranges so when I looked at this post yesterday, I figured she did also.

lucy24




msg:4544284
 10:09 am on Feb 10, 2013 (gmt 0)

The question is academic, because it didn't ask for anything else after robots.txt. (I checked. I do have the range blocked.) And I didn't hear any complaints about it either.

dstiles




msg:4544348
 7:34 pm on Feb 10, 2013 (gmt 0)

Print and frame it! An AWS bot that obeys robots.txt! :)

blend27




msg:4544401
 12:56 am on Feb 11, 2013 (gmt 0)

Print and frame it! An AWS bot that obeys robots.txt! :)

Chances are that my cat's talking to me will make sense, which to this days sounds MYAU to me.
web
BTW, Have anybody heard of reliable myau translator web service?

lucy24




msg:4573348
 9:51 pm on May 12, 2013 (gmt 0)

109.78.198.49 - - [12/May/2013:07:39:51 -0700] "GET /hovercraft/images/wormapple.jpg HTTP/1.1" 200 32565 "-" "rarely used"

I expect this is perfectly true.

:: detour to raw logs ::

75.108.158.236 - - [11/May/2013:19:20:16 -0700] "GET /rats/images/ourhouse/LivRm5.jpg HTTP/1.1" 301 600 "-" "rarely used"
75.108.158.236 - - [11/May/2013:19:20:16 -0700] "GET /boilerplate/sorry.html HTTP/1.1" 200 1441 "-" "rarely used"


Huh. Fancy that.

:: detour to confirm hunch that these are Ukrainian IPs ::

Nope. They're not even the same country. What gives?

dstiles




msg:4573663
 7:37 pm on May 13, 2013 (gmt 0)

109.78.198.49 is vodafone Ireland.

75.108.158.236 is SuddenLink US - all 75.n.n.n are (basically) Arin (USA, Canada etc).

So likely compromised machines on DSL lines running a scan of some kind.

lucy24




msg:4573718
 9:04 pm on May 13, 2013 (gmt 0)

Yeah, the 75.108 threw me because I personally know people there; it's one of the local ISPs. But the UA is, uhm, rarely seen ;)

This 42 message thread spans 2 pages: < < 42 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
WebmasterWorld ® and PubCon ® are a Registered Trademarks of Pubcon Inc.
© Pubcon Inc. 1996-2012 all rights reserved