homepage Welcome to WebmasterWorld Guest from 54.225.1.70
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
goo goo GET GET
lucy24




msg:4539232
 12:31 pm on Jan 25, 2013 (gmt 0)

Is this the world's most boring robot or what? I looked them up and all I could find were threads dating from 2005-06.

DoCoMo/2.0 P900i(c100;TB;W24H11) (compatible; ichiro/ mobile goo; +http://search.goo.ne.jp/option/use/sub4/sub4-1/)

Does anyone hereabouts read Japanese? I sure hope the "meta tag" section means "these are the commands we honor" because I found them snuffling around the no-indexed Panda Page one day ;) I don't suppose any amount of Japanese would shed light on that /sub4/ business* though.

They caught my notice recently because they gobbled up six consecutive pages-- plus robots.txt-- in a mere eight minutes. (The first six pages in one directory. They'd picked up the index page a couple days earlier. Did it take the computer two days to decide that there might possibly be something interesting here?) For goo, that counts as a rampage. Normally it's robots.txt, one page, don't see them again for a week.


* They've either got some meticulous rewriting or those are real, physical directories. I navigated back upward one by one.

 

incrediBILL




msg:4539393
 11:07 pm on Jan 25, 2013 (gmt 0)

Does anyone hereabouts read Japanese?


Chromes does, Google translate does, and there's some FF add-ons that add translation to FF so I'd try one of those. I use them daily, they work great :)

keyplyr




msg:4539407
 1:07 am on Jan 26, 2013 (gmt 0)

I get a small, but measurable amount of traffic from them, mostly mobile (as the name implies) but non-mobile as well.

lucy24




msg:4539491
 1:50 pm on Jan 26, 2013 (gmt 0)

I use them daily, they work great

How do you know? Have you checked them repeatedly against languages you know well? That goes double for non-Indo-European languages where they can't simply go word-for-word. Look only at Google In Your Language to see the level of cluelessness they're working with.

I'm prepared to take g###'s word for it that such-and-such Russian word means "rat", because it fits what I find in logs. But in general I wouldn't trust a machine translation unless I had some independent evidence that they're reliable.

dstiles




msg:4539569
 10:36 pm on Jan 26, 2013 (gmt 0)

I've allowed Japanese search engine ichiro for several years. At some point I found an English version of their bot text.

lucy24




msg:4539579
 12:12 am on Jan 27, 2013 (gmt 0)

At some point I found an English version of their bot text.

I know the feeling. I once found the English version of the toscrawler's text-- but it definitely wasn't done by clicking a flag icon on the page referenced in their UA :)

They seem to be harmless anyway, so I've added them to the Ignore list.

keyplyr




msg:4539589
 4:01 am on Jan 27, 2013 (gmt 0)




They seem to be harmless anyway, so I've added them to the Ignore list.

It's what's done with the data after they sell it I worry about.

incrediBILL




msg:4539596
 7:50 am on Jan 27, 2013 (gmt 0)

How do you know? Have you checked them repeatedly against languages you know well?


They display English that I can read vs. symbols I can't comprehend, so that's pretty good IMO. Translations are often grammatically poor but you get the basic information which is all I need to figure out what the bot is all about.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved