Forum Moderators: open

Message Too Old, No Replies

Gowikibot

         

keyplyr

2:40 am on Oct 15, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



UA: Mozilla/5.0 (compatible; Gowikibot/1.0; +http://www.gowikibot.com)
Protocol: HTTP/1.1
Robots.txt: No, possibly on an earlier visit.
Host: gowikibot.com
63.224.101.56 - 63.224.101.63
63.224.101.56/29
Host: Qwest ISP (centurylink.com)
67.0.0.0 - 67.7.255.255
67.0.0.0/13

Odd this SE upstart gowiki.com would use an ISP range, but willing to give it a pass while it's in beta.

dstiles

10:39 am on Oct 15, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just got this, first time, from 67.2.242.*** which looks to be a dynamic IP - 67.2.242.***.slkc.qwest.net. I'm in the UK. It his 22 times 2 minutes.

Is this really a distributed bot?

Looking at its source, github, it appears to be something anyone can use.

[edited by: keyplyr at 6:02 pm (utc) on Oct 15, 2017]
[edit reason] obscured private ip address [/edit]

keyplyr

10:50 pm on Oct 15, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Looking at its source, github, it appears to be something anyone can use.
Are you looking at the golang github page which has a wiki defining the various elements?
Welcome to the Go wiki, a collection of information about the Go Programming Language. Awesome Go is another great resource for Go programmers, curated by the Go community.
github.com/golang/go/wiki


If so, IMO that's a different animal from the link supplied in the UA string which points to:
Gowikibot is the web crawler for the Gowiki search engine which is currently in development. Our bots are identified by the following user-agent:

Mozilla/5.0 (compatible; Gowikibot/1.0; +http://www.gowikibot.com)

Our bots respect the robots.txt instructions, the meta robots tag, and the link rel follow/nofollow directives.

In the robots.txt file, our bots will respond to either User-agent: Gowikibot or User-agent: gowikibot or, if neither is present, User-agent: *.

We attempt to not burden web servers by only crawling a few pages of a site at a time and by using a reasonable crawl delay between each request.

At this time, we only crawl html and pdf files, as well as robots.txt files. We appreciate your understanding and you continuing to allow access to our web crawlers.
gowikibot.com

Although... this upstart SE may in fact be using the golang programming.

TorontoBoy

6:26 pm on Oct 16, 2017 (gmt 0)

5+ Year Member Top Contributors Of The Month



This scraper just hit me using 71.219.108.***. The info page offers no explanation of its intent.
http:/1.1
71.208.0.0 - 71.223.255.255
CIDR: 71.208.0.0/12
NetName: QWEST-INET-118
NetHandle: NET-71-208-0-0-1
Parent: NET71 (NET-71-0-0-0-0)
NetType: Direct Allocation
OriginAS:
Organization: Qwest

[edited by: keyplyr at 6:42 pm (utc) on Oct 16, 2017]
[edit reason] obscured private IP address [/edit]

keyplyr

6:45 pm on Oct 16, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Mod Note: Please only post ranges, not specific IP addresses

dstiles

6:13 pm on Oct 17, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



keyplyr : github[.]com[/]jmoiron[/]gowiki/
(my brackets)

TorontoBoy

6:35 pm on Oct 17, 2017 (gmt 0)

5+ Year Member Top Contributors Of The Month



github(.)com[/]golang[/]go[/]wiki: This is a compendium of information about the Go language
Welcome to the Go wiki, a collection of information about the Go Programming Language. Awesome Go is another great resource for Go programmers, curated by the Go community.

github(.)com[/]jmoiron[/]gowiki: This is wiki, encyclopedia or web site software with its own web server.
Gowiki is a single-file single-executable wiki which runs its own webserver.

Neither of these has a bot that will go out and scrape content. I don't think this bot is associated with these githubs.

keyplyr

9:39 am on Oct 18, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@dstiles - yes that seems to be the one.

Crawls like a SE would, not like a scraper.