Welcome to WebmasterWorld Guest from 54.147.44.93

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Wikimpress - dumbest scraper project ever

     

DeeCee

1:34 am on Mar 15, 2012 (gmt 0)



I just started being scraped by a bot with this Agent string:

"Mozilla/5.0 (compatible; U; Linux i686 (x86_64); de-DE; <a href=http://wikimpress.org/>Wikimpress</a>) Wikimpress/1.0

Checked out the wikimpress.org site. This what I found out. Wikimpress.org is a totally foolish, illegal content scraper project, that per the German site's description plan to gather the whole WWW and especially social media related content into a single Wiki.

They state it is a commercial project (+ they plan to show ads) and all content will be released under Creative Commons license. (Huh? Steal copyrighted content all over the world, and re-release it under CC?)

Among other they also state that

Wikimpress is not related to Wikipedia. But we're using Wikipedia information under the CC-BY-SA license.


They are also scraping WikiPedia, and I see from their "random page" functionality that Wikipedia is already well represented.

Blocked, Blocked, Blocked.

Marshall

2:38 am on Mar 15, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I see they have no category for copyright law.

Marshall

keyplyr

9:11 am on Mar 15, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month





Did they come from 188.138.104.220?

Plusserver, Germany
188.138.0.0 - 188.138.127.255
188.138.0.0/17

DeeCee

9:34 am on Mar 15, 2012 (gmt 0)



IP: 188.138.104.220
Pointing to ww7.netznutz.net, which might indicate that there is more than the one IP I have seen so far.

NetzNutz GmbH (translation: Net Nuts) is the supposed company that owns this site among a long list of other domains. Many of them leading to a for sale page, but other domains with various junk content.

Don't know if there is on useful site among them, but I did not see anything not from somewhere else.
Examples: a CD sales affiliate site (of the clone type), a CD Wiki site with imported information about CDs, the Wikimpress site, a site with some air pictures of Germany, and similar stuff.

Basically just random content places to either gather ads or advertise the domains for sale, but without having to create any content himself.

Similar to how the Wikimpress site is planned to gather the pages from all our sites (the whole WWWW and Social media. :) ) And then per the Wikimpress site's description along the way put ads around it all.

MxAngel

9:32 am on Mar 16, 2012 (gmt 0)

5+ Year Member



I'm blocking the whole CIDR range.

Another nasty content theft (scraping) and image theft / hotlinking site is www.thesearchengine.net - 188.138.118.19 which I caught in my hotlinking script.

The IP is cited in malware reports too:

[malc0de.com...]
[threatexpert.com...]
 

Featured Threads

Hot Threads This Week

Hot Threads This Month