Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

Wikimpress - dumbest scraper project ever

1:34 am on Mar 15, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Dec 1, 2011
posts: 192
votes: 0

I just started being scraped by a bot with this Agent string:

"Mozilla/5.0 (compatible; U; Linux i686 (x86_64); de-DE; <a href=http://wikimpress.org/>Wikimpress</a>) Wikimpress/1.0

Checked out the wikimpress.org site. This what I found out. Wikimpress.org is a totally foolish, illegal content scraper project, that per the German site's description plan to gather the whole WWW and especially social media related content into a single Wiki.

They state it is a commercial project (+ they plan to show ads) and all content will be released under Creative Commons license. (Huh? Steal copyrighted content all over the world, and re-release it under CC?)

Among other they also state that

Wikimpress is not related to Wikipedia. But we're using Wikipedia information under the CC-BY-SA license.

They are also scraping WikiPedia, and I see from their "random page" functionality that Wikipedia is already well represented.

Blocked, Blocked, Blocked.
2:38 am on Mar 15, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 4, 2001
votes: 29

I see they have no category for copyright law.

9:11 am on Mar 15, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
votes: 332

Did they come from

Plusserver, Germany -
9:34 am on Mar 15, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Dec 1, 2011
posts: 192
votes: 0

Pointing to ww7.netznutz.net, which might indicate that there is more than the one IP I have seen so far.

NetzNutz GmbH (translation: Net Nuts) is the supposed company that owns this site among a long list of other domains. Many of them leading to a for sale page, but other domains with various junk content.

Don't know if there is on useful site among them, but I did not see anything not from somewhere else.
Examples: a CD sales affiliate site (of the clone type), a CD Wiki site with imported information about CDs, the Wikimpress site, a site with some air pictures of Germany, and similar stuff.

Basically just random content places to either gather ads or advertise the domains for sale, but without having to create any content himself.

Similar to how the Wikimpress site is planned to gather the pages from all our sites (the whole WWWW and Social media. :) ) And then per the Wikimpress site's description along the way put ads around it all.
9:32 am on Mar 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Aug 18, 2010
posts: 49
votes: 0

I'm blocking the whole CIDR range.

Another nasty content theft (scraping) and image theft / hotlinking site is www.thesearchengine.net - which I caught in my hotlinking script.

The IP is cited in malware reports too: