homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Wikimpress - dumbest scraper project ever

Msg#: 4429323 posted 1:34 am on Mar 15, 2012 (gmt 0)

I just started being scraped by a bot with this Agent string:

"Mozilla/5.0 (compatible; U; Linux i686 (x86_64); de-DE; <a href=http://wikimpress.org/>Wikimpress</a>) Wikimpress/1.0

Checked out the wikimpress.org site. This what I found out. Wikimpress.org is a totally foolish, illegal content scraper project, that per the German site's description plan to gather the whole WWW and especially social media related content into a single Wiki.

They state it is a commercial project (+ they plan to show ads) and all content will be released under Creative Commons license. (Huh? Steal copyrighted content all over the world, and re-release it under CC?)

Among other they also state that

Wikimpress is not related to Wikipedia. But we're using Wikipedia information under the CC-BY-SA license.

They are also scraping WikiPedia, and I see from their "random page" functionality that Wikipedia is already well represented.

Blocked, Blocked, Blocked.



WebmasterWorld Senior Member 10+ Year Member

Msg#: 4429323 posted 2:38 am on Mar 15, 2012 (gmt 0)

I see they have no category for copyright law.



WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Msg#: 4429323 posted 9:11 am on Mar 15, 2012 (gmt 0)

Did they come from

Plusserver, Germany -


Msg#: 4429323 posted 9:34 am on Mar 15, 2012 (gmt 0)

Pointing to ww7.netznutz.net, which might indicate that there is more than the one IP I have seen so far.

NetzNutz GmbH (translation: Net Nuts) is the supposed company that owns this site among a long list of other domains. Many of them leading to a for sale page, but other domains with various junk content.

Don't know if there is on useful site among them, but I did not see anything not from somewhere else.
Examples: a CD sales affiliate site (of the clone type), a CD Wiki site with imported information about CDs, the Wikimpress site, a site with some air pictures of Germany, and similar stuff.

Basically just random content places to either gather ads or advertise the domains for sale, but without having to create any content himself.

Similar to how the Wikimpress site is planned to gather the pages from all our sites (the whole WWWW and Social media. :) ) And then per the Wikimpress site's description along the way put ads around it all.


Msg#: 4429323 posted 9:32 am on Mar 16, 2012 (gmt 0)

I'm blocking the whole CIDR range.

Another nasty content theft (scraping) and image theft / hotlinking site is www.thesearchengine.net - which I caught in my hotlinking script.

The IP is cited in malware reports too:


Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved