Welcome to WebmasterWorld Guest from 54.147.189.54

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

PediaSearch.com Crawler

     
10:35 pm on Jul 27, 2006 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:5805
votes: 64


While I do get traffic from Wikipedia articles that reference my webpage content, this has to be a violation of my copyright. From what I understand, this bot follows these links, scrapes your site's content, creates a book and sells it to the user.

85.214.51.184 - - [27/Jul/2006:10:53:39 -0400] "GET / HTTP/1.0" 200 10956 "-" "PediaSearch.com Crawler"

6:13 pm on July 29, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:June 3, 2002
posts:566
votes: 0


and it does not fetch robots.txt

deny from 85.214.

10:01 pm on July 29, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Sept 21, 2005
posts:379
votes: 0


It did ask for robots.txt when it visited one of our sites, but it remains to be seen whether it honours it or not - as I'd never heard of it previously.

pediapress.com - - [24/Jul/2006:03:45:48 +1000] "GET /robots.txt HTTP/1.0" 200 1930 "-" "PediaSearch.com Crawler"
pediapress.com - - [24/Jul/2006:03:45:50 +1000] "GET / HTTP/1.0" 200 9489 "-" "PediaSearch.com Crawler"

10:57 pm on July 29, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


and it does not fetch robots.txt

deny from 85.214.

Yo Jan!
Get with the program!

deny from 85. ;)

Have you changed locales yet and did you purchase that "dingy" I suggested for local transportation?

Best Don

10:59 pm on July 29, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 1, 2004
posts:607
votes: 0


From the website, only genuine Wikipedia articles are printed, which is perfectly legal provided they follow the GFDL. However it does say "PediaPress suggests further articles based on what's in your book" - maybe they scrape the external links from Wikipedia articles?

The company behind it is located in Germany.

11:20 pm on July 29, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


only genuine Wikipedia articles are printed

Isn't that an oxymoron?
It's a user based forum.

I've seen some of these pages that provide "not so accurate" information.

You or I, or anybody may go to any page and add inaccuracies should we desire and there are no controls to prevent such a thing (with the exception of another user updating our change later).

11:47 pm on July 29, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 1, 2004
posts:607
votes: 0


Wilderness, I mean "genuine" in the sense that the content printed is (according to the bot's site) genuine, original and legal Wikipedia content.

Whether that content is always accurate is another question; I think there's an ongoing thread elsewhere on WW about the rights and wrongs of Wikipedia.

1:47 am on July 30, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5408
votes: 2


zCat,
I realized what you meant and was just yanking your chain.

Don

7:19 am on July 30, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 1, 2004
posts:607
votes: 0


Ah, the forum software must have swallowed your smileys ;-).

Used to be quite active in Wikipedia but I think it's getting out of control