Thumbshots.org provides thumbshots for DMOZ website. I think this feature is very cool.
I want to do the same thing for my link indexing site. Many websites in my directory are not in DMOZ so I cannot only use thumbshots.org's service.
I wonder if Perl can do this: Visit websites automatically and print(save) the homepage of the destination website in Image format.
I think first I need to have Perl visit those site, I know that can be realized by LWP.
Next, I wonder if I shall have perl save the homepage locally and then convert the html page into image format.
Regarding the 2nd step, I have no idea about how to achieve that. I searched Internet and did not find useful information about Convert HTML to Image in Linux.
I know there are some programs can convert HTML to PDF in linux. Can anybody give me a hint how to convert HTML to Image (GIF,PNG,JPEG) in linux?
Thanks a lot!
But I come to realize it is rather hard to do it. Like jk3210 & Josk said, to call browser (either in Windows or in Linux) and then "print" the screent can do that but that's may be slow and resource intensive.
Someone else tells me that if don't want to call browsers, then I have to write my own HTML parser to render the page, to read and then convert HTML directly into image. If that's true, then I think I'd better forget it.
Alexa.com and Thumbshots.org both provide thumbshots for millions of websites, maybe they have their own HTML parser?