|Script to Determine The Significant Web Page Image|
| 1:53 am on Mar 3, 2008 (gmt 0)|
I generate a list of web pages using a script I wrote.
What I need to do now is to grab the content of these pages, and to somehow determine the most significant image is on the page.
This will allow me to grab and resize the image for a thumbnail for a link to the site.
For example, if they were news articles, it would be the story's picture, not the newspaper's logo, or advertisments, or other images.
I'm dealing with a list of different, and changing websites.
Are there any scripts out there that does this? Does anybody have any recommendations?
| 2:40 pm on Mar 3, 2008 (gmt 0)|
I don't know of any but you could try the usual suspects, hotscripts, sourceforge, freshmeat
it does sound fairly specific though and the decision of which image is mildly complex
| 2:45 pm on Mar 3, 2008 (gmt 0)|
|and to somehow determine the most significant image is on the page. |
You will need to define more specifically exactly what makes any image on any given page most significant.
I believe you are going to end up writing some custom code for this particular project.
| 7:43 pm on Mar 3, 2008 (gmt 0)|
I think this is going to be a really complex thing to code, since it won't always work however good you can code it. The only guys who are coding stuff like that are programmers at google/yahoo/msn and other search engine companies, and even their code is not perfect, far from that...
If you need that just for taking out the picture which would tell you what the website is about, i'd suggest you take thumbnails of whole websites, which is easy to code.
| 2:08 am on Mar 5, 2008 (gmt 0)|
It would be really complex, something way beyond me. That's why I was hoping for a script somebody might have already developed.
I decided to go a different approach, and make my own script that will just work for the ten most frequent sites on my list. They provide most of the results anyways.