Forum Moderators: open

Message Too Old, No Replies

Search Spider to search inside pdf files?

Any sfree pider to search pdf files of a website?

         

financialhost

11:55 am on Oct 27, 2006 (gmt 0)

10+ Year Member



This seems like the most appropriate forum for this kind of question.

I am looking for a free spider i could use in order to scan a website's pdf files looking for the text "widgets".

Anyone know of one like this which exists?

Any help appreciated.

Thanks

wilderness

3:13 pm on Oct 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your reference to "their" and NOT "yours" in this thread:
[webmasterworld.com...]

gives the impression that it's anothers website?

Perhaps the reason the other websites PDF's are not indexed by google is because the site doesn't desire them to be indexed. (My own PDF's are stored in an image folder and excluded in robots.text. Anybody who begins crawling them is denied access.)

PDF's come in a variety of formats (text and/or image).
I archive some both ways. It's really dependent upon what I'm attemtping to accomplish.
Text saved as a PDF image is generally not searchable.

Best solution that you may have is to download all the PDF's from the "others" website and search them offline. There are many tools for this.