homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum


 9:32 pm on Oct 19, 2000 (gmt 0)

who owns this little sucker?



 1:03 am on Oct 20, 2000 (gmt 0)

It could be anybody, it's a personal link validator and can scan pages for a multitude of things including 404's, java script, dead links, etc.

mark roach

 11:06 am on Oct 20, 2000 (gmt 0)

Has anyone got an example of a bit of perl that might do this. I would like to feed a program a list of URLs to firstly check they exist and secondly check if they have a link back to me. A bit like a cut down version of Brett's Sim Spider :)


 1:36 pm on Oct 20, 2000 (gmt 0)

thanks air! at least it isn't a download tool! ;)


 1:57 am on Oct 21, 2000 (gmt 0)

Mark, are you looking for a bot that will go through an entire web site or just a script that will extract links from one page?


 10:26 am on Oct 21, 2000 (gmt 0)

The script would take a list of URL from a database and firstly verify that they exist, secondly it would check that there was a link back to my site on the page.

I have found a snippet of code that will nearly do the job and am currently coding up the solution. I will post it up here if anyone else is interested.


 2:53 pm on Oct 21, 2000 (gmt 0)

Okay, if you are on it that's cool. Otherwise, I have code that will do that. I'd have to combine it in the right way for you, though. That is why I asked if you were looking for something that would crawl a site (bit more coding) or just check a single url from each site. It is always nice to look at someone else's code, so if you are interested in posting it I'd be interested in seeing it.

Are you using HTML::LinkExtor or doing it by parsing the html yourself?


 2:08 pm on Oct 23, 2000 (gmt 0)

I used HTML::LinkExtor (when I finally got it to work!). This was my first attempt a writing a spider. I reckon it would only take a couple more lines of code to get it trawl the whole web looking for links to my site. Get ready to add "Champdogs Link-Validator" to your robots.txt :)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved