I am trying to write a recipricol link checker for my website. Am having a hard time figuring out how to pull URL's out of a file. I can get the webpage I need to check, search through it etc. But need to pull URL's so that I can do recursive checks on sub pages.
Any ideas?
Using Preg_split I can split the file by '/http:\/\//i' to get to the links, but cannot figure out how to strip everything after the end of the URL and keep the link.
Thanks in advance,
RC