Forum Moderators: coopster

Message Too Old, No Replies

class for resolving uri's

         

mincklerstraat

11:27 am on Mar 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm beginning to write (may or may not complete) a script which allows you to browse info when you're offline - idea is to allow for a lot of extra fancy functionality that can be added to the original html source. First target is the php manual (of course), but I'd like to make it more flexible so it can handle other manual-like content that's freely available on the web. So I need it to be able to grab the images on a page and cache them, whose paths are possibly modified with the base href tag.

So I'm looking for this tag, splitting it with explode, accounting for whether it begins with [,...] or '/', and it occurs to me: this is a whole lot of work, it's pretty standard http protocol stuff, and anyone who put some time into writing a class like this would have done a lot better at writing a class like this than I'd spend in the limited time I have.

Has anyone come across a class like this, or used one? Basically a class that can resolve relative uri's, or take multiple uris to put together the 'real' uri - like your basic page url, and relative uri's for the links and images etc.

ergophobe

1:00 am on Mar 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For relative URIs can't you get them by subtracting $_SERVER['DOCUMENT_ROOT'] out of the value returned by realpath()?

Anyway, I thought there was a PEAR class for doing complicated URI stuff, but I assume you've looked there already?

mincklerstraat

11:27 am on Mar 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



For relative URIs can't you get them by subtracting $_SERVER['DOCUMENT_ROOT'] out of the value returned by realpath()?

You can for files on your own box - but realpath doesn't work for remote files. Just subtracting won't work for directory traversal in the uri.
I thought there was a PEAR class for doing complicated URI stuff...

Yeah, you'd think, wouldn't you? This would be one of the most obvious of classes to have in PEAR. However, I've checked File System, HTML, File Formats, and Tools and Utitlities and haven't really found anything answering the description, so I thought I'd ask here.

What I'd like is a function or class that lets me take the url of a file, and then a url that it has in a <BASE HREF> tag, and put them together to come up with the base url that I need to use when grabbing relative uri's in the page. Same function, or related function, allows you to take the uri of the page or the uri modified with <BASE HREF>, and a relative uri from a link or image on the page, and returns the full url. This probably wouldn't be so hard to write if you really had brain processor resources left to use, and time to think about the various possibilities - I'm low on both now, so I know anything I'd write would be slipshod compared to a well-thought-out solution.

<edit: guess which page turns up first googling php class resolve uri? hint: it contains forum88 in the url>
<edit: ergophobe, you're right, I didn't check the big category 'networking' in PEAR, there's a class here Net_URL, hope it isn't too big and bulky since I might have to call this one lots in big loops)>

ergophobe

4:01 pm on Mar 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




big category 'networking' in PEAR, there's a class here Net_URL

I don't really frequent PEAR, but something in my addled brain was ringing there. Report back on what you find.

mincklerstraat

11:36 am on Mar 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Class NET_url has got a function for resolving uri's when there's directory traversing inside the uri - this is something realpath() can do for local files, but not for arbitrary uri's. I'll just have to write a function which merges a url and an uri to produce a url, with any special cases for <base href> I can think of (at the moment, I don't think there are any, except that <base href> can itself be a relative uri). The function is quite simple and straight-forward; the main weight off my back is that I know that this is a PEAR package, and exists inside a framework of rigid quality control, so it should be informed of the various exceptions or special conditions that I might have overlooked.