Forum Moderators: coopster
O.k., here's what I'm trying to do:
We have a form and the client supplies a URL, among other things. The URL is really the point of the whole form, so I want to do some validation on the submitted URL to ensure that its 1) syntactically correct and then 2) really exists. 1) is not a problem - easily done.
What I've tried for 2):
multiple PHP scripts to retrieve the header information returned by a request for the URL,etc. I've even now tried CURL/PHP techniques.
The problem:
When the ( syntactically correct) URL clearly does not exist (e.g. pasting it into the address of a browse returns correct 4xx or 5xx HTTP Status codes - e.g. Server not Found, Document not found, etc), the host of my website is returning a "200 OK" (or sometimes some 3xx codes followed by 200 OK) pointing to some error splash page : "Document/Server not found". This is precisely what I don't want - i want the 4xx or 5xx Status code from the header of the requested URL so that we can prompt the client to correct their URL submission. I have contacted our host but, as of yet, no solution forthcoming.
I know this is a common problem, but I have yet to find a solution. Anyone have any ideas?
maybe look at this one
[zend.com...]
maybe take a look at fsockopen [php.net] for the second part, the example there looks like it may be just right.
you could look at get_headers [php.net] as well, though the whole URL Functions [php.net] portion of the manual is interesting
I'll repeat guts of the problem (whether I use fsocketopen(), get_headers, curl/php or whatever):
When the ( syntactically correct) URL clearly does not exist (e.g. pasting it into the address of a browse returns correct 4xx or 5xx HTTP Status codes - e.g. Server not Found, Document not found, etc), the host of my website is returning a "200 OK" (or sometimes some 3xx codes followed by 200 OK) pointing to some error splash page : "Document/Server not found". This is precisely what I don't want - i want the 4xx or 5xx Status code from the header of the requested URL so that we can prompt the client to correct their URL submission.
So basically, my host is hijacking some 4xx and 5xx header responses and returning a false 200 OK to my scripts. Usually these are for urls like [thisdoesntexist.com...] (ie no path or document)