Forum Moderators: coopster
Not knowing the best way to check the headers returned by a URI on my own site, I tried FSOCKOPEN. The 404 error page URL decoded my test URI and offered the correct link. However, I realised that truly non-existent pages would create an endless loop because the 404 error page would check the URI again and result in another 404 error page being started, and so on...
(Don't I feel stupid! ;) )
I'm using mod_redirect and so file_exists() is not an option. Is there a way to do what I'm trying to achieve without creating an endless loop?
I have a "virtual" URL (http://www.example.com/city-centre/banker's-draft/) which mod_rewrite rewrites to http://www.example.com/search.php?area=city-centre&pub=banker's-draft The search script then searches the database based on the area and pub name provided.
That's all well and good. But a small number of spiders and browsers (e.g. Opera) URL encode the apostrophe and request [example...]
My mod_rewrite is set up to match a-z, 0-9, ampersands, full stops (periods) and hyphens like so:
RewriteRule ^([a-z-]+)/([a-z'0-9&\.-]+)/$ search.php?area=$1&name=$2 [NC,L] When a request is made that has been URL encoded, mod_rewrite fails to make a match and so a 404 error is generated.
I thought that it would be a good idea to use a custom 404 error page (a PHP script) to URL decode the $REQUEST_URI and open a socket to see if the decoded URL would be matched by mod_rewrite and send back a 200 OK header.
That proved successful for URLs that had been URL encoded (like in the example above). However, for truly non-existent files (e.g. http://www.example.com/load_of_junk.htm) a loop would be created. This was because of my faulty logic!
Upon requesting http://www.example.com/load_of_junk.htm a 404 would be returned and the 404 script would URL decode that URI and try to request the decoded URI again. As the page really doesn't exist, this would trigger another 404 error and another instance of the script would URL decode the address and request it again... And so this would theoretically go on forever.
All I really want to do is check whether a URL will be matched with mod_rewrite and return a 200 OK header without triggering an endless chain of 404s!
Sorry for the long post and I hope that it makes more sense now!