homepage Welcome to WebmasterWorld Guest from 54.197.94.241
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
need help with .htaccess
jack13580




msg:4569350
 2:58 pm on Apr 30, 2013 (gmt 0)

I am making a url parsing application and ran into a problem

when I curl a fake domain to test if my application for example 1.com can tell if its fake, it says its real and returns a 200 header code

the conclusion I came up with is that the webserver must be redirecting to a page not found like if I put that same fake url into my browser it takes me to a Verizon search page



is there anyway to catch this using .htaccess and redirect to my own custom error page?

 

lucy24




msg:4569457
 10:01 pm on Apr 30, 2013 (gmt 0)

need help with .htaccess

Snappy reply: Don't we all :)

Now then... Your question was a bit fuzzy, so I'm going to guess about what's happening.
returns a 200 header code

Do you mean
#1 the server sends out a 200
or
#2 the recipient gets a 200
?

Most of the time they are the same. But if the process involves sending a request to a preliminary file which then returns a header, your server will always show a 200. The obvious example is a php page which has to process some parameters before showing any content. If the parameters are bad, the resulting 404 or 403 comes from the php, not the server.

This in turn means the user won't get an error page, because the server does not know there has been an error. So you need to include the content of your error page. Do not redirect; show the content of the page at the originally requested URL.

phranque




msg:4569513
 2:35 am on May 1, 2013 (gmt 0)

your .htaccess file can only have an effect on requests that go to your server.
it is unlikely that the request you describe will resolve to your own server.

i would expect something more like this request & response:
GET http://1.com/
500 Can't connect to 1.com:80 (Bad hostname)

lucy24




msg:4569521
 3:11 am on May 1, 2013 (gmt 0)

Oops. Did I read your question backward?

if I put that same fake url into my browser it takes me to a Verizon search page

If you can come up with a simple tool that can distinguish mechanically between

-- typo or alias domains that redirect to a different domain of the human owner's choice
and
-- parked domains that either redirect or show content from the dragon in residence

... well, you'll be able to retire to Aruba next year. Identifying domains that aren't registered at all is trivial by comparison: your robot will go the rounds of DNS and will come up empty-handed. But it sounds as if that isn't what you're looking for.

jack13580




msg:4569560
 7:42 am on May 1, 2013 (gmt 0)

I'm going to re explain this

My application parses user in putted urls and determines if they are real or not by if it is returning a 200 header code only, using php and curl


The problem comes when I use a fake url like 1.com for example
My application gets a 200 header code returned even though it doesn't exist
Through research I have determined that my web host might be redirecting that fake url that is being parsed by curl to one of their error pages

Would there be any way to catch this and have it handled by my own custom error page?

Another example of what my web host might be doing is when I enter that fake url myself into my browser and end up getting redirected to a Verizon search page saying that url could not be found

phranque




msg:4569711
 4:44 pm on May 1, 2013 (gmt 0)

if i had to guess it's not your web host doing this.
verizon is probably your ISP and when the DNS lookup fails they are hijacking your request and returning their search page.
do a search on "verizon NXDOMAIN response hijacking".

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved