homepage Welcome to WebmasterWorld Guest from 54.211.73.232
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
PHP app causing a problem with 404 error reporting
using x-http-ph5 in .htaccess file but it is allowing junk to be added
nstokes



 
Msg#: 4664774 posted 1:06 am on Apr 22, 2014 (gmt 0)

A long time ago I added to my .htaccess file
AddHandler application/x-httpd-php5 .php .html .htm .shtml

So I could load my menus and footer with a simple piece of code in my web pages. This works great but I have noticed since moving to a new host that I am seeing weird files like:
http://example.com/amsoil/filters/air-filters-main.html/7_3-powerstroke-air-filter.html

Notice this is actually showing two URL's? It also does not report a 404 error. As a result my site is dropping in traffic like a rock, googles daily impressions is dropping like a rock.

I have tried to get support form the webhost but they say it is a problem with my .htaccess, fact is if I remove:
AddHandler application/x-httpd-php5 .php .html .htm .shtml from the .htaccess I lose my menus and footers but I do get a 404 error for files listed like the one above, reinstall the code and I lose the correct 404 error msg. In short they are no help. To further complicate this, I am in no way a PHP kind of guy, it is greek to me but I desperately need this fixed

Can anyone help me figure this out, this is my 1st post so I hope I have supplied enough detail. thanks
Norm

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4664774 posted 1:38 am on Apr 22, 2014 (gmt 0)

Let's backtrack to what you want to do and then work out how to do it. Are the header and footer coded as php includes or as SSIs?

For posterity (because the link will probably be deleted soon):

The resulting URLs look like this:
http://example.com/dir1/dir2/pagename.html/other-file-name.html
If you want to study your logs for clues, look at the timestamp of this post and then scan for user-agent "Camino". That was me. See if there are any logged errors.

Is "other-file-name.html" -- the part after ".html/" -- named in the page code, for example in an SSI reference?

Quick edit: If it's a php include, you won't find a 404 error. Do you have access to your php logs? On shared hosting, you might not. In fact, even an ordinary SSI probably won't show up in regular access logs, though it should appear in error logs as two lines: one for "can't find file" and one for "unable to include".

nstokes



 
Msg#: 4664774 posted 1:53 am on Apr 22, 2014 (gmt 0)

What I have in my web pages is code such as:
<?php include($_SERVER['DOCUMENT_ROOT'].'/_inc/top_nav_main.shtml'); ?>
To make this work I have:
AddHandler application/x-httpd-php5 .php .html .htm .shtml
in my .htacess file.
The file /_inc/top_nav_main.shtm is nothing more than an unordered list that is configured by CSS to make the menu at the top.

Google webmaster tools is now showing trash like:
Pages with duplicate meta description:
AMSOIL Air Filters Eaa provide the Absolute Air Filtration Available proividing clean air with max a
/amsoil/air-filter/
/amsoil/air-filter/Air-filters-main.html
/amsoil/air-filter/filter-main.htm
/amsoil/air-filter/mann-air-filters.htm
/amsoil/filters/air-filters-main.html/7_3-powerstroke-air-filter.html
/amsoil/filters/air-filters-main.html/Air-filters-main.html
/amsoil/filters/air-filters-main.html/Ea-Motorcycle-Air-Filters.html
/amsoil/filters/air-filters-main.html/mann-air-filters.htm
/amsoil/filters/air-filters-main.html/powercore-air-filters.html
/amsoil/filters/air-filters-main.html/racing_air_filters.html
/amsoil/filters/air-filters-main.html/twin-air.html

These are clearly invalid url's but if you got to http://example.com/amsoil/filters/air-filters-main.html/Ea-Motorcycle-Air-Filters.html
You will see the 1st url loads and the status returned is 200 OK which is wrong. If I remove the app from the .htaccess file I will lose the menu but the 404 error will trigger correctly. Where this concatenation of url's is coming from I do not know but google is really dropping me in my indexing fast since this has started showing up.
Does this explanation help?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4664774 posted 2:34 am on Apr 22, 2014 (gmt 0)

the status returned is 200 OK which is wrong

Actually, it's right. Has to do with the way different filetypes handle trailing path info (the part after the extension). But that's no comfort to you here.

To make this work I have:
AddHandler application/x-httpd-php5 .php .html .htm .shtml

Meaning that your files have .html extension but you're parsing them all as php? But the sole purpose of the php is to add those includes?
<tangent>
If so, is this really the most efficient way to do it? I'd have thought an SSI would use fewer resources.
</tangent>
if I remove:
AddHandler application/x-httpd-php5 .php .html .htm .shtml
from the .htaccess I lose my menus and footers but I do get a 404 error for files listed like the one above

If you remove the line, your .html pages are no longer being parsed as php and the path-info rules change. But that doesn't address the underlying problem.

I see three things:
the "real" or "base" URL
the added stuff after the .html extension
the material to be included

Some cursory experimenting (me again in your logs!) suggests that the added parts exist as filenames in their own right. So what you've got, legitimately, is
example.com/dir/file1.html
and
example.com/dir/file2.html
which is somehow turning into
example.com/dir/file1.html/file2.html

Who's requesting these bogus pages? Only the googlebot, or other visitors as well? One thing you can do is add a line to htaccess. Assuming you've already got mod_rewrite in place:

RewriteRule ^([^.]+\.html). http://www.example.com/$1 [R=301,L]

Don't cut and paste: that's one possible wording, and may not be optimal for your site. The idea is simply to grab any request with stuff after "html" and forcibly redirect to a form without the added stuff.

I don't see the connection between the included files and the garbage URLs. I kinda suspect there are two separate and unrelated issues. A php include of the kind you're using is pretty generic-- not the kind of thing you can blame on a different host using a different php version.

nstokes



 
Msg#: 4664774 posted 2:47 am on Apr 22, 2014 (gmt 0)

I am thinking it would be a valid ideal to try and make a change to SSI and drop the php include all the way around. 1st attempts to get ssi to work failed but I am researching. As far as who is requesting these pages, I have no clue at this point. I do appreciate the help. Back to the drawing board, I need to figure out a solution

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved