homepage Welcome to WebmasterWorld Guest from 54.211.219.178
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Extentionless URLs stopped working unexpectedly, returning a 404 error
w3bmastine




msg:4686145
 7:30 pm on Jul 8, 2014 (gmt 0)

Hi Everyone,

today ran into the problem, that my life-long working .htaccess file stopped working with my hosting provider. I contacted them about this, but they tell me they cannot help. I assume my rules are outdated?!

What happens is, that index.html get's redirected as in rule 3, even rule 4 seems to work. http://example.com/test.html becomes http://example.com/test but - although test.html exists - I get an 404 error. This started today. Does anyone else run into this or a similar problem? Can anyone help?


# rewrite defaults
RewriteEngine on
RewriteBase /

#1 charsets
AddCharset UTF-8 .html

#2 trailing slashes
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^!/$ http://example.com/$1/ [R=301,L]

#3 redirect index.html to directory
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html\ HTTP/
RewriteRule ^(([^/]+/)*)index\.html$ http://example.com/$1 [R=301,L]

#4 force extensionless url's
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*[^/.]+\.html\ HTTP/
RewriteRule ^(([^/]+/)*[^/.]+)\.html$ http://example.com/$1 [R=301,L]

#5 canonical hostname
RewriteCond %{HTTP_HOST} !^example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

#6 prevent hotlinking
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?example\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^-?$
RewriteRule \.(jpe?g|gif|bmp|png)$ - [F]

#7 caching
<IfModule mod_headers.c>
# YEAR
<FilesMatch "\.(ico|gif|jpg|jpeg|png|flv|pdf)$">
Header set Cache-Control "max-age=29030400"
</FilesMatch>
# WEEK
<FilesMatch "\.(js|css|swf)$">
Header set Cache-Control "max-age=604800"
</FilesMatch>
# 45 MIN
<FilesMatch "\.(html|htm|txt)$">
Header set Cache-Control "max-age=2700"
</FilesMatch>
</IfModule>

#8 compression
<ifModule mod_deflate.c>
SetOutputFilter DEFLATE
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
SetEnvIfNoCase Request_URI \
\.(?:gif|jpe?g|png)$ no-gzip dont-vary
Header append Vary User-Agent env=!dont-vary
</ifModule>

#9 deny request based on request method
RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK|OPTIONS|HEAD)$ [NC]
RewriteRule ^.*$ - [F]

#10 blocking [403 error code]
order allow,deny
# hack-attempts
deny from 14.102.148.38
deny from 144.76.247.203
deny from 198.57.247.156
deny from 201.75.193.190
deny from 202.166.193.69
deny from 202.147.169.205
deny from 37.26.108.85
deny from 46.4.15.141
deny from 49.50.8.63
deny from 50.87.144.147
deny from 82.160.134.5
deny from 91.109.19.34
deny from 93.174.93.163
deny from 94.19.107.228
deny from 94.199.206.71
# ua-string: various
deny from 193.150.120.14
deny from 195.242.218.133
deny from 199.187.122.91
deny from 46.118.155.40
deny from 46.119.115.144
deny from 46.119.120.207
deny from 82.193.99.33
# ua-string: NerdyBot
deny from 107.178.209.42
deny from 107.178.217.202
# ua-string: Web Crawler
deny from 76.9.31.75
# ua-string: updown_tester
deny from 162.243.114.218
deny from 162.243.115.95
deny from 198.199.124.206
# ua-string: YisouSpider
deny from 42.156.136.54
deny from 42.156.139.110
# semalt.com bot
deny from 177.135.46.31
deny from 187.66.223.150
# seomon bot
deny from 176.9.101.134
# sogou.com
deny from 113.106.13.161
allow from all


Thank you in advance.

P.S.: I know my IP-blacklisting is not very sexy, but it worked. :) Please do not comment on these. Thank you.

 

phranque




msg:4686170
 8:36 pm on Jul 8, 2014 (gmt 0)

welcome to WebmasterWorld, w3bmastine!


have you checked for additional clues in the web server error log?
that should tell you the precise filepath apache is looking for.

lucy24




msg:4686172
 8:50 pm on Jul 8, 2014 (gmt 0)

Please do not comment on these.

If you don't want comments why did you include these lines in the post? Deny from.. is a different mod and therefore can't possibly affect RewriteRules.

:: now studying mod_rewrite section of htaccess to find the RewriteRule that rewrites extensionless URL to same URL plus ".html" ::

... and ...

I don't find it. Either you or someone else has deleted an essential rule.

Now, as long as we're here:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^!/$ http://example.com/$1/ [R=301,L]

There was no capture, so what is $1 supposed to refer to? In any case the form
^!/$
makes no sense; cursory experimentation reveals that ! in a non-pattern-initial position is interpreted as a literal exclamation mark. Did you let your cat edit your htaccess file again?

#6 prevent hotlinking
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?example\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^-?$
RewriteRule \.(jpe?g|gif|bmp|png)$ - [F]

#9 deny request based on request method
RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK|OPTIONS|HEAD)$ [NC]
RewriteRule ^.*$ - [F]

Rules in [F] go before all other RewriteRules. And they need to stay together, so you can find them later, instead of being mixed in with rules from other mods.

Edit: Why block HEAD requests? In fact I take the opposite tack and give them a free ride in mod_rewrite. All they're doing is checking whether the file exists-- for example a link checker when there's no fragment to follow up on.

encyclo




msg:4686195
 11:05 pm on Jul 8, 2014 (gmt 0)

Welcome to WebmasterWorld w3bmastine, as mentioned above there doesn't appear to be any method in the posted .htaccess to make the extensionless URLs resolve.

My guess is that a previous configuration of Apache by the hosting company permitted extensionless URLs - possibly with content negotiation - and they have changed this accidentally or deliberately.

Whilst it isn't necessarily an ideal permanent solution, try adding this to the .htaccess:

Options +MultiViews

Does it help?

lucy24




msg:4686210
 1:02 am on Jul 9, 2014 (gmt 0)

Options +MultiViews

Damn, that's brilliant. I normally think of MultiViews as handling "wrong" extensions, but it also works on extensionless. But not an ideal approach, since it means that every single time there's a 404, the server has to do further work.

I checked to see whether the Options setting had any significant changes that I'd forgotten about. But although 2.2 and 2.4 are significantly different, the former default "All" explicitly omits MultiViews. So this alone wouldn't make any difference. Unless, of course, the host replaced the config file and forgot to restore Options +MultiViews. And the Options default doesn't seem to have changed between 1.3 and 2.2. Though if the host only just upgraded from 1.3 -- now there's a prospect to make the blood run cold! -- I should think you'd notice other stuff changing. Even from 2.2 to 2.4.

In any case, if the host has just upgraded their server, I doubt they'd be managing a 24-hour turnaround on non-upgrade-related support issues ;)

But the
^!/$
business points strongly to the idea that OP's cat has been messing about with the htaccess file.

encyclo




msg:4686213
 1:21 am on Jul 9, 2014 (gmt 0)

But not an ideal approach


Sure, but getting the site working again always comes first, optimizing comes later... :)

w3bmastine




msg:4686466
 7:36 pm on Jul 9, 2014 (gmt 0)

To let you know... Options +MultiViews fixed my problem. Even my provider came back to me, telling me about changing this option.

Thank you for your help, encyclo, phranque, lucy24. And thank you for getting back to me.
---

@lucy24 / @encyclo- I want to come back to to your ideas / suggestions but will not be able to do so this week. As I'm new to this board, should I open a new thread or can we discuss your ideas in this thread?

phranque




msg:4686473
 8:54 pm on Jul 9, 2014 (gmt 0)

start a new thread, posting just the relevant part of the .htaccess.

a more focussed thread title and discussion is always helpful for future searches related to your problem.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved