homepage Welcome to WebmasterWorld Guest from 50.16.112.199
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

This 113 message thread spans 4 pages: < < 113 ( 1 2 3 [4]     
Trailing slash
qimqim



 
Msg#: 4647301 posted 11:47 pm on Feb 20, 2014 (gmt 0)

Very big, small problem... which I can't solve.

I have a login procedure in my site. from some of the pages after a successful login the visitor is redirected to the page he was viewing. All works fine except that when the page opens it has a slash at the end and the page does not sisplay properly until manuall you delete the slasj from the url

How can I do this in htaccess?

I tried RewriteRule ^(.*)/+$ $1 [R=301,L]
but got this:
http://example.net/home3/pintotou/public_html/Asia/Indonesia/bali.php

instead of this, which is what I want:

http://example.net/Asia/Indonesia/bali.php

without the htaccess code I am getting

http://example.net/Asia/Indonesia/bali.php


Thank you for your help

[edited by: phranque at 9:30 am (utc) on Feb 21, 2014]
[edit reason] Please Use example.com [webmasterworld.com] [/edit]

 

qimqim



 
Msg#: 4647301 posted 6:55 am on Feb 27, 2014 (gmt 0)

Hi Lucy

I'm sorry but can't understand what we are trying to achieve now with

Here it would be
^([^/]+/)*
or rather-- ahem, oops, my bad--
^(([^/]+/)*)
so the whole thing gets captured. Or, if you have no literal periods anywhere,
^([^.]*)index\.html



Could you give me the whole line, please, as it should be?

You didn't answe my question regarding #bandwidth theft. We have now taken off the www before instructing later in the code to make it obsolete. I would ahve thought this instruction (#badwidth...) should come after the index redirect.

Thanks

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4647301 posted 10:41 pm on Feb 27, 2014 (gmt 0)

Full rule:
RewriteRule ^(([^/]+/)*)index\.html http://www.example.com/$1 [R=301,L]

You can argue about whether the condition needs a full-scale
^[A-Z]{3,9}\ (([^/]+/)*)index\.html\ HTTP
or merely unanchored
index\.html
On at least some systems, you don't need the condition at all. All of mine simply use the [NS] flag.

I would have thought this instruction (#bandwidth...) should come after the index redirect.

Why, for heaven's sake? That means you're redirecting requests that will end up being locked out, so everyone's doing extra work. This is universally true when the rule has an [F] flag. But in the case of hotlinking rules, you would do it even if you're simply rewriting. Obviously duplicate content is not an issue. If they've hotlinked to the wrong form of your domain name, go ahead and serve up your "NO HOTLINKS" picture* wherever they asked for it.


* Mine's a garish black/cyan/magenta png that hurts my eyes just looking at it. The uglier the picture, the sooner people remove the hotlink.

qimqim



 
Msg#: 4647301 posted 4:03 am on Feb 28, 2014 (gmt 0)

Hi Lucy

I'm getting lost...

I suppose we are talking about #4a

#4a index redirect

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html
RewriteRule ^(.*)index.html$ http://example.net/$1 [R=301,L]

#4b domain-name canonicalization redirect

RewriteCond %{HTTP_HOST} !^(example\.net)?$ [NC]
RewriteRule ^(.*)$ http://example.net/$1 [R=301]



but the rule you sent me

RewriteRule ^(([^/]+/)*)index\.html http://www.example.com/$1 [R=301,L]


should be example.com and not www.example.com. No?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4647301 posted 12:42 pm on Feb 28, 2014 (gmt 0)

Probably. You know your own domain name better than I do ;)

qimqim



 
Msg#: 4647301 posted 12:52 pm on Feb 28, 2014 (gmt 0)

Hi Lucy

Could you check if everything is as it should be, please?

I am going away tomorrow, you'll be pleased to hear, so this thread will be quiet for a good 10 days.

#Use PHP5.4 Single php.ini as default

AddHandler application/x-httpd-php54s .php

#
AddType text/x-component .htc


#Do not allow access to the directories -For security reasons, Option followsymlinks cannot be overridden.

Options -Indexes +SymLinksIfOwnerMatch
RewriteEngine on


#1 # block visitors referred from indicated domains

RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} libwww-perl
RewriteRule .* [F]

#2 bandwidth theft
RewriteCond %{HTTP_REFERER} !^http://example\.net/

RewriteRule .*\.(jpe?g|gif|png|bmp)$ - [F]

#3 redirects from file that changed name

#3a
RewriteRule ^Pinto/oldindex\.html http://example.net/Pinto/oldindex.php [R=301,L]

# 3b
RewriteRule ^Asia/Indonesia/bali\.html http://example.net/Asia/Indonesia/bali.php [R=301,L]

# 3c
RewriteRule ^Asia/Indonesia/indonesia\.html http://example.net/Asia/Indonesia/indonesia.php [R=301,L]


#4a index redirect

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html
RewriteRule ^(([^/]+/)*)index\.html http://example.com/$1 [R=301,L

#4b domain-name canonicalization redirect

RewriteCond %{HTTP_HOST} !^(example\.net)?$ [NC]
RewriteRule ^(.*)$ http://example.net/$1 [R=301]

#5
# BEGIN EXPIRES
<IfModule mod_expires.c>
ExpiresActive On
ExpiresDefault "access plus 10 days"
ExpiresByType text/css "access plus 1 week"
ExpiresByType text/plain "access plus 1 month"
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType application/x-javascript "access plus 1 month"
ExpiresByType application/javascript "access plus 1 week"
ExpiresByType application/x-icon "access plus 1 year"
</IfModule>
# END EXPIRES


<ifModule mod_deflate.c>
<filesMatch "\.(js|css|html|php)$">
SetOutputFilter DEFLATE
</filesMatch>
</ifModule>

Dideved



 
Msg#: 4647301 posted 9:18 pm on Mar 3, 2014 (gmt 0)

Honestly, you were better off back when you had ^(.*/)index\.html$

When suggesting her alternative, Lucy is careful to say, "If your URL paths don't contain periods," because if there's a period anywhere in the path of any of your URLs, then her alternative pattern will fail for that case.

(.*), on the other hand, will work correctly for *all* URLs, not just some URLs. Correctness should be out #1 concern. The performance impact would have to be super significant to make us skimp on correctness. But it isn't significant. The alternatives proposed aren't even always faster, and when they are, it's by single-digit nanoseconds. Even for a micro-optimization, that's awfully micro.

Correctness and robustness are more important. Stick with that.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4647301 posted 10:33 pm on Mar 3, 2014 (gmt 0)

RewriteRule .*\.(jpe?g|gif|png|bmp)$ - [F]


When there's no opening anchor, there is no need for .* All you need in this rule is
\.(jpe?g|gif|png|bmp)$
(bmp? Really? I'm not surprised you want to block hotlinkers; those things are huge.)

RewriteRule ^(.*)$ http://example.net/$1 [R=301]

Here, conversely, there's no need for anchors. By default a Regular Expression will start as soon as it can and go on as long as it can:

RewriteRule (.*) http://example.net/$1 [R=301,L]

All redirects require the [L] flag. It isn't implied with [R] the way it is for [F] and certain other flags.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4647301 posted 1:51 am on Mar 4, 2014 (gmt 0)

Dideved, rather than taking random and disruptive potshots at general principles, i would ask that you address the OP's problem in a specific manner.
and having done so, i would then encourage you to take some responsibility to follow the thread to its conclusion.

qimqim



 
Msg#: 4647301 posted 2:23 pm on Mar 4, 2014 (gmt 0)

hi Lucy

I'm travelling and will not have a chance to amend the file til I get back on the 10th.

I would appreciate your comments in Dideveds post as I'm not sure which way the wind is blowing...

Many thanks, all

qimqim



 
Msg#: 4647301 posted 10:05 am on Mar 8, 2014 (gmt 0)

Hi

I'm returning home tomorrow, but meanwhile I found thta Google Search are returning
http://example.net/index.html/
and because of the trailing slash the url returns a 404 error.

How can the htaccess be modifies to redirect these type of urls with trailing slashes to one without them?

Thank you

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4647301 posted 1:36 pm on Mar 8, 2014 (gmt 0)

This has been a long thread. Was it the one where I said something so catastrophically wrong that I had to go hide for several days? Did I at some point ask if your filepaths contain literal periods? Or have you got wonky URLs that might have slashes and further business after the extension?

With a normal URL it's straightforward:

RewriteRule ^([^.]+\.html). http://www.example.com/$1 [R=301,L]

meaning "if there's any stuff of any kind after the 'html' extension, redirect". That's a dot meaning "any character"; doesn't matter if there are more because you're already redirecting. A query string doesn't count.

qimqim



 
Msg#: 4647301 posted 7:44 am on Mar 12, 2014 (gmt 0)

Hi Lucy

I am sorry, but as you probably gathered I am not conversant with htaccess Speak, and many of the posts are difficult for me to understand. For instance, I am not sure what you mean by "literal periods". Anyway, that I know, in my website code there are no "wonky URLs that might have slashes and further business after the extension". So i do not know why Google brings up a page like
http://example.net/index.html/
that returns a 404 error.

I am also still confused over you earlier post

Here, conversely, there's no need for anchors. By default a Regular Expression will start as soon as it can and go on as long as it can:

RewriteRule (.*) http://example.net/$1 [R=301,L]

All redirects require the [L] flag. It isn't implied with [R] the way it is for [F] and certain other flags.


Could you just tell me exactly what I should put/amend in the htaccess file, please?

Regards

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4647301 posted 9:51 pm on Mar 12, 2014 (gmt 0)

For instance, I am not sure what you mean by "literal periods".

period = full stop = . = dot

In Regular Expressions, a . represents "any one character, including a space". (There are special rules for line breaks, but they don't apply in Apache.) And "escape" means put a \ backslash before the character. Generically this means "I'm talking about the actual character, not its special RegEx meaning." So
^ = beginning of pattern
\^ = a caret
( = start capturing
\( = an opening parenthesis
[ = beginning of a group
\[ = an opening bracket
and so on.

When we say "literal period" we mean a . as such-- the kind you see in
www.example.com
and
index.html

If you don't say
www\.example\.com
and
index\.html

then the dot could also mean
index,html
indexxhtml
index3html
and so on. Sometimes the difference is crucial. Other times there could hardly be anything but a period (full stop, .) in that location. But always escape \. as a matter of habit.

. in grouping brackets don't need to be escaped.

So i do not know why Google brings up a page like
http://example.net/index.html/
that returns a 404 error.

Well, I don't know either, but did we ever establish that it was your mistake to start with? Sometimes a search engine finds a malformed or mistyped link on someone else's site, and then they come by and ask for nonexistent pages. If it wasn't your mistake in the first place, you are welcome to ignore the issue and simply let the server return a 404.

qimqim



 
Msg#: 4647301 posted 4:32 pm on Mar 14, 2014 (gmt 0)

Hi Lucy

thank you for putting up with me...

I am finally back home.

The 404 error worries me because it is being displayed as one of the Google search results and I don't want one of my few visitors to ens uo with it. I have looked everywhere with dnGrep and the only places where example.net has a slash afterwars, other than in subdirectories is in the htaccess file and in the sitemap, which starts

<url>
<loc>http://example.net/</loc>
<lastmod>2014-02-20T12:24:06+00:00</lastmod>
</url>


Do you think this could be the reason, and if so how can I amend the htaccess (or maybe easier just to amend the sitemap)?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4647301 posted 8:38 pm on Mar 14, 2014 (gmt 0)

Now, wait. A slash after the domain name should be completely irrelevant. Unlike everything else in an URL, it's supplied by the browser. So not even google can claim that
www.example.com
and
www.example.com/
are different URLs. And by definition both forms will reach your server as requests for
/
root.

But your original question wasn't about that. It was about forms like
blahblah/index.html/

This is in the category of "problems you don't have to deal with until they arise". At this point there are different approaches.

One is to simply ignore it-- really-- and let those silly requests get their well-earned 404. The only way
index.html
and
index.html/
could really be different pages is if you're parsing html as php and {buncha stuff that you don't want to read and that gives me a headache}.

Another is to fiddle with your AcceptPathInfo settings. Yawn.

Another approach looks like this:
RewriteRule ^([^.]+\.html). http://www.example.com/$1 [R=301,L]
OR
RewriteRule \.html. - [F]
That is: either redirect to the form that doesn't have garbage after ".html" or block them outright. The choice is entirely yours, depending on where the requests come from. You could theoretically attach conditions and handle them in different ways depending on who's asking. But that's an awful lot of trouble to go to for a request that wasn't your mistake in the first place.

Note again that
\.html.
means "anything at all after the element '.html'". That's again your distinction between a RegEx . "any character" dot and a \. literal period "exactly a period and nothing else".

qimqim



 
Msg#: 4647301 posted 8:50 pm on Mar 14, 2014 (gmt 0)

So, what you suggest is to insert

RewriteRule ^([^.]+\.html). http://www.example.com/$1 [R=301,L]

But is that additional or to replace something else?

Thanks

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4647301 posted 9:58 pm on Mar 14, 2014 (gmt 0)

It's a new rule to meet a new need. If you take the redirect route, the rule goes near the end of your existing redirects but before the final domain-name-canonicalization rule. If your existing index redirect has an end anchor in the pattern, remove it so it will work either way.

If you choose instead to block, the rule would go with your other [F] rules.

qimqim



 
Msg#: 4647301 posted 8:51 am on Mar 15, 2014 (gmt 0)

Right, so I called it #4c. Is this ok?
#4a index redirect

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html
RewriteRule ^(([^/]+/)*)index\.html http://example.com/$1 [R=301,L

#4b domain-name canonicalization redirect

RewriteCond %{HTTP_HOST} !^(example\.net)?$ [NC]
RewriteRule ^(.*)$ http://example.net/$1 [R=301]

#4c index/ redirect
RewriteRule ^([^.]+\.html). http://www.example.com/$1 [R=301,L]

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4647301 posted 9:39 am on Mar 15, 2014 (gmt 0)

Swap 4c and 4b.

Delete www from rule target.

qimqim



 
Msg#: 4647301 posted 9:47 am on Mar 15, 2014 (gmt 0)

Thanks. Like this?

#4a index redirect

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html
RewriteRule ^(([^/]+/)*)index\.html http://example.com/$1 [R=301,L

#4c index/ redirect
RewriteRule ^([^.]+\.html). http://example.com/$1 [R=301,L]

#4b domain-name canonicalization redirect

RewriteCond %{HTTP_HOST} !^(example\.net)?$ [NC]
RewriteRule ^(.*)$ http://example.net/$1 [R=301]

qimqim



 
Msg#: 4647301 posted 10:43 am on Mar 19, 2014 (gmt 0)

Hi

Could you confirm that the above is correct, please?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4647301 posted 6:54 pm on Mar 19, 2014 (gmt 0)

The order is right, but the labels are a little inaccurate. The pattern for the index redirect has no closing anchor, so this rule includes requests for "index.html/" alongside the ordinary "index.html". Rule 4c then scoops up any remaining requests for "something-else.html/". Or, technically, .html with any kind of appended garbage; it just happens to be / here.

Harrumph. Just because I can answer posts at 3:43 AM doesn't necessarily mean I do ...

qimqim



 
Msg#: 4647301 posted 7:02 pm on Mar 19, 2014 (gmt 0)

Hi Lucy

I'm sorry but I don't really understand what you mean. Could you amend the code so that I can just insert it in the htaccess file?

Thanks

This 113 message thread spans 4 pages: < < 113 ( 1 2 3 [4]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved