Forum Moderators: phranque

Message Too Old, No Replies

Setting up Extensionless URLs with htaccess in MAMP

getting error

         

Lorel

6:17 pm on Oct 30, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I started a similar thread back in Jan on the same subject but that website got put on hold. I'm now working on another website and recently started another thread on how to install and MAMP.
I'm using some of info from the original thread, plus what I learned from the installing Mamp thread.

I left the extensions on the file names.

All links are relative.

On the home page I took the extension off one page I'm using for a test. I left the slash off the end but also tested it with the slash on. when I mouse over it in the browser it shows up estensionless but it throws server not found error if I open the page.

I have MAMP installed and I can open pages of the website within Mamp. I also loaded htaccess so the includes would work (for header, etc).

The only other thing I have in htaccess is for extensionless urls (see below). I've tried several settings for the rewrite URL, as mentioned in previous threads, and so far nothing is working.

here is what is currently in htaccess:

(I understand I need to change example.com to the domain once I load it onto the web server)

---------------------

ErrorDocument 404 /missing.htm
AddHandler server-parsed .htm
#
Options +Includes
Options +FollowSymLinks
RewriteEngine on
#
RewriteBase /
#RewriteBase /example.com/
# check for file not existing
RewriteCond %{REQUEST_FILENAME} !-f
# check for directory not existing
RewriteCond %{REQUEST_FILENAME} !-d
# check for filename not ending in .htm to avoid looping
RewriteCond %{REQUEST_FILENAME} !\.htm$
# if the above conditions are met of no matching file, dir or htm file
# then rewrite to a .htm file
RewriteRule ^(([^./]+/)*[^./])$ /$1.htm [L]

---------------------

I get ERROR MESSAGE: Server not found

It was suggested in the earlier thread that I should use Live http headers in Firefox. Here is the last input from trying to access the page.

---------------------

[ajax.googleapis.com...]

GET /ajax/libs/jquery/1.6.2/jquery.min.js HTTP/1.1
Host: ajax.googleapis.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:33.0) Gecko/20100101 Firefox/33.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
If-Modified-Since: Mon, 02 Apr 2012 18:24:28 GMT
Cache-Control: max-age=0

HTTP/1.1 304 Not Modified
Date: Thu, 30 Oct 2014 07:51:32 GMT
Expires: Fri, 30 Oct 2015 07:51:32 GMT
Age: 37970
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0.01

---------------------

Can anyone see what might be wrong?

not2easy

6:29 am on Nov 7, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I was wondering why the WP htaccess doesn't use a rule to add a slash so I checked the live site and it has a slash on the end of each link.

I assume this something set up within WP itself because the host said to set up a temp directory to work on the site and comment out the rules in WP htaccess rules because they are not needed when using WP as it has it's rules within the program.

Yes, that is handled in the Settings for Permalinks. It can be changed there when the site goes live. They have examples to choose from for your permalink structure.

Lorel

4:05 pm on Nov 7, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm confused by what you said:

Internal links-- ones from one page to another page on the same site-- should not include the protocol-plus-domain anyway.


They don't.

When you make hard-coded html pages you can globally delete all occurrences of
http://www.example.com
(without the final slash, because you'll need to keep that) on all pages.


I assume you mean keep the slash after example.com, i.e., leave it there.

I see I didn't make myself clear as to which slash I was talking about. I'm wondering about the final slash after the file name, i.e., if it was in the original, should the new file name have one also to avoid duplication issues?

/page-name/

lucy24

10:36 pm on Nov 7, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm confused by what you said:
<snip>

I, in turn, was confused by the earlier
As a test I tried just adding slashes on the end of each link (hoping to not have to make more changes to htaccess) but it's redirecting to the live site.

I don't see how this could happen unless a link gives the full domain name. Otherwise you should end up getting a 404 from the MAMP site.

Within WordPress, you can choose to have your URLs either with slash or without slash. not2easy can tell you exactly which buttons to push. On a hand-coded site, you have to decide which form you want, and redirect/rewrite the others. (Redirect with/without, and rewrite to .htm.)

The complication comes from search engines. Once you've got an URL that looks like a directory
example.com/filename/

the search engine will occasionally ask for the two forms
example.com/filename
example.com/filename/index.html

As far as I can tell, it only asks for "index.html", not "index.php" and definitely not "index.htm". (I'm talking about direct observation on my own sites, which means we can pretty well exclude the possibility that the search engine is asking for URLs it found in other people's links ;) The only exception is one specific "/directory/index.html" URL that dates back to before I had an index redirect; I see those in human referers so I can't blame the search engine for asking.)

In the case of "index.html" the worst that can happen is a 404. But in the case of a missing final slash in a rewritten URL, you have to code the redirect. If it were a real, physical directory, mod_dir would take care of it; that's off the table in a made-up URL.

I don't personally know whether an extensionless URL will lead a search engine to ask for the identical URL with slash, and/or the identical URL with ".html" even though these URLs have never been used on the site. It's worth finding out.

Lorel

7:09 pm on Nov 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I left off the last slash on the links to the pages for now and will deal with that later if I need to.

I loaded the site on the domain and the extensionless URLs are working (showing up without the extension), as well as the redirect to www and the 404 errors.

I have a long list of 410s to add however they throw a 500 (internal server error) instead of a 410. I called the host and he can see it's throwing a 500 but they don't support helping with htaccess. I haven't had any problem with 410s on a new site before so I suspect it has something do do with the extensionless urls.

Here is a sample of one of the 410s:

RewriteRule ^file-name\.htm - [G}

here is the htaccess.

-------------

ErrorDocument 404 /missing.htm
AddHandler server-parsed .htm
DirectoryIndex index.htm
#
Options +Includes
Options +FollowSymLinks
RewriteEngine on
#
#
RewriteCond %{HTTP_HOST} !^(www\.EXAMPLE\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
#
# Extensionless rule - goes after all external redirects
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^.]+[^./])$ /$1.htm [L]
#

-------------

Can anyone see what might be preventing 410s from working?

lucy24

9:04 pm on Nov 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteRule ^file-name\.htm - [G} 

Was that a direct cut-and-paste?

Lorel

6:59 pm on Nov 10, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I just happened to pick out the only 410 in my list with a typo. Thanks for pointing that out. I realized I also had to remove cookies, clear the cache occasionally, then the redirects worked (I checked them all individually to make sure.

I still have 14 redirects for pages that were moved to another site that aren't working.

However, before working on those I want to make sure I have the preferred order of items in the htaccess. I found a list last week and lost it.

Is this the correct order:

ErrorDocument 404 /missing.htm
AddHandler server-parsed .htm
#
Options +Includes
Options +FollowSymLinks
RewriteEngine on
#
Redirects from old page to new
#
401s
#
non-www to www
#
Extensionless URLS.
#

lucy24

9:17 pm on Nov 10, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



When your fingers typed "401" did your brain mean "410"? I don't remember any authentication questions coming up.

I put [G] after [F] but before [R]. Sometimes you'll need to fiddle with rule ordering, for example if part of a directory was deleted while the rest gets redirected.

The domain-name-canonicalization redirect ("non-www to www") is the very last external redirect. Extensionless is in two pieces. The external redirect (from old "with" to new "without" URL), if needed, is near the end of your [R] list. The internal rewrite goes after all redirects.

A single lethal typo in htaccess is generally enough to create a 500-class error for all requests in the directory containing the htaccess. (A non-lethal typo would be something like redirecting to "wigdet.html" instead of "widget.html", or misplacing an anchor so a RewriteCond returns an unintended value.)

Lorel

10:18 pm on Nov 10, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sorry that 401 was a typo - should be 410. I need to "upgrade" my glasses.

I always check my htaccess changes right away in case they throw a 500 error and remove them quickly.

Thanks for your help Lucy.

I was confused re external vs internal redirects but found this very good explanation by JD morgan a few years back:

[webmasterworld.com...]
This 38 message thread spans 2 pages: 38