Forum Moderators: phranque
#301-redirect: page.html zu page
RewriteRule ^([\w-]+)\.html$ http://www.example.de/$1 [R=301,L] How can I keep the present URLs like http://www.example.de/page?Each page in WP has several versions so it is not one for one unless you want more work than you need to have. Another point - Your example is not https: but if you are not moving to https: you should probably not bother with rewriting or redirecting because most browsers are or will be scaring visitors away.
the 301 is to redirect any calls to example.com/yadda.htm to the extensionless version ... you can leave this in place, however this is really to catch any links which point to the filename with the extension.
you then need to add a rewrite such as
RewriteRule ^([a-zA-Z0-9-_.]*)$ /$1.html [L]
which will internally serve the file with the extension when the extensionless file is requested.
In a situation where you are migrating from WP to static html, the URL taxonomy is quite different. To redirect from the old URLs to the new static pages you will need to do more than have the same appearance. The WP URLs offer the same content at multiple URLs so if your new static URLs are not going to have all those extra URLs, you should be sure that you have set up a way to capture all the old formats and rewrite them to your preferred URL. Hopefully that URL has had a canonical meta tag to let Google know which version of the "blue widgets" pages you wanted to have indexed. You would need to capture whatever /category/ or /tag/ and /archive/ URLs that have existed to rewrite those to their new replacement URLs.
For example, if you set up your WP URLs to have a Permalink syntax that omitted category, tag, archive and other optional additions (such as /product/ ?) and you set a canonical meta tag to link to your preferred URL, then that would be the choice to use for your new URLs. That way, old to new would go smoothly (or have a better chance at least). If you have not been using any canonical meta link for your content, then it is anyone's guess which version of your content has been selected for indexing. Look at either your log files or analytics to see where visitors are landing. Google does not always follow the canonical preference if others have been linking to your content using a different URL. Start with your sitemap to help you have a list of all your posts and/or pages so you will know what needs to be changed. See what you can get out of GSC for linked content to help you plan. Then you can start by setting up rules to send requests for https://www.example.de/blue-widgets/ and https://www.example.de/books/blue-widgets/ and https://www.example.de/calendars/blue-widgets/ all to https://www.example.de/blue-widgets/ - it is a downsizing.
Your rules want to capture old requests for all category names, tag names and the format of your archives and send all of the versions to one page. This is not redirecting all pages to the home page for visitors to start over so you need a map, a plan before you begin. When you know all of the terms that need to be removed or altered, then you can set up rules. One rule is not likely to handle all cases, though some can be combined.
RewriteRule ^([a-zA-Z0-9-_.]*)$ /$1.html [L]
>>I don't think this is where you want to go, one rule would remove the .html and the other would add it back on.
no it doesn't, it's an internal rewrite, when extensionless is requested, the file with extension is served. or at least that's how it works on my server.
i thought the OP was creating static html files, but wanted to keep the extensionless URIs
Either change can disrupt traffic for a time as your "new" URLs are crawled. If the old URLs will remain the same by simply removing the file extension then I would add that in at the same time for minimal disruptions. The rule for the file extensions should appear in your .htaccess file before the https: rules. I am not positive that you want to use any [L] (Last) flag on that .html remover rule because it will need further changes for https.
If you search here (upper right search for desktop) for "to https:" you will find literally several hundreds (my guess is thousands) of discussions. The more recent are more convenient, the older discussions have more 'meat'.
#301-redirect: page.html zu page
RewriteRule ^([\w-]+)\.html$ http://www.example.de/$1 [R=301,L] and says it works fine. - but now will be changing to https URLs.
but now will be changing to https URLsBut that's just a matter of adding an "s" in the target, isn't it? (This thread has been somewhat meandering, so I may have missed a key piece.)
I fear I cannot explain it better, is it still confusing?
I guess your code does the opposite of what my goal is, because it adds
First: What is the need to go from CMS to static?
Two: Is extension negative required (other than a dim hope of keeping what you got?)
Three: SOMETIMES a clean start is the best thing in the world.
That said, you can find a number of methods to "go back" to static. .htaccess if your best friend in that regard. If you are moving from HTTP to HTTPS that's a minor complication.
But I do have to ask, why change? What does WP not offer that makes the transition back to static necessary? A WP site, kept clean and with the least amount of plugins (for security purposes) is actually a pretty fair platform for a website. (Me, never went there, but that's beside the point.)
Is there a compelling need to ditch WP?
Incidentally: complete brain fart on my part because I forgot it has to be
^[^.]+[^/]$
so you're not rewriting directory requests (including root). Oops.
but now will be changing to https URLs
But that's just a matter of adding an "s" in the target, isn't it? (This thread has been somewhat meandering, so I may have missed a key piece.)
RewriteRule ^([^.]+[^/])$ /$1.html [L]
Therule will add .html to requests without .html and it is a 302 (temporary) rule without aRewriteRule ^([a-zA-Z0-9-_.]*)$ /$1.html [L]flag at the end.[L,301]
while i haven't tried [L,301] it should throw an error.Yup: Test site confirms that it’s a solid 500--on any request, not just one that matches the pattern. Supplementary for-the-heck-of-it experimenting confirms that this will happen if you put anything in "flag" position that isn't a recognized flag, like some random letter of the alphabet.
the [L] flag without an [R] flag simply means an internal rewriteOr no action at all, depending on whether there’s a target or merely a - filler. (I know that you know that; I’m throwing it in for archival purposes.)
In summary: When using extensionless URLs, you need to do two things: Redirect requests from with-extension URLs to the without-extension form, and Rewrite requests for extensionless-anything to the physical file which has an extension.
...If your URLs have never had an extension, you shouldn’t need the redirect. This, in turn, means you don’t need a RewriteCond looking at THE_REQUEST so you don’t go around in circles.
#301-redirect: page.html zu page
RewriteRule ^([\w-]+)\.html$ http://www.example.de/$1 [R=301,L] But wait! A further nasty complication is that certain robots (looking especially at you, AppleBot, but there are others) are so fixated on extensionless URLs, they will routinely request
/dir/subdir
for real, physical directories whose URL has never been anything but
/dir/subdir/
In Apache, these are generally handled by mod_dir, which first adds the slash and then supplies the appropriate DirectoryIndex. But if you’ve got a rewrite to handle extensionless URLs, you also need to ensure that nothing is getting rewritten to the nonexistent
/dir/subdir.html
and preferably you want to do this without a server-intensive !-d RewriteCond. (The form -f is never needed, because real physical files will always have an extension.) Alternatives depend on the exact site; even a RewriteCond excluding a short list of named directories would be more efficient.
#Redirect non https: and non www requests
#301-redirect: page.html zu page
RewriteCond %{HTTPS} !on [OR]
RewriteCond %{SERVER_PORT} 80 [OR]
RewriteCond %{HTTP_HOST} !^(www\.example\.de)?$
RewriteRule ^([\w-]+)\.html$ https://www.example.de/$1 [R=301,L] RewriteRule ^ebooks/(\w+)$ https://example.com/ebooks/$1/ [R=301,L]
to prevent chained redirects that come in with the wrong protocol. This is site-specific: /ebooks/ happens to contain more subdirectories than the rest of the site put together, and all URLs in the directory are in the form /ebooks/title/ (or /ebooks/title/volume.html which isn’t affected by the rule).
# xmlrpc.php deaktivieren
<Files "xmlrpc.php">
Require all denied
</Files>
# Kein Zugriff auf wp-config und Dateien mit WP-Version
<FilesMatch "(wp-config.php|liesmich.html|readme.html|liesmich.txt|readme.txt|licence.txt)">
Require all denied
</FilesMatch>
# Zusaetzlicher htaccess-PW-Schutz
<Files wp-login.php>
AuthType Basic
...
</files>
# 301-redirect: page.html zu page
RewriteRule ^([\w-]+)\.html$ http://www.example.de/$1 [R=301,L]
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
# BEGIN Hotlinking unterbinden
<IfModule mod_rewrite.c>
#RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^https?://(www\.)?example\.de(/.*)?$ [NC]
RewriteCond %{HTTP_REFERER} !^https?://(www\.)?google\.[^/]+(/.*)?$ [NC]
RewriteCond %{HTTP_USER_AGENT} !^(.*)Googlebot(.*)$ [NC]
RewriteCond %{HTTP_USER_AGENT} !^Googlebot\-Image(.*)$ [NC]
RewriteCond %{HTTP_USER_AGENT} !^Googlebot\-Video(.*)$ [NC]
RewriteCond %{HTTP_REFERER} !^https?://(www\.)?bing\.[^/]+(/.*)?$ [NC]
RewriteCond %{HTTP_USER_AGENT} !^(.*)Bingbot(.*)$ [NC]
RewriteCond %{HTTP_USER_AGENT} !^(.*)MSNBot-Media(.*)$ [NC]
RewriteCond %{HTTP_USER_AGENT} !^(.*)BingPreview(.*)$ [NC]
RewriteCond %{HTTP_USER_AGENT} !^(.*)MSNBot(.*)$ [NC]
...
RewriteRule \.(jpe?g|png|gif|svg|pdf|mp3)$ - [NC,F]
</IfModule>
# END Hotlinking unterbinden
# Serverseitige deflate-Komprimierung
<IfModule mod_filter.c>
<IfModule mod_deflate.c>
AddOutputFilterByType DEFLATE text/plain text/html text/xml text/css text/javascript text/rtf
AddOutputFilterByType DEFLATE application/javascript application/x-javascript application/msword application/ld+json
</IfModule>
</IfModule>
# Browser-Caching durch mod_expires
<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType text/html "access plus 1 month"
....
</IfModule>
#Redirect non https: and non www requests
#301-redirect: page.html zu page
RewriteCond %{HTTPS} !on [OR]
RewriteCond %{SERVER_PORT} 80 [OR]
RewriteCond %{HTTP_HOST} !^(www\.example\.de)?$
RewriteRule ^([\w-]+)\.html$ https://www.example.de/$1 [R=301,L]
That rule for port 80 is in case incoming old links are still using "http" syntax.
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress I don't have any subfolders or subdirectories at the moment and this won't changeIf you have no subdirectories in your URLpaths--and no directories at all except ones containing with-extension supporting files--then you need not worry about that aspect. (Whew.) Yes, I realize it makes the whole thing now sound like a long and pointless digression--but it’s something that could become calamitous on sites where it does occur, so it’s good to get everything spelled out and unambiguous.
Those assorted rules should be higher up in the file, the hot -linking and such should not be after the canonical rewrites and rules with the [L] flag. It can cause looping or 500 errors to have things in the wrong order.
So it depends - do those various default URLs exist or not?
If you have no subdirectories in your URLpaths--and no directories at all except ones containing with-extension supporting files--then you need not worry about that aspect. (Whew.) Yes, I realize it makes the whole thing now sound like a long and pointless digression--but it’s something that could become calamitous on sites where it does occur, so it’s good to get everything spelled out and unambiguous.
Since I myself don’t use extensionless URLs, I don’t know whether robots also like to take the opposite approach, requesting
/filename/
when the URL is correctly
/filename
and-that's-all.