Welcome to WebmasterWorld Guest from 18.205.176.85

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

.htaccess with mod rewrite or Wordpress-plugin?

URL rewriting, better choice on the long run

     
4:40 pm on Aug 4, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Hi,
let's say you have to solve a little issue with URL-rewriting for a relaunch on Wordpress. It has to do with file extensions like .php or .htm, .html ect.

Let's say there are two working ways, which would you choose, considering "the long run" of the site, i.e., causing few playground for future issues and change necessities, dependancy:

-one of the available Wordpress plugins which do a more or less small rewrite-action within WP, supplementing the usual dynamic URL-creation

-redirecting the URL by mod_rewrite, server internal, without 301 (relaunch with WP doesn't actually change URLs, i.e., domain.de/bla.html --> domain.de/bla.html)

Considering the long run and not wanting to be faced with more or less "usual" issues like Plugin/WP-Updates, htaccess seems to be preferable?
Dependance, security, speed? htaccess again better, I suppose?

I'm not only faced with a certain site, it's more like a key question and basic issue for me, concerning other sites in future too.

Thanks,


deeper
7:39 pm on Aug 30, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Sounds compelling, thanks. I will check it and ask my hoster.

"If you have access to your server config (httpd.conf) you want your rewrites in your server config not in .htaccess... "

This would only concern the internal .html-rewrite, i.e., only
RewriteRule ^((onedir|otherdir|thirdddir)/\w+)\.html /index.php?$1 [L] and all other other codes should stay in the .htaccess?

1. Performance
I guess it's difficult, but can you estimate how much it could improve the performance?

2. Yes, lucy24 adressed this. But as not2easy mentioned in a different thread, WP won't touch the htaccess as long as its Standard code is not touched:
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

# END WordPress

WP says: "WordPress will play nice with an existing .htaccess and will not delete any existing RewriteRules or other directives. If you have other mod_rewrite rules, put yours before WordPress's."
9:30 pm on Aug 30, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


If you're on shared hosting, you do not have access to the config file, so stop thinking about it :)

General principle: If you do have access to the config file, don't use htaccess at all. Rare exceptions if you're changing a whole bunch of stuff and don't want to keep restarting the server every time you tweak one line. But that's temporary.

WP won't touch the htaccess as long as its Standard code is not touched

Well, that's a pretty superfluous thing of WP to say, since the first thing anyone would do is delete the "IfModule" lines-- and if you've got existing RewriteRules, you wouldn't need to duplicate "RewriteEngine on".

put yours before WordPress's

Well, that depends on what exactly your own RewriteRules are for. If you're changing the site's innards so existing .html URLs are now handled by WordPress, you may need to put your own rules between the two WP rules.

Besides, how many non-page requests are handled by WP (or any other CMS)? For most people it's better to tweak the final WP rule so it only applies to page requests.
4:09 pm on Aug 31, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


I remember your advice about deleting "IfModule" and "RewriteEngine On".

But if it assures never being deleted by any WP updates: You don't think it's worth keeping then these 3 lines and use the kind of "code sections" I adressed above?

The WP standard "code section" would always stay the same then. Furthermore there would be the line with .html-rewrite before the WP code as second "code section".

As a third section the formerly explicitly written www-301 obviously is not needed anymore with WP. Due to not2easy it's already done by the standard WP code:
[webmasterworld.com...]

But as I researched in the meantime a third section will be needed for a different issue, concerning security risks with xmlrpc.php:

<Files "xmlrpc.php">
Order allow,deny
Deny from all
</Files>


You would not recommend such a "three-section-htaccess"?
6:45 pm on Aug 31, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


OK, technically your added rules will still work if you put them before the line that goes
blahblah index.php - [L]

It's not where they would normally belong, but it won't actually affect the workings of anything. If you can be certain that WP won't touch your htaccess if it finds the core block already present, then that may be a fair compromise.

<Files "xmlrpc.php">
Order allow,deny
Deny from all
</Files>


If nobody is allowed to access the file, who's it for?
7:29 pm on Aug 31, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4569
votes: 367


There is some confusion, because WP only rewrites the "entire" htaccess file in a few, limited circumstances. The "Jetpack" plugin is one of the worst case scenarios. Other plugins can add lines within the wrapper/container. I would never add a plugin without checking after and having a backup copy on hand, just in case. But under normal circumstances, htaccess does not change unless you internally change some settings within WP.

WP uses the container so I would not remove that. Yes, it is stupid and redundant, but you will have problems if you remove the internally CMS generated form container:
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
whatever
</IfModule>

# END WordPress


I have never seen it overwrite an entire htaccess file, but I have seen it malfunction if you remove that wrapper. My thoughts are that WP uses the container lines to locate its instructions, but I do not have definitive information. I imagine that searching at wordpress.org would give you the full story.
11:59 pm on Aug 31, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


I have seen it malfunction if you remove that wrapper

Heh. Anyone got a test site with WP installed? I don't, or I'd race off and do my own experimenting. When possible, I always prefer to test for myself. Sometimes the results aren't what the docs led you to believe.

There are a lot of permutations to check:

-- effect of comment lines (BEGIN/END WP), present or absent
-- effect of <IfModule> envelope (exact text vs. any type of <Envelope> vs. presence of any line with content et cetera)
-- effect of stuff within the outer envelope (comment lines) vs. within the inner envelope (<IfModule wrapper)

And then the question is: What is the absolute worst that can happen? Overwriting entire htaccess vs. overwriting all mod_rewrite lines* vs. something less? And when does this worst-case scenario happen?


* Since all mod_rewrite directives happen to begin with
RewriteBlahblah

it is easy to excise the whole thing globally.
12:36 am on Sept 1, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4569
votes: 367


Since my experiment was on a live site, it was very, very short (and unintended). I did not check any of those things. But I won't do that again. Mistakes like that tend to stay with you anytime you get near them again.
3:21 am on Sept 1, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


@lucy24:
xmlrpc.php is needed for remote working and pingbacks. I don't need it, but the file sometimes is misused as a gate for security attacks. Without any access to it this should be solved.

The code for xmlrpc.php is placed correctly at the end of the htaccess, like below?

Preferring the option without any risks of being overwritten by WP
my "3-section-htaccess" would look like this, please tell me if something is wrong:

#internal .html-rewrite
RewriteRule ^((onedir|otherdir|thirdddir)/\w+)\.html /index.php?$1 [L]


# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

# END WordPress

# disabling xmlrpc.php
<Files "xmlrpc.php">
Order allow,deny
Deny from all
</Files>
7:44 am on Sept 1, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


Just remember to replace
(onedir|otherdir|thirdddir)

with the actual directory names ;)

xmlrpc.php is needed for remote working and pingbacks.

Do you mean, access to the file is only by means other than HTTP? Or is it used internally? Make sure you're not blocking essential subrequests.

My personal preference is to collect all mod_rewrite stuff at the end of the htaccess file, simply because mod_rewrite tends to take up the most space. Put any quick and simple directives such as ErrorDocument lines at the beginning, followed by access control. Short <Files> or <FilesMatch> envelopes go near the beginning too.

Disclaimer: I've got two layers of htaccess files, so we are sometimes in "Do as I say, not as I do" mode. The first htaccess is in my userspace, covering access control for all domains plus a few very basic things like charset and noindex headers that are the same everywhere. There is no mod_rewrite stuff at this level. Then each domain has its own htaccess for site-specific things like redirects, plus a few things like Options directives that don't work on the userspace level even though they're always the same. (I tried.)
5:10 pm on Sept 1, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Here are the details regarding xmlrpc.php:
[xmlrpc.scripting.com...]
As far as I can understand its functions, it shouldn't block any necessary subrequests.

There is one further issue, the last one (I promise :)) which I would like to insert in my future .htaccess, which gives more security by establishing a second password-level. Probably you know this:

# Protect wp-login
<Files wp-login.php>
AuthUserFile ~/.htpasswd
AuthName “Private access”
AuthType Basic
require user mysecretuser
</Files>



This is also placed at the beginning?

So my final .htaccess would look as follows:


# disabling xmlrpc.php
<Files "xmlrpc.php">
Order allow,deny
Deny from all
</Files>

# Protect wp-login
<Files wp-login.php>
AuthUserFile ~/.htpasswd
AuthName “Private access”
AuthType Basic
require user mysecretuser
</Files>

#internal .html-rewrite
RewriteRule ^((onedir|otherdir|thirdddir)/\w+)\.html /index.php?$1 [L]

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

# END WordPress
5:57 pm on Sept 1, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4569
votes: 367


An extra password isn't necessarily your best option. Automated bot login attempts just keep on trying another password and waste your bandwidth while they keep your server busy. Read some of the many solutions being discussed by WordPress users here at the WordPress forum: [webmasterworld.com...]

Some related tips during setup: change the wp- default prefix for your sql tables to something else. You will need to have set up your php database before installation or else go in and download the sql, use find/replace to edit the tables and upload. When people try to hack the database they go for wp-whatever, not yxo-whatever. Your wp-config.php will need to have the same information.

Change file permissions on wp-config.php so it is not at the default.

On install set up your new Admin account with all privileges and delete the default Admin user account.

Much more information can be found on "hardening WordPress" and at wordpress.org.
7:59 pm on Sept 1, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Yes, security is really an issue with WP. As I have a fixed IP, a good provider without bandwidth and his own security measures, furthermore don't like tracking/adjusting IP-denials and don't like plugins, this is a good solution for me.

The .htaccess is o.k. as shown above, with wp-login.php (...) at the beginning?
12:17 am on Sept 2, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


Here are the details

Well, that was all so much Hungarian to me ;)

it shouldn't block any necessary subrequests.

Heh, no, I meant the other way around: if the file is invoked by other html/php pages as a subrequest, you have to make sure it isn't blocked.

By default, any lockouts you create -- whether it's Deny from... or a RewriteRule with [F] flag or anything else leading to a 403 -- apply not only to external browser requests but also to internal requests. So, for example, if you wanted your error documents to use the same include files as all other pages, you'd have to override whatever it was that created the 403 in the first place.

But if the thingummy.php file is just running quietly in the background, not getting included in any way, then you don't have to worry about granting access to it. The "Deny from" line applies only to HTTP(S) requests; it won't prevent you from accessing the file by FTP or similar. Presumably someone has to access it, or it wouldn't be able to do anything.
11:50 pm on Sept 2, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Hungarian :)?

I hope being wrong, but some hours ago an unsettling thought crossed my mind. I opened this thread in order to find the best way managing a redesign with WP regarding the fact that WP pages cannot have URLs with .html-endings. My URLs now have .html, for example site.com/page1.html, but WP only allows "site.com/page1".

We/you found using the .htaccess with this code being the best option for me:
#internal .html-rewrite
RewriteRule ^((onedir|otherdir|thirdddir)/\w+)\.html /index.php?$1 [L]

But: This means, that in future all web user will visit my pages only with URLs like site.com/page1 (without .html - unlike now), due to this rewrite. Right? Noone in future will see the origin .html-URLs.

Therefore future backlinks will link with site.com/page1 too, because that's the only URL they see. Right?

So I will gather backlinks "without .html" as referencing URL instead of "with .html" like now and all the past.
--> two different Backlink-URLs for each page, without 301 would be the final result in future.

Is this bad foresight right, so that the slower htaccess-rewrite is unavoidable?
6:49 pm on Sept 3, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


This means, that in future all web user will visit my pages only with URLs like site.com/page1 (without .html - unlike now), due to this rewrite. Right?

Wrong :( The essence of a rewrite is that it's internal; from the user's POV the originally requested URL does not change. (Technically: The browser does not make a new request to a new URL. Content is served from the old URL.)

Now, if newly created extensionless pages link to those older pages, and these new links use an extensionless URL, then you've got Duplicate Content. So if you want to keep the old URLs, make sure that any and all links use the .html form.

Always link to the URL you want people to see and use. A search engine that meets a link pointing straight to a 301 on the same site is not a happy search engine. It also means the server has to handle one extra request, and the user's initial load time is doubled. But in most situations, that's secondary.
10:27 pm on Sept 3, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Ah... sure, sorry, stupid question.

Sometimes things are too near to see them. There is a german saying expressing what I mean, but I cannot translate it.

No, there won't be extensionless backlinks in future. No danger so far.

The only possible option with 301 and going extensionless would mean to drop the extensions completely. Like changing the domain and using a 301 which is made exactly for this. But in my case the domain would stay, just the extension would vanish. In both cases the URL would "permanently move" and there is no choice then between internal rewrite or 301. In this completely different scenario 301 is a must.

But why should I do this? Someone told me to take the opportunity of my WP-relaunch to drop .html, because it's outdated. So what. Extensions like .html have no special value and no disadvantage at the same time IMHO.
12:30 am on Sept 4, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


My personal reaction to extensionless URLs is "Get back in the server and put some clothes on!" but I do realize this is a minority position.
1:36 am on Sept 4, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4569
votes: 367


I would be concerned about how you can get WP to generate these .html URLs for all the ways it links to content. (Or did I miss that part?)
2:13 pm on Sept 4, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


@lucy24:
Creating a new site today I would go extensionless: No hazzle with URLs any more using WP-pages and it's shorter. You know "Cool URLs don't change"? I think and hope it won't be a technical problem even in 20 years to have extensions, but on the long run stable URLs probably should better have no extensions.

@noteasy24:
Can you give me an example, I don't understand what you mean. Writing pages or posts in WP which contain external links with extensions?
2:16 pm on Sept 4, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Sorry, "not2easy"
3:39 pm on Sept 4, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


not2easy, if WP is making links to existing URLs, wouldn't it just use whatever form of the URL you're telling it to link to?
5:13 pm on Sept 4, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4569
votes: 367


wouldn't it just use whatever form of the URL you're telling it to link to?
Yes, or course, if the link is in the content, it will be in every version of the content in that same form that you wrote it as. I was talking about the internal URL structure that WP generates.

In WP, you create a page or post and you give it a title. That title is its URL or you can set up permalink taxonomy in the settings to be something other than its title. WP then goes on to offer the same content under other URLs: /category/, /archive/, etc. So you may create a post called "example.com/my-trip-to-paris" and WP generates URLs like: "example.com/europe/my-trip-to-paris", "example.com/april/my-trip-to-paris", "example.com/travel/my-trip-to-paris".

So long as you only recreate existing pages of your site and all the links to them are incoming external links, they can be rewritten, but internally, WP creates its own URLs to link to its content. Any pages you create in the future would not have any extension. The internal navigation created within WP would have no extensions, your sitemap will have no extensions.

Edited to add, this is why I thought that this discussion could use some experienced feedback in the WordPress forum: [webmasterworld.com...] as to the what/how/why/why not of the rewrite question. It may be possible to alter extensionless URLs to look like existing URLs, but is it a good idea?
8:07 pm on Sept 4, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


You are talking about two potential problems:

1.Internal links, automatically created by WP, sitemap, navigation

Custom menu feature of WP since 3.0 allows to create custom URLs without any limitations (btw, I always use absolute paths for internal links), i.e. extensions are possible.

So far navigation and sitemap should not cause problems.
Any other internal links, created by WP automatically to my pages which should have .html but will miss the extension? Hm, at the moment I don't see them.

2. New pages after relaunch
Right, they won't have .html. But there won't be many additional completely new pages. So in future some pages are extensionless and biggest part of them will have .html. Not nice, but not a real "no go" in my eyes.

Furthermore I could create new .html-pages with the help of a plugin.
8:16 pm on Sept 4, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4569
votes: 367


I was not concerned about links in a custom navigation menu that you create, my concern was more about URLs created when people search. Your content can be served up several ways and they will not be the same URL but have the same content. You will need to be sure to noindex everything except your preferred URLs.
7:00 am on Sept 5, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


My impression was that the original question was about pages which already exist, with URLs ending in .html, and how to fit these existing pages into a new WP install. I assumed-- if I was mistaken, deeper will tell me so-- that any newly created pages will fit the usual WP pattern.
2:00 pm on Sept 5, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


@lucy24:
Basically my question was about existing .html-pages in new WP, right. However the thoughts of not2easy are rightful, regarding internal links as well as new pages.

New pages will WP handle due to any normal setting in the permalinks-feature of WP - without .html (that's unavoidable). As I said, I could even new pages give .html with the help of a plugin. But I don't like being dependant on a plugin regarding essential issues like URLs. Furthermore there could be complications/interactions with internal rewrite in the .htaccess.

Therefore I will have some new pages without .html and about 50 with .html in future.

@not2easy:
Search, o.k. At the moment there is none, because it is a static site, but it might get an onsite-blog anywhen.

Anythinge else?

Noindex and canonical are the weapons I can use.
7:56 pm on Sept 5, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


If it's an active, ongoing, productive site AND you expect to stay with WP (or similar CMS) for the foreseeable future, then it might be a better plan in the long run to go extensionless on your existing pages too. This is very simple: just put a RewriteRule before the existing WP block that says something like

RewriteRule ^((dir1|dir2|dir3)/\w+)\.html http://www.example.com/$1 [R=301,L]


In other words, the pattern would look exactly the same as the existing [L] rule we've been talking about. Only the target is different.
11:12 pm on Sept 5, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Hm, difficult... will have to think about it. Coding may be almost the same, but it's a different game, because .html will vanish "officially" and completely, on the long run, for visitors and bots.
301 means a little loss of PR and a little risk, because sometimes sites don't come back after 301-relaunch.

Regarding this part of the code: (dir1|dir2|dir3) which is the same in both cases, can you tell what I will have to insert if two kinds of URLs:

example.com/page1.html
example.com/dir/page1.html
6:58 am on Sept 6, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15944
votes: 890


Do you mean that in addition to the named directories, you've got some top-level html pages? If there are not too many of them, it may be simplest to list them by name. So the pattern is

^((?:dir1|dir2|dir3)/\w+|page1|page2|page3)\.html


Note position of | pipes and / directory slash.

The no-capture ?: isn't functionally necessary and won't affect rule execution. It just saves the server a nano-instant of work because normally each set of parentheses is a separate capture.

I've been saying
\w+
on the assumption that all page names in these directories are made only of alphanumerics and lowlines. If any of them contain hyphens, you'd have to go to
[\w-]+
instead. And then any other specific characters like . or ~ if they actually occur in names of pages that already exist. (This is the one great advantage of lowlines. They count as \w so regular expressions are very simple.)
2:55 pm on Sept 7, 2014 (gmt 0)

Full Member

10+ Year Member Top Contributors Of The Month

joined:Feb 22, 2008
posts: 345
votes: 0


Yes, at the moment (without CMS, "ancient" static site), most of the pages are in the top-level folder.

There is the root-folder and it contains my main (top-level) folder. Most of the pages (usually about 30 - 40) are in this main folder.

Furthermore there is a folder in this main folder containing about 20 pages.

Therefore my URLs reflecting this show only two structures:
example.com/page1.html
example.com/dir/page1.html

Hyphens? Yes, many. Therefore

example.com/page1-with-text.html
example.com/dir/page-with-pics.html

would be more precise. Had no idea this could be important.
This 68 message thread spans 3 pages: 68