Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Please help - I'm dreadful with RegEx

My rules are having side effects



10:20 am on Jun 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Hi All,

I am trying to rebuild our companies CMS system. I currently have a number of RegEx rules in place (5 in total) to do what I want, however, they are causing problems elsewhere. Images, css and js docs aren't loading. If I disable the URL Rewritting they work but obviously everything else breaks.

Here are the lines and what I am trying to achive with them.

RewriteRule ^/$ /cms/cms.asp?page=default

Any connections to www.domain.com should be redirected to www.domain.com/cms/cms.asp?page=default

RewriteRule ^/(.*)/$ /cms/cms.asp?page=$1-default

URL's like:

Should be redirected to:

RewriteRule (.*)\.html$ /cms/cms.asp?page=$1

URL's like:

Should be redirected to:

RewriteRule .*/(.*)-(.*)\.htm[^lL](.*) /asp/$1/$2.asp?$3

URL's like:

Should be redirected to:

RewriteRule ([^.]+[^/])$ /cms/cms.asp?page=$1-default

URL's like:

Should be redirected to:

Ok. Now the whole idea.

URL's to folders should be redirected to the default.html file within that forlder (this includes the root of the website, URLs ending in the slash, URLs not ending in the slash.

HTML files should then be redirected to their relevant CMS page. So:
www.domain.com/folder/folder2/file.html = www.domain.com/cms/cms.asp?page=folder-folder2-file

HTM files should be translated to their relevant ASP application. So:
www.domain.com/folder/app-file.htm?query=strings&remain=intact = www.domain.com/asp/app/file.asp?query=strings&remain=intact

All other files (e.g. images, css, js, docs, xls etc....) should not be rewritten.



1:39 pm on Jun 23, 2006 (gmt 0)

5+ Year Member

I assume this is link re-writing, so it should be looking for "<a href='...", right?

First, get all your href/locations and store the matches in an array:

var myRE = new RegExp("<a[^>]*(?:href)\s*=\s*\'?"?[^\'"\s>]*\'?"?\s?", "ig");
var results = allDocumentText.match(myRE);
var newArry = new Array();

//loop through all results
for(var i =0; i < results.length; i++) {

//try the matches in order
if (regexp1.exec(results[i])) {

//if it's found, run it's replaceString
var newValue = results[i].replace(regexp1,replaceString2);

//replace original href found with new href value

} else if (regexp2.exec(results[i])) {
newArry[i] = results[i].replace(regexp2,replaceString2);
} else if (....

Here are the regexp that you *may* (UNTESTED!)

First find all .html files under one directory:
var regexp1 = /(https?:\/\/)www.domain.com(?:\/([^\/]+)\/)([^.\/]+)[.]html/i

Replace with:
var replaceString1 = '$1www.domain.com/cms/cms.asp?page=$2-$3';

Next find all .html files under two directories
var regexp2 = /(https?:\/\/)www.domain.com(?:\/([^\/]+)\/)([^.\/]+)\/([^.\/]+)[.]html/i

Replace with:
var replaceString2 = '$1www.domain.com/cms/cms.asp?page=$2-$3-$4

*You can continue adding in "([^.\/]+)\/" for each possible directory level

Next find HTM files:
var regexp3 = /(https?:\/\/)www.domain.com(?:\/[^\/]+\/)([^-]+)-([^.]+)[.]htm(\?[^\'"\s>;]+)?/i

Replace with:
var replaceString3 = '$1www.domain.com/asp/$2/$3.asp$4

Next find domain folders (www.domain.com/ordering/ www.domain.com/ordering)
var regexp4 = /(https?:\/\/)www.domain.com(?:\/([^\/]+)\/?)/i

Replace with:
var replaceString4 = '$1www.domain.com/cms.cms.asp?page=$2-default

Personally, I'd do this with a server side language once on all files.

Hopes this helps, if any of the reg exp do not work, let me know. I didn't test any of them :O

- JS


9:09 pm on Jun 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Thanks JS.

This is all for the Ionic URL Rewritting component (it's free so I think I can name it).

I'll try your rules.


7:12 am on Jun 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Hi JS,

I have looked over you post again and am confused. I am using server-side URL rewritting via an ASP component. You appear to be doing it via JavaScript. This isn't an option for me as the pages aren't at their actual first locations (just the rewritten locaton).



Featured Threads

Hot Threads This Week

Hot Threads This Month