Welcome to WebmasterWorld Guest from 23.20.147.6

Forum Moderators: Ocean10000 & incrediBILL & phranque

Rewrite url with special characters

     
6:47 pm on Mar 2, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:July 15, 2015
posts:100
votes: 39


The old content managment system I used created post links in this format:

example.com/readnews.asp?id=234345

Using WordPress it will not let me just chnage posts to match the old urls (using the custom permalink plugin and inputing readnews.asp?id=234345 just encode the url and make the exact url that I want 404 )? I also tried changing the wordpress permalink setting to try to add "readnews.asp?id=". However this just broke all links to posts.

Here's the code that I am trying to just remove the readnews.asp?id= so that I can just input the post id in wordpress myself instead but isn't working

RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^id=([0-9]*)$
RewriteRule ^readnews\.asp$ /readnews/%1? [L]


Any help would be great. Thank you.
9:36 pm on Mar 2, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13674
votes: 439


Are you trying to rewrite (invisibly, behind the scenes) or redirect (tell the visitor to ask for a different URL)?

What's the significance of the "special characters" mentioned in the topic title? There don't seem to be any.

Is your code located before the WP section, but in the same htaccess file? (If not, it needs to be.)
7:35 am on Mar 3, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:8289
votes: 331


What's the significance of the "special characters" mentioned in the topic title? There don't seem to be any
jambam is likely referring to the id parameter... those numbers (if they aren't just an example.)
11:51 am on Mar 3, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:July 15, 2015
posts:100
votes: 39


Hi thanks for your replies.
Basically I have lots of old urls that consist of example.com/readnews.asp?id=randomnumbers and want to reup these articles to my new contant managment system wordpress using the custom permalinks plugin to add /readnews.asp?id=3453465 as the url however when I do this my server/wordpress doesnt like the special/reserved characters in the url I set so decides to encode them with percentage signs and numbers whilst my server/wordpress if you go over to the nonencoded version which all the links point to just shows up with 404 error and doesnt redirect to the encoded version. Hope that makes sense. Im really unsure if this is htaccess or wordpress that is messing up my urls because I have doen something like this on another servers using custom permalinks plugin and the urls just worked as expected and didnt automaticall encode them.
12:04 pm on Mar 3, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:8289
votes: 331


Along with the benefits of using a CMS [webmasterworld.com] also comes the downside. Wordpress does its own thing with file paths. You may need to manually rename these pages.
2:48 pm on Mar 3, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


My advice is to let your old cms URLs go. Use title-based permalinks in WP and forget about preserving the old URLs. Search engines such as Google and Bing are very adept at reindexing sites, and they'll do it very quickly. The only search engines I've had difficulty with was Sogo (Chinese search engine), which took about 2 weeks before they clued in.

Site owners change CMS all the time. Search engines are used to this behaviour. Your SERP should not suffer. A tech solution is sometimes not the best solution.
8:12 pm on Mar 3, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13674
votes: 439


lots of old urls that consist of example.com/readnews.asp?id=randomnumbers and want to reup these articles to my new contant managment system wordpress using the custom permalinks plugin to add /readnews.asp?id=3453465 as the url however when I do this my server/wordpress doesnt like the special/reserved characters in the url I

But, but, but, splutter ... what "special/reserved characters"? All your examples just say "random numbers", both in prose and with illustrations. It would help if you said what non-numeric characters are involved.

If it were an external redirect, you'd handle non-word characters with the [NE] tag. But that shouldn't apply in internal rewrites (because nothing is being sent to a browser).

Your old URLs involve .asp. Is it possible that WP doesn't recognize the .asp extension, and therefore doesn't understand that ?blahblah is the query string?

:: grasping at straws ::
9:09 pm on Mar 3, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:July 15, 2015
posts:100
votes: 39


Sorry for my bad explanation (to be honest I dont know wether its wordpress or a server thing as wordpress semems to handle its own /?p=123 just fine).
Basically in wordpress I want to manually edit the urls on old posts for example Post 1 to read readnews.asp?id=234345 or another readnews.asp?id=344367 However if I try to use the custom permalinks plugin to change the permalink to one that I want such as readnews.asp?id=234345 on wordpress my server spits out an encoded url instead replacing ? with %3F and = with %3D but the non encoded version that all links point to doesnt redirect to the encoded version but to a 404 error page instead because my server is treating the enocded and unenocded urls as different urls.
Just to be clear the post id means nothing anymore to wordpress all im bothered about is the urls to get people clicking links to the right page and not to a 404 error.

Thanks so much guys for trying to help me out :)
9:20 pm on Mar 3, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


large table of asp?id=n to WP permalink?

WordPress automatically enables the default permalink structure after you install WordPress. The number that is used in the default permalink advises WordPress where the content can be found in your database. To be more specific, the number refers to the ID of the table row in the wp_posts table of your WordPress database (the table prefix for your website will be different if you changed it during the installation process). For example, http://www.yourwebsite.com/?p=50 would refer to the 50th row in your website’s wp_posts table and http://www.yourwebsite.com/?page_id=100 would refer to the 100th.

[elegantthemes.com...]
10:01 pm on Mar 3, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13674
votes: 439


Try adding the [NE] flag. Just to eliminate all possibilities.
11:38 am on Mar 6, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:July 15, 2015
posts:100
votes: 39


Thanks. Is there a way that I can via the htaccess use regex to cut out readnews.asp?id= and permanent redirect to the cut out version
for example

http://example.com/readnews.asp?id=12345678 goes to http://example.com/12345678

and so on..

I think that would be easier.
1:14 pm on Mar 6, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


i think what you want is an external redirect to the canonical url, which you could do by changing your ruleset from an internal rewrite to an external redirect:
RewriteEngine On

RewriteCond %{QUERY_STRING} ^id=([0-9]*)$
RewriteRule ^readnews\.asp$ http://example.com/readnews/%1? [L,R=301]


however i wonder how your server will handle the subsequent request for http://example.com/12345678 - do you need to internally rewrite that request to an internal url?
i.e. a request for http://example.com/12345678 gets internally rewritten to /readnews.asp?id=12345678

[edited by: phranque at 1:35 pm (utc) on Mar 6, 2017]

1:22 pm on Mar 6, 2017 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:July 15, 2015
posts:100
votes: 39


thanks ever so much phranque that code you gave works great. I can now just change the wordpress permalink structure in wordpress to readnews

You have saved me a load of stress! :)
1:36 pm on Mar 6, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


Still, how will you map the .asp based doc id to the WP-based p=999 id?
1:38 pm on Mar 6, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


that code you gave works great

you were close from the start.

everything lucy24 asked in her first post was important.
including the third question...
Is your code located before the WP section, but in the same htaccess file? (If not, it needs to be.)
1:41 pm on Mar 6, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


Still, how will you map the .asp based doc id to the WP-based p=999 id?

i believe the request will be passed to the wordpress script by the unseen WP section of the .htaccess referred to above and the wordpress script will handle it:
I can now just change the wordpress permalink structure in wordpress to readnews

i.e. WP won't see the /readnews.asp?id=12345678 request, it will see a request for /readnews/12345678 instead.
9:51 pm on Mar 6, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13674
votes: 439


Is there a way that I can via the htaccess use regex to cut out readnews.asp?id= and permanent redirect to the cut out version
for example

http://example.com/readnews.asp?id=12345678 goes to http://example.com/12345678
Did you mean this kind of thing?
RewriteCond %{QUERY_STRING} id=(\d+)
RewriteRule ^readnews\.asp http://example.com/%1 [R=301,L]

?
12:20 am on Mar 7, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


RewriteCond %{QUERY_STRING} id=(\d+)
RewriteRule ^readnews\.asp http://example.com/%1 [R=301,L]

the suggestion to use \d+ rather than [0-9]* is worth discussing.
however i would precede the id= with either a start anchor or a word boundary depending on the requirements of the application.

also, the substitution string in this case should end in the question mark (?) to prevent appending the query string from the request to the redirected url.
https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewriterule
Modifying the Query String

By default, the query string is passed through unchanged. You can, however, create URLs in the substitution string containing a query string part. Simply use a question mark inside the substitution string to indicate that the following text should be re-injected into the query string. When you want to erase an existing query string, end the substitution string with just a question mark. To combine new and old query strings, use the [QSA] flag.
1:39 am on Mar 7, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 81
votes: 4


RewriteCond %{QUERY_STRING} ^id=([0-9]*)$

^ start string
[0-9]: a number
*: repeat previous
$: end string
RewriteCond %{QUERY_STRING} id=(\d+)

(\d+): numeric, repeated

First one only allows id=numeric, nothing before or after, so is stricter. Second is shorter, and allows anything before or after the id=numeric. Is one better than the other?
4:15 am on Mar 7, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


[0-9]: a number
*: repeat previous

[0-9]* means zero or more numeric digits.

(\d+): numeric, repeated

\d+ means one or more numeric digits.

First one only allows id=numeric, nothing before or after, so is stricter. Second is shorter, and allows anything before or after the id=numeric. Is one better than the other?

depending on the requirements of the application

do you need to handle requests for urls with more than the id= parameter in the query string?
5:54 am on Mar 7, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13674
votes: 439


Is one better than the other?

There is more than one difference between the two versions. \d vs. [0-9] is probably the most trivial difference; it's got nothing to do with any of the others. (But, hey, a savings of two bytes in your htaccess...)

I don't think you would want the capture to be \d* or [0-9]* with * instead of + because then you're allowing for URLs that say "id="nothing, which would redirect to, er, I guess the root. If you really anticipate seeing requests for URLs with nothing in the "id=" slot, find some other way to deal with it.

^ (opening anchor) is a good efficiency move if you can be 100% certain that the "id=" is the first thing in the query string. If "id" is your only parameter, then obviously it will come first. Otherwise replace the ^ with \b (word boundary). Or, as a last resort
(^|&)
but this is only needed if you have other parameters whose names end in "id" preceded by a non-word character. Pretty unlikely, right?

$ closing anchor isn't needed if you're going to throw away any other parameters, assuming they even exist. Anything after the series of numerals will be ignored anyway: \d+ or [0-9]+ means "capture for as long as the numbers continue, whether or not there is anything after them".
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members