Welcome to WebmasterWorld Guest from 34.204.191.31

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Rewrite image files with query string to new urls

     
3:51 pm on Aug 23, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 24, 2003
posts: 763
votes: 91


Hello, I was wondering if anyone might be able to assist with this. I am launching my new site and have uploaded my old site's images to a directory with the same name on the new site. Unfortunately the old site took each image and appended a query string, so now I am trying to come up with ONE redirect to match them all to the location on the new site. The directory they are located in is /d/ and the base image file name with remain the same on both sites.

The problem is that I had to truncate the image file names on the new server, so for example:
http://example.com/d/filename&g2_itemId=012345
now becomes:
http://example.com/d/012345-filename.jpg

So when a request is made for a file in the /d/ directory of any name and with a query string of up to any five number combination it will redirect to the /d/ directory on the new server and find the same image with the rewritten filenames there. What I do not know how to do is to transfer two parameters, one from the original filename and other the appended ID#. This is my attempt at the the rewrite rule...Any help is appreciated!

<IfModule mod_rewrite.c>
Options +FollowSymlinks
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/d/([a-z0-9/-]+)$
RewriteCond %{QUERY_STRING} ^g2_itemId=([0-9]*)$
RewriteRule ^(.*)$ http://example.com/d/$2\-$1\.jpg [QSD,L,R=301]

</IfModule>

[edited by: ichthyous at 5:01 pm (utc) on Aug 23, 2016]

4:25 pm on Aug 23, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 24, 2003
posts: 763
votes: 91


I tested this and the second parameter ($2) is not transferring at all, it's just blank...
5:24 pm on Aug 23, 2016 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Apr 11, 2015
posts: 328
votes: 24


http://example.com/d/filename&g2_itemId=012345


I assume that was just a typo... "&" should be "?"?

http://example.com/d/filename_.jpg


Your current redirect would have resulted in "http://example.com/d/-filename.jpg". (?)

...except that the second parameter ($2) is not transferring


You should be using %1, instead of $2 in the RewriteRule substitution as a backreference to the first captured group (ie. "([0-9*)") in the last matched CondPattern. Backreferences that start "$" reference the RewriteRule pattern (you don't have a second group in the RewriteRule pattern, hence the empty string substitution).

However, your ruleset can be optimised. You should try to match as much as possible in the RewriteRule pattern, rather than replying totally on the RewriteCond (REQUEST_URI variable), since the RewriteRule is processed first. Did you intend to include a slash in your character class? Also, be as specific as possible with your patterns.

So, something like the following:


RewriteCond %{QUERY_STRING} ^g2_itemId=(\d{1,5})$
RewriteRule ^d/([a-z0-9-]+)$ http://example.com/d/$1-%1.jpg [QSD,L,R=301]


\d{1,5} matches a string of 1 to 5 digits (ie. "up to any five number combination")
5:38 pm on Aug 23, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 24, 2003
posts: 763
votes: 91


Yes, I meant to type a ?, not & sign. And YES this worked perfectly...you have no idea how much time you just saved me...there are thousands of images and I was going to enter them manually into htaccess one by one, but then I realized that the original developer sucked all the images from my live site with a script and that they have the query string parameter included in the title now. Actually, I don't even have to copy them to the /d/ directory as they are in wp-content/uploads/ as is...so I swapped out the directories and this works fine. Thanks so much for the help!
3:20 pm on Aug 25, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 24, 2003
posts: 763
votes: 91


Hello Whitespace, I spoke too soon...this code is working fine, except that I notice that about half of the images have been indexed with their session ID, which will need to be stripped out completely. This is the code I am using and which works to redirect image urls which have no session ID:

RewriteCond %{QUERY_STRING} ^g2_itemId=(\d{1,5})$
RewriteRule ^d/([a-z0-9-]+)$ http://example.com/wp-content/uploads/%1-$1.jpg [QSD,L,R=301]

I checked google search console and about 50% of the images are indexed with the session ID, and 50% are not. the session ID takes the form of "&g2_serialNumber=7" with the number changing. So first thing I would need to do is check if the string contains two parameters and strip the second one, then redirect the url...for example:

Url with session ID: http://example.com/d/aerial-panorama-lower-manhattan?g2_itemId=3177&g2_serialNumber=6
Stripped session ID: http://example.com/d/aerial-panorama-lower-manhattan?g2_itemId=3177
Redirected URL (working already): http://example.com//wp-content/uploads/3177-aerial-panorama-lower-manhattan.jpg

Any advice on how to handle stripping the session ID from the URL first? Thanks!
4:22 pm on Aug 25, 2016 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Apr 11, 2015
posts: 328
votes: 24


So first thing I would need to do is check if the string contains two parameters and strip the second one, then redirect the url...


That would seem to be overcomplicating matters. It seems you simply want to ignore the session ID (ie. "g2_serialNumber" URL param). Assuming this URL param always occurs after the "g2_itemId" param then you can simply remove the trailing anchor (ie. "$") on the CondPattern....


RewriteCond %{QUERY_STRING} ^g2_itemId=(\d{1,5})


This will now match any query string that starts "g2_itemId=123", rather than being exactly equal to.
5:25 pm on Aug 25, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 24, 2003
posts: 763
votes: 91


I see, that did the trick, obviously I'm not that well versed at rewrite rules! The second parameter is being ignored now. The old site (still live) is from 2006 so it will be good to retire it and move to a platform that has better control of the urls...it's been an arduous design and optimization process to move such a big site with existing traffic. Thanks gain for your help, hopefully others can benefit from this too...