homepage Welcome to WebmasterWorld Guest from 54.197.215.146
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Rewriting 150,000+ image files into new directories
motorhaven




msg:4472372
 1:55 am on Jul 4, 2012 (gmt 0)

A web site I purchased has user uploads, and unfortunately they are not well organized for server speed. All 150,000 images are in same directory (Linux box). Naturally this adds substantial overhead to each image fetch.

What I want to do is make 100 sub-directories in the forum:

1
1/1
1/2
1/3
1/4
1/5
1/7
1/8
1/9
2/1
2/2

and so on.

All the files have a number in the first two digits of the filename. So I would place the files in the appropriate sub-directory using the first 2 digits, and then use a rewriterule so Apache can find it. I've banged my head against the wall trying to get rewriterules working.

Basically:

11ABCEFG.jpg would be rewritten to:

/1/1/1ABCEFG.jpg

94SJRDHEID.jpg would be rewritten to:

/9/4/94SJRDHEID.jpg

It should be simple, I've done a lot of work with mod-rewrite but this one is eluding me!

 

matrix_jan




msg:4472375
 2:17 am on Jul 4, 2012 (gmt 0)

I did something similar years ago. Should be something similar to this one. Lucy might help you :)

RewriteCond %{THE_REQUEST} /([0-9])([0-9])([^&]+)\.jpg HTTP/
RewriteRule . http://www.example.com/%1/%2/%1%2%3\.jpg? [R=301,L]

g1smd




msg:4472420
 6:53 am on Jul 4, 2012 (gmt 0)

It's too soon to start coding. We're still discussing the requirements.


So you're going to move the physical files to new folders.

What happens to the URLs?

You want new URLs and will redirect requests for the old URLs to the new URLs? You'll need to update the internal links on all pages of the site too.

OR

You continue to use the old URLs and will rewrite the request to fetch the file from the new folder structure? Nothing else on the site will need to be edited.

Both are possible. The second is preferred.

lucy24




msg:4472436
 7:54 am on Jul 4, 2012 (gmt 0)

Do you have a doodad that will automatically shove all new uploads into the appropriate directory? If you don't change the physical file structure, you still have a directory containing 1,5 lakhs of files; no combination of Rewrites and Redirects will change that.

Are the image files directly accessible by name, as in image-hosting sites, or can they only be called by pages? If they're not human-accessible, there's no need for a Rewrite. html pages don't know from pretty URLs.

motorhaven




msg:4472640
 10:59 pm on Jul 4, 2012 (gmt 0)

I have rewritten the code which places new uploads in the new directories, but I'm not deploying it until I have rewrites for the old image locations complete.

Its a forum, and rewriting all the existing posts which point to the old image locations to new locations isn't feasible (just yet, it is planned). New posts will reference the correct new locations.

At some later point I'll write a script to update the old posts in the database, but for now the priority is to spread the images across multiple directories.

Plus we have a lot of traffic via Google images, so even if I rewrite how our site references the files I still need rewriterule(s) to be able to let Google know the new locations.

g1smd




msg:4472642
 11:09 pm on Jul 4, 2012 (gmt 0)

The whole idea of a rewrite (as opposed to a redirect) is that the URLs used "out there" on the web do NOT change. This avoids having to redirect requests and it avoids having to alter links on pages.

New posts should also continue to use the old URL scheme. So to your rewrite:

RewriteRule ^(([0-9])([0-9])[0-9]+)\.jpg /$2/$3/$1.jpg [L]

Simples!

Leosghost




msg:4472644
 11:21 pm on Jul 4, 2012 (gmt 0)

RewriteRule ^(([0-9])([0-9])[0-9]+)\.jpg /$2/$3/$1.jpg [L]

Simples!


Hah ..meerkats write regex..you are A.Orlov* and I claim my 5 rubles ..
*or possibly Sergei in which case 2.5 rubles and we'll call it queets :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved