homepage Welcome to WebmasterWorld Guest from 54.226.0.225
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Content Management
Forum Library, Charter, Moderators: ergophobe

Content Management Forum

    
How do you Store Millions of Images on your Server?
NeedExpertHelp




msg:4278916
 4:24 pm on Mar 9, 2011 (gmt 0)

Hello,

I want to be able to scale to millions of user profile pics on my Server.

I currently store all images in one folder, which is a big no-no, so I want to spread them out into many folders and sub-folders (e.g. aa/bb/ etc...).

What is the best and most efficient way of doing that, especially if I do not want to have to call the DB to get the filename/path for that user's profile pic?

I'm thinking of maybe doing a hash of the username and utilizing the first 4 letters of that hash to generate/locate the path for that user's profile pic, that way I wouldn't have to access anything additionally from the DB since I will always have the user's username. So, for example, if the first 4 characters of the user's username hash were "aabb", I would store that user's profile pic under aa/bb/username/profile.jpg , which should theoretically allow me to scale to millions of users without having to add anything to the DB, while spreading all the pics evenly throughout the aa/zz/ folder structure.

Any ideas/input?

Thanks!

 

explorador




msg:4279019
 6:45 pm on Mar 9, 2011 (gmt 0)

Every user has a unique ID, and also a unique nickname. You could use any of those to identify the images. If the ID is too long I'll go after the nickname after passing some filters on it to eliminate problematic characters or signs. I think it would also be helpful on the SEO side of naming images for profiles, as nicknames won't change.

I'm using that on a product database where every image has the name of the unique id of the product.

NeedExpertHelp




msg:4279263
 1:20 am on Mar 10, 2011 (gmt 0)

Hi explorador, thanks for your input, but how would you organize the folder structure based on the username such that the millions of profile pics are evenly-spread without overloading any one folder?

Does anyone else have any other ideas/input?

Thanks!

NeedExpertHelp




msg:4279974
 1:45 am on Mar 11, 2011 (gmt 0)

Anyone? :)

ergophobe




msg:4282057
 6:05 pm on Mar 15, 2011 (gmt 0)

I think your hash idea is best actually. You're looking at 16^2 directories, each with 16^2 subdirs in each one. I think I read that when *nix systems perform certain directory/file ops, they have to read the whole dir into memory, so spreading them out makes sense.

I wonder though if simply using 16^3 directories (first three chars) would be enough. 4 million users would give you 4096 directories with 1000 files each, which seems more manageable to me, though you'd have to look around for any performance implications.

BradleyT




msg:4283140
 5:37 pm on Mar 17, 2011 (gmt 0)

Probably something that could be answered or asked at High Scalability.

brotherhood of LAN




msg:4283147
 5:55 pm on Mar 17, 2011 (gmt 0)

If it's your own server I'd consider having a separate partition, and choose a partition format that will best suit your need. I don't have any particular filesystem recommendations but there are enough differences between them to make it a proper consideration, e.g. a smaller block size as having lots of small files would be 'rounded up' to the block size of the filesystem.

explorador




msg:4283758
 5:10 pm on Mar 18, 2011 (gmt 0)

how would you organize the folder structure based on the username such that the millions of profile pics are evenly-spread without overloading any one folder?

Hi, kinda late but, I would use alphabetic organization there.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Content Management
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved