Welcome to WebmasterWorld Guest from

Forum Moderators: phranque

Message Too Old, No Replies

Storing images as blobs - database or cloud? pros and cons?



2:49 am on Nov 6, 2009 (gmt 0)

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Until several projects ago, I always stored photos & product images as files. My web app usually had a root-level folder named "/images", and all my images would be neatly organized and named inside it.

On a site with over 10,000 little JPGs, that method gets cumbersome. Uploading, downloading, backing up, reverting & syncing the web project was often an 8 hour job that I'd run overnight. In the morning, I'd check and if the gods were benevolent, all those files would be transfered. But not always.

Then last year, I was building another project where users could upload a little photo as an avatar, and it struck me: why not store the photos in the database? So I did.

Then I used the same technique on another project, where users uploaded little thumbnail photos of nifty objects...

Sure, it's simple to create an <img> element that points directly to the image's physical location, but through the magic of URL rewriting (.htaccess) and a little PHP hocus pocus, using the built-in GD library or Imagick, delivering images at an artificial URL from a blob in a database isn't too difficult either.

Maybe I'll describe the technique in detail, in another post.

I'd like to get a good, thorough pile of pros and cons about storing images as blobs. I've done it on three projects now, and this latest one is potentially image-heavy (ie it's storing real photos, not just profile avatars).

- your images are suddenly sortable, indexable, and easily retrievable using SQL commands. Usually I access images via a simple row identifier, but not always!
- it's more elegant (imho)
- it keeps my scripts and templates separate from my data. As I work on the back-end of my site, I can use the same methods on images as I use to keep my live databases in sync.

- it makes your database bigger. Quite a lot bigger, actually.
- Response time is probably a little slower. I haven't measured anything, but I just figure it probably is
- processing overhead: I usually suck the blob into Imagick or GD before outputting it to HTTP. It may not be necessary to do that, but that's how I'm doing it now.
- scalability concerns: what happens when the number of images gets into the thousands? millions? My database might get so bulky that it impacts "regular" data performance.

One idea I'm toying with is to keep my image blobs in a cloud (like Amazon S3), instead of a database. Really it's just switching one storage medium for another; I'd still need a script that grabs the blob and streams it out with the right mimetype. But then I'm not as likely to run into database bloat problems.


8:35 pm on Nov 9, 2009 (gmt 0)

10+ Year Member

Storing Images in the Filesystem Versus a Database [webmasterworld.com]

This has already been discussed. I would definitely recommend reading what cooper has to say, as I agree with him wholeheartedly.



8:41 pm on Nov 9, 2009 (gmt 0)

WebmasterWorld Senior Member topr8 is a WebmasterWorld Top Contributor of All Time 10+ Year Member

>>On a site with over 10,000 little JPGs, that method gets cumbersome.

surely you could arrange the jpg's into folders in such a way that you would only need to download the 'new' jpgs as the old ones would never change? once you've backed them up, why would you want to back them up every day?


7:37 am on Nov 10, 2009 (gmt 0)

5+ Year Member

Just out of interest, does anyone know how well the images cache when storing them as a blob in the database compared to the filesystem?


3:57 pm on Nov 10, 2009 (gmt 0)

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member

>> does anyone know how well the images cache

It depends on the browser, and the URL.
If you use URL rewriting (which you almost certainly will with this technique) a URL like this one will cache nicely:

12345 is obviously a db row id

but if you employ a querystring, like this:

then browser caching will be fruity.

The caching problem with querystringed URLs can be soothed using an "Expires" header, but I find that the first method is better & easier.

Make sure you validate that the id is an integer to prevent people from injecting crap into your SQL SELECT query, and also throw a 404 status header if SQL can't find that row, or if there are any problems retrieving the image blob.


Featured Threads

Hot Threads This Week

Hot Threads This Month