Msg#: 3290160 posted 8:18 pm on Mar 22, 2007 (gmt 0)
I wonder how do large websites like Flickr, Photobucket, etc which have millions of photos maintain their database structure. Do they have one single tables containing the records of photos or do they break it up at a certain size.
Does anyone have any opinions to share or could you provide me with some great technical documentation on such complex database structure designing.
Msg#: 3290160 posted 9:52 pm on Mar 22, 2007 (gmt 0)
Large binary objects are better held in a filesystem IMHO, since that's what filesystems are optimised to handle in many ways. My experience of filesystem vs RDBMS BLOBs is that the former was 1000x faster for me, and simpler to implement.
Keep your directories small however, so introduce a sensible hierarchy (ie tree structure).
Msg#: 3290160 posted 4:23 am on Mar 23, 2007 (gmt 0)
Well, they might not even store their photos in the database. They could have records pointing to the location of the photos on the file server.
Yes, I am aware of that. They maintain the photos on the filesystem and there are simply records in the mysql tables to reference them. I am however talking about the MySql Table structure. For eg: if there are millions of photos, the records for these millions are held in one single table or they are partitioned [dev.mysql.com]?
Msg#: 3290160 posted 9:29 am on Mar 23, 2007 (gmt 0)
Speaking only for my mere 20,000+ case: I keep all metadata in memory, and the main table which maps from name to all the other details of a photo/video/sound is flat. With a decent hash of the key there is no reason not to, not even memory these days.
(I generally do not keep photo/thumbnail bulk data in memory.)
Msg#: 3290160 posted 1:13 pm on Mar 23, 2007 (gmt 0)
The question is not an easy one to answer. I've been working with databases for many years and almost every project I've ever worked on has a different setup.
Do they use partitioning? It's hard to say although it would work. They could very well be using a fulltext text field to hold the tags. They could even just relate the tags table back to the image table. All those would work with millions of rows without any problems.
It all comes down to what the database folks there are comfortable with as long as the results come back as fast as the requirements dictate.