| Linux ext2 filesystem and large numbers of files Any issues? |
sugarkane

msg:913659 | 11:15 am on Oct 21, 2002 (gmt 0) | One solution to a project I'm working on could involve the creation of a large number of very small files - in the region of a million or so. Are there any real or practical limitations on the number of files in a Linux filesystem? Any other issues to bear in mind - eg storage space efficiency etc?
|
martin

msg:913660 | 6:48 pm on Oct 21, 2002 (gmt 0) | man mke2fs the -i option sets the bytes/inode ratio. I was looking for something similar in tune2fs but I couldn't find any.
|
amoore

msg:913661 | 7:38 pm on Oct 21, 2002 (gmt 0) | It seems like once you get a couple thousand files in a directory it takes forever to ls it. I think that it has to stat all the files and that takes forever. I don't think it's even a linear increase since it starts to get really slow pretty quickly. I can't remember if it slows down the time to stat a file given its name or to stat the directory or anything like that, though. I would suggest a couple of things: test it out first by making a directory with 100,000 or so files in it and see if it works very well. Also, think about making the directory structure hierarchical in some way. You will notice that machines with many users do this with the first two characters of the username for /home. Something like: /lotsafiles/aa/aasgd /lotsafiles/aa/aaasdf /lotsafiles/ab/abasdf /lotsafiles/bb/bbfoobar may work well for you. Finally, as martin suggested, you may want to adjust your block size if the files are all really small so that you don't use up extra disk space. A relational database is probably starting to sound a little more attrcative now, I bet.
|
sugarkane

msg:913662 | 8:07 pm on Oct 21, 2002 (gmt 0) | > structure hierarchical in some way Yep, was planning to have directories 0-255, each with subdirs 0-255 (ie a 16 bit number range mapped to the filesystem). Many directories would in fact be empty or maybe only contain a few files, some might contain a few thousand. > A relational database is probably starting to sound a little more attrcative Hehe, actually I'm using one now, and it's starting to drag its feet a little. I think I can maybe cut down the search times by going this way instead... but my grand plan is a little hazy at the moment I have to admit.
|
|
|