Forum Moderators: phranque
a) The actual size of a file on disk is not only determined by the contents of that file, but also by the block size of the file system it is stored in. If your own box uses a blocksize of 1kb, and the server one of 2kb, then you're losing an average 1kb per file when uploading. This has nothing to do with the OS used, but can be different on each machine with any OS. In general, large disks use larger block sizes than small ones.
b) Stuff you did on the server (even the uploading) may have left garbage in some temp directory, which isn't automatically cleared, but still counts for your quota.
so I am thinking I have used about 57meg??? they say I have used 70
the site contains just over 3500 files do you think the block size would add up to the remainder???
the site contains just over 3500 files do you think the block size would add up to the remainder?
That would mean a loss of 3.7 kb per file. I just checked and found that my own /home partition of less than 10 GB uses a block size of 4kb. If the partition on your server is significantly larger (eg 80+ GB), then a block size of 8 kb would be possible, explaining the difference (the average loss is half the block size per file).
Of course, there's still the possibility that your log files are also counted, so you might want to ask them about that.
This has nothing to do with the operating system. The block size is determined by the file system, and should be variable with all up to date implementations, even under Windows. It is possible that good old FAT couldn't do more than 1 kb, but that's really not very useful for large disks. With a decent system setup, the user can specify the desired block size during installation, although some dumbed down formatting routines may try to make a guess without telling you about it.
The optimal block size depends on both the size of the partition as well as the typical (not average!) size of the files that will be stored there. If you make the blocks small, then you may get too many of them, so that the block management structures (inodes on unix) take more space on the disk than you're gaining by reducing the slack per file. If they're too big, then each file will waste more space than necessary. There's no "one size fits all" here.
3500 items? Are they all in one folder? The folder itself takes up a space as well.
While this is true, you shouldn't forget that directories/folders are also just files on a technical level, subject to the same block size boundaries. Storing large numbers of files into one directory may create other performance problems, but for our type of counting here, a directory is just another file that should be included with that number of 3500.
I also have MYSQL so the db for that is another 5 meg
This reduces the difference to 8 MB, and the average loss per file to 2.3 kb (ignoring the directories). Therefore, the block sitze of your server is most likely 4 kb, which sounds very reasonable. You'll just have to live with the fact that your files eat more space there than on your local disk.
If you have 3500 files that all consist of one character, you expect to have used 3500 bytes of the hard disk. This is not true:
Each file will be allocated a minimum block size. Windows in some circumstances may allocate a minimum of 16k per file. This means that you will use 3500 x 16k for the above files = 56000k or about 55Mbytes (! sounds bad !).
You may find that Linux may be less economical in some circumstances and may allocate a minimum of 32k per file, thus using double the amount above: 110Mbytes.
On top of that, each file will have a filename, that has to be stored somewhere. If the average filename length is 10 bytes, there is another 34k of data stored, then it has to store the time the file was created and updated (min of 4 bytes each, more likely to be 8 bytes), the file length, index map of various parts of the file (files are not always stored in one long block, you find a bit here and a bit there - the index points the file system in the right direction, but takes up space).
There is a lot to consider.
If the minimum block size is 4k, a one byte file will take space in the directory index plus 4k. If the file was 4k, it would take exactly the same amount of space. But if it were 4k + 1byte, it would need an extra block, and it would take up space in the directory index plus 8k (2x4k).
When I mentioned 3500+ files most of you would have probaly realised most of my side is DB driven. What I did was to compress the file size of my tempate HTML files. Generaly I removed as much "white space" as I could, all <!--notes--> and un nescasery <tags>. Managed to reduce my template by about 10% 3500 files all 10% smaller certainly ads up. Im now 8 meg below alocation.
:)