homepage Welcome to WebmasterWorld Guest from 54.166.120.175
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Hardware and OS Related Technologies / Linux, Unix, and *nix like Operating Systems
Forum Library, Charter, Moderators: bakedjake

Linux, Unix, and *nix like Operating Systems Forum

    
need help with tar to backup entire folders
put the turkey leftovers down and help me out :-)
walkman



 
Msg#: 3512016 posted 6:25 pm on Nov 23, 2007 (gmt 0)

does this command seem right?
tar cvf - directory gzip > directory.gz

The reason I am asking is that lately I cannot unzip the directory.gz on windows no matter what program I use (tried 6 of them) At most I get one much larger file instead of unzipping in /folder/sub/file/html which it always did. I have a serious of scripts and use webmin to backup nightly, just in case, but I might have been fooled. I did not try this on my server...usually when I overwrite a config file for example, I unzip it here and upload it back. For server failure I have a second HD that should back up nightly.

Does anyoen have a better solution to backup entire directories? Uploading thm as they are is too time consuming.
My full command on the script I am using is:
cd $log_dir/$current_day
tar cvf - $web_dir/* gzip >$log_dir/$current_day/so-www-hour-$datestamp.zip

where the variables are on top of the script. This way there is no overwriting as the date and folders are different. the folders and dates work fine, the content is the issue. I can unzip .gz files from other sites (i.e. downloaded programs) on my PC too.

Thanks in advance for any suggestions,

 

mcavic

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3512016 posted 8:27 pm on Nov 23, 2007 (gmt 0)

does this command seem right?

tar cvf - directory gzip -c > directory.gz

To test the generated file, in Linux, try:
gunzip -c directory.gz tar tvf -

And that should give you a list of files in the archive. This is my preferred method in Linux, though I don't have all that much experience with it in Windows. You might need to call it directory.tar.gz instead of directory.gz. That way, after it's decompressed in Windows, it'll be a tar file, which still needs to be extracted.

DamonHD

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3512016 posted 10:37 pm on Nov 23, 2007 (gmt 0)

You want tar cf - directory ¦ gzip -9 > directory.tar.gz

1) Don't use the v option on tar else the verbose output will/may get mixed up with the tar output and corrupt the archive.

2) Use the -9 option on gzip if you want decent compression, which is probably good for backups.

Rgds

Damon

walkman



 
Msg#: 3512016 posted 11:16 pm on Nov 23, 2007 (gmt 0)

thanks! Prob solved. Not sure why it stopped working but ...

One more problem: I also need to exclude some folders for a more comprehensive backup. Bascially I need to back up /home minus a few (huge and redundant) folders. Sort of:
tar -cf --exclude "/home/cronolog"--exclude "/home/site2/backup" --exclude "/home/site3/backup" /home gzip -c > $dir/$day/all.tar.gz

I had it as tar -cf $dir/$day/all.tar.gz
--exclude "/home/cronolog"--exclude "/home/site2/backup" --exclude "/home/site3/backup" /home
and it worked but there was no gzip, making the archives too large. Also, using nano the line is too long and this resulted in problems (too many exclusions add up).

how do I do this?

Thanks again,

DamonHD

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3512016 posted 8:48 pm on Nov 24, 2007 (gmt 0)

I would be more inclined to use 'find' to collect the files, filter things out with its logic or use grep, then pass the file list to tar.

One way of doing this might be:

tar cf - `find . -type f ¦ egrep -v dir/name1` ¦ gzip -9 > directory.tar.gz

Rgds

Damon

walkman



 
Msg#: 3512016 posted 10:31 pm on Nov 24, 2007 (gmt 0)

>> tar cf - `find . -type f egrep -v dir/name1` gzip -9 > directory.tar.gz
Thank you! That is good as well.
Two questions: is the "-type f" to exclude? If so,
how do I add more directories to exclude? Say I want to exclude dir1 dir2 dir3.

Is this much more server intensive?

Thanks a lot for your help,

DamonHD

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3512016 posted 9:58 am on Nov 25, 2007 (gmt 0)

The -type f includes only plain files since you don't usually need the directories entries in your tar file, and this stops tar recursing on those directories giving duplicate copies.

You can do more filtering in various ways, but the simplest is probably to replace the single egrep with a chain of them, eg:

tar cf - `find . -type f egrep -v dir/name1 egrep -v dir/name2 ...` ...

But if its getting that complicated I'd break this into two steps: selecting the files and then tar-ing them up. You tar probably has an option to archive a set of files listed in another file...

Unless you actually see a performance problem, don't worry about it. If you get into some sort of difficulties you could learn to combine the multiple egrep stages into one with a more complex regex filter expression, eg:

egrep -v '(dir/name1)¦(dir/name2)'

Rgds

Damon

[edited by: DamonHD at 10:00 am (utc) on Nov. 25, 2007]

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3512016 posted 12:49 pm on Nov 25, 2007 (gmt 0)

the type -f will also exclude symbolic links which can be useful for tar.

if the multi pipe/egrep thing doesn't do it for you, you can try the -prune option for the find command, but i'm not familiar enough with the syntax to "fix" your command to use it.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Hardware and OS Related Technologies / Linux, Unix, and *nix like Operating Systems
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved