Welcome to WebmasterWorld Guest from 54.162.139.105

Forum Moderators: phranque

Featured Home Page Discussion

Beyond backup: preserving your web data

What do you do? how do you do it?

     
6:09 am on Oct 11, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2006
posts:1472
votes: 86


Hi webmasters, some still use tape, some download their data from time to time, some create local backups (in their server) and some do nothing. Probably most people rely on their hosting company for backups but that's too risky, not to mention slow when it comes to emergencies (and just consider the terrible truth that not all companies perform frequent backups that you can use, data they would actually give to you for one or two files etc).

My approach: for years I've managed 1, 2 and even 3 main servers. So I created some scripts. Given X days of the month the script will backup the whole set of websites on X server. Then would copy the backup to another server, this way I had redundancy of data. In case of emergencies I expanded files and done. Server crashed? expand files, update dns, done. I was doing this manually then was fully automated, 4 times per month. It save me from terrible disasters.

Then tried something new... email. Yes you got that right, every week the server would create some special backup, fragment it and send it to my email. Why? you can get lots of free space for free, you can't get this on FTP.

I'm tempted on doing the same uploading the data to Google Drive, or Sky, DropBox, etc. Been researching but there is no easy solution. 4Shared offers ftp for paid accounts (the easiest solution that I can think of), but it would be even easier to buy some cheap hosting space with FTP (back to basics). The thing is, where is the fun if we can do it for free? so back to the cloud. The problem is most services provide some API and they change it from time to time breaking your code, I've seen it, is ugly.

So, no decisive solution, just options, what's your take?

Just in case you are wondering... yes some websites are heavy, but I'm oldschool and I work hard to keep my websites optimized, fast, not as heavy as many today. This... has saved me from lots of disasters because small is better, is faster, and easier to move around.
10:38 pm on Oct 27, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10090
votes: 548


I have a couple projects on one account at RackMount. I have it set to do a complete back-up twice a week.

I have another project at Polytechnic University but the data it produces gets written to the above mentioned RackMount account, so that's taken care of and I have all the code saved on my local machine.

On my local machine, I manually copy my 3 sites' files once a week to an external SSD. I don't bother to back-up these sites on the server since all 3 are on shared hosting and what they offer is pointless IMO.
2:19 am on Oct 29, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2006
posts:1472
votes: 86


Nice. Manual copy? that makes me nervous, you can forget a few days or be sick (traveling or whatever) and miss some copies, then disasters happen and you can loose valuable data of few days.
3:13 am on Oct 29, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10090
votes: 548


RE: manual copy?

Kinda depends on the type of site you have. I don't have forums or other dynamic environments where files are user input created or database driven. Most all pages are static, with bits of dynamic programing here & there. I know what I've changed from day to day, week to week.

I do forget to do the weekly backups occasionally, but so far it hasn't been a reason for concern.
6:06 am on Oct 29, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11071
votes: 106


I'm tempted on doing the same uploading the data to Google Drive, or Sky, DropBox, etc. Been researching but there is no easy solution.

email backup files to a gmail account?

there are some ideas in this thread for automating a solution or at least part of the process in this thread:
Automatic backups on Linux/Apache [webmasterworld.com]
7:38 am on Oct 29, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:7876
votes: 546


All my backups are based on the source machine (where the content is created). For various reasons (experience) I don't depend on SSD, though it does duplicate the spinning rust, which is itself in duplicate (rotation, swapping drives every week) to limit any loss to no MORE than 7 days, automated, and does require one other step:

My HD backup is 4 units, rotated weekly and the most recent is RELOCATED from the server/dev to an off premise location to insure that if the building burned down the backups would not be lost.

It is tempting to think of using a cloud solution for same (for the same reason, off premise), but I'm really old school in that I don't trust third parties to treasure my data the same as I do, or won't, at some future time, hold it hostage for more moolah.

My sites are generally static and incremental (what has changed recently is what is backed up) so relatively quick and painless. I suspect most of us have gone through these questions many times, and those of us that actually do backup data have their own methods.

What should be taken away from the OP and Backups in General is: DO WHATEVER IT TAKES, it is that important!
11:39 am on Oct 29, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1567
votes: 213


I call my VPS provider's API on a weekly basis to take a live, full-server snapshot. Then, depending on how often the data on the server changes, I run a backup script to ZIP up the databases, site files and config files, and transfer them to a nearby cloud region, where they are automatically removed after X days. I get an e-mail if anything goes wrong in the process. I might replicate the cloud contents to a local server at some point; there are scenarios where you just have to pull a single file from a backup, and having to download a full backup for that is a little impractical.
7:44 pm on Oct 29, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2006
posts:1472
votes: 86


@keyplyr, I get it, static pages have that benefit of not really changing for quite sometime unless one changes them. I used to do manual backups but somtimes I forgot, and sometimes I was away (traveling) and ugly things can happen.

@phranque, I used to do things like that (the referenced thread), then changed my mind. Email to Gmail? there are issues there but doable, Gdrive? doable but still issues, I will post about this below.

@tangor, good for you having physical access to the infrastructure. My sites are hosted far away, and while I have my backup routine, just like any old school I also need to have date locally, in a hard drive, USB memory, whatever but it has to be reduntant. As for cloud solutions... I don't really trust them, things fail, redundancy is key. There are no "free services" that one can really trust on reliability or privacy, and the paid ones can be expensive. Just saying US$150 per year is something some options would cost for quite decent space, is cheap, but in many ways after all due to the use I would be giving it... is not cheap. Sure how much I value my data? more than that but the thing is to be effective and that includes being cost effective.

@robzilla, nice, that sounds nice. Just an anecdote, had my sites on one big hosting company where things went wrong including backups, that sure affected me forever. Then reading news of weather issues (past years) made me thing I wasn't so crazy, but that's jut me, I experienced some difficult episodes in the past.

Perhaps later I will buy some space at some provider. The one I like the most offers great space and remote uploads but I'm still not convinced. Found a free service allowing what I would call almost the perfect solution, you can create an account, register your other accounts at other file hosting services and then you can use an URL to upload anything, after it finish uploading to their server then is uploaded to any service you registered. The problem is... that's one of those services people use for mirroring content and I never got an answer on privacy (what happens to the files you upload to their servers) end of story (there are lots of details there.)
7:53 pm on Oct 29, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2006
posts:1472
votes: 86


As for what I mentioned above I was going to post:

GDrive, Dropbox, etc are attractive solutions, there are more like 4shared, etc. They are free but have a very nice compromise with the users (specially the first two), the problem is uploading. There are no remote uploads options, you can create your own APP using their API, but after days reading and trying (coding) I decided to quit that option. They change their API and there you have it: your backup solution stops working. Besides their API doesn't work as solidly as you might expect, and some options are not official, also requiring lots of Megabytes just to upload their SDK-API to your server (for it to work) this is a short version of what I found, there are plenty of well documented complains on forums (official and non official) it just sucks. It's no surprise because they designed their solution to have a user at the end, not remote or bot use. I do understand it's not an accident.

Transfer to another server has been my favorite solution, is fast, way fast and easy. But you have to pay several servers at diff hosting companies, oh yes I had two servers at the same company in several occasions only to find some "issues" are not isolated as sometimes they want you to believe, don't keep all the eggs in the same basket!.

Back to email. Many email services offer great solutions and space, specially Microsoft and Google, the problem is the size of what you send. I've found hosting servers using a limit of 24MB, you can't send anything bigger than that, and that doesn't mean the email service would be able to receive it, there are limits. So I'm dividing the backup in several files. Then I can save to DropBox or Gdrive. Is free, FAST, quite reliable and it can happen while I'm sleeping. I discard any option that requires my intervention, it's fantastic to be far away in a forest, road or another country and receive an email saying "your backup is complete now". If I could manage to send bigger files than 24 MB it would be perfect!.
7:59 pm on Oct 29, 2017 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Nov 25, 2003
posts:1080
votes: 248


Note: I'm a straight info site, however I do do a lot of visitor fingerprinting->personalisation->contextual serving that is modified in real time.
Note: my business has become a rather larger operation than initially envisioned...
Note: no full backup is ever complete unless confirmed valid via a test restore comparison.
Note: I am ever mindful of Murphy.

1. Development versions are backed up:
* continuously byte-level deduplicated and backed up on development server.
* and to a dedicated backup server in my office.
* fully every 24hrs to removable HD's in Faraday cases held in a fire-rated portable safe in a rated safe in my office.
---HDs rotated as required.

2. A copy of the current production version of each site is maintained:
* on a dedicated backup server in my office.
* on removable HD's in Faraday cases in a fire-rated portable safe in a rated safe in my office.
* on removable HD's in Faraday cases in a safety deposit box.

3. Live sites are:
* served via a CDN that, in toto, serves as a practical geo-failover.
* continuously byte-level deduplicated and backed up variously.

So far, so good; knocks head against desk in supplication.
Disclaimer: iamlost not paranoid. Really.
10:18 pm on Nov 1, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14624
votes: 392


Download entire site via FTP now and then. Download DB on a regular schedule plus have a backup emailed daily.
12:26 pm on Nov 2, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1567
votes: 213


Hope you're all using encrypted e-mail for those backups...
6:25 pm on Nov 2, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10090
votes: 548


Ditto on the encrypted email comment. Email has to be the most unsecure platform out there.

Gmail has been secure for several years now, but personally, I still wouldn't trust it with anything highly sensitive.
10:08 pm on Nov 2, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1567
votes: 213


Unfortunately, both sender and receiver need to support TLS for the e-mail messages to securely travel across the wire.

Lots of internet traffic is still insecure. HTTP, FTP, POP3, IMAP, rsync, etc. The risk may not be great, but the potential is, and since the Web is essentially my livelihood, I encrypt pretty much all my data nowadays.
10:24 pm on Nov 2, 2017 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10090
votes: 548


My GF has been attempting to encrypt me for several months, but I remain a fan of open source.
5:15 pm on Nov 3, 2017 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:24715
votes: 608


I guess it depends on the type of site. A large dynamic site will have hugely different requirements to a forum, or a blog, or an ecommerce site.

I dare not rely on the ISP because, unless you actually pay for them to run backups, you cannot rely on their systems. Even so, I think i'd rather have a copy backed up.

I rely on automated tools to make multiple copies off the server.
Those backups, site and database, etc., are then automatically duplicated on multiple hard drives, on and off-site.
Once a year I archive certain aspects of the sites to DVD, which is a manual process. It is a little tiresome, but it's only once per year.
The largest single files are usually log files, and they present their own challenge of storage.
Wordpress sites are automatically backup up and downloaded with the same frequency.
6:04 am on Nov 4, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:7876
votes: 546


Backup, by most commonsense, is a copy of existing. The best form of that is a true copy maintained OFF SITE, whether on local system, media (hd, ssd, tape), and DUPED to a secondary exterior location as well, in case of fire, flood or Acts of God (a legal term, don't imply anything else).

For MOST webmasters, the DEV (local) machine and the HOST (where site is located) are backups of each other. For those who CREATE their sites on HOST only, then you must FTP (copy) site to a LOCAL! And then make a copy of LOCAL and place that off premises.

What makes backup work IN ALL CASES is FREQUENCY. Fail that and your most recent backup is WHERE YOU START AGAIN.

Most of my sites are in the 7 day range as the sites are not that turgid in updates. One site I manage is 24 hours as it might have between 6-25 updates per day. Other sites, like major news outlets, are maintained on constant mirrors in different locations. That's the best, and most expensive, method of getting things done.
6:41 am on Nov 6, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2006
posts:1472
votes: 86


Nice, valid and valuable points. To me backups should also be like a magic wand, a magic command. That one little thing you do and the music begins, like writing some fresh content and then not having to wait for 1 week for the usual weekly backup to run, it must be something you can call at will when you need it.

The first enemy of backups is not doing it often.

Another enemy, don't know if #2 or perhaps #0, the mother of all problems: is not having optimized sites, having gazillion bytes of trash you don't use. Backups should be like traveling: fast, light when possible, easy to move. I remember a "webmaster" in charge of X site delivering the files on DVD to the sysadmin for upload, that site was so big, sure that site died very fast. Gone for good.
7:20 am on Nov 6, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:7876
votes: 546


Key to everything is "incremental". Back up WHAT HAS CHANGED and leave the "static" core alone. This keeps backups down to SECONDS and MINUTES not HOURS and prevents extraordinary expense in ever larger storage medi(a/um) costs.
12:37 am on Nov 9, 2017 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 10, 2004
posts: 431
votes: 24


I use a Gsuite Google Drive account with unlimited storage. Costs me $10/month.

My Linux servers each have a separate hard drive for daily incremental backups, weekly fully backups and monthly full backups. Each morning rclone syncs them to Google Drive, encrypting them in the process. I'm at about 1.2 terabytes of storage there, can't beat the price.

In addition to the above, rsnapshop runs on each, with 6 updates per day of new files, 10 days of daily, 12 weeks of weekly and 6 months of monthly. Since rsnapshot only stores changes, it takes very little space to have all these copies, even though each looks like a complete mirror of the "live" file system.

I have the Google Drive sync to a local server here, so there a backup to my Google Drive backup.
7:16 am on Nov 10, 2017 (gmt 0)

Senior Member from ES 

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2005
posts:656
votes: 0


Daily backup all my sites with backWPup plugin to a Dropbox account.
Local files (invoices, taxes, lists ...) copied every friday to a pendrive that I alwasys get in my keychain.
10:39 pm on Nov 10, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1567
votes: 213


Literally a keychain? What if you lose your keys on the street? Can whoever finds them see your taxes?
11:39 pm on Nov 10, 2017 (gmt 0)

Senior Member from KZ 

WebmasterWorld Senior Member lammert is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 10, 2005
posts: 2936
votes: 24


I have never focused on making backups, always on restoring data. For that reason my backup schedule always creates full backups. Restoring data from an incremental set is a challenging job if you are under stress when your systems are down. One corrupted or missing backup file in the chain may make it impossible to restore at all.

But chances are small that I will need to restore my main web servers in the coming years. Since a few years my main servers use a cluster configuration with five servers in four data centers in three countries. All data is real time synchronized over encrypted channels. This may seem to be an expensive setup, but I now pay less for these five servers than I did for one "reliable" server in the past. When selecting hardware, the only things which count are network reliability and CPU speed. RAID and 24/7 tech support for reliability are not necessary. If a server fails or becomes unstable, I simply remove it from the cluster and terminate the lease contract. New servers can be commissioned and fully functional in the cluster within a few hours. All replication, fail-over and monitoring tasks are automated. The system sends me emails when cluster nodes are failing. The remaining nodes will automatically disconnect the failed node and continue working.

Three of the five nodes are setup to create backups of the database each night. For that task they may temporarily disconnect from the cluster to be sure their data set is consistent, create a compressed and encrypted backup and send it to two cloud locations in two different countries.Backups of the three nodes are scheduled at different times to be sure that at each moment at least four synchronized nodes are in the cluster backbone. Backups are named per weekday which are overwritten each seven days, and every Sunday a complete backup is created for permanent storage. This creates a significant amount of backup data (the largest table currently contains 230 million records and grows with more than 100000 records per day) but with current costs of cloud storage it is doable.

The setup as mentioned above is running on MariaDB and Galera Cluster with IP based fail over at the DNS provider level. To make the setup work with my sites I had to move all user session information and flat files like images to the SQL database. Otherwise this information would not be available when a node fails and traffic is automatically rerouted to another node. Due to all the automation, maintenance work is now just a few hours per month, mainly to update the server software with the latest patches.
2:47 am on Nov 11, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2006
posts:1472
votes: 86


Some really clever and complete routines here!.

@Lexur, I don't know how backWPup works, I mean never used it. Does it work manually? while you are connected? While I was researching options to upload directly to Dropbox, well I found some nice tools, turns out they worked using my computer. I mean there was no direct interaction between the original server and the destination (Gdrive, Dropbox etc), this includes the chrome extensions found to do the same job. Just as I closed the browser, turned my computer off, etc the process stopped without warning.

Pendrive? I suggest encryption. I used to include a copy on a portable drive (not all the server but some scripts I was working on), well... the pendrive stuff wasn't as strong as the rest of the process, it got broken. On a previous ocassion I just lost it, so two times were enough to avoid such practice, those things are mostly decorative in this context, some security doors can ruind them, crossing an airport door ruined one too.
2:49 am on Nov 11, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2006
posts:1472
votes: 86


Lamert: Restoring data from an incremental set is a challenging job if you are under stress when your systems are down. One corrupted or missing backup file in the chain may make it impossible to restore at all.

True. I make myself use only full backups, space is not really a problem, and yes emergencies can be a killer when you are restoring data.
9:26 am on Nov 12, 2017 (gmt 0)

Senior Member from ES 

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2005
posts:656
votes: 0


@robzilla yes, I use a pendrive in my keychain. By the way, I always carry the keychain in my front right pocket and it is tied with a slight black rope to the wallet, always in my back right pocket.
That way, I never lose my keys (actually they cannot move away from my body more than the sixty centimeters of the rope) and if somebody tries to rob my wallet while commuting, I will perceive the pull in my keys, so I cannot lose the wallet neither for the same reason.

@explorador for Wordpress websites BackWPup works great. You can
- schedule a daily backup in the midnight and upload it to Dropbox (you can do it manually also, of course)
- in the same task, order a daily MySQL optimization
- save as many backups as you want (usually the last 30)
- select what folders and files do you want to be saved
- select only the main image (not the thousands of thumbnails generated by WP)
And remember, what I get in the pendrive is a backup copy, the data are in my local computer, at the office.
7:18 pm on Nov 12, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2006
posts:1472
votes: 86


Sounds nice, (the plugin) just think about what I wrote about the data transfer, if it needs you to be online then the data is being transfered using your local computer, many things can go wrong in the middle (man in the middle). Data transfer should be as direct as possible.
9:23 pm on Nov 12, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 25, 2005
posts:1567
votes: 213


tied with a slight black rope to the wallet

I like it :-) I'd copy you but my wallet and keys don't fit in the same pocket.