Automated full-disk backup on Linux/Ubuntu

By Atomstar on Friday 4 October 2019 18:14 - Comments (5)
Categories: Linux, RaspberryPi, Views: 2.532

Now that I'm storing my valuable smart home data (;)) on Raspberry Pi I need a backup in case something goes wrong, most notably a power failure.

I settled on a full-disk rsnapshot incremental backup scheme, combined with explicit influxdb backup which seems to work nicely.

Goal

On my Raspberry Pi I have the following assets to protect:
  • Overall Raspbian config (which took a while to converge to)
  • InfluxDB with data
  • Grafana with dashboards
  • Worker scripts
Goals:
  • Protect against data corruption through power failure
  • Bonus: protect against user error (keep version history)
  • Bonus: protect against fire (offsite backup)

Alternatives considered

Besides a backup, I briefly considered a UPS to protect against power outage. This would be nicer as I never need to check for corruption of the fs after actual power failure. However, this solution is either more complex (needs extra wiring/soldering), more expensive, unclear reliability (test once is no guarantee for it to work, while backups are to some extent), and most importantly: a UPS could actually increase risk of fire, nullifying my UPS-solution and causing worse problems :p

Ideas I had and decided not to use:
  • Use a Powerbank as UPS: pro: easy, reliable; con: needs bank that can simultaneously charge/discharge (expensive), increase fire risk: affordable power banks are not designed to be used continuously, additional logic required to power down for longer outages
  • Custom Rai solution: pro: integrated solution, reliable? con: more complex, fire risk, additional logic required to power down for longer outages

Backup solution

I settled on using rsnapshot. I tried dd before, as recommended here and there, but this did not work for me as I want to backup a live filesystem, and somehow dd does not like this.

For reference, this command did NOT work:
ionice -c 3 dd bs=4M if=/dev/mmcblk0p2 | gzip \
  > /media/backup/$(date +%Y%m%d).raspbian.img.gz
Backup influxdb
Since backing up a live database is prone to error (I don't know what would happen when backing up a database which is being written to), I separately backup the influxdb using the following script which runs daily at midnight:

#!/usr/bin/env bash

/usr/bin/influxd backup -portable \
  /home/pi/backup/influx_snapshot.db/$(date +%Y%m%d) && \
  /home/pi/.local/bin/rotate-backups \
  --daily 5 --weekly 4 --yearly "always" \
  /home/pi/backup/influx_snapshot.db/


This makes a backup using influxdb's own backup utility, and rotates these using rotate-backups. Rotating has the advantage that I can roll back a few days back in case I accidentally delete my data.
Get USB stick
Get reliable (and optionally slow thus cheap) USB stick (USB2.0 suffices for my ~3GB RPi3B+ installation). Some options:
  • Transcend JetFlash 790 64GB Zwart, 15 EUR, 35 MB/s
  • Transcend JetFlash 790 128GB Zwart, 21 EUR, 35 MB/s
  • Intenso Speed Line 128GB Zwart, 18 EUR, 23 MB/s
  • Sandisk Ultra 64 GB, 13 EUR, 22 MB/s
  • Intenso Speed Line 256GB Zwart, 35 EUR, 89MB/s
  • Kingston DataTraveler 100 G3 64GB Zwart, 8 EUR, 15 MB/s (collapses for random write)
Based on this hardware.info review.

Add to /etc/fstab to automount:
UUID=6AC72A3C-8CAA-445F-83CE-35FF5D76BD01 /media/backup ext4 noatime,noexec,nosuid 0 0
Configure & schedule rsnapshot
As backup tool I use rsnapshot which has been around for a while and is built on robust rsync backend. I used the digital ocean guide and the linuxconfig guide

I use the following config:
config_version	1.2

# Set backup target to usb stick
snapshot_root	/media/backup/rsnapshot/
# don't create root directory because it's already there
no_create_root	1

cmd_cp		/bin/cp
cmd_rm		/bin/rm
cmd_rsync	/usr/bin/rsync
cmd_logger	/usr/bin/logger
cmd_du		/usr/bin/du

# Keep 7 daily, 4 weekly and 4 monthly backups
retain	daily	7
retain	weekly	4
retain	monthly	4

verbose		2
loglevel	3
lockfile	/var/run/rsnapshot.pid

# Exclude
exclude	/media/
exclude	/dev/
exclude	/mnt/
exclude	/lost+found/
exclude	/proc/
exclude	/tmp/

# Add backup target
backup	/		rpi3b/


After setting up, test config and dry-run:
rsnapshot configtest
rsnapshot -t daily


And finally add to cron (as root):
# daily backup is ran at 01:20 am to include stuff happening at midnight
20 01 * * * /usr/bin/rsnapshot daily 
# weekly backup is ran at 01:05 pm on Sunday, just before running the daily backup that week
05 01 * * 7 /usr/bin/rsnapshot weekly 
Conclusion & next steps
Using this backup scheme I largely cover two of the three goals:
  • Protect against data corruption - works with max 1 day delay, which could be shortened by increasing backup frequency. Open risk: recovering crashed/corrupted system is fairly slow as I likely have to reconfigure stuff manually
  • Bonus: protect against user error (keep version history) - covered nicely via incremental backups
  • Bonus: protect against fire (offsite backup) - not covered, but if my home is gone home automation loss is OK
Some ideas to extend & improve:
  • Automatic method to check for data corruption upon hard system crash
  • Automatic method to check for invisible data corruption upon system crash after which system seems to work OK
  • Backup to offsite host (rsync works over ssh)
  • Run using ionice/nice to reduce backup load
  • Automatically umount usb stick when not in use

Volgende: Tunnel Apple AirPlay/Bonjour across zones in OpenWRT Firewall 20-09 Tunnel Apple AirPlay/Bonjour across zones in OpenWRT Firewall

Comments


By Tweakers user ASP, Sunday 6 October 2019 13:21

Tip: how to restore!

By Tweakers user vanaalten, Monday 7 October 2019 08:01

Would it make sense to backup to a micro-SD card instead (somehow connected via USB)? Then you could consider to buy a 2nd rPI: in case your smart home controller breaks down, you can quickly replace by replacing the broken rPI with the backup unit & backup micro-SD card.

By Tweakers user stuiterveer, Monday 7 October 2019 09:46

vanaalten wrote on Monday 7 October 2019 @ 08:01:
Would it make sense to backup to a micro-SD card instead (somehow connected via USB)? Then you could consider to buy a 2nd rPI: in case your smart home controller breaks down, you can quickly replace by replacing the broken rPI with the backup unit & backup micro-SD card.
When talking about backups, I personally don't feel like this is an adequate step since SD card corruption happens way faster than HDD corruption.

By Tweakers user Atomstar, Wednesday 9 October 2019 18:29

Good point, no experience yet :p
vanaalten wrote on Monday 7 October 2019 @ 08:01:
Would it make sense to backup to a micro-SD card instead (somehow connected via USB)?
Although the idea is nice, this does not work with a live system, see 'Alternatives considered'. Any ideas on how to make this work are welcome :)
stuiterveer wrote on Monday 7 October 2019 @ 09:46:
When talking about backups, I personally don't feel like this is an adequate step since SD card corruption happens way faster than HDD corruption.
I'm already running my RPi off a SD card, so the added risk is already there. Also, I believe wear of SD cards is greatly exaggerated, see Reduce filesystem usage to prevent SD card wear at a previous post.

By Tweakers user roelvdwal, Tuesday 15 October 2019 21:51

The reason why dd won't work is because it works at a lower level: it's filesystem agnostic and copies block by block. This will fail if, for example, you write a large file to storage, one part might be placed close to the beginning of the drive, another part at the end. If using dd, it's possible only the second portion will get copied over. I'm 99% sure I'm omitting other things what'll go wrong with dd on a live system but that's one of the easiest to grasp and explain. Although this is not something I experiment with, maybe looking into the zfs filesystem is worthwhile, especially because it can create snapshots of a running filesystem. These snapshots will only take up more space if you actually modify the filesystem after their creation. compare it to making snapshots of virtual machines if you like. From what I've read these can be sent over the network as well. For more info, see
https://serverfault.com/q...nuous-backups-of-zfs-pool
Keep in mind I'm only relying on other people's info, I've never built such a system myself

Comment form
(required)
(required, but will not be displayed)
(optional)