I’m setting up automated offsite backups from my NSLU2 to Amazon S3. With suprisingly little effort, I’ve managed to get a tool called s3sync running on the “slug” (as it’s known). s3sync is a Ruby script, so in order to run it, I had to install Ruby, which in turn meant that I had to replace the slug’s firmware with a different version of Linux, called Unslung. All of this worked pretty much as advertised in the tools’ respective documentation – for the details, see the previous posts in this series.
As all of the pieces were in place, I next needed to do some simple tests to make sure it could handle the kind of files I wanted it to back up. In particular, I wanted it to be able to handle deep directory hierarchies, and to remember user and group ownership and file permissions.
The first step was to create some test files.
# cd /tmp # mkdir testdata # cd testdata # mkdir directorynumber1 # cd directorynumber1 # mkdir directorynumber2 # cd directorynumber2
# cd directorynumber21 # pwd /tmp/testdata/directorynumber1/directorynumber2/directorynumber3/directorynumber4/directorynumber5/directorynumber6/directorynumber7/directorynumber8/directorynumber9/directorynumber10/directorynumber11/directorynumber12/directorynumber13/directorynumber14/directorynumber15/directorynumber16/directorynumber17/directorynumber18/directorynumber19/directorynumber20/directorynumber21 # cat > file000 000 # chmod 000 file000 # cat > file644 644 # chmod 644 file644 # cat > file777 777 # chmod 777 file777 # chown guest:nobody file777 # chown bin:administrators file000 # ls -lrt ---------- 1 bin administ 4 Nov 14 2006 file000 -rw-r--r-- 1 root root 4 Nov 14 2006 file644 -rwxrwxrwx 1 guest nobody 4 Nov 14 2006 file777 #
So, I had some files with differing permissions and ownership, at the bottom of a directory hierarchy with over 350 characters in it – I had a vague impression that there might be a 200-character key limit on S3, and I’m always worried about 255-character limits, so 350 seemed like a sensible test length; if a system can manage 350, it can probably manage much larger figures, up to 32,767 or so… Anyway, the next step was to sync the whole thing up to S3:
# cd /tmp/s3sync/ # ./s3sync.rb -r /tmp/testdata <my key ID>.Test:yetanotherprefix #
A quick check with jets3t Cockpit confirmed that everything was uploaded with appropriate-looking keys, and also with properties specifying decent-looking integer owner, group and permission values. This looked good – no key-length limit issues. However, there was only one way to be absolutely sure that it was working:
# ./s3sync.rb -r <my key ID>.Test:yetanotherprefix/testdata/ /tmp/copytestdata #
(Note the positions of the slashes, etc. – the full syntax for s3sync can take a while to work out, but the README documents it well if you take the time to read it…)
And then, to confirm that it’s OK:
# cd /tmp/copytestdata/directorynumber1/directorynumber2/directorynumber3/directorynumber4/directorynumber5/directorynumber6/directorynumber7/directorynumber8/directorynumber9/directorynumber10/directorynumber11/directorynumber12/directorynumber13/directorynumber14/directorynumber15/directorynumber16/directorynumber17/directorynumber18/directorynumber19/directorynumber20/directorynumber21/ # ls -lrt -rw-r--r-- 1 root root 4 Nov 14 01:03 file644 ---------- 1 bin administ 4 Nov 14 01:03 file000 -rwxrwxrwx 1 guest nobody 4 Nov 14 01:03 file777 #
…which all looked correct!
So now I knew that s3sync would work from the NSLU2 to Amazon S3, that the file attributes I cared about were being persisted, and that deep directory hierarchies were not a problem. The next step would have to be to get it working with full SSL, as I don’t really want my private data flying over the public Internet unencrypted, and then to put the whole thing into a shell script and schedule a cron job to sync daily.