Project: Automated offsite backups for an NSLU2 -- part 11

Posted on 14 November 2006 in NSLU2 offsite backup project

Previously in this series: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8, Part 9, Part 10.

I'm setting up automated offsite backups from my NSLU2 to Amazon S3. With suprisingly little effort, I've managed to get a tool called s3sync running on the "slug" (as it's known). s3sync is a Ruby script, so in order to run it, I had to install Ruby, which in turn meant that I had to replace the slug's firmware with a different version of Linux, called Unslung. All of this worked pretty much as advertised in the tools' respective documentation -- for the details, see the previous posts in this series.

Having confirmed that s3sync worked as I'd expect it to, I needed to install it in a sensible place -- I'd previously just put it in /tmp -- set it up so that I could use SSL to encrypt the data while it was on its way to Amazon, and then write a script to synchronise at least one of the directories I want backed up. I'd then be able to test the script, schedule it, test the scheduling, and then I'd be done!

First things first - I was getting annoyed with not having some of my favourite packages installed on the slug, so:

# ipkg install less
Installing less (394-2) to root...
Downloading http://ipkg.nslu2-linux.org/feeds/unslung/cross/less_394-2_armeb.ipk
Installing ncursesw (5.5-1) to root...
Downloading http://ipkg.nslu2-linux.org/feeds/unslung/cross/ncursesw_5.5-1_armeb.ipk
Installing ncurses (5.5-1) to root...
Downloading http://ipkg.nslu2-linux.org/feeds/unslung/cross/ncurses_5.5-1_armeb.ipk
Configuring less
Configuring ncurses
Configuring ncursesw
# ipkg install bash
Installing bash (3.1-1) to root...
Downloading http://ipkg.nslu2-linux.org/feeds/unslung/cross/bash_3.1-1_armeb.ipk
Installing readline (5.1-1) to root...
Downloading http://ipkg.nslu2-linux.org/feeds/unslung/cross/readline_5.1-1_armeb.ipk
Configuring bash
Configuring readline
# ls /opt/bin/bash
# /opt/bin/bash
bash-3.1#

So, I edited /etc/passwd to make /opt/bin/bash the shell for root, logged out, then logged back in again.

OK, the next task was to installing s3sync somewhere sensible: I felt that /home/s3sync was a good enough place for the s3sync script itself and my own shell scripts, so I put everything there:

-bash-3.1# cd /home
-bash-3.1# mkdir s3sync
-bash-3.1# cd s3sync
-bash-3.1# mv /tmp/s3sync/* .
-bash-3.1# ls
HTTPStreaming.rb README.txt      README_s3cmd.txt S3.rb           S3_s3sync_mod.rb
S3encoder.rb     s3cmd.rb        s3sync.rb        s3try.rb        thread_generator.rb
-bash-3.1#

Next, it was necessary to install some root certificates so that it could use SSL to transfer data. Working from John Eberly's post on how he set up s3sync, I did the following:

-bash-3.1# mkdir certs
-bash-3.1# cd certs
-bash-3.1# wget http://mirbsd.mirsolutions.de/cvs.cgi/~checkout~/src/etc/ssl.certs.shar
Connecting to mirbsd.mirsolutions.de[85.214.23.162]:80
-bash-3.1# sh ssl.certs.shar
x - 00869212.0
x - 052e396b.0
x - 0bb21872.0

...

x - f4996e82.0
x - f73e89fd.0
x - ff783690.0
-bash-3.1#

And now I could put in scripts to upload to S3, based on John Eberly's:

-bash-3.1# cat > upload.sh
#!/opt/bin/bash
# script to sync local directory up to s3
cd /home/s3sync
export AWS_ACCESS_KEY_ID=<my key ID>
export AWS_SECRET_ACCESS_KEY=<my secret key>
export SSL_CERT_DIR=/home/s3sync/certs
./s3sync.rb -r --ssl --delete "/user data/Giles/Catalogue" <my key ID>.Backups:/remotefolder
-bash-3.1# chmod 700 upload.sh

The chmod was required to stop non-root users (of whom I naturally have hordes on the slug :-) from being able to read the private key. Better to be safe than sorry. The directory I was syncing is a very small subdirectory of the area I want to back up to S3.

Next, a download script:

-bash-3.1# cat > download.sh
#!/opt/bin/bash
# script to sync "directory" down from s3
cd /home/s3sync
export AWS_ACCESS_KEY_ID=<my key ID>
export AWS_SECRET_ACCESS_KEY=<my secret key>
export SSL_CERT_DIR=/home/s3sync/certs
./s3sync.rb -r --ssl --delete <my key ID>:/remotefolder/Catalogue/ /downloads/
-bash-3.1# chmod 700 download.sh
-bash-3.1#

Next, I created the my key ID.Backups bucket using jets3t Cockpit, and then ran the upload script:

-bash-3.1# ./upload.sh
-bash-3.1#

A quick check confirmed that the data had been uploaded. However, I found myself thinking -- I'd like the tool to log a bit more than that. s3sync's usage said that there was a "-v" option to run it in verbose mode, so I set that in the upload script and reran it. There was still no output, but I suspected that that was simply because there were no changes to upload... so I deleted the data from S3 using jets3t Cockpit, and reran. This time I got output:

-bash-3.1# ./upload.sh
Create node 19_Ejiri.jpg
Create node 22_Okabe.jpg
Create node 29_The_Original_Hachiman_Shrine_at_Suna_Village.jpg
Create node 47_Kameyama.jpg
-bash-3.1#

Time to test the download script (adding the -v to it first):

-bash-3.1# mkdir /downloads/
-bash-3.1# ./download.sh
Create node 19_Ejiri.jpg
Create node 22_Okabe.jpg
Create node 29_The_Original_Hachiman_Shrine_at_Suna_Village.jpg
Create node 47_Kameyama.jpg
-bash-3.1# ls -lrt /downloads/
-rwxrw----    1 guest    everyone   578008 Nov 14 22:30 19_Ejiri.jpg
-rwxrw----    1 guest    everyone   607822 Nov 14 22:30 22_Okabe.jpg
-rwxrw----    1 guest    everyone   563472 Nov 14 22:30 29_The_Original_Hachiman_Shrine_at_Suna_Village.jpg
-rwxrw----    1 guest    everyone   681194 Nov 14 22:31 47_Kameyama.jpg
-bash-3.1# ls -lrt /user\ data/Giles/Catalogue/
-rwxrw----    1 guest    everyone   607822 Mar 17  2005 22_Okabe.jpg
-rwxrw----    1 guest    everyone   578008 Mar 17  2005 19_Ejiri.jpg
-rwxrw----    1 guest    everyone   681194 Mar 17  2005 47_Kameyama.jpg
-rwxrw----    1 guest    everyone   563472 Mar 17  2005 29_The_Original_Hachiman_Shrine_at_Suna_Village.jpg

Hooray! So, finally, I decided to try syncing up my entire "user data" share on an cron job, set to execute very soon. I modified the upload.sh script to point to the correct directory, and then edited /etc/crontab, adding a line saying:

42 22 * * * root /home/s3sync/upload.sh &> /tmp/s3sync.log

And then I waited until 10:42pm by the slug's time (which, incidentally, seemed to have drifted a minute or so since the previous evening). At 10:42pm, I checked what processes were running:

-bash-3.1# ps auxww
  PID TTY     Uid        Size State Command
    1         root       1212   S   /bin/init
    2         root          0   S   [keventd]

...

 1628 ttyp1   root       2100   S   -bash
 1715         root       2036   S   /opt/bin/bash /home/s3sync/upload.sh
 1716         root      12856   S   /opt/bin/ruby ./s3sync.rb -v -r --ssl --del
 1718 ttyp1   root       1984   R   ps auxww
-bash-3.1#

Excellent. The logfile was there; nothing had been written yet, but checking the bucket showed that data was already being copied up. My best guess was that the logfile would be flushed at a later point.

At this point, all I could really do was wait -- so it was time to leave the slug for the day, ready to check the next. If everything had synchronised up correctly -- and a download to another machine worked -- then I would be able to say that I'd completed the project :-)

Next: Scheduling part 2.