AWS Glacier Backups

A short list of commands to help zip or image data and content into mid-size chunks that can be stored in S3 buckets. Since we are storing backups on AWS S3 Glacier, which needs to be restores before access, we want to keep the number of files reasonable. AWS charges for each API request, so incentive to store bigger files. The max limit on single file is 4GB. So lets go with file size in the range of 100MB - 3GB.

  • Determine size of dir
    • du -h /path/to/folder1
  • Create encrypted zip
    • zip -er file.zip /path/to/folder
  • Create tar.gz file
    • list current mounts: df - h /Volumes/tmp
    • tar -czf /Volumes/tmp/folders.tar.gz folder1 folder2
  • Encrypt using openssl
    • openssl aes-256-cbc -pass file:storage.enc_key -in file.txt -out file.enc
    • openssl aes-256-cbc -d -pass file:storage.enc_key -in file.enc -out file.txt
  • Create encrypted tars
    • tar -cz folder1 folder2 | openssl aes-256-cbc -pass file:storage.enc_key -out /Volumes/tmp/folders.tar.gz.enc
    • openssl aes-256-cbc -d -pass file:storage.enc_key -in /Volumes/tmp/folders.tar.gz.enc | tar -xzC /Volumes/tmp/abx
  • Install & configure awscli
    • brew install awscli
    • aws configure
    • AWS Access Key ID: <access-id>
    • AWS Secret Access Key: <secret-access>
    • Default region name: us-west-2
    • Default output format: json
    • Config is stored in .aws dir
  • List vaults in Glacier
    • aws glacier list-vaults --account-id -
  • List jobs in Glacier Vault (generic)
    • After "initiate-job", check status using "list-jobs"
    • aws glacier list-jobs \ --account-id - \ --vault-name photos
  • Get list of archives in Glacier Vault
    • aws glacier initiate-job \ --account-id - \ --vault photos \ --job-parameters '{ "Type": "inventory-retrieval" }'
    • Job takes 3-5 hours to complete
    • Use "list-jobs" to get job-id
    • aws glacier get-job-output \ --account-id - \ --vault-name photos \ --job-id "j6ig7qCeJ4Ortc-D83EgHsNxm3RriaAkyEFma3_dx_TV_xix5_APExmpGrDLT7EU07Wxc_5BQfwllggqsgH_JfLusxIV" \ archiveList.json
    • File archiveList.json contains details of archives in vault
  • Upload archive to Glacier (small files)
    • aws glacier upload-archive \ --account-id - \ --vault-name photos \ --body pics-2008.tar.gz
  • Upload archive to Glacier (large files)
    • Include aws-sdk for java
    • Upload archive using high level api
    • Script file aws-archive.sh for java program
    • aws-archive.sh photos pics-2008.tar.gz
  • Download/Retrieve archive from Glacier
    • aws glacier initiate-job \ --account-id - \ --vault-name photos \ --job-parameters file://archive-retrieval.json
    • Job takes 3-5 hours to complete
    • Use "list-jobs" to get job-id
    • aws glacier get-job-output \ --account-id - \ --vault-name photos \ --job-id "xGvIJyQPC9weheMNwIf4s2z8Zct1lYGvjzdxz84VwhD-OaGtCRPwLCAGdr5c_m3qadoOkMGo-FYaLJ5psLKhhcFDjC1n" \ pics-2008.tar.gz

archive-retrieval.json

{
  "Type": "archive-retrieval",
  "ArchiveId": "AveGlBWdJIDk8-THelSpu8FFo34KUmg8pVOQFvMxEQzM8MXMC6A4V7XcX3E3_qf7II3nYNuUpsgAhbSNYzbUUDKEmKv6VRwJvQZdP9m33ZpCGhsrMXnAgn05ng2xDvHHGFSRUjFf-g",
  "Description": "Retrieve SQL dump for audit team",
  "SNSTopic":"arn:aws:sns:us-west-2:112233445566:glacier-sandbox"
}

aws-archive.sh

#!/bin/bash

CLASSPATH="/Users/bhira/Code/programming"  
CLASSPATH="$CLASSPATH:/Users/bhira/Code/lib/aws-java-sdk-1.11.245.jar"  
CLASSPATH="$CLASSPATH:/Users/bhira/Code/lib/commons-logging-1.2.jar"  
CLASSPATH="$CLASSPATH:/Users/bhira/Code/lib/jackson-databind.jar"  
CLASSPATH="$CLASSPATH:/Users/bhira/Code/lib/jackson-core-2.2.3.jar"  
CLASSPATH="$CLASSPATH:/Users/bhira/Code/lib/jackson-annotations-2.1.2.jar"  
CLASSPATH="$CLASSPATH:/Users/bhira/Code/lib/httpclient-4.5.4.jar"  
CLASSPATH="$CLASSPATH:/Users/bhira/Code/lib/httpcore-4.4.7.jar"  
CLASSPATH="$CLASSPATH:/Users/bhira/Code/lib/joda-time-2.9.9.jar"

CLASSNAME="AWSArchiveUpload"

java -cp $CLASSPATH $CLASSNAME $@  

References:
http://docs.aws.amazon.com/cli/latest/reference/glacier/index.html
https://www.madboa.com/blog/2016/09/23/glacier-cli-intro/
http://docs.aws.amazon.com/amazonglacier/latest/dev/uploading-an-archive-single-op-using-java.html

Baldeep Hira

bay area programmer working on mobile/tablet/web apps and enterprise cloud apps; ui/ux, html5 and everything else for a prettier web and world

  • San Francisco Bay Area
comments powered by Disqus