AWS - Not so easy DynamoDB import
- Details
- Written by: JC
- Category: Cloud
One of my (not so) favorite things to do when learning cloud is discovering gaps in documentation. When I wanted to import DynamoDB data, I thought it would be easy to use the format here:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.Format.html
It turns out it needs to be a bit different than that. At first, I thought maybe it could be as easy as making a backup and seeing what the format was like. Unfortunately, DynamoDB backups have one record per file (at least the amount of rows I had). This really doesn't scale will if you have thousands of records.
I found the format actually needs the below in order to import using the build-it import function:
{"Item": {"id": {"N": "0"}, "admin": {"S": "fname"}, "type": {"S": "lname"}}}
{"Item": {"id": {"N": "1"}, "admin": {"S": "fname"}, "type": {"S": "lname"}}}
To get it to this format, I did have to go through a few hoops. First, I used 'export-dyanmodb' script found here:
https://github.com/truongleswe/export-dynamodb
This script would also export each entry as one file, so I have to parse each file, convert the multiple-lines into a single line like above and put it in one file. Here is my sample script I used to do that:
import json
import os
files = os.listdir("./dump/<table>/data")
for filename in files:
f = open('./dump/<table>/data/' + filename, 'r')
json_file = json.loads(f.read())
for line in json_file['Items']:
item = {}
item['Item'] = line
with open('<table>.json','a') as f2:
f2.write(json.dumps(item)+"\n")
f2.close()
f.close()
After the records were in one text file I could then use gzip to compress the file to save some space.