AWS - Not so easy DynamoDB import

Details: Written by: JC; Category: Cloud; Published: 11 November 2023

One of my (not so) favorite things to do when learning cloud is discovering gaps in documentation. When I wanted to import DynamoDB data, I thought it would be easy to use the format here:

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.Format.html

It turns out it needs to be a bit different than that. At first, I thought maybe it could be as easy as making a backup and seeing what the format was like. Unfortunately, DynamoDB backups have one record per file (at least the amount of rows I had). This really doesn't scale will if you have thousands of records.

I found the format actually needs the below in order to import using the build-it import function:

{"Item": {"id": {"N": "0"}, "admin": {"S": "fname"}, "type": {"S": "lname"}}}
{"Item": {"id": {"N": "1"}, "admin": {"S": "fname"}, "type": {"S": "lname"}}}

To get it to this format, I did have to go through a few hoops. First, I used 'export-dyanmodb' script found here:

https://github.com/truongleswe/export-dynamodb

This script would also export each entry as one file, so I have to parse each file, convert the multiple-lines into a single line like above and put it in one file. Here is my sample script I used to do that:

import json
import os

files = os.listdir("./dump/<table>/data")
for filename in files:

    f = open('./dump/<table>/data/' + filename, 'r')

    json_file = json.loads(f.read())

    for line in json_file['Items']:
        item = {}
        item['Item'] = line

        with open('<table>.json','a') as f2:
            f2.write(json.dumps(item)+"\n")
            f2.close()

f.close()

After the records were in one text file I could then use gzip to compress the file to save some space.

AWS - Not so easy DynamoDB import

No comments

Leave your comment