Sii
BlogerSii
Amazon QLDB – database similar to blockchain

Hard development 2 October 2023

Amazon QLDB – database similar to blockchain

Amazon Quantum Ledger Database (Amazon QLDB) is a NoSQL database that aims to provide the following features:

transparency,
immutability,
cryptographically verifiable transaction log.

It allows data to be stored in a secure way that guarantees it cannot be tampered with, as all changes to the database are easily traceable and these traces cannot be removed. This makes QLDB the perfect database when audit trails are required.

The database, called a ledger in QLDB, stores documents that are the counterparts of table rows from relational databases. A full history is available for each of the documents so that any changes made to them can be easily traced. The data can be queried and manipulated using SQL-like language, making it easier to understand how to use QLDB if you come from the world of SQL databases.

In the article you will find information about the journal and its structure, querying and modifying data, examples, legal compliance and the pros and cons of this solution.

Journal

A key component of QLDB is a journal, which is structurally similar to a transaction log in a traditional SQL database, in that all data changes to the database are written to the journal first, in sequential order. However, the journal is append-only, meaning that data can only be inserted into the journal and cannot be deleted or modified. Therefore, the journal guarantees the transparency of the database and is one of the key features leading to the immutability of the database.

The journal is the source of truth in the database. All the data that users get from it comes from this component, and all data changes are recorded there before they are available to users. It also handles concurrency and guarantees the correct (sequential) order of data changes.

It is possible to retrieve data directly from the journal so that it can be exported if required. Another feature provided by QLDB is QLDB Stream, which can deliver the data to Kinesis Data Streams for processing in various ways, such as triggering Lambda functions. This results in near real-time processing of data changes introduced in QLDB.

Structure of the journal

QLDB’s journal is supported by cryptography. Its structure is very similar to that of blockchains. It consists of sequential blocks that are linked together as a chain. Each block consists of user data (actual content), metadata and queries executed in the transaction that committed the block.

The journal block is hashed using the SHA-256 cryptographic function, and its hash is used to calculate the hash of the next block. In this way, if one of the committed blocks is modified, its hash becomes incorrect and the hashes of the next blocks become incorrect as well. So, the verification of the whole chain will eventually fail. This is how the database provides the cryptographically verifiable transaction log.

The structure of the journal is illustrated in the following diagram.

Fig. 1 The structure of the journal in Amazon QLDB

Each block stores many hashes. The most important are:

block’s hash which uniquely identifies the block – calculated as the hash of the concatenation of the previous block’s hash and the other hashes in the block (such as hashes of queries and documents),
previous block’s hash – the “Block’s hash” value of the previous block. This is how the block is linked to the previous block.

Querying and modifying data

Each time new data is written to the journal, it is materialised into a table, the name of which is specified by the user when inserting the data, just as in a normal SQL INSERT query. The materialised table provides views that allow easy retrieval of the current state of the data as well as the history of the data.

Data can be retrieved from the tables and views using a special SQL-like language called PartiQL, which is maintained by Amazon. This language can also be used to insert, update and delete data. Note that any queries that modify the data in any way do not directly modify the data in the tables. Instead, they are appended to the journal as new events, as described in the Journal section.

Each change to the same document results in the creation of a new document revision. Document revisions can be used to check the history of the document.

The data returned by PartiQL from QLDB is in a JSON-like format, which is actually an extension of JSON. The format is called Amazon Ion.

Example

Create tables

First, we can create tables using PartiQL like this:

CREATE TABLE BankAccount;
CREATE TABLE Customer;

There is no need to provide any kind of schema because QLDB does not enforce any schema, just like many other NoSQL databases.

Create indexes

We can also add some indexes to the created tables that will improve the performance of querying the data from them.Indexes are used whenever a table is queried to find a document in it using WHERE operator with an equality operator on an indexed field. If data is queried without using an index, a full table scan is performed, which significantly reduces performance and increases the cost of the operation.

There is one default index for each table – the document’s ID, which is automatically generated for each document and is available in the document’s metadata. The indexes can be added like this:

CREATE INDEX ON BankAccount (AccountNumber);
CREATE INDEX ON Customer (IdentityNumber);

Insert sample data

We can insert some data into the tables this way:

INSERT INTO BankAccount
<< {
  'AccountNumber': '123456789',
  'AccountType': 'Saving',
  'Customer': { 'IdentityNumber': 'Customer1' }
},
{
  'AccountNumber': '456789123',
  'AccountType': 'Checking',
  'Customer': { 'IdentityNumber': 'Customer2' }
} >>;

INSERT INTO Customer
<< {
  'IdentityNumber': 'Customer1',
  'Name': 'John',
  'Surname': 'Doe'
},
{
  'IdentityNumber': 'Customer2',
  'Name': 'Doris',
  'Surname': 'Lin'
} >>;

Query data

Basically, we can retrieve data just like in SQL databases – using SELECT operator, e.g.:

SELECT * FROM Customer AS c
WHERE c.Name = 'John';

This query returns details of the customer with the name John:

[
  {
    "IdentityNumber": "Customer1",
    "Name": "John",
    "Surname": "Doe"
  }
]

We can also join data from two tables like this:

SELECT c.Name, c.Surname, b.AccountNumber
FROM Customer AS c, BankAccount as b
WHERE c.IdentityNumber = b.Customer.IdentityNumber;

This query returns customer names along with their account numbers:

[
  {
    "Name": "John",
    "Surname": "Doe"
    "AccountNumber": "123456789"
  },
  {
    "Name": "Doris",
    "Surname": "Lin"
    "AccountNumber": "456789123"
  }
]

Modify data

We can update data like in SQL databases – using UPDATE operator, e.g.:

UPDATE BankAccount AS b
SET b.AccountType = 'SavingAccount'
WHERE b.AccountType = 'Saving';

Returned data looks like this:

[
  {
    "documentId": "84MQvUwiL6I3QvjowvCW6x"
  }
]

which is an array with IDs of the new document versions (updated ones).

View block’s metadata – committed view

We can query the committed view of the table to get more table details. The view returns documents from the system-defined table directly corresponding to the queried user table. Thus, the view consists of a list of documents from the user table in an extended version consisting of the following attributes:

the current data in the table – the same as returned by the regular SELECT on the table presented above,
the block’s address,
the block’s hash,
the block’s metadata describing the document and the transaction that committed the block.

Such queries are typically run to retrieve metadata of documents, in particular system-generated identifiers of the latest revisions of documents stored in a user table or the time at which they were stored (the transaction’s time). Querying a committed view of a table is as simple as modifying the regular SELECT query on the table by adding the _ql_committed_ prefix to the table name, e.g.:

SELECT * FROM _ql_committed_BankAccount;

In general, the _ql_ prefix is reserved for system objects in QLDB and committed views are an example of such objects.

The query returns a response that is an array of two elements, because there are two different documents in the BankAccount table:

[
  {
    blockAddress: {
      strandId: "IXlQaOgH5XSBHHptx1Anyl",
      sequenceNo: 83
    },
    hash: {{LZEMR8w8dgk/XIC/U1VDR5C3b/P2YBnwSvqlSOn4GVg=}},
    data: {
      AccountNumber: "123456789",
      Customer: {
        IdentityNumber: "Customer1"
      },
      AccountType: "SavingAccount"
    },
    metadata: {
      id: "84MQvUwiL6I3QvjowvCW6x",
      version: 1,
      txTime: 2023-08-29T23:05:23.544Z,
      txId: "42BHOyJI6wQCZ6orjQ0y4H"
    }
  },
  {
    blockAddress: {
      strandId: "IXlQaOgH5XSBHHptx1Anyl",
      sequenceNo: 31
    },
    hash: {{UG+YfHl3hwZofthfoPbWvG5c7k4agKjqtQr3rFYt0z8=}},
    data: {
      AccountNumber: "456789123",
      AccountType: "Checking",
      Customer: {
        IdentityNumber: "Customer2"
      }
    },
    metadata: {
      id: "BvtXPJzyhL0DNT262S6hVk",
      version: 0,
      txTime: 2023-08-29T22:47:34.673Z,
      txId: "72U8fHKd0YU4NI9l8Bg9ae"
    }
  }
]

View data history

We can also retrieve the data history for a document using the history function, which returns a view containing a list of document revisions with all block details, similar to the use of the _qldb_committed_ prefix usage, e.g.:

SELECT * FROM history(BankAccount)

The returned data this time consists of three elements – two documents originally inserted into the BankAccount table and an updated version of one of the documents:

[
  {
    blockAddress: {
      strandId: "IXlQaOgH5XSBHHptx1Anyl",
      sequenceNo: 31
    },
    hash: {{15G1wu8Pb3RWWOEsmPgEMP51qL89dC1q6PfxnV7q0+o=}},
    data: {
      AccountNumber: "123456789",
      AccountType: "Saving",
      Customer: {
        IdentityNumber: "Customer1"
      }
    },
    metadata: {
      id: "84MQvUwiL6I3QvjowvCW6x",
      version: 0,
      txTime: 2023-08-29T22:47:34.673Z,
      txId: "72U8fHKd0YU4NI9l8Bg9ae"
    }
  },
  {
    blockAddress: {
      strandId: "IXlQaOgH5XSBHHptx1Anyl",
      sequenceNo: 83
    },
    hash: {{LZEMR8w8dgk/XIC/U1VDR5C3b/P2YBnwSvqlSOn4GVg=}},
    data: {
      AccountNumber: "123456789",
      Customer: {
        IdentityNumber: "Customer1"
      },
      AccountType: "SavingAccount"
    },
    metadata: {
      id: "84MQvUwiL6I3QvjowvCW6x",
      version: 1,
      txTime: 2023-08-29T23:05:23.544Z,
      txId: "42BHOyJI6wQCZ6orjQ0y4H"
    }
  },
  {
    blockAddress: {
      strandId: "IXlQaOgH5XSBHHptx1Anyl",
      sequenceNo: 31
    },
    hash: {{UG+YfHl3hwZofthfoPbWvG5c7k4agKjqtQr3rFYt0z8=}},
    data: {
      AccountNumber: "456789123",
      AccountType: "Checking",
      Customer: {
        IdentityNumber: "Customer2"
      }
    },
    metadata: {
      id: "BvtXPJzyhL0DNT262S6hVk",
      version: 0,
      txTime: 2023-08-29T22:47:34.673Z,
      txId: "72U8fHKd0YU4NI9l8Bg9ae"
    }
  }
]

Legal compliance

Some data protection laws, such as the GDPR (General Data Protection Regulation) in the European Union, impose some restrictions and obligations on consumer rights, such as the right to be forgotten in the GDPR. The immutability of QLDB means that there is no way to comply with such a law, which effectively excludes this database from being used for such data. This significantly limits its usability.

To address this issue, an exception to the immutability rule has been added to QLDB – the data redaction operation. This operation permanently deletes an inactive document from the QLDB, making it possible to permanently delete some user data. However, it only removes the actual content (data provided by the user), leaving the block itself undeleted. So the block’s hash, sequence and metadata cannot be deleted and this type of data is stored permanently. In this way, it is possible to combine as much immutability as possible with legal compliance.

Usage analysis

Like any solution, QLDB has advantages and disadvantages. Among them are worth mentioning:

Benefits

immutability provides a secure way to store and track data and its changes,
fully auditable data integrity,
2-3 times faster than traditional blockchain frameworks due to the centralised nature of QLDB, as opposed to the decentralised nature of traditional blockchain, which eliminates the need for multi-party consensus.

Disadvantages

an overhead due to the need to store all data changes in a journal and materialise them in tables, resulting in additional storage and time required to perform operations compared to simpler databases,
a relatively new solution compared to many other databases, so the functionality is still limited, although it has already been significantly extended compared to earlier versions of QLDB.

In general, it may be a good idea to use QLDB where audit trailing is required and the ability to verify data integrity is a must. For less important data, where the extra layer of protection is not a priority, it may be better to use other databases that do not have the overhead of the extra protection.

Examples of use cases where QLDB can be applied:

financial transaction records,
transaction history in supply chain systems,
claims transaction history in insurance systems,
HR systems maintaining employee data.

More information

This article is only an overview and contains the most important information about QLDB. However, there are many more details and information that can be found in the QLDB Developer Guide.

Other knowledge sources:

A comparison of QLDB and relational databases to help those who are more familiar with relational databases understand QLDB concept.
The exact structure of the journal and its blocks, with a detailed example.
More details on the topic of QLDB verification process and the security provided by the cryptography in QLDB.

***

If you are interested in the subject of data warehouse, check out this article: “Implementation of the Modern Data Warehouse architecture in the Azure cloud“.

5/5 ( votes: 3)

Rating:

5/5 ( votes: 3)

Author

Paweł Woroniecki

He has been working at Sii for 3 years, programming mainly web applications in Java, but is no stranger to Angular, DevOps issues and software architecture areas. Currently a Software Engineer, who has also served as a Technical Leader

All articles written by the author

Cancel reply

You might also like

Hard development 26 July 2024

Don't miss out

Subscribe to our blog and receive information about the latest posts.

Get an offer

If you have any questions or would like to learn more about our offer, feel free to contact us.

Send your request Send your request

Natalia Competency Center Director

Join Sii

Find the job that's right for you. Check out open positions and apply.

Apply Apply

Paweł Process Owner

Who we are

What we offer

Industries

Career

Trainings

News & events

Blog

Amazon QLDB – database similar to blockchain

Journal

Structure of the journal

Querying and modifying data

Example

Create tables

Create indexes

Insert sample data

Query data

Modify data

View block’s metadata – committed view

View data history

Legal compliance

Usage analysis

More information

We're hiring!

Take part in our trainings

Explore Sii Poland's offer

Sii's news and events

Most popular tags

Leave a comment

Ensuring safe and ethical AI – the role of Guardrails in Llama 3

Hilla: optimal technology integration – Spring Boot and React in one place

Bluetooth Low Energy Basics – ‘The Story of King Harald in a Nutshell’

Modularity in Adobe Experience Manager

IEC 62304 Software Safety Classes vs. FDA Software Requirements

Data pseudonymization in Google Cloud

Migration to go_router — dev’s story

Price Tracker – a practical dive into serverless applications

Using ChatGPT for creating Spring framework applications (no hand programming)

Server Side Events – implementation and highlights

We're hiring!

Take part in our trainings

Explore Sii Poland's offer

Sii's news and events

Most popular tags

File upload error

No file was uploaded

Incorrect file size (max 5MB)

Invalid file format

Empty file

Processing...

Get an offer

Join Sii

Ta treść jest dostępna tylko w jednej wersji językowej. Nastąpi przekierowanie do strony głównej.

Ta treść jest dostępna tylko w jednej wersji językowej.
Nastąpi przekierowanie do strony głównej.