Exploring Write Ahead Logs and Streaming Sqlite to S3

May 25, 2022

I’ve been using sqlite in production for some of my personal projects for the last few months. It’s an incredible, simple and powerful tool for saas-type workloads. It’s embedded within the application so is incredibly fast. One of the challenges is that it’s a single point of failure which can make recovering or making the database more available a pain. Enter Litestream. Using litestream I’ve been able to solve a lot of these concerns. Litestream streams sqlite to S3. I’ve been wanting to dive deep into write-ahead logs and streaming them to S3.

Consistency modes in databases

There are several modes for ensuring consistency in databases, including:

  1. Write-Ahead Logging (WAL)
  2. Journaling
  3. Two-Phase Commit
  4. Snapshot Isolation
  5. Multi-Version Concurrency Control (MVCC)

Sqlite supports WAL and journaling and I wrote some simple implementations to understand how these modes work. The other modes such as two-phase commit, snapshot isolation and mvcc are more complex and for distributed use-cases. I’ll be doing a deep dive in another post for these modes.

Our in-memory database interface

class KVStore(ABC):
	@abstractmethod
	def __setitem__(self, key, value): ...

	@abstractmethod
	def __getitem__(self, key): ...

	@abstractmethod
	def commit(self): ...

	@abstractmethod
	def rollback(self): ...

Write ahead log

Here’s an example implementation of a Python dictionary key-value store with write-ahead log.

class KVStoreWAL(KVStore):
    WAL_FILE = 'wal.log'

    def __init__(self):
        self.wal = {}
        self.data = {}
        with open(self.WAL_FILE, 'r') as f:
            for line in f:
                data = json.loads(line)
                self.wal.update(data)

        self.txn = []

    def __setitem__(self, key, value):
        self.txn.append((key, value))

    def __getitem__(self, key):
        return self.data.get(key)

    def commit(self):
        with open(self.WAL_FILE, 'a') as f:
            for key, value in self.txn:
                if value is not None:
                    self.data[key] = value
                    self.wal[key] = value
                    f.write(json.dumps({key: value}) + '\n')
                else:
                    self.data.pop(key, None)
                    self.wal.pop(key, None)
                    f.write(json.dumps({key: None}) + '\n')
        self.txn = []

    def rollback(self):
        self.txn = []

    def close(self):
        with open(self.WAL_FILE, 'a') as f:
            for key, value in self.txn:
                f.write(json.dumps({key: value}) + '\n')
        self.txn = []

    def __del__(self):
        self.close()

The rollback method simply resets the transaction list to an empty list, discarding any changes made within the transaction.

Streaming WAL to S3

For a simplistic implementation we can just replace all I/O operations from the disk based wal.log to instead use the network and write directly to S3.

self.s3 = boto3.client('s3')
response = self.s3.get_object(Bucket=self.bucket_name, Key=self.key_prefix + '/wal.log')
for line in response['Body'].read().decode().splitlines():
	data = json.loads(line)
	self.wal.update(data)

This does have an additional overhead but makes it easy to understand for our purpose. The way Litestream does it is it has a background process that asynchronously writes to S3 every second See: Data loss window.

Conclusion

Overall I’ve been pretty impressed with Litestream. I had a fun time diving inton write-ahead logs. The one thing that’ll make it even better is support for live replication. Being able to spin up multiple readers would be huge win for this. There is an open thread for this Write once - read many


Written by Ganesh Iyer A software engineer building platforms for leveraging artificial intelligence in healthcare.