Monday, June 12, 2023

Database replication using Confluent (Kafka) and Debezium

I have been playing with confluent cloud and Debezium for a little while and found it is extremely useful in streaming data ingestion, the usual use case I came across includes the following two scenario:

1. Use Debezium CDC connector to generate change records to Kafka topics, dump the change records to either cloud storage or to delta lake, this is usually called the raw zone, you can then subsequently consume these change records in your favorate data platform, such as Databricks or Snowflake, both have a rebust streaming ingestion support.
2. Another way is often you just want to have a copy fo the production database for analytics usage, hence a like for like replication is what you need, you can you jdbc sink connector for that, the additional benefits is that you can replicate data to different target database platform, for example mysql to SQL Server, postgres to SQL Server, mysql to postgres etc.

Elevating LLM Deployment with FastAPI and React: A Step-By-Step Guide

  In a   previous exploration , I delved into creating a Retrieval-Augmented-Generation (RAG) demo, utilising Google’s gemma model, Hugging ...