Thursday, March 28, 2024

Elevating LLM Deployment with FastAPI and React: A Step-By-Step Guide

 In a previous exploration, I delved into creating a Retrieval-Augmented-Generation (RAG) demo, utilising Google’s gemma model, Hugging Face, and Meta’s FAISS, all within a Python notebook. This demonstration showcased the potential to build a locally-run, RAG-powered application.

The conceptual flow of using RAG with LLMs. (Source)

This article aims to advance that groundwork by deploying the model and RAG functionality via FastAPI, with a subsequent consumption of the API through a straightforward ReactJS frontend. A notable enhancement in this iteration is the integration of the open-source Mistral 7b model and the Chroma vector database. The Mistral 7b model is acclaimed for its optimal balance between size and performance, surpassing the Llama 2 13B model across benchmarks and matching the prowess of Google’s gemma model. Continue here


Friday, March 1, 2024

Streamlining Real-Time CDC and Data Replication with Debezium and Kafka

 In today’s fast-paced digital landscape, efficient data management and replication are more critical than ever. This article walks you through setting up a streamlined, real-time Change Data Capture (CDC) and data replication pipeline using Debezium and Kafka. We’ll leverage Docker Compose for a simplified testing environment, avoiding the complexities of server provisioning.

For those considering cloud-based solutions, options like Confluent Cloud offer a Kafka service with a free trial. Alternatively, Azure Event Hubs or AWS’s Managed Kafka services provide robust platforms for handling large-scale data streams.

Full article can be read here: https://medium.com/@george.vane/streamlining-real-time-cdc-and-data-replication-with-debezium-and-kafka-b4d3bc56e2ab 

Monday, June 12, 2023

Database replication using Confluent (Kafka) and Debezium

I have been playing with confluent cloud and Debezium for a little while and found it is extremely useful in streaming data ingestion, the usual use case I came across includes the following two scenario:

1. Use Debezium CDC connector to generate change records to Kafka topics, dump the change records to either cloud storage or to delta lake, this is usually called the raw zone, you can then subsequently consume these change records in your favorate data platform, such as Databricks or Snowflake, both have a rebust streaming ingestion support.
2. Another way is often you just want to have a copy fo the production database for analytics usage, hence a like for like replication is what you need, you can you jdbc sink connector for that, the additional benefits is that you can replicate data to different target database platform, for example mysql to SQL Server, postgres to SQL Server, mysql to postgres etc.

Friday, May 5, 2023

Migrating IBM DB2 to Google Bigtable and achieving FIPS compliance encryption using Java custom encryption library

 This is about a project I undetook recently, the purpose was to migrate  large volume of on-prem db2 data to google bigtable using DataProc, a spark based solution on Google Cloud, a few things that are notable from the project:

1. I have to use SCALA to develop the solution due to the fact that the encryption libary was developed in Java and althoguh it has interoperativity to Python, it does have lot of limitation which stoped me from using Python... on the other hand, SCALA and JAVA just work together seamlessly.

2. for FIPS compliance, I have to use bouncecastle library, which introduces issues in managing dependencies, "dependency hell" as some named it, at then end I have to use mevan to manage dependencies and shade sbt due to the complexity.

3. I used hbase-spark connector for talking to bigtable, Since I am using spark 3, I have to complied the connector libaray manually, see https://github.com/apache/hbase-connectors/tree/master/spark 

(this project was done about a year ago)

Handling Large Messages With Apache Kafka

While working on handling large messages with Kafka, I came across a few useful reference articles, bookmarking here for anyone who needs them:

https://dzone.com/articles/processing-large-messages-with-apache-kafka

https://www.morling.dev/blog/single-message-transforms-swiss-army-knife-of-kafka-connect/

https://www.kai-waehner.de/blog/2020/08/07/apache-kafka-handling-large-messages-and-files-for-image-video-audio-processing/

https://docs.confluent.io/cloud/current/connectors/single-message-transforms.html#cc-single-message-transforms-limitations

Tuesday, April 4, 2023

Object Tracking Demo

 


In a proof of concept project I undertook a while ago, YOLO (You Only Look Once) object detection model was used in combination with the Deep SORT (Simple Online and Realtime Tracking) algorithm to track objects in real-time. The aim was to showcase how this technology can be applied to traffic monitoring, specifically measuring the time it takes for vehicles to pass through a road junction. The project demonstrated the effectiveness of the combination of these technologies in accurately detecting and tracking vehicles as they move through a monitored area. The results showed that this method could provide accurate data on traffic flow, which could be useful for traffic management and infrastructure planning purposes. The time taken for vehicles to pass through the road junction was also measured as part of the project.


Thursday, December 1, 2022

The hidden cost of using a metadata driven ingestion framework

 I came across a few implementations of metadata driven ELT/ETL frameworks, designed and developed by some big consulting firms, perhaps great minds think alike, they all use hashed values  to detect changed records, to be able to avoid duplication and insert changes, this indeed saves a lot of effort and simplify the design.

However one of the major drawbacks of this approach is that often this makes optimization impossible, for example in delta lake, use merge into statement based on the hashed key makes the operation extremely expensive.  this will make partioning not useful or possible, often ended up high running cost.

It will probably make more sense to also define a partition key, zorder columns plus key columns used to identity uniqueness in the metadata, so the delta tables will be created using the optional partition key and zorder columns, then update the merge into statement to use join conditions based on the key columns so it will be able to do parition proning at least, usually the partition key should be part of the key columns  and in most cases is a date column , optionally defining the zorder columns to be the business key etc will also help.

Elevating LLM Deployment with FastAPI and React: A Step-By-Step Guide

  In a   previous exploration , I delved into creating a Retrieval-Augmented-Generation (RAG) demo, utilising Google’s gemma model, Hugging ...