Thursday, March 28, 2024

Elevating LLM Deployment with FastAPI and React: A Step-By-Step Guide

 In a previous exploration, I delved into creating a Retrieval-Augmented-Generation (RAG) demo, utilising Google’s gemma model, Hugging Face, and Meta’s FAISS, all within a Python notebook. This demonstration showcased the potential to build a locally-run, RAG-powered application.

The conceptual flow of using RAG with LLMs. (Source)

This article aims to advance that groundwork by deploying the model and RAG functionality via FastAPI, with a subsequent consumption of the API through a straightforward ReactJS frontend. A notable enhancement in this iteration is the integration of the open-source Mistral 7b model and the Chroma vector database. The Mistral 7b model is acclaimed for its optimal balance between size and performance, surpassing the Llama 2 13B model across benchmarks and matching the prowess of Google’s gemma model. Continue here


Friday, March 1, 2024

Streamlining Real-Time CDC and Data Replication with Debezium and Kafka

 In today’s fast-paced digital landscape, efficient data management and replication are more critical than ever. This article walks you through setting up a streamlined, real-time Change Data Capture (CDC) and data replication pipeline using Debezium and Kafka. We’ll leverage Docker Compose for a simplified testing environment, avoiding the complexities of server provisioning.

For those considering cloud-based solutions, options like Confluent Cloud offer a Kafka service with a free trial. Alternatively, Azure Event Hubs or AWS’s Managed Kafka services provide robust platforms for handling large-scale data streams.

Full article can be read here: https://medium.com/@george.vane/streamlining-real-time-cdc-and-data-replication-with-debezium-and-kafka-b4d3bc56e2ab 

Elevating LLM Deployment with FastAPI and React: A Step-By-Step Guide

  In a   previous exploration , I delved into creating a Retrieval-Augmented-Generation (RAG) demo, utilising Google’s gemma model, Hugging ...