
Features:
- Handles multiple file types (PDF, Word, Excel, etc.)
- Interactive AI-powered file chat
- LangChain integration
- UI & easy to use
- Deployed on Hugging Face
Technology Used:
Python, LangChain, Hugging Face, Streamlit, GEMMA 2, Google Colab, Vector DB(Chroma), EmbeddingsRAG File Chat Application
An advanced AI application for interacting with multiple file types using Retrieval-Augmented Generation (RAG).
Project Overview
This project started with learning about RAG on Colab through a following YouTube tutorial By @Daniel Bourke , where I built the foundational RAG architecture from scratch.
I then expanded it to include LangChain for simplifying workflows and Multiple Files support, using the following tutorial.
Then, I added a user-friendly Streamlit UI, and deployed the final application on Hugging Face Spaces for seamless access.
The application allows users to upload various file types (PDFs, Word documents, Text, MD Files) and chat with them using an AI-driven system. It combines robust AI models with intuitive usability.
Project Version Repositories
-
RAG: From Scratch for 1 PDF: GitHub Link
-
Conversion to Langchain Multiple Files: GitHub Link
-
Deployment on Hugging Face Spaces with Streamlit UI: Hugging Face Spaces Link
Development Steps
- Started on Colab: Followed a YouTube tutorial to create a basic RAG implementation from scratch.
- Integrated LangChain: Simplified file handling and AI pipelines using LangChain.
- Built a Streamlit UI: Designed an interactive web interface for easy user interaction.
- Deployment: Hosted the final application on Hugging Face Spaces for public accessibility.
Features
- Supports Multiple File Types: Chat with PDFs, Word, Excel, and more.
- AI-Powered Conversations: Leverages retrieval-augmented generation for context-aware interactions.
- LangChain Integration: Streamlines data ingestion and query handling.
- Streamlit-Based UI: Easy-to-use and visually appealing interface.
- Deployed for Access: Available on Hugging Face Spaces.
Technology Used
- Python: Core programming language.
- LangChain: Simplifies RAG workflows and query handling.
- Streamlit: For building a user-friendly web UI.
- Hugging Face Spaces: Deployment platform.
- Google Colab: Initial development and experimentation platform.
- GEMMA 2: LLM used for file understanding and response generation.