🔍 ShopLens AI

Multimodal RAG for E-Commerce Product Assistant

Upload any product image → Get specs, find alternatives, compare prices
Powered by vision LLMs, semantic search, and conversational AI — all in one app.

📌 Overview

ShopLens AI is a production-ready multimodal RAG (Retrieval-Augmented Generation) system that combines computer vision, vector search, and large language models to build an intelligent e-commerce shopping assistant.

A user uploads a product photo — a pair of sneakers, a laptop, headphones — and the system:

Understands the image using Llama 4 Scout vision model
Finds similar products using hybrid semantic + keyword search over a FAISS vector database
Answers questions in natural language using ChatGroq (LLaMA 3.3 70B)
Remembers context across a multi-turn conversation

Think of it as ChatGPT + Google Lens + Amazon Search combined into one interface.

🏗️ Architecture

User uploads product image
         │
         ▼
┌─────────────────────┐
│     vision.py       │  Llama 4 Scout (Groq) → Rich product description
│  Image → Text       │  Caches results by image hash
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│    retriever.py     │  FAISS semantic search
│  Hybrid Search      │  + BM25 keyword search
│  (RRF Fusion)       │  → Reciprocal Rank Fusion
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│      model.py       │  ChatGroq LLaMA 3.3 70B
│  RAG Generation     │  + Conversation memory
│                     │  → Natural language answer
└──────────┬──────────┘
           │
           ▼
┌─────────────────────┐
│      app.py         │  Streamlit dark UI
│  Web Interface      │  Multi-turn chat + product cards
└─────────────────────┘

🚀 Tech Stack

Component	Technology	Purpose
Vision LLM	Llama 4 Scout 17B (Groq)	Image → product description
Chat LLM	LLaMA 3.3 70B (Groq)	Answer generation
Embeddings	HuggingFace `all-MiniLM-L6-v2`	Text → vectors
Vector DB	FAISS (CPU)	Semantic similarity search
Keyword Search	BM25 (rank-bm25)	Exact brand/model matching
Fusion	Reciprocal Rank Fusion	Hybrid retrieval ranking
Framework	LangChain 0.3	LLM orchestration
UI	Streamlit	Web interface

📁 Project Structure

rag_prod/
│
├── app.py           → Streamlit web UI (dark premium theme)
├── pipeline.py      → Main orchestrator + ChatSession (multi-turn)
│
├── vision.py        → Image loading, format conversion, Groq vision call
├── retriever.py     → Hybrid FAISS + BM25 search with RRF fusion
├── embeddings.py    → HuggingFace model + FAISS index build/load/search
├── model.py         → ChatGroq LLM — answer generation + comparison table
│
├── catalog.py       → Product data + searchable text builder
├── ingest.py        → One-time index builder CLI
├── config.py        → All settings loaded from .env
│
├── requirements.txt
├── .env.example
└── README.md

⚡ Quick Start

1. Clone the repository

git clone https://github.com/yourusername/shoplens-ai.git
cd shoplens-ai/rag_prod

2. Create a virtual environment

python -m venv freshenv

# Windows
freshenv\Scripts\activate

# Mac / Linux
source freshenv/bin/activate

3. Install dependencies

pip install --upgrade pip
pip install -r requirements.txt

⚠️ NumPy note: If you get a NumPy 2.x compatibility error with FAISS, run:
pip install "numpy<2" && pip install faiss-cpu --force-reinstall

4. Set up API key

cp .env.example .env

Edit .env and add your Groq API key:

GROQ_API_KEY=gsk_your_actual_key_here

Get a free key at console.groq.com/keys — no credit card required.

5. Build the vector index

python ingest.py

Expected output:

INFO - Building FAISS index from 8 documents...
INFO - Index saved to faiss_product_index ✅
✅ Done: 8 products indexed and ready.

6. Launch the app

streamlit run app.py

Open http://localhost:8501 in your browser.

🎯 Features

Core Pipeline

Multimodal input — Upload product images (JPG, PNG, WEBP, AVIF auto-converted)
Vision understanding — Llama 4 Scout extracts category, brand, color, features, use case
Hybrid retrieval — FAISS semantic search + BM25 keyword search fused with RRF
Conversational AI — Multi-turn chat with windowed conversation memory
Price filtering — "under $300" auto-detected from natural language

Advanced Retrieval

Reciprocal Rank Fusion (RRF) — Merges semantic and keyword rankings without score normalization
Image description caching — MD5 hash-based cache avoids redundant Vision API calls
Format normalization — AVIF, BMP, TIFF automatically converted to JPEG before processing
Post-retrieval filtering — Metadata filters applied after embedding search

UI / UX

Dark premium theme — #0a0a0f background with purple/orange accent gradients
Product cards — Match score, price, category, and direct product links
Suggestion chips — One-click common queries
Comparison mode — Auto-detects "compare" intent and generates markdown tables
Session management — Clear conversation without page reload

💡 Key Concepts

Why Hybrid Search?

Search Type	Strength	Weakness
Semantic (FAISS)	Understands meaning — "running shoes" matches "athletic footwear"	Misses exact brand names
Keyword (BM25)	Finds exact matches — "Nike Air Max 270"	No semantic understanding
Hybrid (RRF)	Best of both worlds	—

Reciprocal Rank Fusion

RRF combines two ranked lists without requiring score normalization:

RRF_score(doc) = Σ 1 / (k + rank_i)

where k=60 is the standard constant from the original paper.

RAG Pipeline

Query = image_description + user_question
   ↓
Retrieve top-K products from vector store
   ↓
Format as context: [Product name, specs, price, link]
   ↓
LLM generates answer grounded in retrieved context

🔧 Configuration

All settings are in config.py, overridable via .env:

Variable	Default	Description
`GROQ_API_KEY`	—	Required. Groq API key
`GROQ_MODEL`	`llama-3.3-70b-versatile`	Chat LLM
`VISION_MODEL`	`meta-llama/llama-4-scout-17b-16e-instruct`	Vision LLM
`EMBEDDING_MODEL`	`sentence-transformers/all-MiniLM-L6-v2`	Embedding model
`TOP_K_RESULTS`	`5`	Products to retrieve
`EMBEDDING_DEVICE`	`cpu`	`cpu` or `cuda`
`LLM_TEMPERATURE`	`0.3`	Lower = more factual

📦 Adding More Products

From a JSON file:

python ingest.py --source path/to/catalog.json

JSON format:

[
  {
    "id": "P001",
    "name": "Product Name",
    "category": "Category",
    "brand": "Brand",
    "price": 199,
    "specs": "Technical specifications...",
    "description": "Product description...",
    "url": "https://example.com/product"
  }
]

Upgrade embedding model for better accuracy:

# In config.py
EMBEDDING_MODEL = "BAAI/bge-large-en-v1.5"  # 1024-dim, higher accuracy

🛠️ Troubleshooting

Error	Fix
`GROQ_API_KEY not found`	Check `.env` file is in the `rag_prod/` folder
`No FAISS index found`	Run `python ingest.py` before starting the app
`numpy.core.multiarray failed`	Run `pip install "numpy<2"`
`fbgemm.dll not found`	Install VC++ Redistributable
`proxies` TypeError (groq)	Run `pip install groq==0.9.0 httpx==0.27.0`
Vision AVIF error	Run `pip install pillow-avif-plugin`
Port 8501 in use	Run `streamlit run app.py --server.port 8502`

👨‍💻 Contributors

@ash-iiiiish

🤝 Contributing

Contributions are welcome! Fork this repository and submit a pull request.

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.............

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 ShopLens AI

Multimodal RAG for E-Commerce Product Assistant

📌 Overview

🏗️ Architecture

🚀 Tech Stack

📁 Project Structure

⚡ Quick Start

1. Clone the repository

2. Create a virtual environment

3. Install dependencies

4. Set up API key

5. Build the vector index

6. Launch the app

🎯 Features

Core Pipeline

Advanced Retrieval

UI / UX

💡 Key Concepts

Why Hybrid Search?

Reciprocal Rank Fusion

RAG Pipeline

🔧 Configuration

📦 Adding More Products

🛠️ Troubleshooting

👨‍💻 Contributors

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 168 Commits
faiss_product_index		faiss_product_index
images		images
multimodal_rag_production_v2/rag_prod		multimodal_rag_production_v2/rag_prod
.gitignore		.gitignore
LICENSE		LICENSE
Multimodal_RAG_Ecommerce.ipynb		Multimodal_RAG_Ecommerce.ipynb
README.md		README.md
product_catalog.json		product_catalog.json
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🔍 ShopLens AI

Multimodal RAG for E-Commerce Product Assistant

📌 Overview

🏗️ Architecture

🚀 Tech Stack

📁 Project Structure

⚡ Quick Start

1. Clone the repository

2. Create a virtual environment

3. Install dependencies

4. Set up API key

5. Build the vector index

6. Launch the app

🎯 Features

Core Pipeline

Advanced Retrieval

UI / UX

💡 Key Concepts

Why Hybrid Search?

Reciprocal Rank Fusion

RAG Pipeline

🔧 Configuration

📦 Adding More Products

🛠️ Troubleshooting

👨‍💻 Contributors

🤝 Contributing

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages