2025-05-07
Image Search Engine 🔍
A powerful and efficient image search engine that uses deep learning to find similar images based on their visual content. This project combines state-of-the-art neural networks with vector database technology to enable fast and accurate image similarity search.
Features
- Multiple Model Support: Compatible with various deep learning architectures:
- ResNet50
- EfficientNet
- MobileNetV3
- Vector Database Integration: Uses LanceDB for efficient similarity search
- Batch Processing: Optimized for processing large image datasets
- Web Interface: User-friendly FastAPI-based web application
- Configurable: Easy-to-modify YAML configuration files
- Multi-format Support: Handles various image formats (JPEG, PNG, WebP)
Installation
- Clone the repository:
git clone https://github.com/yourusername/image-search-engine.git
cd image-search-engine
- Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
Usage
1. Image Ingestion
To index your image dataset:
python ingest.py
This will:
- Scan your dataset directory for images
- Extract features using the specified model
- Store embeddings in the vector database
2. Query Images
To search for similar images:
python query_db.py --image_path path/to/your/query/image.jpg
3. Web Interface
Start the web application:
python -m uvicorn app.app:app --reload
Access the web interface at http://localhost:8000
🔧 Configuration
Configuration files are stored in the configs/
directory. Available configurations:
animals_efficientnet.yml
animals_mobilenetv3.yml
animals_resnet50.yml
ocean_resnet50.yml
ocean_resnet50_v2.yml
Example configuration:
MODEL_NAME: resnet50
MODEL_DIM: 2048
COLLECTION_NAME: images
DATASET_PATH: ./dataset
LANCEDB: ./vectordb
Project Structure
image-search-engine/
├── app/ # Web application
├── configs/ # Model configurations
├── dataset/ # Image dataset directory
├── engine/ # Core search engine components
├── vectordb/ # Vector database storage
├── ingest.py # Dataset ingestion script
├── query_db.py # Image query script
└── requirements.txt # Project dependencies
How It Works
- Feature Extraction: Deep learning models convert images into high-dimensional feature vectors
- Vector Storage: Features are normalized and stored in LanceDB
- Similarity Search: Query images are processed the same way and compared using cosine similarity
- Result Ranking: Most similar images are returned based on vector similarity
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.