#自然语言处理#A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
#自然语言处理#Efficient Retrieval Augmentation and Generation Framework
#大语言模型#🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL
#自然语言处理#Use late-interaction multi-modal models such as ColPali in just a few lines of code.
Late Interaction Models Training & Retrieval
Neural Search
High-Performance Engine for Multi-Vector Search
ColBERT humor dataset for the task of humor detection, containing 200,000 jokes/news
An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.
#向量搜索引擎#Vector Database with support for late interaction and token level embeddings.
Tree-based indexes for neural-search
#计算机科学#Efficient late-interaction retrieval systems in Julia!
Official codebase for the ACL 2025 Findings paper: Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval.
#Awesome#A list of multi-vector retrieval resources
#向量搜索引擎#A demonstration of hybrid search with reranking using Qdrant and BGE-M3 model. A showcase of dense and sparse retrieval combined with ColBERT reranking for optimal search results
An overview of popular reranking models and architectures for 2 stage RAG pipelines
Open source ColBERT based document database
A Powerful Python Library to Build AI Applications with the RAG
This is the Information Retrieval 2023-2024 fall semester CEID course project.