Welcome to the "Retrieval Optimization: From Tokenization to Vector Quantization" course! π The course teaches you how to optimize vector search in large-scale customer-facing RAG applications.
In this course, youβll dive deep into tokenization and vector quantization techniques, exploring how to optimize search in large-scale Retrieval-Augmented Generation (RAG) systems. Learn how different tokenization methods impact search quality and explore optimization techniques for vector search performance.
What Youβll Learn:
- π§ Embedding Models and Tokenization: Understand the inner workings of embedding models and how text is transformed into vectors.
- π Tokenization Techniques: Explore several tokenizers like Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece, and how they affect search relevancy.
- π Search Optimization: Learn to tackle common challenges such as terminology mismatches and truncated chunks in embedding models.
- π Search Quality Metrics: Measure the quality of your search using various metrics and optimize search performance.
- βοΈ HNSW Algorithm Tuning: Adjust Hierarchical Navigable Small Worlds (HNSW) parameters to balance speed and relevance in vector search.
- πΎ Vector Quantization: Experiment with major quantization methods (product, scalar, and binary) and understand their impact on memory usage and search quality.
- π§© Tokenization in Large Models: Learn how tokenization works in large language models and how it affects search quality.
- π οΈ Training Tokenizers: Explore how Byte-Pair Encoding, WordPiece, and Unigram are trained and function in vector search.
- π Search Optimization: Understand how to adjust HNSW parameters and vector quantizations to optimize your retrieval systems.
- π¨βπ» Kacper Εukawski: Developer Relations Lead at Qdrant, Kacper brings expertise in vector search optimization and teaches practical techniques to enhance search efficiency in RAG applications.
π To enroll or learn more, visit π deeplearning.ai.