Search
Last updated
Was this helpful?
Last updated
Was this helpful?
refers to matching some or all of a text query with documents stored in a database. Compared to traditional database queries, full-text search provides results even in case of partial matches. It allows building more flexible search interfaces for users, thus enabling them to find accurate results more quickly.
Prefix and infix searching: This allows you to search for parts of words, like finding "apple" by searching "app" or finding "highlight" by searching "light."
Morphology processing: This includes stemming and lemmatization. Stemming finds different forms of a word, like "running" "and ran," all stemming from "run." Lemmatization finds the base form of a word, so "running" becomes "run."
Fuzzy searching: This helps find results even when the query contains typos.
Exact result count: Full-text search provides the total number of documents that match the search criteria.
Vectorization: Machine learning (ML) models, such as sentence transformers or OpenAI embeddings, convert the search query text and the documents into numerical representations. These representations are called vectors or embeddings.
Embedding space: These vectors are plotted in a multi-dimensional space, where the distance between vectors reflects the semantic similarity between the original pieces of text. Documents with similar meanings have vectors that are closer together in this space.
Nearest neighbors: The search engine uses algorithms like k-nearest neighbors (KNN) to find the vectors in the embedding space that are closest to the query vector. These closest vectors represent the documents that are most semantically similar to the search query.
Feature
Full-Text Search
Vector Search
Data Type
Structured or semi-structured text
Unstructured or high-dimensional data
Query Type
Keyword or phrase matching
Similarity matching
Primary Use Case
Exact matches, metadata filtering
Semantic understanding, recommendations
Technology Examples
PostgreSQL full-text search, Elasticsearch
pgvectorscale, FAISS
Full text search cannot understand the relationship and semantic
Vector sarch cannot identify the exact keyword precisely , some of the precise meaning of text may be missed
Hybrid search combines the strengths of full-text search and vector search. It builds upon the accessible, search-as-you-type experience of full-text search and integrates the enhanced discovery capabilities that AI search enables.
Let’s say your final keyPhrases are ranked like Good Product, Great Product, Nice Product, Excellent Product, Easy Install, Nice UI, Light weight etc.
But there is an issue with this approach, all the phrases like good product, nice product, excellent product
are similar and define the same property of the product and are ranked higher. Suppose we have a space to show just 5 keyPhrases, in that case, we don't want to show all these similar phrases.
For the traditional semantic search, the highest similarity, the higheest ranking, which may cause the similar result
The idea behind using MMR is that it tries to reduce redundancy and increase diversity in the result and is used in text summarization. MMR selects the phrase in the final keyphrases list according to a combined criterion of query relevance and novelty of information.
Unlike traditional keyword-based search, vector search retrieves results by analyzing the similarity between