Related papers: Semantic Search and Recommendation Algorithm
This work falls in the areas of information retrieval and semantic web, and aims to improve the evaluation of web search tools. Indeed, the huge number of information on the web as well as the growth of new inexperienced users creates new…
Text search based on lexical matching of keywords is not satisfactory due to polysemous and synonymous words. Semantic search that exploits word meanings, in general, improves search performance. In this paper, we survey WordNet-based…
Nowadays, search engine users commonly rely on query suggestions to improve their initial inputs. Current systems are very good at recommending lexical adaptations or spelling corrections to users' queries. However, they often struggle to…
Searching for information about a specific person is an online activity frequently performed by many users. In most cases, users are aided by queries containing a name and sending back to the web search engines for finding their will.…
Traditional information retrieval systems rely on keywords to index documents and queries. In such systems, documents are retrieved based on the number of shared keywords with the query. This lexical-focused retrieval leads to inaccurate…
Semantic search with large language models (LLMs) enables retrieval by meaning rather than keyword overlap, but scaling it requires major inference efficiency advances. We present LinkedIn's LLM-based semantic search framework for AI Job…
Searches for phrases and word sets in large text arrays by means of additional indexes are considered. Their use may reduce the query-processing time by an order of magnitude in comparison with standard inverted files.
Existing search engines use keyword matching or tf-idf based matching to map the query to the web-documents and rank them. They also consider other factors such as page rank, hubs-and-authority scores, knowledge graphs to make the results…
Distributed representations of words have shown to be useful to improve the effectiveness of IR systems in many sub-tasks like query expansion, retrieval and ranking. Algorithms like word2vec, GloVe and others are also key factors in many…
In this paper, we propose an alternative to deep neural networks for semantic information retrieval for the case of long documents. This new approach exploiting clustering techniques to take into account the meaning of words in Information…
Due to their ease of use and high accuracy, Word2Vec (W2V) word embeddings enjoy great success in the semantic representation of words, sentences, and whole documents as well as for semantic similarity estimation. However, they have the…
Search has for a long time been an important tool for users to retrieve information. Syntactic search is matching documents or objects containing specific keywords like user-history, location, preference etc. to improve the results.…
Neural information retrieval systems typically use a cascading pipeline, in which a first-stage model retrieves a candidate set of documents and one or more subsequent stages re-rank this set using contextualized language models such as…
This paper presents an approach to enhance search engines with information about word senses available in WordNet. The approach exploits information about the conceptual relations within the lexical-semantic net. In the wrapper for search…
The classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) if WordNet synsets are chosen as the indexing space, instead of word forms. This result is obtained for a manually…
With the recent advancements in information technology there has been a huge surge in amount of data available. But information retrieval technology has not been able to keep up with this pace of information generation resulting in over…
Vector representations and vector space modeling (VSM) play a central role in modern machine learning. We propose a novel approach to `vector similarity searching' over dense semantic representations of words and documents that can be…
Word2vec is a popular family of algorithms for unsupervised training of dense vector representations of words on large text corpuses. The resulting vectors have been shown to capture semantic relationships among their corresponding words,…
Traditional retrieval methods have been essential for assessing document similarity but struggle with capturing semantic nuances. Despite advancements in latent semantic analysis (LSA) and deep learning, achieving comprehensive semantic…
Recently, neural approaches to spoken content retrieval have become popular. However, they tend to be restricted in their vocabulary or in their ability to deal with imbalanced test settings. These restrictions limit their applicability in…