Skip to content

Developed a Search Engine for both phrase and free text queries on Fars persian news using concepts such as TF-IDF,inverted index, champion list.

Notifications You must be signed in to change notification settings

Sabaghip/Search-Engine

Repository files navigation

Information Retrieval Course Project

Instructor: Dr. A. Nikabadi

Course content: CS276 Standford University

Project Overview

  1. Preprocessing on data (Noramlization, Tokenization, Stemming, Removing Stopwords)

  2. Created a positional inverted index

  3. Used Zipf's law

  4. Used Heaps law

  5. Searching by Normal quries, Phrase Queries (used permuterm index), Boolean queries

  6. Ranking results

  7. Show words in vector representation

  8. Compute tf-idf

  9. Compute cosine similarity between query terms and documents

  10. Used Index elimination techniques such as creating champion list

  11. Rank results based on most relevent results

About

Developed a Search Engine for both phrase and free text queries on Fars persian news using concepts such as TF-IDF,inverted index, champion list.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages