Fact Based Search Engine: News Fact Finder Utilizing Naive Bayes Classification
There are a number of quality news sources available on the Internet. Searching through all these sources for facts related to a certain subject would be exhaustive for a user. We developed a niche sentence level search engine called News Fact Finder in order to provide users with factual information relevant to the query. Sentence level search is based on the intuition that if all the query words are within the same sentence, that result is more relevant than a result containing the query words in remote parts of the text. We therefore use suffix arrays which excel at exact substring matching to index our database. Our framework uses a Naive Bayes classifier for classification of sentences as facts and opinions. Ranking was performed at the document level, such that a document with many related facts would be ranked higher. News Fact Finder performs competitively on a large collection of news documents in providing relevant fact-based results to users. This is a novel approach to perform quality-based searching, ranking, indexing and categorization of news information.
search engine, word list, news article, inverted index, sentence level, news fact finder
Computer Engineering | Engineering
Faculty of Applied Science & Technology (FAST)
© Springer- Verlag Berlin Heidelberg 2013
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Salmon, Ricardo; Ribeiro, Cristina; and Amarala, Swathi, "Fact Based Search Engine: News Fact Finder Utilizing Naive Bayes Classification" (2013). Books and Websites. 7.
Salmon R., Ribeiro C., Amarala S. (2013) Fact Based Search Engine: News Fact Finder Utilizing Naive Bayes Classification. In: Pasi G., Bordogna G., Jain L. (Eds) Quality Issues in the Management of Web Information. Intelligent Systems Reference Library, vol 50. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37688-7_6