A spell-checking system for the Arabic language for the World Wide Web using a computational lexicon and orthographic and phonological rules

Author

Kuwait University

Abstract

This paper presents a system for detecting and correcting spelling errors for the Arabic language for the World Wide Web (Web Spell Checker) that we designed using the WebSpellChecker Engine system. Our system is available to users in the form of a Cloud Web Service that can be integrated with any website or application available on the World Wide Web, and it can also be integrated with smart device applications through an Application Programming Interface (API) that allows spell checking of Arabic texts entered to web applications and smart device applications. Our system can handle a large percentage of words covering Standard Arabic in general and Modern Standard Arabic in particular using a computational lexicon. This lexicon is built using a huge open source word list. This list was built using an open source Lexical Database dedicated to morphological analysis of Arabic nouns and verbs, which was designed using Finite State Automata technique. The mentioned word list contains the possible inflected and derived forms of Standard Arabic words (examples: كَتَبَ، ويكتبان، كتبوا، فسيكتبن، كاتِبة، للكاتِبَين، المكتوب). The system was also provided with the ability to re-rank the automatic correction suggestions resulting from the application of the Levenshtein Edit Distance Algorithm
used in the automatic computer correction of spelling errors by giving priority to showing the automatic correction suggestions for common spelling errors among Arabic language users using context sensitive orthographic and phonological rules. The computational lexicon and the mentioned context sensitive orthographic and phonological rules are used to provide the system with linguistic knowledge that enables it to detect and correct spelling errors in Standard Arabic texts entered to web applications.
 

Keywords