Can Proofreading be Automated?

Can Proofreading be Automated?

Humans currently manually proofread their work, whether academic or non-academic. The proofreading process involves checking and correcting a given text for errors: spelling/typographical, grammatical and punctuation.  One must possess a sufficient amount of knowledge on the lexical and syntactical rules of a human language, in order to accurately proofread a text. Using Natural Language Processing and Computational Linguistics, which involve interactions between machines and human language, this task may be automated. This has been attempted, but with a lower accuracy than humans. Syntactic rules can be hand-crafted and applied so that automated programs can understand how to spot and adjust errors. Spelling errors can be corrected and the correct word predicted using spelling prediction programs. This feature is now commonly included in word processing software such as Microsoft Word. Machines can now perform part of speech tagging with 95% accuracy (Brill 1995). This allows for Word Sense Disambiguation. WordNet and other related knowledge bases can also help with POS tagging and predicting the correct word sense of a polysemous word, based on its context and surrounding text. Lexical and syntactic analysis of text enables the generation of parse trees, which are diagrams that show the syntactic relations among words. This can better help to understand the meaning underlying an expression. The syntactic element of linguistics is easier, and machines have performed with good accuracy for these tasks.

However, this is not the case for semantics. For true artificial intelligence, and the ability to fully understand the meaning underlying a text, semantics must be considered. Humans have the ability to automatically infer meaning using logic and prior real-world knowledge. This is not feasible for machines, since it would require a massive knowledge base, and would be time-intensive. Computational semantics is only slowly advancing using ontologies and KBs, machine learning and syntactic rules.

For example, in the sentence “We saw her duck.” Based on the surrounding text in the paragraph, a human is able to infer whether “duck” in the sentence refers to the animal she possesses, or whether it refers to the verb “move downwards”. However, this is complex for a machine, and requires processing the surrounding text in order to make an accurate judgement. Therefore, it is these kinds of ambiguous sentences that would create problems in the case of automated proofreading. Semantics and pragmatics require more research, and once these core problems can be addressed, programs that rely on true AI can be applied to automate the task of proofreading.


Leave a reply

Your email address will not be published. Required fields are marked *