Unsupervised context-sensitive bangla spelling correction with character n-gram

Read the Paper

Authors

Roy, S., & Ali, F. B.

Abstract

We propose an unsupervised context-aware spelling error detection and correction method for traditional Bangla written script. The context-aware capability is introduced by generating a representation of information from the context words and using the concept of cosine similarity of find the best candidate for the misspelled words. We used fast-text embedding concept of using character n-gram embedding. The character n-gram embedding is capable of generating embedding for unknown words, which is a capability missing in any conventional methods. We performed our experiments on Bangla language collected from popular newspapers, blogs, Bangla Wikipedia and Banglapedia. Our proposed method outperforms conventional methods.

wave
Unlock the power

Tailored to Your Industry,
Designed for Results

No Credit Card Required

20 free demos per month

© Copyrights 2024 VISIE Limited. All rights reserved.