Authors
Roy, S., & Ali, F. B.
Abstract
We propose an unsupervised context-aware spelling error detection and correction method for traditional Bangla written script. The context-aware capability is introduced by generating a representation of information from the context words and using the concept of cosine similarity of find the best candidate for the misspelled words. We used fast-text embedding concept of using character n-gram embedding. The character n-gram embedding is capable of generating embedding for unknown words, which is a capability missing in any conventional methods. We performed our experiments on Bangla language collected from popular newspapers, blogs, Bangla Wikipedia and Banglapedia. Our proposed method outperforms conventional methods.
No Credit Card Required
20 free demos per month
© Copyrights 2024 VISIE Limited. All rights reserved.