Suffix Based Automated Parts of Speech Tagging for Bangla Language

Author Topic: Suffix Based Automated Parts of Speech Tagging for Bangla Language  (Read 403 times)

Offline Monir Hossan

  • Sr. Member
  • ****
  • Posts: 343
  • Remain honest in every sphere of life!
    • View Profile
    • Daffodil International University
Suffix Based Automated Parts of Speech Tagging for Bangla Language

Natural language processing (NLP) is the technique by which we process the human language with the computer. Parts-of-Speech (POS) tagging is one of the fundamental requirements for some NLP applications. It is considered as a solved problem for some foreign languages, such as English, Chinese, due to higher accuracy (97%), where it is still an unsolved problem for Bangla because of its ambiguity. Although making a POS tagger for Bangla is not a new work, but each one of available POS taggers has different kinds of limitations. We choose to develop an unsupervised system rather than a supervised system, because a supervised system needs a huge data resource for training purpose and available resources in Bangla is really poor. Here we develop a POS tagger mainly based on Bangla grammar especially suffixes. Because Bangla is a very inflectional language, where a single word has many variants based on their suffixes. In this POS tagger, we assign 8 base POS tags, where some rules, based on Bangla grammar and suffix, are applied to identify POS tags with the cooperation of verb root dataset. To handle non-suffix words, a dataset of almost 14500 Bangla words, with having their default POS tags, is added with the system, which helps to increase the efficiency of this POS tagger. A modified version of previously used algorithm for suffix analysis is applied, which result in a satisfactory level of about 94.2%.

Monjoy Kumar Roy ; Pinto Kumar Paull ; Sheak Rashed Haider Noori ; S M Hasan Mahmud

Source: 2nd International Conference on Electrical, Computer and Communication Engineering, ECCE 2019

Mohammad Monir Hossan
Senior Assistant Director (Division of Research)