Suffix Based Automated Parts of Speech Tagging for Bangla Language

Author Topic: Suffix Based Automated Parts of Speech Tagging for Bangla Language  (Read 403 times)

Offline Monir Hossan

  • Sr. Member
  • ****
  • Posts: 343
  • Remain honest in every sphere of life!
    • View Profile
    • Daffodil International University
Suffix Based Automated Parts of Speech Tagging for Bangla Language

Abstract:
Natural language processing (NLP) is the technique by which we process the human language with the computer. Parts-of-Speech (POS) tagging is one of the fundamental requirements for some NLP applications. It is considered as a solved problem for some foreign languages, such as English, Chinese, due to higher accuracy (97%), where it is still an unsolved problem for Bangla because of its ambiguity. Although making a POS tagger for Bangla is not a new work, but each one of available POS taggers has different kinds of limitations. We choose to develop an unsupervised system rather than a supervised system, because a supervised system needs a huge data resource for training purpose and available resources in Bangla is really poor. Here we develop a POS tagger mainly based on Bangla grammar especially suffixes. Because Bangla is a very inflectional language, where a single word has many variants based on their suffixes. In this POS tagger, we assign 8 base POS tags, where some rules, based on Bangla grammar and suffix, are applied to identify POS tags with the cooperation of verb root dataset. To handle non-suffix words, a dataset of almost 14500 Bangla words, with having their default POS tags, is added with the system, which helps to increase the efficiency of this POS tagger. A modified version of previously used algorithm for suffix analysis is applied, which result in a satisfactory level of about 94.2%.

Authors:
Monjoy Kumar Roy ; Pinto Kumar Paull ; Sheak Rashed Haider Noori ; S M Hasan Mahmud

Source: 2nd International Conference on Electrical, Computer and Communication Engineering, ECCE 2019

Link: https://ieeexplore.ieee.org/document/8679161
Mohammad Monir Hossan
Senior Assistant Director (Division of Research)
E-mail: monirhossain@daffodilvarsity.edu.bd