My Research
AI and machine learning research contributions
My research focuses on artificial intelligence, machine learning, and data-driven solutions to real-world problems.
Here is the research project I have done in graduate school.

Natural Language Processing (NLP) for low-resource languages presents unique challenges due to limited datasets, dialectal variations, and lack of standardized tools. Developing AI solutions for such languages requires creative approaches to handle data scarcity, noise, and linguistic diversity. My research focused on applying advanced NLP techniques to address these challenges and create practical solutions for real-world communication in Sri Lanka.
Next-Generation Noisy Robust Speech Translation System
Supervisor: Dr. Madusha Chandrasena, PhD, Senior Lecturer
Department of Computer Systems Engineering, University of Kelaniya
I worked on a Next-Generation Noisy Robust Speech Translation system focused on enabling communication between Sinhala and Tamil speakers in Sri Lanka.
Background & Motivation
Sri Lanka is a multi-lingual nation where language is key in communication, governance, education and delivery of services to the people. Sinhala and Tamil are considered official languages however its main users belong to different ethnic groups. Monolingual speech represents a high percentage of the population, and this tendency frequently results in the breakdown of communication (both in the formal and informal contexts).
This is because the recent developments in speech technology have come up with real-time translation aids, which translate spoken language in real time. It should be noted, though, that most of the available solutions are created on the basis of high-resource languages (e.g., English, Spanish, or Chinese), thus can partially support Sinhala and Tamil. These shortcomings render existing tools to be less efficient and less applicable in Sri Lanka.
Research Challenges
Major limitation of the research in Sinhala and Tamil speech translation system constructs is creating of large dataset of high-quality annotations. The fact that they are low-resource languages means that they do not have adequate digital infrastructure to develop stable AI models of Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS). The consequences of this are tools that are not as precise and in many cases do not reflect the depth and breadth of these languages.
The other drawback is the unacceptable performance in noise. Background noise causes a huge drop in the accuracy of speech recognition in most of the real life backgrounds including those where people work like in hospitals or markets or even when using public transport. The large diversity of dialects and accents all over the country is also a problem that many tools have a hard time coping with.
These constraints indicate that speech translation systems, which work robustly in noisy conditions, have to be specifically designed to accommodate Sinhala and Tamil. In this project, these issues are resolved through the attention to real-life practice so that multilingual communication can be enhanced in Sri Lanka.