Natural Language Processing and Knowledge Discovery Research Group

The Natural Language Processing and Knowledge Discovery (NLP-KD) research group focuses on cutting-edge research in the mining and processing of text and natural language data, particularly in Vietnamese and multilingual contexts. The group’s work is centered around two main research directions:

Scientific Research Publication and Research Training Capacity

The research group has published numerous international articles in ISI/SCIE journals and at leading conferences in the fields of Artificial Intelligence (AI), Natural Language Processing (NLP), and Data Science. Key research directions include intelligent conversational systems, recommendation systems, natural language processing, deep learning, computer vision, and biomedical data analysis. Currently, many PhD candidates, master’s students, and undergraduates are actively participating in research at laboratories affiliated with the group.

Projects and Collaboration

Members of the research group are leading various research topics and projects at multiple levels, ranging from institutional to national and international. In addition to academic activities, the group actively promotes technology transfer, applying research outcomes in practical settings within both domestic and international enterprises and organizations.

Members

Associate Professor, PhD Le Anh Cuong, Group Leader

* Research achievements of the group leader:

Principal Investigator of 2 NAFOSTED-funded projects and 1 national-level KC program project
Successfully supervised 7 PhD students and numerous Master’s students
Published nearly 100 international publications in prestigious journals and conferences in Natural Language Processing (NLP) and Machine Learning
Second Prize Winner of "Nhân tài đất Việt" Award 2017

PhD Tran Luong Quoc Dai; Secretary, Key Member

* Research achievements:

Team member of a successfully completed NAFOSTED-funded project
Published 7 international articles indexed in Web of Science (WoS)
Active *collaborations with international research groups

PhD Tran Thanh Phuoc, Key Member

*Research achievements:

Principal Investigator of 2 institutional-level research projects and main member of 1 provincial-level project
Published 43 scientific papers, including 20 WoS-indexed international publications; H-index: 6 (WoS)
Participated in database design and system integration consulting for an e-commerce platform in pharmaceuticals
Collaborated with multiple international research groups

PhD Pham Van Huy, Key Member

*Research achievements

Published 42 articles indexed in WoS/Scopus
H-index: 17 (based on WoS)
Currently Principal Investigator of 2 scientific research projects
Extensive collaborations with international research groups

PhD Trinh Hung Cuong, Key Member

*Research achievements:

Published 16 international articles, including 7 indexed in WoS
Active collaborations with international research teams

PhD Ho Thi Linh, Member

*Research achievements:

Member of a successfully completed NAFOSTED-funded project
Published 4 international articles, with 2 indexed in WoS

MSc Vu Dinh Hong, Member

*Research achievements:

Published 4 international papers
Participated in a successfully completed NAFOSTED project
Involved in the transfer of 2 AI-based technology products to enterprises

Publications

Toan Nguyen Mau, Anh-Cuong Le, Duc-Hong Pham, Van-Nam Huynh. “An information fusion based approach to context-based fine-tuning of GPT models”, 2024, Information Fusion 104: 102202.
Phuc-Nghi Nguyen and Phuoc Tran. “Constructing a Chinese-Vietnamese bilingual corpus from subtitle websites”, 2024, International Journal of Intelligent Information and Database Systems 16(4): 385–408.
Quoc-Dai Luong Tran, Anh-Cuong Le, Van-Nam Huynh. “Enhancing Conversational Model With Deep Reinforcement Learning and Adversarial Learning”, 2023, IEEE Access 11: 75955-75970.
Lkhagvadorj Munkhdalai, Tsolmon Munkhdalai, Jae-Eun Hong, Van-Huy Pham, Nipon Theera-Umpon and Keun Ho Ryu. "Discrimination Neural Network Model for Binary Classification Tasks on Tabular Data", 2023, IEEE Access 11: 15404-15418.
Phuoc Tran, Dat Nguyen, Huu-Anh Tran, Thien Nguyen, and Tram Tran. “Building a Closed-Domain Question Answering System for a Low-Resource Language”, 2023, ACM Transactions on Asian and Low-Resource Language Information Processing 22(5): Article 109.
Thi-Linh Ho, Anh-Cuong Le, Dinh-Hong Vu. “Multiview Fusion Using Transformer Model for Recommender Systems: Integrating the Utility Matrix and Textual Sources”, 2023, Applied Sciences 13(10): 6324.
Quoc-Dai Luong Tran, Anh-Cuong Le. “Exploring bi-directional context for improved chatbot response generation using deep reinforcement learning”, 2023, Applied Sciences 13(8): 5041.
Ngoc-Khuong Nguyen, Dac-Nhuong Le, Viet-Ha Nguyen, Anh-Cuong Le. “A Method of Integrating Length Constraints into Encoder-Decoder Transformer for Abstractive Text Summarization”, 2023, Intelligent Automation & Soft Computing 38(1): 95-109.
Quoc-Dai Luong Tran, Dinh-Hong Vu, Anh-Cuong Le, Ashwin Ittoo. “Contextual Modeling in Context-Aware Conversation Systems”, 2023, KSII Transactions on Internet & Information Systems 17(5): 1601-1622.
Thi-Linh Ho, Anh-Cuong Le, Dinh-Hong Vu. “Enhancing recommender systems by fusing diverse information sources through data transformation and feature selection”, 2023, KSII Transactions on Internet and Information Systems 17(5): 1413-1432.
Hung-Cuong Trinh, Van-Huy Pham, Anh H. Vo. “Remaining Useful Life Estimation based on Noise Injection and a Kalman Filter Ensemble of modified Bagging Predictors”, 2023, KSII Transactions on Internet and Information Systems 17(12): 3242-3265.
Phuoc Tran, Duy Khanh Nguyen, Tram Tran, and Bay Vo. “Using Syntax and Shallow Semantic Analysis for Vietnamese Question Generation”, 2021, Proceedings of the 13th International Conference on Knowledge and Systems Engineering (KSE 2021): 1-6.
Quoc-Tuan Vo, Phuoc Tran and Tram Tran. “Sentiment analysis for a Low-Resource Language: A study on a Vietnamese University”, 2020, Proceedings of the 4th International Conference on Information System and Data Mining (ICISDM 2020): 70–74.
Phuoc Tran, Thien Nguyen, Dinh Hung Vu, Huu-Anh Tran and Bay Vo. “A Method of Chinese-Vietnamese Bilingual Corpus Construction for Machine Translation”, 2020, Proceedings of the 2020 The 4th International Conference on Machine Learning and Soft Computing (ICMLSC 2020): 120-124.
Lkhagvadorj Munkhdalai, Tsolmon Munkhdalai, Van-Huy Pham, Meijing Li, Keun Ho Ryu and Nipon Theera-Umpon. “Recurrent Neural Network-Augmented Locally Adaptive Interpretable Regression for Multivariate Time-Series Forecasting”, 2022, IEEE Access 10: 11871-11885.
Quoc-Dai Luong Tran, Anh-Cuong Le and Van-Nam Huynh. "Towards a Human-like Chatbot using Deep Adversarial Learning", 2022, 2022 14th International Conference on Knowledge and Systems Engineering (KSE): 1-5.
Tuguldur Amarbayasgalan, Van-Huy Pham, Nipon Theera-Umpon, Yong Piao and Keun Ho Ryu. “An Efficient Prediction Method for Coronary Heart Disease Risk Based on Two Deep Neural Networks Trained on Well-Ordered Training Datasets”, 2021, IEEE Access 9: 135210-135223.
Khaliunaa Davagdorj, Ji-Won Bae, Van-Huy Pham, Nipon Theera-Umpon and Keun Ho Ryu. "Explainable Artificial Intelligence Based Framework for Non-Communicable Diseases Prediction", 2021, IEEE Access 9: 123672-123688.
Thien Nguyen, Lam Nguyen, Phuoc Tran, and Huu Nguyen. “Improving Transformer-Based Neural Machine Translation with Prior Alignments”, 2021, Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2021): 3651–3657.
Thien Nguyen, Huu Nguyen, and Phuoc Tran. “Sublemma-Based Neural Machine Translation”, 2021, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (EACL 2021): 2094–2100.
Quoc-Dai Luong Tran and Anh-Cuong Le. “A Deep Reinforcement Learning Model using Long Contexts for Chatbots”, 2021, 2021 International Conference on System Science and Engineering (ICSSE): 83-87.
Quoc-Dai Luong Tran, Anh-Cuong Le, Duc-Hong Pham. “Emotionally-Aware Sequence-to-Sequence Models for Conversational Systems”, 2021, Advances in Intelligent Information Hiding and Multimedia Signal Processing. Smart Innovation, Systems and Technologies 212: 245-254.
Phuong Nguyen Huy Pham, Bich-Ngan Thi Nguyen, Quy Thi Ngoc Co, Nguyen Tu Nguyen, Phuoc Tran and Vaclav Snasel. “An efficient hybrid algorithm for community structure detection in complex networks based on node influence”, 2020, Journal of Intelligent & Fuzzy Systems 39(4): 5235-5246.
Thien Nguyen, Trang Nguyen, Huu Nguyen and Phuoc Tran. “Transformer Encoders Incorporating Word Translation for Russian-Vietnamese Machine Translation”, 2020, Proceedings of the Sixth International Workshop on Balto-Slavic Natural Language Processing (BSNLP 2020): 59–64.
Huu Xuan Huynh, Van-Huy Pham, Edwin Lughofer, Mazin G. K. Al-Shamri, Mohammed F. R. A. Jabar, Nashwan M. Abdulkadir, Ahmed I. Abed, Sameer A. Hasan. “Context-Similarity Collaborative Filtering Recommendation”, 2020, IEEE Access 8: 33342-33351.
Bayu Tanuwijaya, Van-Huy Pham, Said Broumi, Darjan Karabasevic, Edmundas Kazimieras Zavadskas, Dragan Pamucar, Edmundas Zavadskas. “A Novel Single Valued Neutrosophic Hesitant Fuzzy Time Series Model: Applications in Indonesian and Argentinian Stock Index Forecasting”, 2020, IEEE Access 8: 60126-60141.
Khaliunaa Davagdorj, Jin-Suk Lee, Van-Huy Pham, & Keun Ho Ryu. “A Comparative Analysis of Machine Learning Methods for Class Imbalance in a Smoking Cessation Intervention”, 2020, Applied Sciences 10(9): 3307.
Hung-Cuong Trinh, Yung-Keun Kwon. “A Data-Independent Genetic Algorithm Framework for Fault-Type Classification and Remaining Useful Life Prediction”, 2020, Applied Sciences 10(1): 368.
Tuguldur Amarbayasgalan, Van-Huy Pham, Nipon Theera-Umpon, & Keun Ho Ryu. “Unsupervised Anomaly Detection Approach for Time-Series in Multi-Domains Using Deep Reconstruction Error”, 2020, Symmetry 12(8): 1251.
Phuoc Tran, Van-Deo Duong, Dien Dinh, Bay Vo, Huu Nguyen, and Long H.B. Nguyen. “Projecting dependency syntax labels from English into Vietnamese in English-Vietnamese bilingual corpus”, 2019, Proceedings of the 2019 International Conference on Asian Language Processing (IALP): 247-252.

Products

Develop intelligent systems capable of reasoning, learning, and making decisions like humans to solve complex problems across various domains.
Design algorithms that enable computers to automatically improve their prediction or classification performance through learning from data.
Create models and techniques that help computers understand, analyze, and generate human language in a natural and accurate manner.
Build computer vision systems that can recognize, analyze, and understand image and video content similarly to how humans do.
Train large-scale language models based on transformer architectures and diverse datasets to generate natural language, answer questions, and support a wide range of AI tasks.
Develop multimodal models that can process and integrate information from multiple data sources - such as text, images, audio, and video - for enhanced comprehensive understanding.
Apply AI in healthcare to diagnose diseases, analyze medical images, predict disease progression, and support accurate and effective treatment decisions.
Leverage AI in finance to forecast markets, detect fraud, automate transactions, and analyze financial risks, thereby improving operational efficiency and security.

Natural Language Processing and Knowledge Discovery Research Group

Research

TON DUC THANG UNIVERSITY