|November 23 – 24, 2018, Ho Chi Minh City, Vietnam|
School of Information Science, JAIST
Minh Le Nguyen is currently an Associate Professor of School of Information Science, JAIST. He leads the lab on Machine Learning and Natural Language Understanding at JAIST. He received his B.Sc. degree in information technology from Hanoi University of Science, and M.Sc. degree in information technology from Vietnam National University, Hanoi in 1998 and 2001, respectively. He received his Ph.D. degree in Information Science from School of Information Science, Japan Advanced Institute of Science and Technology (JAIST) in 2004. He was an assistant professor at School of information science, JAIST from 2008-2013. His research interests include machine learning, natural language understanding, question answering, text summarization, machine translation, big data mining, and Deep Learning.
In this talk, we focus on showing the state-of-the-art works on natural language generation(NLG) using deep learning approaches. We will highlight existing works on NLG from the leading natural language processing conferences in 2018. We then present the application of NLG in the chatbot systems. The first part of the tutorial will show the background knowledge on deep learning for natural language processing. The second part will discuss NLG techniques from the basic to the state of the art techniques. The third part will show how NLG techniques are used in spoken dialog systems (i.e. Microsoft's Cortana, Apple's Siri, Amazon Alexa, Google Assistant, and Facebook's M) and Chatbot systems. The final part will give a conclusion with our discussion on the challenging of NLG when exploiting for the Vietnamese language.
Ho Chi Minh City University of Technology (HCMUT), Vietnam
Dr. Quan Thanh Tho is an Associate Professor in the Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology (HCMUT), Vietnam. He received his B.Eng. degree in Information Technology from HCMUT in 1998 and received Ph.D degree in 2006 from Nanyang Technological University, Singapore. His current research interests include formal methods, program analysis/verification, the Semantic Web, machine learning/data mining and intelligent systems. Currently, he heads the Department of Software Engineering of the Faculty. He is also serving as the Chair of Computer Science Program (undergraduate level).
A statistical language model is a probability distribution over sequences of words. Language modeling is used in various computing tasks such as speech recognition, machine translation, optical character and handwriting recognition and information retrieval and other applications. Whereas n-gram is considered as a traditional language model, neural language model has been emerging recently as a means to approximate the probability of a sentence using neural networks and word embeddings. An advantage of a neural language model is that it can be further applied to other NLP tasks where the training datasets may be limited. In this talk, we realize this idea by introducing the usage of a Vietnamese neural model language trained from a large corpus of social media data. When further applying this neural model language with other NLP tasks including entity recognition, spam detection and topic modeling with relatively small training datasets; we witness improved performance achieved, as compared to other existing approaches using deep learning with typical word embedding techniques.