CUHK Research: Changing the world

keep our eyes very closely on what’s hot in this area if we’re going to catch the train.” To meet this challenge, he and his top students monitor what is new in their field. “The research intensity is very high, which means a lot of people are researching the same problem at the same time. What may once have been a 50-year research problem can now be solved in half a year – then my PhD students need to find another topic to study.” Professor Jia draws on his 20-plus years of experience to help 40 or so PhD students in his group see the big picture and focus on the most important topics in the computer engineering community, such as how to combine multi-modality information including natural language, images, videos, and sound. A related initiative is the Deep Vision Lab, which Professor Jia set up informally so that computer engineering friends from other Hong Kong universities and top universities overseas can, together, review key papers and the latest developments. They aim to identify the most important problems to be worked on over the next five to 10 years. Visualising through words Professor Jia is taking on one of the toughest problems yet: how to bridge computer vision and natural language processing (NLP). “Originally, language and visual content processing were totally separate fields of research, but they are converging because the computer vision people are looking for NLP models to process visual data. There was also a time when NLP researchers used computer vision solutions.” Among many potential impacts, success would make it possible to create a poster by talking naturally to a computer instead of typing in keywords. That is easier said than done. “The way we encode information and messages for vision and language is completely different,” says Professor Jia. “Joining these two is undoubtedly one of the most important tasks in my research pipeline.” The Deep Vision Lab 13

RkJQdWJsaXNoZXIy NDE2NjYz