News: A Paper about indoor robot navigation is accepted to ICRA 2018!
News: A Paper about 360 visual grounding is accepted to AAAI 2018!

About Me

I reveiced my MS from National Tsing Hua University, where I worked with Prof. Min Sun on Machine Learning (Deep Learning) and their applications in Robot Learning, Computer Vision, Natural Language Processing, and their intersection. I had the pleasure of being a visiting student working with Dr. Juan Carlos Niebles at Stanford Vision Lab.

My recent projects focus on robot learning, video forecasting, and visual-language understanding (reasoning). I’ve been researched on video captioning, video title generation, video question answering, and accident anticipation.

My CV [PDF], last updated January 2018.

Visiting Student
Sept. 16 - Mar. 17

MS. in EE
Sept. 14 - Jul. 17

Summber Intern
Jul. 14 - Sept. 14

BS. in MEM
Sept. 10 - Jun. 14

Summer Intern
Jul. 13 - Aug. 13



Kuo-Hao Zeng, De-An Huang, Juan Carlos Niebles, Min Sun

Under Submission

Omnidirectional CNN for Visual Place Recognition and Navigation

Hung-Jui Huang, Tsun-Hsuan Wang, Juan-Ting Lin, Chan-Wei Hu, Kuo-Hao Zeng, Min Sun

ICRA 2018

Self-view Grounding Given a Narrated 360° Video

Shih-Han Chou, Yi-Chun Chen, Kuo-Hao Zeng, Hou-Ning Hu, Jianlong Fu, Min Sun

AAAI 2018
ICCV 2017 Workshop

Visual Forecasting by Imitating Dynamics in Natural Sequences

Kuo-Hao Zeng, William B. Shen, De-An Huang, Min Sun, Juan Carlos Niebles

ICCV 2017 Spotlight

Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization

Kuo-Hao Zeng, Shih-Han Chou, Fu-Hsiang Chan, Juan Carlos Niebles, Min Sun

CVPR 2017 Spotlight

Leveraging Video Descriptions to Learn Video Question Answering

Kuo-Hao Zeng, Tseng-Hung Chen, Ching-Yao Chuang, Yuan-Hong Liao, Juan Carlos Niebles, Min Sun

AAAI 2017

Title Generation for User Generated Videos

Kuo-Hao Zeng, Tseng-Hung Chen, Juan Carlos Niebles, Min Sun

ECCV 2016

Video Captioning via Sentence Augmentation and Spatio-Temporal Attention

Tseng-Hung Chen, Kuo-Hao Zeng, Wan-Ting Hsu, Min Sun

ACCV 2016 Workshop

Semantic Highlight Retrieval and Term Prediction

Kuo-Hao Zeng, Yen-Chen Lin, Ali Farhadi, Min Sun

TIP 2017
ICIP 2016
CVPR 2015 Workshop

Side Projects

Microsoft - MSR Video to Language Challenge

Microsoft - MSR Video to Language Challenge


MSR Video to Language Challenge is a challenge hosted in ACMMM2016. The challenge mainly facilitates the progress of video captioning.

Computer Vision for Visual Effects

Computer Vision for Visual Effects

Team 11

CVFX is a graduate-level course at NTHU. We made up a team to conduct all the course assignements, term project, and final project.

Work & Research Experiences

  • Ph.D. Study
    begin in
    Fall 2018!

  • Sept. 2016 - Mar. 2017

    Visiting Student Researcher

    Stanford Vision Lab

    Research into cutting-edge AI technology on computer vision applications. The projects concentrate on visual forecasting, including accident anticipation and generalization of visual forecasting model.

  • Oct. 2014 - Aug. 2016

    MS. Student Researcher

    NTHU VSLab

    Focus on computer vision, machine (deep) learning, and their intersection. Research topics centralize on vision-language. The projects include video titling and video QA.

  • Jul. 2014 - Sept. 2014

    Summer Intern


    Develop efficient software/frameware tools to validate high speed communication circuits. The tools have to read/write the registers and pop up results.

  • Jul. 2013 - Aug. 2013

    Summer Intern


    Validate harmonic drive. Harmonic drive is a key component of a robot or a robotic arm. It locates at joints of a robotic system for gear reduction, increasing rotational speed, or differential gearing.

Advisors & Collaborators

** indicates advisor.


Spring 2015

Head TA, Signal & System, NTHU

Fall 2015

Head TA, Computer Vision, NTHU