Exploring the Role of Context and Social Media to Enhance the User-Experience in Photography
and Image Tag Recommendation Yogesh Singh Rawat
Yogesh Singh Rawat PhD in Computer Science
National University of Singapore
23 Mar 2017 (Thursday)
2:30pm - 3:30pm
Meeting Room 4-4, Level 4
School of Information Systems
Singapore Management University
80 Stamford Road
We look forward to seeing you at this research seminar.
Allow us to share this seminar with your co-workers
The last decade has seen a significant improvement in the ease of capture of multimedia content.
Cameras now have intelligent features, such as automatic focus, face detection, etc., to assist users in
taking better photos. However, it still is a challenge to capture high-quality photographs. The complex
nature of photography makes it difficult to provide real-time assistance to a user for improving the
aesthetics of capture. The recent advancements in sensor technology, wireless networks, and social
media provide us an opportunity to enhance the photography experience of users.
Our work aims at providing real-time photography assistance to users by leveraging on camera
sensors and social media content. We focus on two different aspects of user-experience in
photography. The first focus is on camera guidance and the second is on location recommendation for
photography. In addition, we also focus on exploring the role of context and user-preference in
developing a deep network for image tag recommendation. In this talk, we will provide a brief
overview of our work on camera guidance and mainly discuss the research on location and image-tag
In camera guidance, we focus on landmark photography where feedback regarding scene composition
and camera parameter settings is provided to a user while a photograph is being captured. We also
focus on group photography where we use ideas of spring-electric graph model and color energy to
provide real-time feedback to the user about the arrangement of people, their position and relative
size on the image frame. In location recommendation, we first focus on a viewpoint recommendation
system that can provide real-time guidance based on the preview on user's camera, current time and
user's geo-location. Next, we introduce a photography trip recommendation method which guides a
user to explore any tourist location from the photography perspective. We leverage on ideas from
behavioural science for a better photography experience. More specifically, we utilize the optimal
foraging theory to recommend a tour for efficient exploration of a tourist location.
Social media users like to assign tags to the captured images before sharing with others. One of the
biggest advantages of adding tags to photographs is that it makes them searchable and easily
discoverable by other users. However, assigning tags to the photographs when shared immediately
after capture can be challenging. In such scenario, real-time tag prediction based on image content
can be very useful. However, the tags assigned to an image by a user also depend on the user-context
apart from the visual content. In addition, user-preference also plays an important role in predicting
image tags. Motivated by this, we propose a convolutional deep neural network which integrates the
visual content along with the user-preference and user-context in a unified framework for tag
prediction. We observe that integrating user-preference and the context significantly improves the
tag prediction accuracy.
About the speaker
Yogesh Singh Rawat recently completed my doctoral studies in the Computer Science Department at School of
Computing, National University of Singapore, where I was with Professor Mohan Kankanhalli in the
Multimedia Analysis and Synthesis Lab. I obtained my BTech degree in Computer Science and
Engineering from Indian Institute of Technology, IIT-BHU, Varanasi in 2009. My PhD dissertation is
focused on enhancing photography experience of users utilizing social media and camera sensors. It is
centered around computational media aesthetics and analysis of social media images for
photography. My research interests lie in the intersection of Machine Learning, Big Data, Social
Computing, Image and Video Processing, and Multimedia Computing. Before joining NUS in summer
2012, I was working at Mentor Graphics , India, (2009-2012) with Praveen Shukla where I worked as a
R&D developer in the Veloce Emulation team.
LARC is supported by the National Research Foundation, Prime Minister's Office, Singapore under its International Research Centres in Singapore Funding Initiative.