Relative Attributes for Abandoned Object Detection
Relative Attributes have been successfully applied to large-scale object
detection, recognition and zero-shot learning. I have worked on an approach to
use relative attributes framework for abandoned object detection and alert
prioritization in large-scale video surveillance. Abandoned object alerts are
represented in terms of three attributes: Staticness, Foregroundness and
Abandonment and a ranking function for each of the three relative attributes is
learnt using the SVM Ranking formulation. Using the learnt attributes, a second
level ranker prioritizes alerts that are of most relevance to the end user.
Retail Video Analytics
Video surveillance and analytics have played a pivotal role in deterring threats posed by anti-social elements
on public facilities such as airports, government buildings and even military installations. They have a great
potential in enforcing compliance in private establishments such as retail stores. A major source of revenue
shrink in retail stores is the intentional or unintentional failure of proper checking out of items by the
cashier. More recently a few automated video surveillance systems have been developed to monitor cashier lanes
and detect non-compliant activities. These systems use data from surveillance video cameras and transaction logs
(TLog) recorded at the Point-of-Sale (POS). Approaches that make use of only the statistics of the TLog data to
detect abnormal events within transactions tend to have a high false positive rate compared to video-based
systems. On the other hand, video-based systems visually monitor the activities of a cashier around the
Point-of-Sale to detect item checkouts and verify them using transaction log (TLog) data. Being able to detect
as many non-compliant events as possible while keeping the number of false alarms low is key to the successful
deployment of these systems. It is a challenging problem to optimize the two conflicting objectives due to
variations and noise within the input data streams. One of my research contributions in the above area includes
a text-based approach to analyzing videos represented as time-ordered discrete features, working as a part of
the Exploratory Computer Vision Group at the IBM T.J. Watson Research Center. Instead of using the two streams
(video and TLog) of data separately, we posit that much can be learned about the nature of an item scan
performed by a cashier by combining them into a single stream. This is because most item checkouts are normal
(no fraud) and a barcode is registered in the TLog. By analyzing visual information around the registered
barcode events, it is possible to model variations in the cashier's activities for checking out an item. This is
helpful in detecting non-compliant cashier activities more robustly in the presence of noise in either the TLog
or the video data. Taking a different perspective, this work aims to open up new possibilities for looking at
video data in a different light in order to infer useful knowledge. Text-based algorithms are simpler and faster
than many sophisticated video analysis techniques but their potential in addressing some of the vision
challenges is yet to be fully explored.
Microarray data analysis
Microarrays enable simultaneous monitoring of thousands of genes in a tissue. Methods for analyzing such data including normalization,
gene selection and phenotypic state prediction have to account for both technical and biological noise in the data. I am working on evaluating
the effectiveness of existing methods and developing novel methods to address these issues. New methods to incorporate Gene Ontology (GO) tags
for state prediction are being explored with the help of probabilistic generative models for multimodal data. Phenotypic state prediction
performance is being used to evaluate these methods.
Competitive Expectation Maximization
Expectation Maximization is a commonly used machine learning tool in missing data problems with an application in probabilistic
mixture modeling. However, it is inherently a local maximum likelihood algorithm and sensitive to initialization. Also in probabilistic
mixture modeling, an important research problem is the automatic determination of the number of clusters. Addressing these issues, I
have worked on implementing a variant of Competitive Expectation Maximization (CEM) algorithm. While determining the number of mixture
components automatically, CEM is known to achieve global maximum likelihood with a high probability. Implementation was done in C.
Semantic evaluation of features using word prediction performance
Large databases of digital images that come with words associated with the images help to learn relationships among visual features
of image regions and words. This can be used to predict words for new images automatically (auto-annotation). It has applications in
content-based image indexing and retrieval and also generic object recognition. I am working on identifying and evaluating visual
features and segmentation algorithms using word prediction performance index. In addition, this work involves programming in C and C++
and also writing shell scripts to set up and manage experiments on large databases.
Modifications to Normalized Cuts segmentation algorithm
As part of my Masters thesis, I worked on modifying the Normalized
Cuts segmentation algorithm to improve its grouping performance on natural images. Further, comparison of the original and modified
versions of Normalized Cuts algorithm was done using word prediction tool.
Human face detection and tracking in color image sequences
Human Face Recognition technology requires isolation of face(s) of individuals from an image sequence/video. As part of my
undergraduate thesis, I worked in a group of two to develop an algorithm to detect/track and segment human face(s) from color
images/video. This was performed using statistical skin color modeling and connected component operators. This module provided necessary
input to a surveillance system employing face recognition technology. The algorithm was implemented in MATLAB.
"Cross modal disambiguation," Kobus Barnard, Keiji
Yanai, Matthew Johnson, and Prasad Gabbur, in Toward Category-Level Object Recognition, Jean Ponce, Martial Hebert, Cordelia Schmidt, eds., Springer-Verlag LNCS Vol. 4170, 2006.
"Evaluation Strategies for Image Understanding and Retrieval," Keiji Yanai, Nikhil V.
Shirahatti, Prasad Gabbur and Kobus Barnard, Proc. of ACM Multimedia Workshop on Multimedia Information Retrieval (MIR), Singapore,
November, 2005 (Invited paper).
ACM link:
Scientist, ID Analytics Inc., San Diego, CA, Summer 2011-Present
Working on boosted classifiers to quantify identity risk.
Working on an approach to learn individual statistics from group statistics
(obtained from United States Census Bureau data) to improve feature set for
identity risk prediction.
Researcher in Computer Vision and Machine Learning, Exploratory
Computer Vision Group, IBM T J Watson Research Center, Hawthorne, NY, Summer
2010-Spring 2011
Worked with Dr. Sharathchandra Pankanti on computer vision and machine learning algorithms for video surveillance applications.
Research Associate, with Kobus Barnard, Computer Science Dept., University of Arizona, Spring,
Summer & Fall 2003, Spring 2005, Spring, Summer & Fall 2007, Spring, Summer & Fall 2008, Spring 2009
Working on microarray data analysis.
Worked on Competitive Expectation Maximization.
Worked on image segmentation and visual features for auto-annotation using joint image-word modeling.
Research Assistant, with Hong Hua, Optical Sciences Center, University of Arizona,
Fall 2005, Spring & Fall 2006
Worked on vision-based eye-gaze tracking for fovea contingent displays.
Worked with Hui Chao on automatic content
extraction and layout change in multilayer PSD images. Developed a real-time algorithm to do connected component labeling to aid in the
process of automatic content extraction.
Worked on a motion detection algorithm in video for surveillance and security applications. Evaluated
performances of different color spaces (RGB, HSL, CIE L*a*b* and YCbCr) in suppressing the effects of small illumination changes and weak
shadows on motion detection.
Developed DSP algorithms for a people counting system based on a planar scan infra-red sensor. This
system has been successfully installed to continuously monitor a secure area in one of the airports in the US.
Intern, Eyematic Interfaces Inc., Los Angeles, CA., Summer 2002
Worked on using synthetic images for face representations under varying illumination conditions. Rendered a database of synthetic face images under different illuminations using 3D Studio Max software. These images were used in training their face finder neural network to improve its performance when operating under poor lighting conditions. Also developed quantitative measures to compare performances of face detection systems and used these measures to show an improvement in performance of the above system.
Teaching Assistant, University of Arizona, Spring & Fall 2002
Course - C Programming for Engineering Applications (ECE275). Supervised programming labs, conducted lectures and maintained class website.
Grader, University of Arizona, Fall 2001
Course - Digital Signal Processing (ECE 429/529). Graded assignments and tracked student grades.
Young Engineering Research Fellow, Indian Institute of Science, Bangalore, India, Fall 2000
Research with Prof. K. R. Ramakrishnan on developing a
computational method for human face detection and tracking in color image sequences.
Intern, Centre for Development of Telematics (C-Dot), India, Spring 2000
Implemented a database project, in C++, on automating a mobile service billing system. Designed an
object-oriented model for maintaining their subscriber database and used it in automating the billing process. Worked in a team of 3 to accomplish the goals of the project.
PROJECTS
Error control coding
Compared performances of uncoded and convolutionally coded communication systems using QPSK modulation over Additive White Gaussian Noise (AWGN) channels. The decoding scheme was implemented using Viterbi algorithm.
Modeling of speech signals
Performed speech signal analysis/synthesis using Linear Predictive Coding (LPC).
Determined AR-coefficients of a speech process using Yule-Walker equations.
Modeled a deterministic signal as the impulse response of an all-pole filter using Prony's method.
Adaptive filter design
Performed noise cancellation in speech signals using RLS adaptive filter and Wiener filter.
Computer graphics
Implemented line and polygon drawing algorithms.
Implemented 3D viewing pipeline.
Implemented recursive ray-tracing.
Image analysis/Computer vision algorithms
Implemented image morphology algorithms for dilation, erosion, opening, closing and connected component labeling.
Implemented edge detection algorithms using Robert, Sobel, Prewitt, Frie-Chen and LOG operators with hysteresis
thresholding and non-maximum suppression.
Implemented Hough transform for straight-line detection.
Performed image segmentation using Kittler'?s algorithm.
4-bit Microprocessor Design
Worked in a team of three to design a 4-bit microprocessor with an associated instruction set. The design used an efficient microprogramming architecture. Basic gate-level implementation and simulation was done in VHDL.
Offline signature verification system
Authentication/verification of hand-written signatures can be automated using digital signal processing algorithms on scanned images of signatures. As an attempt towards this, I developed software in MATLAB for automatic verification of hand-written signatures using simple geometric features.
GRADUATE LEVEL COURSES
Random Processes for Engineering Applications.
Stochastic Processes.
Linear Algebra.
Numerical Analysis.
Computer Vision.
Computer Graphics.
Regression and Multivariate Analysis.
Digital Image Processing.
Digital Communication Systems.
Information Theory.
Algebraic Coding Theory.
Linear Systems Theory.
Computer Aided Logic Design.
Advanced Digital Signal Processing.
Advanced Concepts in Software Systems - Multimedia Data Mining and Retrieval.
Fundamentals of Statistical Machine Learning.
AWARDS / MERITS
Graduate College Fellowship, University of Arizona, Fall 2001.
Young Engineering Research Fellow of the Indian Institute of Science, Bangalore, India.
3rd rank to the University in my undergraduate degree (2nd in a class of 75).
COMPUTER SKILLS
Languages: C, C++, Objective-C, Java, HTML, Perl.
Packages: MATLAB, OpenGL, OpenCV.
Hardware Description Language: VHDL.
Assembly level programming: INTEL 8085, 8086, MC68000, TI TMS320C54x.