Maneesh Kumar Singh

Engineering Video Surveillance Systems: Design Practices and Challenges to Successful Adoption


Design and implementation of intelligent visual surveillance (IVS) systems, for both forensic and real-time needs, has seen more than two decades of research and development – beginning with early large systems like CMU VSAM, ADVISOR and the S3 IBM from late 90s / early 2000s. Nonetheless, in spite of the ever-expanding and ubiquitous need for IVS systems and notable exceptions on limited use cases, they have not yet been adopted by the industry on a large scale. I was a part of the core team in the Real-Time Vision and Modeling Department at Siemens Corporate Technology, Princeton, NJ and intimately involved in the design and implementation of such single- and multi-camera monitoring and surveillance systems for both indoor (railways, airports, tunnels, automotives) and outdoor (perimeter security, aerial surveillance, traffic monitoring systems). I will draw on this experience to highlight some of the main challenges and will highlight the design practices we followed to address these challenges. I will also discuss whether recent advances in deep learning, open set theory and domain adaptation are likely to address some of these challenges and if significant challenges remain.

Dr. Singh is the Director of Image and Video Analytics at Verisk Analytics. He leads the R&D efforts for the development of cognitive analytics and machine learning technologies for business use cases involving computer vision, natural language processing and risk analysis on structured and unstructured data. Verisk Analytics builds tools for risk assessment, risk forecasting and decision analytics in a variety of sectors including insurance, financial services, energy, government and human resources.
From 2013-2015, Dr. Singh was a Technology Leader in the Center for Vision Technologies at SRI International, Princeton, NJ. At SRI, he was the technical lead for the DARPA Visual Media Reasoning (VMR) project for Automatic Performance Characterization and led the development and implementation of efficient Pareto optimal performance curves and a multithreaded APC system for benchmarking more than 40 CV and ML algorithms. Dr. Singh was the Algorithms Lead for the DARPA CwC CHAPLIN project for designing a human-computer collaboration (HCC) system to enable composition of visual narratives (cartoon strips, movies) with effective collaboration between a human actor and the computer. He was also a key performer on the DARPA DTM (Deep Temporal Models) seedling project for designing deep learning algorithms on video data. Previously, Dr. Singh was a Staff Scientist at Siemens Corporate Technology, Princeton, NJ till 2013. At Siemens, he led and contributed to a large number of projects for successful development and deployed of computer vision and machine learning technologies in multi-camera security and surveillance, aerial surveillance, advanced driver assistance and intelligent traffic control; industrial inspection; and, medical image processing and patient diagnostics. Dr. Singh received his Ph.D. in Electrical and Computer Engineering from the University of Illinois at in 2003. He has authored over 25 publications and 14 U.S. and International patents.

Peter H. Tu

Surveillance and Social Situational Awareness


This talk will describe a variety of methods that have been developed for the purposes of understanding group level social behaviors using stand-off video surveillance methods. Three main topics are considered: 1) the GE Sherlock System: a comprehensive approach to capturing and analyzing non-verbal cues of persons in crowd/group level interactions, 2) One Shot Learning:  a new approach to crowd level behavior recognition based on the concept that a new behavior can be recognized with as little as a single example and 3) Agent Based Inference: a novel approach to the analysis of individual cognitive states of person’s interacting in a group or crowd level social interactions. The talk starts with a description of the GE Sherlock system which encompasses methods such as person tracking in crowds, dynamic PTZ camera control, facial analytics from a distance such as gaze estimation and expression recognition, upper body affective pose analysis and the inference of social states such as rapport and hostility.  The talk then discusses how cues derived from the Sherlock system can be used to construct semantically meaningful behavior descriptors or affects allowing for signature matching between behaviors which can be viewed as a form of one shot learning. Going beyond affects based on direct observation, we argue that more meaningful affects can be constructed via the inference of the cognitive states of each individual. To this end we introduce the Agent Based Inference framework. The talk concludes with a discussion of how such methods are making their way into commercial use via efforts such as the intelligent city, the intelligent airport and the intelligent hospital.


Peter is currently a Senior Principal Scientist at GE Global Reseach.  He received his B.S. in Systems Design Engineering from University of Waterloo – Canada in 1990 and his PhD.  in Engineering Science from  Oxford University – England in 1995.

In 1990 Dr. Tu joined Sony Research in Tokyo Japan, where he develeped a number of computer vision algorithms for man-machine interfaces. While at Oxford University, his research was devoted to the development of computer vision methods for the autumatic analysis of seismic imagery. In 1997 Dr. Tu became a senior research scientist working at General Electric’s Global Research center. In partnership with Lockheed Martin, he has developed a set of latent fingerprint matching algorithms for the FBI Automatic Fingerprint Identification System (AFIS). Dr. Tu has also developed optical methods for  the precise measurement of 3D parts in a manufacturing setting. Dr. Tu was the principal investigator for the FBI ReFace project, which is tasked with developing an automatic system for face reconstruction from skeletal remains. In 2006, he was the principal investigator for the National Institute of Justice’s 3D Face Enhancer Program. This work was focused on improving face recogntion from poor quality surveillance video. In 2008, Dr Tu led the GE video analytics team that participated in the DHS STIDP demonstration program – the goal of STIDP is to establish an effective defence against suicide bomber attack. Dr Tu is the prinicipal investigator for the DARPA sponsored effort associated with group level behavior recognition at a distance.  Currently Dr. Tu is the Senior Prinicipal Scientist for a group of 15 researchers in the field of  multi-view video analysis with the aim of acheiving reliable behavior recognition in complex environments. He has helped to develop a large number analytic capabilities including: person detection from fixed and moving platforms, crowd segementation, multi-view tracking, person reacquistion, face modelling, face expression analysis, face recognition at a distance, face verification from photo IDs and articulated motion analysis. Dr Tu has over 50 peer reviewed publications and has filed more than 25 U.S. patents.