Moving object tracking by regularization via sparsity in wide area aerial video

thumbnail.default.alt
Tarih
2017-12-15
Yazarlar
Özyurt, Erdem Onur
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Institute of Science and Technology
Özet
Recently, the field of object tracking is drawing increasingly more attraction of numerous researchers around the world. Object tracking is the process of estimating location of an object or multiple objects over a video sequence elapsing for an amount of time. The task of object tracking accompanied with some challenging problems such as occlusion, illumination change, high motion, pose and scale variations becomes even more challenging when wide area aerial video sequences are the case. This is because the target object shot at a long distance by an aerial vehicle is represented with a few pixels. The increasing need and demand for video surveillance using wide area aerial video sequences, for instance for traffic monitoring, security and surveillance or environmental analysis necessitates dealing with these problems. In the context of this thesis, the problem of object tracking in wide area aerial video is handled by developing a method that utilizes a target update scheme. The proposed model is based on regularization via sparsity-based conditional density propagation. The regularization task is carried out through sparse representation of the possible target candidates and approximating the best target candidate with minimal reconstruction error recursively with respect of the philosophy of conditional density propagation, particle filtering. The target candidates are propagated through the frames via estimated affine transformation parameters that constitute the state vector. The use of the particle filtering via sparse target representations already exists in the literature. In these methods, the approximation of the best target candidate is achieved by representation of each target candidate in the linear span of a fixed or slowly-updating target template set plus a trivial template set and selecting the target candidate with the smallest reconstruction error as the tracking result after deriving the template coefficients through l1 minimization. The process is recursively repeated in the upcoming frames. Our major contribution to this system is the integration of a monitoring system that allows the tracker to automatically perceive anomalies thus update the observation model. Specifically, this is achieved by monitoring the tracking dependent on the model parameters that include minimum reconstruction error after approximation of the target candidates in the current frame, the magnitude of domination of the coefficients of trivial templates over the coefficients of target templates approximated through l1 minimization and the number of particles used in l1 minimization. Subsequent to the approximation via l1 minimization in the current frame, if an anomaly implying divergence from the target object is not detected, the tracking task keeps on moving using the current target template set without an interruption. Whenever an anomaly is reported by the tracker, a detection of target object is expected to be received from the object detector, which is a deep learning network in our model. In our work, because of its excellent object localization capability, Faster R-CNN is employed as the object detector. Once an object detection information is received, a new target template set is generated and the tracking task is resumed with the updated target template set. Unlike the existing models which continuously track, the capability of quiting tracking avoids the tracker from diverging that increase the robustness to occlusion where the target object extrudes from scene. Also, the update of the target template set upon a detection information from the detector makes the tracker more robust to high motion, because the target template set is refreshed with a more recent appearance of the target object. In order to avoid false alarms, the fulfillment of target update on time is vital. In order to achieve this, the proposed monitoring system is designed to be stimulated by analyzing the model parameters retrospectively over time. The experiments are performed using the datasets VIVID and UAV123 which include wide area aerial videos. Commonly used evaluation metrics including success rate, precision rate and center location error are used to report the object tracking performance. Numerical results demonstrate that the proposed target tracking scheme integrated with the object detector based on faster R-CNN significantly improves the accuracy, compared to the baseline l1 tracker and other state-of-the-art trackers.
Açıklama
Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2017
Anahtar kelimeler
object tracking, nesne takibi, image processing, görüntüleme süreci
Alıntı