One or Two Research Paper Digests

Posts

Showing posts from October, 2020

Deeper and Wider Siamese Networks for Real-Time Visual Tracking

In this paper , the authors investigate how to increase the robustness and accuracy of existing Siamese trackers used for visual object tracking. Visual object tracking Visual object tracking is one of the fundamental problems in computer vision. It aims to estimate the position of an arbitrary target in a video sequence, given only its location in the initial frame. It has numerous applications in surveillance, robotics, and human-computer interaction. Siamese Networks and their usage in Trackers Siamese networks are a class of neural networks that fundamentally learns to generate comparable feature vectors from their twin inputs. By learning to compute these comparable feature vectors, it learns differentiable characteristics for each type of image class. With these output vectors, it is possible to compare the two inputs and say if they belong to the same image class or not. For example, this is used in one-shot learning for facial recognition. Here the siamese network learns t...

Joint Pose and Shape Estimation of Vehicles from LiDAR Data

In this paper , the authors address the problem of estimating the pose and shape of vehicles from LiDAR Data. This is a common problem to be solved in autonomous vehicle applications. Autonomous vehicles are equipped with many sensors to perceive the world around them. LiDAR being one of them is what the authors focus on in this paper. A key requirement of the perception system is to identify other vehicles in the road and make decisions based on their pose and shape. The authors put forth a pipeline that jointly determines pose and shape from LiDAR data. More about Pose and Shape Estimation LiDAR sensors capture the world around them in point clouds. Often, the first step in LiDAR processing is to perform some sort of clustering or segmentation, to isolate parts of the point cloud which belong to individual objects. The next step is to infer the pose and shape of the object. This is mostly done by a modal perception . Meaning the whole object is perceived based on partial s...