Most of the content published as CVPR2020 paper ( See paper here )
We contribute with Mapillary Street-Level Sequences (MSLS), a large dataset for place recognition from image sequences. It contains more than 1.6 million images curated from the Mapillary collaborative mapping platform. The dataset is orders of magnitude larger than current data sources, and reflects the diversities of true lifelong learning. It features images from 30 major cities across six continents, hundreds of distinct cameras, and substantially different viewpoints and capture times, spanning all seasons over a nine-year period. All images are geo-located with GPS and compass, and feature high-level attributes such as road type. We propose a set of benchmark tasks designed to push state-of-the-art performance and provide baseline studies. We show that current state-of-the-art methods still have a long way to go, and that the lack of diversity in existing datasets has prevented generalization to new environments.
PyTorch implementations of Burst Image Deblurring Using Permutation Invariant Convolutional Neural Networks
The network takes a n long sequence of burst images and outputs 1 sharper image. Each burst image is fed through a siamese Unet with skip connections. After each convelutional block a global max pooling is applied to gather information accross multiple images in the burst image sequence. After the tracks the feature maps are collapsed to a cleaner output image.
PyTorch implementations of TI-pooling (transformation-invariant pooling) from "TI-pooling: transformation-invariant pooling for feature learning in Convolutional Neural Networks"
TI-pooling is a simple technique that allows to make a Convolutional Neural Networks (CNN) transformation-invariant. I.e. given a set of nuisance transformations (such as rotations, scale, shifts, illumination changes, etc.), TI-pooling guarantees that the output of the network will not to depend on whether the input image was transformed or not.
This project seeks to estimate the 3D position of railroad furniture. The project consists of two parts:
1) Adapt and train a 2D object detection model - that takes an image as input and outputs a 2D bounding box together with class label for each object of interest in the image - for the entire Railroad dataset. Evaluate the performance of the trained model and investigate the causes of errors.
2) Estimation of a dense depth map that allow COWI to measure the 3D positions of the detected objects. Adapt a model to infer depth predictions based on sequences of images and demonstrate how this depth information can be used to obtain 3D information about the detected objects. Evaluate the performance of the depth estimations.
The AWS DeepRacer was developed for the research of deep reinforcement learning, however the performance is very limited with training results from the built-in models and variants. To improve the performance of the AWS DeepRacer for tracking trajectories with predefined markers, we explore the viability of utilizing other controllers.
We implemented a proportional controller and MPC on the DeepRacer car. Due to the limited time for this course project, we have implemented a proportional controller to the on-board computer and simulated MPC for tracking a spline trajectory.
Pronoun resolutions is the task of linking a pronoun to the correct noun. We find that pre-trained state-of-the-art neural models tend to be biased. We show that one of most commonly used dataset for training coreference resolution models has a substantial bias towards male entities, which causes models to perform better on male examples. Since this can lead to discrimination for e.g. job applicants, we are motivated to highlight the problem and explore methods to mitigate this gender bias. We find that using gender neutral word embeddings such as the Debiased GoogleNews embeddings or the Word Dependency embeddings can lower the bias considerably in the model predictions.
The objective of this thesis is to adapt and test how well an open source SLAM implementation works with data recorded from the Tobii Pro Glasses 2. The report investigates all aspects of the Extended Kalman Filter (EKF) SLAM method for data recorded with the Tobii Pro Glasses 2.