Mémoire UCL

TrajViViT: trajectory forecasting with Vvdeo vision transformers on top-view image sequences

Depré, Nicolas [UCL] Franck, Arthur [UCL] Macq, Benoît [UCL]

This master’s thesis presents an original approach to trajectory prediction using a transformer-based model that takes images as input. Trajectory forecasting has many applications in fields such as autonomous driving, robotics, and surveillance systems. Up until now, architectures relied mainly on Recurrent Neural Networks (RNNs), and more specifically Long Short-Term Memory models (LSTMs), along with Convolutional Neural Networks (CNNs). This study introduces TrajViViT, a Trajectory Video Vision Transformer. Although transformers have previously found application in trajectory prediction[24], our methodology stands apart by exclusively supplying images as the model’s input. This approach allows to study the vision abilities of the transformer on a trajectory prediction task. To guide the model towards the target it needs to track, a black box is apposed on its position. The task of the model is to detect the box and make a prediction based on its movement. We demonstrate that vision transformer based models have potential for such a task and can beat a Kalman Filter on longer term predictions. Our implementation does not perform as well as state-of-theart models, but still shows interesting results given the fact that no coordinates are provided as input. A PyTorch implementation of the model can be found at https://github.com/arfranck/TrajViViT

metadata

Access type	:	Accès restreint
Year	:	2023
Keywords	:	Transformers, Trajectory prediction, Vision Transformer (ViT), Video Vision Transformer (ViViT), Sequence-to-sequence model
Language	:	Anglais
Faculty	:	Ecole polytechnique de Louvain
Degree	:	Master [120] en sciences informatiques, à finalité spécialisée

Bibliographic reference	Depré, Nicolas ; Franck, Arthur. TrajViViT: trajectory forecasting with Vvdeo vision transformers on top-view image sequences. Ecole polytechnique de Louvain, Université catholique de Louvain, 2023. Prom. : Macq, Benoît.
Permanent URL	http://hdl.handle.net/2078.1/thesis:42047

TrajViViT: trajectory forecasting with Vvdeo vision transformers on top-view image sequences

Footer

Languages

Search form

You are here

TrajViViT: trajectory forecasting with Vvdeo vision transformers on top-view image sequences

Footer

Languages