Beyond Typical Annotations: Adapting Data Pipelines for VLM-Driven Cars

23 Jun 2026

11:45 - 12:10

Room 2

Software, AI and SDV architecture

Everyone in the autonomous vehicle industry is excited about Vision-Language Models (VLMs) and Vision-Language-Action (VLAs). These models promise to help cars actually understand a scene, rather than just see it. But there is a catch: to teach a model to reason, you cannot just feed it traditional data. We are moving past simply drawing boxes around cars and pedestrians. In this talk, we will look at the practical side of how data pipelines must change to support VLMs. We will discuss how to add rich context and language to your data, ensuring your models get the exact information they need to make smart, human-like decisions on the road.

The New Data Reality: Why standard labels (like simple boxes and tags) are no longer enough for the next generation of AV models
Adding Context: Practical ways to shift your pipeline from pure geometry to rich, descriptive language.
Feeding the VLM: How to properly format and process data so that reasoning models can actually use it.

Speakers

Rafael Oliveira, senior solutions architect - Uber

Autonomous Vehicle Tech Expo Conference

Beyond Typical Annotations: Adapting Data Pipelines for VLM-Driven Cars