Researchers from the College of Information Technology at United Arab Emirates University and the School of Computer & Information Technology at Pakistan’s Beaconhouse National University have proposed using large language models to manage computing resources in connected car networks. In a nutshell, this solution will allow self-driving cars to process data from cameras and sensors faster without wasting precious fractions of a second in complex traffic situations.
Modern cars are famously becoming veritable data centers on wheels. They get equipped with dozens of sensors, cameras and radars that generate enormous amounts of information on a continuous basis. These data need to be analyzed instantly so the car can detect obstacles, estimate distances and decide whether to maneuver. However, onboard computers cannot always handle these workloads in real time. This is when some computations can be offloaded to other network members, namely, neighboring cars or special roadside computing nodes. This process is called dynamic task offloading.
The existing traffic offloading methods have faced significant challenges until now. Conventional algorithms, including those based on reinforcement learning, adapt poorly to rapidly changing traffic flows. Cars move constantly, causing the distance between them to change and connection quality to fluctuate. As a result, the system may choose a less-than-reliable channel, forcing the task to be repeated, which increases lags and power consumption.
The researchers installed a compact language model on roadside computing nodes; the model was similar in architecture to those used in chatbots and was trained to solve engineering problems. Instead of generating text, the model analyzes road environment parameters: vehicle speed and trajectory, available computing power, communication channel bandwidth, battery level and specific task requirements. Based on these data, it selects the optimal option: keeping the computation in the current car, transferring it to a neighboring car or sending it to a roadside server.
In order to train the model, the researchers created a special dataset combining real-world traffic scenarios and computer simulation results. The model was trained to minimize execution latency, reduce power consumption and account for connection stability. A key element was predicting the time period during which vehicles would be able to maintain a stable connection. If there was a risk of connection loss before computations were completed, the system would reject this option in advance.
In a simulated environment, the system based on the Llama 3.2:1b model demonstrated significant advantages. Compared to the best-performing conventional algorithms, task execution latency was reduced by an average of 15.3%, power consumption went down by 22.1% and the execution success rate reached 97.5% even with high traffic density.
Moreover, thanks to compression and optimization methods, decision-making time on a GPU was reduced by 30–40 times compared to a conventional processor, down to 8–10 milliseconds. This meets real-time requirements.



