Найти в Дзене
10,3 тыс подписчиков

📹 Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding


Video-LLaMA project, which is working on empowering large language models with video and audio understanding capability.

Video-LLaMA - мультимодальная система, которая расширяет возможности больших языковых моделей (LLM) для понимания как визуального, так и аудио контента в видео.





📹 Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding  Video-LLaMA project, which is working on empowering large language models with video and audio understanding
00:16
Около минуты