#AGI 理大團隊研發的「VideoMind」AI 模型,成功模仿人類觀看長片的思考方式,在節省算力與記憶體消耗的同時,超越 GPT-4o 和 Gemini 1.5 在影片定位準確度的表現!模型更採用創新 LoRA 策略,具體應用可延伸至智能監控、體育分析及影片搜尋等領域。
A research team at PolyU has developed “VideoMind”, an AI model designed to mimic human cognition when watching long-form videos. By using a novel LoRA-based modular architecture, it achieves high precision in understanding lengthy clips with far less computational cost—outperforming giants like GPT-4o and Gemini 1.5. This advancement opens doors for use in smart surveillance, sports, and content search.
📌 一杯咖啡價錢連接 Web3 世界 https://patreon.com/wanszezit
Full article https://www.ejtech.ai/ceoai/%e6%9c%ac%e5%9c%b0-%e5%89%b5%e7%a7%91-%e5%8b%95%e6%85%8b-%e7%90%86%e5%a4%a7-ai%e6%a8%a1%e5%9e%8b-%e4%bb%bf%e4%ba%ba%e9%a1%9e-%e8%a7%a3%e8%ae%80-%e9%95%b7%e7%89%87/