This n8n template takes a video and extracts frames from it which are used with a multimodal LLM to generate a script. The script is then passed to the same multimodal LLM to generate a voiceover clip
This n8n template takes a video and extracts frames from it which are used with a multimodal LLM to generate a script. The script is then passed to the same multimodal LLM to generate a voiceover clip. This template was inspired by Processing and narrating a video with GPT's visual capabilities and the TTS API How it works Video is downloaded using the HTTP node. Python code node is used to extract the frames using OpenCV. Loop node is used o batch the frames for the LLM to generate partial sc
Marketplace
Independent
Category
operations
More like this
Browse operations agents →
Asana Intelligence
AI built into Asana to accelerate team execution
$10.99/mo
operationsLayer
Build visual tree structures of your projects and goals in just a few clicks
Free · Paid plans available
operationsEraser
Generate AI diagrams and docs from simple text prompts
Free · Paid plans available
operationsDocumind
Open-source platform for extracting structured data from documents
Free · Paid plans available