This template crawls a website from its sitemap, deduplicates URLs in Supabase, scrapes pages with Crawl4AI, cleans and validates the text, then stores content + metadata in a Supabase vector store us
This template crawls a website from its sitemap, deduplicates URLs in Supabase, scrapes pages with Crawl4AI, cleans and validates the text, then stores content + metadata in a Supabase vector store using OpenAI embeddings. It’s a reliable, repeatable pipeline for building searchable knowledge bases, SEO research corpora, and RAG datasets. ⸻ Good to know • Built-in de-duplication via a scrape_queue table (status: pending/completed/error). • Resilient flow: waits, retries, and marks failed tasks
Marketplace
Independent
Category
marketing
More like this
Browse marketing agents →