This web scraper works entirely in your browser and is perfect for creating training data for AI models. It works by reading the sitemap.xml file on the website, which makes it particularly well suited to modern platforms like Squarepace and Shopify which automatically generate boards.
The scraper preserves the structure of your content, including titles, paragraphs, lists and tables, while deleting unnecessary elements such as navigation menus and feet. It also captures metadata, images and PDF documents.
