This tool uses the Firecrawl /v2/scrape API with LLM-powered JSON extraction optimized for text-to-speech. An AI model reads the page and extracts clean plain text that sounds natural when read aloud — no markdown, no URLs, no footnote markers, no formatting symbols.
{type: "json", prompt: "...", schema: {...}} — LLM reads the full page and extracts structured data (title + article_text) based on a TTS-optimized prompt{type: "screenshot", fullPage: true} — full-page screenshot rendered server-side via headless browser (works even on sites that block iframes)const prompt = `Extract the main article content as clean plain text optimized for text-to-speech. Rules: - Return ONLY the article body text that should be read aloud - No markdown formatting: no **, no ##, no [](), no bullet symbols - No URLs or links — just the link text if part of a sentence - No footnote markers like [1], [2], (1), *, † - No figure/image references like (Fig. 1), (see image above) - No image captions or alt text - No author names, bylines, dates, or publication info - No share/subscribe/comment prompts or UI elements - No ads, related articles, or navigation - Expand common abbreviations for natural speech - Keep direct quotes and dialogue naturally - Separate paragraphs with double newlines - For the title field: extract just the article headline`; const resp = await fetch('https://api.firecrawl.dev/v2/scrape', { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': 'Bearer FIRECRAWL_API_KEY', }, body: JSON.stringify({ url, formats: [ { type: 'screenshot', fullPage: true }, { type: 'json', prompt, schema: { type: 'object', properties: { title: { type: 'string', description: 'The article headline as a clean sentence' }, article_text: { type: 'string', description: 'Full article body as plain text for TTS. No markdown, no URLs, no footnotes.' } }, required: ['title', 'article_text'] } } ], }), });
The prompt tells the LLM to extract TTS-optimized plain text. The schema returns a title and article_text — the title appears first so TTS apps can announce it before the body.