Adarsh J

Blog to Podcast Agent Convert any blog article into a spoken podcast using AI — just paste a URL and get an audio file in seconds.

Repository

Team: ViksithBharat2047

Adarsh J

adarsh_j

Description

Blog to Podcast Agent

Convert any blog article into a spoken podcast using AI — just paste a URL and get an audio file in seconds.

How It Works

You paste a blog URL into the app
An AI agent scrapes and reads the article
Mistral AI summarizes it into a conversational podcast script
gTTS converts the script into spoken audio
You listen or download the mp3

Tech Stack

ToolPurposeAgnoAgent frameworkMistral AILLM for summarizationNewspaper4kWeb article scrapinggTTSText to speechStreamlitWeb UI

Setup

1. Clone the repo

git clone https://github.com/Adars2005/BlogtoPodcast-mistral-.git
cd blog-to-podcast

2. Install dependencies

pip install -r requirements.txt

3. Get your API key

You only need one API key — Mistral AI (free, no credit card required):

Go to console.mistral.ai
Sign up and click API Keys in the sidebar
Click Create new key and copy it

4. Run the app

streamlit run app.py

Usage

Open the app in your browser (usually http://localhost:8501)
Paste your Mistral API key in the sidebar
Enter a blog URL in the input field
Click 🎙️ Generate Podcast
Listen in the browser or click Download Podcast to save the mp3

Requirements

agno
mistralai
streamlit
newspaper4k
lxml_html_clean
certifi
gtts

URLs That Work Well

Newspaper4k works best with open, public blogs. Use these types of sites:

✅ Works well:

thepythoncode.com
learnpython.com/blog
thenewstack.io
hostinger.com/tutorials

❌ Avoid (blocks scrapers):

medium.com → 403 error
netflixtechblog.com → SSL issues
devops.com → 403 error

Project Structure

blog-to-podcast/
│
├── app.py              # Main Streamlit app
├── requirements.txt    # Python dependencies
└── README.md           # This file

Troubleshooting

ErrorCauseFix401 UnauthorizedWrong API keyGet a fresh key from console.mistral.ai403 ForbiddenSite blocks scrapersTry a different blog URLSSL Certificate ErrorPython SSL issueAlready fixed in code via certifi404 Not FoundArticle deleted/movedTry a different URL

Limitations

gTTS requires an internet connection to generate audio
Very long articles are trimmed to avoid hitting model limits
Some websites actively block automated scraping

Issues & PRs Board

No issues or pull requests added.