šøļø Web Scrape AI Assistant
A powerful web scraping and content analysis tool that leverages AI models to extract, summarize, and interact with web content. Built with Flask backend and a modern, responsive frontend designed with Tailwind CSS.

⨠Features
- Dual AI Model Support: Choose between Gemini and Groq LLMs for content processing
- URL Scraping: Direct scraping of any website URL with intelligent content extraction
- Search Functionality: Search for topics and automatically fetch relevant web content
- AI-Powered Summaries: Generate concise, well-structured summaries of web content
- Interactive Chat: Ask questions about the scraped content with AI-powered responses
- Beautiful UI: Modern, responsive design with Tailwind CSS and custom animations
- Robust Error Handling: Fallback mechanisms and comprehensive error management

š ļø Technical Stack
- Backend: Flask (Python)
- Frontend: HTML, JavaScript, Tailwind CSS
- AI Models:
- Google's Gemini API
- Groq's Mixtral model
- Web Scraping: BeautifulSoup4, Requests
- Additional Tools: dotenv for environment management
š Setup Instructions
Prerequisites
- Python 3.7 or higher
- API keys for Google Gemini and Groq
Installation
Clone the repository:
git clone https://github.com/AdarshXKumAR/Web-Scrapper-AI-Assistant.git
cd Web-Scrapper-AI-Assistant
Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Install dependencies:
pip install -r requirements.txt
Create a .env.local
file in the root directory with your API keys:
GEMINI_API_KEY=your_gemini_api_key
GROQ_API_KEY=your_groq_api_key
Running the Application
Start the Flask development server:
python app.py
Open your browser and navigate to:
http://127.0.0.1:5000
šŖ“ Usage Guide
- Select AI Model: Choose between Gemini or Groq using the toggle buttons
- URL Scraping:
- Enter a URL in the input field
- Alternatively, enter a topic to search for
- Click "Analyze Content"
- Search & Analyze:
- Click the "Search & Analyze" tab
- Enter a search query
- Click "Search & Analyze" button
- Chat with Content:
- After content is scraped and analyzed, use the chat interface
- Ask questions about the content
- Receive AI-powered responses
š Project Structure
web-scrape-ai-assistant/
āāā app.py # Flask application and backend logic
āāā templates/ # HTML templates
ā āāā index.html # Main application interface
āāā .env.local # Environment variables (not in repo)
āāā requirements.txt # Python dependencies
āāā screenshots/ # Application screenshots
š API Features
/scrape
- Endpoint for scraping URLs and topics
/search
- Search for content and analyze it
/chat
- Ask questions about analyzed content
/check-models
- Check available AI models
ā” Limitations
- Web scraping may be blocked by some websites with anti-scraping measures
- Search functionality depends on public search engines and may be limited
- Large webpages may be truncated due to token limits of AI models
š® Future Improvements
- Add support for additional AI models
- Implement content caching for faster responses
- Add PDF and document analysis capabilities
- Implement user authentication and saved content history
- Add export functionality for summaries
š License
MIT License
š Acknowledgements
Created with ā¤ļø by AdarshXKumAR