Skip to content

Files

Latest commit

bcb9b0b Β· Feb 3, 2025

History

History
210 lines (147 loc) Β· 5.35 KB

README.md

File metadata and controls

210 lines (147 loc) Β· 5.35 KB

🌐 ScrapeGraph Python SDK

PyPI version Python Support License Code style: black Documentation Status

ScrapeGraph API Banner

Official Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.

πŸ“¦ Installation

pip install scrapegraph-py

πŸš€ Features

  • πŸ€– AI-powered web scraping and search
  • πŸ”„ Both sync and async clients
  • πŸ“Š Structured output with Pydantic schemas
  • πŸ” Detailed logging
  • ⚑ Automatic retries
  • πŸ” Secure authentication

🎯 Quick Start

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

Note

You can set the SGAI_API_KEY environment variable and initialize the client without parameters: client = Client()

πŸ“š Available Endpoints

πŸ€– SmartScraper

Extract structured data from any webpage or HTML content using AI.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

# Using a URL
response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the main heading and description"
)

# Or using HTML content
html_content = """
<html>
    <body>
        <h1>Company Name</h1>
        <p>We are a technology company focused on AI solutions.</p>
    </body>
</html>
"""

response = client.smartscraper(
    website_html=html_content,
    user_prompt="Extract the company description"
)

print(response)
Output Schema (Optional)
from pydantic import BaseModel, Field
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

class WebsiteData(BaseModel):
    title: str = Field(description="The page title")
    description: str = Field(description="The meta description")

response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the title and description",
    output_schema=WebsiteData
)

πŸ” SearchScraper

Perform AI-powered web searches with structured results and reference URLs.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

response = client.searchscraper(
    user_prompt="What is the latest version of Python and its main features?"
)

print(f"Answer: {response['result']}")
print(f"Sources: {response['reference_urls']}")
Output Schema (Optional)
from pydantic import BaseModel, Field
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

class PythonVersionInfo(BaseModel):
    version: str = Field(description="The latest Python version number")
    release_date: str = Field(description="When this version was released")
    major_features: list[str] = Field(description="List of main features")

response = client.searchscraper(
    user_prompt="What is the latest version of Python and its main features?",
    output_schema=PythonVersionInfo
)

πŸ“ Markdownify

Converts any webpage into clean, formatted markdown.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

response = client.markdownify(
    website_url="https://example.com"
)

print(response)

⚑ Async Support

All endpoints support async operations:

import asyncio
from scrapegraph_py import AsyncClient

async def main():
    async with AsyncClient() as client:
        response = await client.smartscraper(
            website_url="https://example.com",
            user_prompt="Extract the main content"
        )
        print(response)

asyncio.run(main())

πŸ“– Documentation

For detailed documentation, visit docs.scrapegraphai.com

πŸ› οΈ Development

For information about setting up the development environment and contributing to the project, see our Contributing Guide.

πŸ’¬ Support & Feedback

  • πŸ“§ Email: support@scrapegraphai.com
  • πŸ’» GitHub Issues: Create an issue
  • 🌟 Feature Requests: Request a feature
  • ⭐ API Feedback: You can also submit feedback programmatically using the feedback endpoint:
    from scrapegraph_py import Client
    
    client = Client(api_key="your-api-key-here")
    
    client.submit_feedback(
        request_id="your-request-id",
        rating=5,
        feedback_text="Great results!"
    )

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ”— Links


Made with ❀️ by ScrapeGraph AI