Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Deep crawling request via Docker FastAPI POST not working #932

Closed
RemeLards opened this issue Apr 3, 2025 · 3 comments
Closed

[Bug]: Deep crawling request via Docker FastAPI POST not working #932

RemeLards opened this issue Apr 3, 2025 · 3 comments
Labels
🐞 Bug Something isn't working 🩺 Needs Triage Needs attention of maintainers

Comments

@RemeLards
Copy link

crawl4ai version

0.5

Expected Behavior

Extract atleast up to 50 pages, by using BFS algorithm

Current Behavior

getting this error inside the JSON:

'error_message': "'list' object has no attribute 'status_code'

Is this reproducible?

Yes

Inputs Causing the Bug

Steps to Reproduce

Code snippets

import requests

payload = {
  "urls": [
    "https://en.wikipedia.org/wiki/Mel_scale"
  ],
  "crawler_config": {
    "type": "CrawlerRunConfig",
    "params": {
      "deep_crawl_strategy": {
        "type": "BFSDeepCrawlStrategy",
        "params": {
          "max_depth": 2,
          "max_pages": 50
        }
      },
      "scraping_strategy": {
        "type": "WebScrapingStrategy",
        "params": {}
      }
    }
  }
}

request = requests.post(
    url="http://localhost:8000/crawl",
    json=payload 
)

print(request.json())
print(len(request.json()["results"]))

OS

Linux

Python version

3.12

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

{'success': True, 'results': [{'url': 'https://en.wikipedia.org/wiki/Mel_scale', 'html': '', 'success': False, 'cleaned_html': None, 'media': {}, 'links': {}, 'downloaded_files': None, 'js_execution_result': None, 'screenshot': None, 'pdf': None, 'extracted_content': None, 'metadata': {}, 'error_message': "'list' object has no attribute 'status_code'", 'session_id': None, 'response_headers': None, 'status_code': None, 'ssl_certificate': None, 'dispatch_result': {'task_id': '6cb58043-0180-4372-b951-01ba8c959baa', 'memory_usage': 36.921875, 'peak_memory': 36.921875, 'start_time': 1743657324.6034276, 'end_time': 1743657336.7007508, 'error_message': "'list' object has no attribute 'status_code'"}, 'redirected_url': None}]}
1

@RemeLards RemeLards added 🐞 Bug Something isn't working 🩺 Needs Triage Needs attention of maintainers labels Apr 3, 2025
@aravindkarnam
Copy link
Collaborator

@RemeLards "'list' object has no attribute 'status_code'" was a known issue that got fixed in 0.5.0.post8. Update your version and this should go away. I tried your same input as it is in this version and it worked flawlessly for me!

@handyman7
Copy link

@aravindkarnam can you please suggest where to find 0.5.0.post8 source code? it is probably not on main branch yet, is there PR with this fix

@RemeLards
Copy link
Author

As @handyman7 said, 0.5.0post8 branch isn't available

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 Bug Something isn't working 🩺 Needs Triage Needs attention of maintainers
Projects
None yet
Development

No branches or pull requests

3 participants