-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Markdown output has incorect spacing. #599
Labels
💪 - Beginner
Difficulty level - Beginners
🐞 Bug
Something isn't working
⚡ High
Priority - High
📌 Root caused
identified the root cause of bug
⚙️ Under Test
Bug fix / Feature request that's under testing
Comments
interested @aravindkarnam |
@tautikAg Thanks for showing interest. Next release is by Feb-15th, so plan to raise a PR 2-3 days in advance. |
@tautikAg Hi. Were you able to make progress on this? |
hey @aravindkarnam , i am testing rn. WIll tag you to the PR soon (in few hrs) |
aravindkarnam
added a commit
that referenced
this issue
Feb 12, 2025
Fix Markdown Incorrect Spacing #599
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
💪 - Beginner
Difficulty level - Beginners
🐞 Bug
Something isn't working
⚡ High
Priority - High
📌 Root caused
identified the root cause of bug
⚙️ Under Test
Bug fix / Feature request that's under testing
crawl4ai version
0.4.247
Expected Behavior
Im trying to scrape a page from the blender manual @ https://docs.blender.org/manual/en/4.3/editors/outliner/interface.html
The markdown should look a little more like this (scraped with jina-ai):
Notice the spacing between paragraphs.
Current Behavior
Instead it messes up the spacing like so:
Notice that the spacing between paragraphs is messed up. LLMs can pick up this paragraph proximity.
Is there any config in CrawlRunConfig that I should know that can fix this? @aravindkarnam @unclecode
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
OS
macos
Python version
3.11.9
Browser
Edge
Browser version
No response
Error logs & Screenshots (if applicable)
No response
The text was updated successfully, but these errors were encountered: