Skip to content

feature: Auto-generate llms.txt #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pawamoy opened this issue Mar 2, 2025 · 7 comments · Fixed by #4
Closed

feature: Auto-generate llms.txt #1

pawamoy opened this issue Mar 2, 2025 · 7 comments · Fixed by #4
Assignees
Labels
feature New feature or request insiders Candidate for Insiders

Comments

@pawamoy
Copy link
Owner

pawamoy commented Mar 2, 2025

Is your feature request related to a problem? Please describe.

@Kludex pointed out that what this plugin generates is llms-full.txt. Per the spec, the llms.txt file is just a summary:

  • H1 heading
  • optional quote or text pararaphs describing the project
  • one or more H2 heading composed only of a single item list
  • each item is a link to another Markdown page, or to an arbitrary resource

Example: https://docs.cursor.com/llms.txt. More examples https://llmstxthub.com/.

Describe the solution you'd like

mkdocs-llmstxt could:

  • generate a md file for each page
  • generate /llms.txt with the H1 heading, a Docs H2 heading, and links to all md pages

...but it couldn't (at least not automatically):

  • generate a quote or text paragraphs in llms.txt
  • generate descriptions of each link in llms.txt
  • organize links in more than one Docs section in llms.txt

Describe alternatives you've considered

/

Additional context

/

@pawamoy pawamoy self-assigned this Mar 2, 2025
@pawamoy pawamoy added feature New feature or request insiders Candidate for Insiders fund Issue priority can be boosted and removed fund Issue priority can be boosted labels Mar 2, 2025
@Viicos
Copy link
Contributor

Viicos commented Mar 12, 2025

@pawamoy I'd like to work on this one as we'll want to also have a proper llms.txt file for our documentation. Here is the proposed way to do so:

  • By default, we follow the spec, by including links in the llms.txt file.
  • We add mkdocs configuration values to provide a title, short summary (displayed as a blockquote), an optional markdown description (will be displayed after the blockquote).
  • All inputs specified in inputs: are converted to markdown, exposed in the final build, and included as URL links in the llms.txt file
  • An extra optional_inputs: parameter is added, which can be used to provide links that should be added in the optional section (TODO how to handle exclusions/duplicates between inputs and optional_inputs)
  • Add a generate_full boolean configuration value, indicating whether a llms_full.txt file is generated.

Does that sound good to you?

@pawamoy
Copy link
Owner Author

pawamoy commented Mar 13, 2025

Thanks for offering your help @Viicos!

My own suggestion is:

  • maintain current behavior
  • special case behavior when output file is exactly "llms.txt"
  • in this case, follow the spec: each input file gets its equivalent md file, and the output is our list of links, not a concatenation of the input files (that's your first and third point 👍)

About the title and summary, maybe we could already use site_name as title, and site_description as quote. Then we only need one new (optional) configuration option under llmstxt: markdown_description (or similar name).

About the optional inputs you mention: I wonder if we couldn't "simply" split input files in sections. If inputs is a list, treat it as a single Docs section. If inputs is a dict, use each key as section name.

List (current behavior):

plugins:
- llmstxt:
    files:
    - output: llms.txt
      inputs:
      - file1.md
      - reference/*/*.md

Dict (new additional behavior):

plugins:
- llmstxt:
    files:
    - output: llms.txt
      inputs:
        Docs:
        - file1.md
        Optional:
        - reference/*/*.md
        External resources:
        - https://example.com

Such a dict format would only be useful when the output file is "llms.txt". We could support it for any file, but that means we'd have to shift headings when concatenating. I recommend we worry about that later.

WDYT 😄 ?

@pawamoy
Copy link
Owner Author

pawamoy commented Mar 13, 2025

I can't find any mention of "llms-full.txt" on https://llmstxt.org/ anymore 🤔 So I'm starting to question the usefulness of being able to choose the output file. Is llms-full.txt still a thing?

@Viicos
Copy link
Contributor

Viicos commented Mar 13, 2025

Is llms-full.txt still a thing?

I'm not even sure it was a thing. It is mentioned for an example project, FastHTML, but it is mentioned that they use a different XML format. And their "full" output doesn't mean concatenate everything, but rather that it includes the "optional" section.

Your inputs proposed format looks like it could work

@pawamoy
Copy link
Owner Author

pawamoy commented Mar 13, 2025

I'm not even sure it was a thing.

Thanks. Then I'm open to completely rework the plugin, so that it just follows the spec. We would remove output, rename inputs into sections, only accept a dict, not a list, and move it one layer up (instead of files).

plugins:
- llmstxt:
    sections:
      Docs:
      - file1.md
      Optional:
      - reference/*/*.md
      External resources:
      - https://example.com

In most cases the llms-full.txt file would be too big anyway.

@pawamoy
Copy link
Owner Author

pawamoy commented Apr 8, 2025

Released as 0.2.0.

@Kludex
Copy link

Kludex commented Apr 8, 2025

Thanks. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request insiders Candidate for Insiders
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants