Skip to content
This repository was archived by the owner on Feb 15, 2022. It is now read-only.

Markdown processing pipeline design #245

Closed
ghost opened this issue Mar 31, 2021 · 13 comments
Closed

Markdown processing pipeline design #245

ghost opened this issue Mar 31, 2021 · 13 comments

Comments

@ghost
Copy link

ghost commented Mar 31, 2021

Create a diagram and brief explanation of the phases in markdown processing and how data will be passed between phases.

Demonstrate the contents of the markdown after each transformation

Related to #222 #243 #229 #137

@ghost ghost self-assigned this Mar 31, 2021
@ghost ghost changed the title Markdown rendering design Markdown processing pipeline design Mar 31, 2021
@ghost
Copy link
Author

ghost commented Mar 31, 2021

Possible steps:

  • use ocaml-mdx and dune to compile snippets
  • generate table of contents and inject heading ids into markdown or generated html
  • render markdown to html
  • apply code highlight at build time or when displaying
  • check that internal links point to valid pages - can implement this with two phases: dump links to a file, and check link file

@ghost
Copy link
Author

ghost commented Apr 2, 2021

Are there hooks in the nextjs build process to fire a task before all pages build or after all pages have been built? Alternatively, should we introduce inotify-wait to implement a third command make markdown:watch?

@ghost
Copy link
Author

ghost commented Apr 3, 2021

We should use a different file extension for inputs / outputs of each stage. Possibly:

.omdx, .md, .mdtoc

@ghost
Copy link
Author

ghost commented Apr 6, 2021

Leaning towards Ashish's suggestion of invoking an external operating system process (invoking ocaml logic) from rescript as the first step of processing a markdown file, where the process writes the result to stdout, and rescript continues working with the result.

@ghost
Copy link
Author

ghost commented Apr 8, 2021

We need to experiment with invoking ocaml-mdx soon, in order to get more clarity on what is possible. In particular, will dune become a runtime dependency of the nextjs build?

@agarwal
Copy link
Member

agarwal commented Apr 9, 2021

So far I'm thinking of a system with 3 build tools in this order:

  1. dune. Builds an ocaml CLI tool ocamlorg. Then contains other rules that call ocamlorg to build various files. These rules might generate .md, .yaml, and even .res files.
  2. rescript. Now ReScript takes over. It continues the build by compiling .res to .js files.
  3. next. Last, NextJS does its work.

All 3 tools support a watch mode, so ideally a developer launches all 3 and clearly gets incremental builds across all stages.

@ghost
Copy link
Author

ghost commented Apr 9, 2021

I would move that comment into a new issue. This issue is for the specifics of integrating ReScript with remark, omd, and ocaml-mdx, possibly using nodejs child process. The comment above addresses this at a high level, but also is useful for structuring the build for other processing like #140 .

@ghost
Copy link
Author

ghost commented Apr 9, 2021

It will be useful to understand rescript-lang's highlightjs integration (https://github.com/rescript-association/rescript-lang.org/blob/master/src/common/HighlightJs.res) while contemplating this design.

@ghost
Copy link
Author

ghost commented Apr 10, 2021

Starting to add diagrams and descriptions of options here - https://docs.google.com/document/d/1dCfHnYwCM8-lCZZ38ZwpLBrP_Ru6NISBHS12qO2Wo90/edit?usp=sharing

@ghost
Copy link
Author

ghost commented Apr 12, 2021

I'm going to run with spikes to explore how the following combination might work:

  • when a page is building and runs getStaticProps, we call out to a node child process which runs ocaml-mdx to deal with code snippets. the result of the step will be a string written to stdout and read by the node child process library. (related to Evaluate ocaml code in markdown files #229)
  • parse the values in the string from prior phases
  • generate the table of contents with custom code operating on the AST generated by remark-js
  • pass the table of contents data to the table of contents component. pass the markdown body to rehype-react (or mdxjs)
  • integrate with highlight.js to enhance output from rehype-react (is this feasible?) (related to Implement code highlighting #137)

Based on the findings from the spikes, I will either continue pursuing that design or regroup and come up with a new design or select one of other alternatives.

@ghost
Copy link
Author

ghost commented Apr 14, 2021

The spike #290 did not reveal any significant problems with using node child_process or using ocaml-mdx. I'm going to pursue the design noted above.

@ghost ghost closed this as completed Apr 14, 2021
@agarwal agarwal unassigned ghost Apr 22, 2021
@ghost
Copy link
Author

ghost commented May 4, 2021

If we decide to use omd later, we can use rehype-parse and rehype-react to accept generated html.

@agarwal agarwal assigned ghost May 19, 2021
@ghost
Copy link
Author

ghost commented Jun 10, 2021

@tmattio This issue might be interesting for you. There is no action needed here.

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant