-
-
Notifications
You must be signed in to change notification settings - Fork 812
[RFC]: support for structured package data #1147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@Planeshifter Given your previous efforts to build scaffolding tooling, would be good to get your opinion on the above proposal and what, if any, additional structured information might be useful. |
@Planeshifter Pinging you here, in case you have forgotten about this issue. |
@kgryte is this in the works? |
@Snehil-Shah Sort of. We've created a Google sheet for collecting this information, but that effort has stalled. Something like this would be rather useful, but it involves a fair amount of manual labor, and we haven't had the bandwidth to push forward. |
Opening up a tracking issue for this one should help us move forward with since it does require a good number of additions to be made. Should we open one ? |
From what I gather resolving this will help in the scaffolding process of both the Gsheets project and developing C implementations, right ? |
@adityacodes30 Before opening up a tracking issue, we need to settle on the desired path forward. But, yes, this is also relevant to the scaffolding process for both GSheets and the C implementation work. |
Generally, I think we should start with |
My my main concern, and it is for me a serious one, is that this increases duplication of package documentation even more, which is already quite excessive. If we undertake this, I think it's necessary to at the same time build tooling (either LLM-assisted or just deterministic) that scaffolds out the required other files such as As for the proposed schema, it seems sensible. I would drop Here are my answers to the raised questions:
Not sure. Maybe something for testing. Should it support
This is a pretty deep rabbit hole and not something that should be encoded in metadata, I think. Burden would be on the person populating the metadata to make sure any constraints are satisfied.
Probably most that have
In my view, bloat will not be an issue. Metadata would be stripped when publishing packages, so this would only affect the development environment. |
Re: extra keywords. The point here is that there are keywords which are universal for a particular conceptual function and which should be included in all downstream scaffolded packages, and others which are not universal and which scaffolding tool may, or may not, be interested in using. |
I don't have a simple answer here. To me, it is a balance of trade-offs. Right now, the situation is not tenable, as we need to individually define example ranges, aliases, etc, for all higher order packages (e.g., strided, iter, ndarray), which vastly outweighs the maintenance and creation burden if we bite the bullet when creating a base package in the first place. |
PR-URL: #2893 Ref: #1147 Co-authored-by: Athan Reines <[email protected]> Reviewed-by: Athan Reines <[email protected]> Signed-off-by: Gunj Joshi <[email protected]>
PR-URL: #2912 Ref: #1147 Reviewed-by: Athan Reines <[email protected]>
PR-URL: #2914 Ref: #1147 Co-authored-by: Athan Reines <[email protected]> Reviewed-by: Athan Reines <[email protected]> Signed-off-by: Athan Reines <[email protected]>
PR-URL: #2922 Ref: #1147 Co-authored-by: Athan Reines <[email protected]> Reviewed-by: Athan Reines <[email protected]>
PR-URL: #2927 Ref: #1147 Co-authored-by: Athan Reines <[email protected]> Reviewed-by: Athan Reines <[email protected]> Signed-off-by: Gunj Joshi <[email protected]> Signed-off-by: Athan Reines <[email protected]>
Description
This RFC proposes adding structured package data to facilitate automation and scaffolding.
Overview
The need for structured package data has been discussed at various points during stdlib development. This need has become more paramount when seeking to automate specialized package generation for packages which wrap "base" packages for use with other data structures. The most prominent example being
math/base/special/*
APIs which are wrapped to generate a variety of higher-order packages, includingmath/iter
math/strided
math/
generics supporting ndarrays, arrays, and scalarsand more recently in work exposing those APIs in spreadsheet contexts. In each context, one needs to
and in some contexts
While various attempts have been made to automate scaffolding of higher-order packages, where possible, each attempt has relied on manual entry of necessary scaffold data, including parameter names, descriptions, and example values. To date, we have not created a centralized database from which we pull desired package meta data.
Proposal
In this RFC, I propose adding structured meta data to "base" packages. This structured meta data can then be used in various automation contexts, most prominent of which is automated scaffolding.
The meta data would be stored as JSON in a subfield of the
__stdlib__
configuration object ofpackage.json
files. The choice of JSON stems from the ability to use JSON Schema for validation and linting.Examples
I've included two examples below.
math/base/ops/add:
stats/base/dists/arcsine/pdf:
Annotated Overview
Discussion
Related Issues
No.
Questions
a < b
. In the example JSON, I've simply manually adjusted the PRNG parameters and the example values to ensure we don't run afoul of that constraint. It was not clear to me how we might include such constraints in a universal way which is machine parseable and actionable in scaffolding tools.package.json
files. This could lead to bloat in thepackage.json
files. Another possibility is putting such info in a separate.stdlibrc
file in the root package directory. Would this be preferrable?Other
No.
cc @Planeshifter
Checklist
RFC:
.The text was updated successfully, but these errors were encountered: