-
Notifications
You must be signed in to change notification settings - Fork 26
[RFC]: Develop a Google Sheets extension which exposes stdlib functionality #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@adityacodes30 Thank you for sharing a draft of your proposal. A few comments:
|
Got it , scaffolding does make sense and i will integrate it as in the steps further. That should significantly reduce the time to implement the packages. I think scaffolding and automation will save me enough time to work on CAS and perhaps ndarray apis as well About Parsing - I found this . Math.js does provide a parser, to get it to call stdlib APIS, it also does contain a scope to define functions . It should prove to be useful, still there is a need for some R&D. If need be we can also build it in-house |
Hi @adityacodes30, thanks for sending this draft proposal! I see you have a very ambitious project with a lot of deliverables, which we may need to refine to scope better the whole project and set you up for success. Please use these questions as guidance,
I would recommend you to think first about the tutorial(s) and documentation to understand and plan what APIs would make more sense to work on early on. Also, please take into account that I estimate that a tutorial needs around 3 weeks to develop. |
Thank you @steff456. That does put things more into perspective. I will incorporate the suggestions Yes, documentation is key. With the additional information now, I think making the tutorials would likely continue in the extended coding period and beyond the program |
Full name
Aditya Sapra
University status
Yes
University name
Thapar Institute Of Engineering and Technology
University program
Computer Engineering
Expected graduation
2026
Short biography
Hi , I am Aditya Sapra and I am currently in my 4th semester pursuing Computer Engineering at Thapar Institute of Engineering and Technology. Additionally I have the CS50 certificate by Harvard University. I have been passionate about the computer sciences and have been actively developing projects. While my primary language of choice is Javascript , i do have experience in working with C, C++ and python. I am very interested in backend systems, computer networking, Cloud and operating systems.
I do have hands on experience with technologies such as the MongoDB, PostgreSQL, React, Node, Express, Message queues such as RabbitMQ, Apache Kafka and deployment technologies such as Docker, Docker-compose and Kubernetes. Apart from that i do have keen interest in statistics and machine learning.
Formally my coursework up till now includes Computer Programming , Object Oriented Programming , Computer Networks, Operating Systems, DBMS, Data Structures, Design and Analysis of Algorithms along with 2 undergraduate level Mathematics
subjects
Timezone
India Standard Time GMT+5:30
Contact details
email:[email protected],linkedin:https://www.linkedin.com/in/aditya-sapra-a70475252/,github:adityacodes30
Platform
Mac
Editor
I use VSCode as my daily driver due to its rich extension support and adoption. I use nano when i'm operating on remote Linux VM's
Programming experience
Creating things through code that materializes into real-world usage has always been a motivating factor in my programming journey. I have been programming for ~2 years where I've created some projects that I am truly proud of. I have working experience with Javascript, C, C++ and Python. I have made projects encompassing a range of different technologies and domains. I've listed some notable ones below
Catalog Scoring for Open Network for Digital Commerce
Deepword
Food festival website
Agency Website
AppscriptDB
Other Projects [ Linktree ]
JavaScript experience
The first language i learned was Javascript. The majority of my projects and programming experience has been in Javascript and it is my go to language for building stuff. I have used used JS to build an array of things all the way ranging from backend apis to creating complex frontend animations. More recently , my contributions to stdlib have have helped me gain a deeper understanding as well. I like javascript because of its intuitive syntax, cross platform flexibility, huge ecosystem but most of all the unparalleled community support . Its easy to build and deploy stuff that has the potential to be on the forefront of the impact that technology makes. That said i would like to see javascript adopt some of the typescript fundamentals such as data types natively.
Node.js experience
I have used nodeJS almost exclusively to create the backend apis across my projects. I have a working experience with node coupled with libraries such as express , cors, jsonwebtokens and more. I have handled many core backend functionalities with node such as callback apis, server side code, file handling, database connections etc.
C/Fortran experience
I learned C during my coursework across two semesters as well as during the assignments in CS50 where I implemented a variety of algorithms and programs such as greedy algo, extracting images from a wiped memory card. I also have a broad overview of memory management in C. I also have some experience working with makefiles also have
Interest in stdlib
When i first came across stdlib, i was beyond delighted. Since by origin, JS was a browser-centric language a standard library was never thought of. But as the language and the world evolved the need was there. Stdlib seems like the answer.
What excites me about the project is the accessibility it provides to the general population. Often times people might not have the technical know-how or the resources to run statistical functions via code. I think stdlib has the potential to solve that issue by running in the browser itself. The potential is endless. I also see it being standardized for education pedagogies due to the ease it provides. I think stdlib along with the tools built on top of it has the potential to make a big impact on the educational field. standardized
Version control
Yes
Contributions to stdlib
Merged PRs
feat: add string/base/last-grapheme-cluster
feat: add string/base/last-code-point
feat: add string/base/last
feat: add array/base/join
Implements is-positive finite
feat: add assert/is-same-date-object
-at time of submission you can view All prs here
Code review - 1413
Issues
[RFC]: Add @stdlib/string/last
Goals
Goals
By the end of this project I plan to fully implement the G-Sheets project which enables people to use stdlib and all its related functionality. Although the work has already started on this, a lot has to be done until reaches a POC phase. I have tried my best to go through the stdlib and research what functions would be the most in demand and suitable for exposing in Google Sheets
I will be dividing the project into phases spread across 12 weeks
Phase 1 - Getting the base packages ready !
235-260h
According to the TODO here , a total of 11 namespaces need to be implemented spanning ~800 packages of which 68 are implemented. 729+ packages need to be implemented. I intend to commit ~ 30-40 minutes average per package including the following steps
A summary of the packages is given below-
Note: This is flexible depending on further discussion and scope of the project
Additional tests with GAPTS
Phase 2 - Implementing 2D array semantics
35h
When working with arrays, arrays of different shapes can be combined under certain rules. Existing + custom wrapping logic will have to be used to perform these operations based on discussions and scope set in the three week bonding period. stdlib/ndarray will be particularly helpful in implementing these.
Phase 3 - Implementing performant fused operations
35h
As all functions are executed as RPC's ( Remote Procedure Calls ). It is needed that multiple operations be fused into one to get them out in a single server call to reduce number of network requests as well as reduce latency. A wrapper function will be explored to chain multiple function calls per call. A number of standard frequently used fused operations can be included as well. There will also be work on performant element wise iteration apis. To handle volume we need to create APIs that can iterate over the spreadsheet data in an efficient manner, possibly by chunking the data and processing it in batches. Operation fusion will likely need the development of a Computer Algebraic System. because Google sheets will treat data as strings and we will need to parse it and then call stdlib APIs .
Phase 4 - Documentation and tutorials
35h
This phase rounds up the fore said work with tying everything together via documentation. While the individual READMEs will be created as well at time of adding the packages this will add higher level documentation for namespaces, packages among other things. Tutorials on the usage of api functions would be set up as well via a subpage on the main stdlib website, if time permits with video snippets. This completes the core part of the execution
Phase 5 - Adding side panel in sheets for users
25h
This feature will allow users to interact with a side panel in the sheets app itself to search functions on the fly and view relevant description and related tutorials. This will add a level of interactivity that will make the usage extremely beginner friendly.
Phase 6 - Streamlining Build processes
15h
After the initial proof of concept has been made, but as the nature of open source is we need to regularly push updates. Therefore a CI/CD pipeline will be setup to update the deployments. This can also be done before week one to ease development. For now it is reserved for last
Note: The phases are not necessarily sequential, as anything else in development the steps are cross sectional
Why this project?
Stdlib has excited me due to its usage of JavaScript to run in browser environment, which opens the door for accessibility where enough computer resources might not be present. But to leverage and harness the power we need to build solutions on top of it so it can benefit an end user. This project does exactly that. I think implementation of this project will
Being a user facing project it fills me with enthusiasm that my work might impact thousands of people. It is also a way for me to accelerate my learning journey as a quality SWE.
Qualifications
I have a working proficiency in Javascript and NodeJs. I also have experience in backend technologies due to which I am extremely familiar with networking, apis and code optimisation along with other things. I have worked and have familiarity with google workspace and appscript which plays a pivotal role in this project.
I have been also working on this project with the issue #3 which has enabled me to get a deeper understanding of the existing underlying processes and repository structure. Working on generating githooks and its related makefiles has allowed me to have a understanding of underlying code and processes. I have added 6+ packages in the main repo which demonstrates my understanding of the codebase, general practices and have constantly learnt from feedback to generate quality PR's. More recently i have picked up the book Introduction to Algorithms to further my algorithmic skills and write efficient code.
Prior art
This project has already started at gsheets .
However i found that a repo has been implemented that brings
Lodash
to Gsheets - lodashgs. This uses the main lodash (a js utility lib) directory as a submoduleFor array broadcasting I found the numpy's implementation interesting.
We intend to use a parser , Math.js has one
Commitment
Keeping in mind the descriptive scope i intend to work on the full time equivalent for this project over the 12 week period and extending it to 16 weeks . In the community bonding period I intend to discuss the scope of the project and finalise the checkpoints with the mentors. As soon as that is done i will start on the implementation of the project starting with generating ci/cd git workflows to streamline further development.
I have my year end / summer break across the coding period so i will be fully available to concentrate my energy on this project. ~30h/week
Post the program i intend to explore the monetisation measures for this project and implement tensor and notebook related functionalities in the program
Schedule
Assuming a 12 week schedule,
During the community bonding period, I plan to work with the mentors to discuss the final scope, set clear and confirmed goals , milestones for the project. I also plan to implement the git Ci/Cd workflows and decide on a temporary build process for the upcoming weeks. I plan to start with the project in the bonding period itself if all deliberation agendas have been met.
Week 1 will see the implementation of the base packages as aforementioned. To avoid maintainer noise and clutter i will be spreading my pull request for the packages throughout the week and grouping similar packages into 1 PR if need be. Ideally week 1 should see implementation of ~45+ packages.
Packages to be implemented from assert namespace
contains
is-absolute-http-uri
is-absolute-path
is-absolute-uri
is-alphagram
is-alphanumeric
is-anagram
is-ascii
is-between
is-binary-string
is-blank-string
is-boolean
is-capitalized
is-composite
is-cube-number
is-current-year
is-digit-string
is-email-address
is-empty-string
is-even
is-falsy
is-finite
is-hex-string
is-infinite
is-integer
is-leap-year
is-localhost
is-lowercase
is-nan
is-negative-integer
is-negative-number
is-nonnegative-integer
is-nonnegative-number
is-nonpositive-integer
is-nonpositive-number
is-number
is-odd
is-positive-integer``is-positive-number
is-prime
is-probability
is-regexp-string
is-relative-path
is-relative-uri
Week 2 will including implementing packages further and finalising the PRs of the prev week.
Packages to be implemented from assert:
is-semver
is-square-number
is-string
is-triangular-number
is-truthy
is-unc-path
is-uppercase
is-uri
is-whitespace
deepequal
is-camelcase
is-complex
is-constantcase
is-even
is-kebabcase
Note - The is-complex will likely need more utility to parse it from a string to a complex number which is mentioned further in the proposal
Packages to be implemented from string:
acronym
code-point-at
ends-with
format
from-code-point
left-pad
left-trim-n
left-trim
num-grapheme-clusters
pad
percent-encode
remove-first
remove-last
remove-punctuation
remove-words
repeat
replace
reverse
right-pad
right-trim-n
right-trim
split-grapheme-clusters
starts-with
substring-after-last
substring-after
substring-before-last
substring-before
trim
truncate-middle
truncate
Packages to be implemented from random:
base/randi
base/randn
base/randu
sample (refactoring)
shuffle (refactoring)
Week 3 will including implementing packages further and finalising the PRs of the prev week. In week 3 i will complete the implementation of the
array
andnumber
namespacePackages to be implemented from array:
datespace
incrspace
logspace
unitspace
cartesian-square
cartesian-product
cartesian-power
n-cartesian-product
one-to
zero-to
take
I have noticed in my research that the cartesian related functions are quite needed. We will need to figure out how to render the view of products in google sheets
Packages to be implemented from number:
float64/base/exponent
float64/base/from-binary-string
float64/base/from-words
float64/base/get-high-word
float64/base/get-low-word
float64/base/normalize
float64/base/set-high-word
float64/base/set-low-word
float64/base/signbit
float64/base/to-binary-string
float64/base/to-words
uint16/base/from-binary-string
uint16/base/to-binary-string
uint32/base/from-binary-string
uint32/base/rotl
uint32/base/rotr
uint32/base/to-binary-string
uint8/base/from-binary-string
uint8/base/to-binary-string
Week 4 will see the implementation of math/base/special packages which includes the following packages
Link
and the implementation of nlp packages:
expand-acronyms
expand-contractions
ordinalize
porter-stemmer
tokenize
Week 5 would see the implementation of stats:
anova1
binomial-test
chi2gof
chi2test
fligner-test
kruskal-test
kstest
levene-test
lowess
padjust
pcorrtest
ranks
ttest
ttest2
vartest
wilcoxon
ztest
ztest2
base/*
base/dists/*
Implementation of stats will require implementing
stats/base/*
which exposes statistical tests. This step will likely take time due to the volume of packages present. I will use scaffolding and. automation processes to set up the apisBy week 6 a lot of code has been written and is ready for midterm evaluation ! Beyond mid term evaluations i will focus on getting the old pr/s backlogs over the finish line and implementing the
simulate
packages. I will also research on stats hereawgn
awln
awun
bartlett-hann-pulse
bartlett-pulse
cosine-wave
flat-top-pulse
hann-pulse
lanczos-pulse
periodic-sinc
pulse
sine-wave
square-wave
triangle-wave
In this week I will implement the complex namespace. This will help to perform operations on complex numbers. The representation of complex numbers in google sheets is by default string . My approach here will be to use the parser we have in stdlib to convert the string inputs to complex numbers. The complex namespace will require some R&D to implement. However its implementation will be quite useful as it will have wide ranging implications
I plan to implement complex/base/assert and complex as well as use additional utilities such as wrapping functions and parser from complex/base to facilitate development
In this week I plan to implement the
BLAS - basic linear algebra subprograms
namespace in which blas/base and blas/ext/base functions would be particularly useful in computations . These have wide ranging applications and would be one of the most used packages according to my research.In this week i will implement the 2d array semantics according to the rules and operations decided. This should take around -40 hours as it will include a good amount of R&D and a number of iterations to settle on the final code. Pull request will be generated for each RFC
Beyond completing the 2d array semantics and broadcasting apis, Week 10 will focus on implementing performant fused operations, focusing on optimising multiple function calls into single server calls. I will research and experiment with element-wise iteration APIs to ensure efficiency and begin exploring the creation of APIs for efficient iteration over spreadsheet data, considering chunking and batch processing.
In week 11, I will complete the implementation of performant fused operations, ensuring that multiple operations are efficiently combined into single server calls. I will also finalize the element-wise iteration APIs, ensuring they meet performance requirements and handle large volumes of data effectively. A final PR for phase 3 will be generated in week 12
The additional tasks involved are
- Design a wrapper function for chaining multiple function calls into a single RPC.
- Prototype and test the wrapper function with common use cases.
By the end of this week, I aim to have production-ready performant fused operation APIs with tests. I will also continue with the documentation process, focusing on creating higher-level documentation for namespaces and packages.
I will shift focus to documentation and tutorials, starting with the creation of higher-level documentation for namespaces and packages. I will begin drafting tutorials on API usage, considering both written and video formats. Furthermore, I will set up a subpage on the main stdlib website to host the tutorials and related documentation. There will be a page for each namespace therefore ~ 13-14 pages. The website documentation will define the functionality and definitions in detail. I plan to write 2 tutorials which will cover the stats and math namespace
By the end of this week i aim to have
I will continue working on adding the side panel in sheets for users, focusing on the design and user interface aspects. The interface will also link to tutorials. Additionally, I will start streamlining build processes to set up a CI/CD pipeline for future updates.
During the final week, I will focus on completing any pending tasks, conducting final testing and debugging, and preparing for the final evaluation. I will ensure that all deliverables are well-documented and ready for deployment, making any last-minute updates or improvements as necessary.
Notes:
Related issues
#13
Checklist
[RFC]:
and succinctly describes your proposal.The text was updated successfully, but these errors were encountered: