Skip to content

[RFC]: develop C implementation for base special mathematical & base statistical distribution functions #107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
7 tasks done
Neerajpathak07 opened this issue Mar 17, 2025 · 7 comments
Labels
2025 2025 GSoC proposal. received feedback A proposal which has received feedback. rfc Project proposal.

Comments

@Neerajpathak07
Copy link

Neerajpathak07 commented Mar 17, 2025

Full name

Neeraj Pathak

University status

Yes

University name

Vishwakarma Institute of Technology

University program

Bachelor of Technology in Instrumentation & Control Engineering

Expected graduation

May 2027

Short biography

I am a 2nd year Undergraduate student at Vishwakarma Institute of Technology, currently pursuing B.Tech in Instrumentation & Control Engineering. My initial interest in programming developed in my Sophomore year when I started learning C/C++ & Python and also learning core concepts like data structures and algorithms, Object-oriented programming and Database Management system. By the end of my Sophomore year I entered the universe of Web Development acquiring firm knowledge and practice of Javascript, Typescript, React, Next, Vue, Redis, MongoDB and Express.js. Which helped me extensively in the early days of my 2nd year to build projects and work on newer technologies. Starting to contribute to open-source was also one of the best decisions to help me get good hands-on practice for coding and learning along the way. I am also keen on learning cutting edge technologies like DevOps, Machine Learning, Artificial Intelligence and Data Science.

Timezone

Indian Standard Time ( IST ), UTC+5:30

Contact details

Email:- [email protected], [email protected]
Github:-Neerajpathak07
LinkedIn:- Neeraj Pathak

Platform

Windows

Editor

I prefer using Visual Studio Code due to the access to thousands of extensions including Github Actions and EditorConfig which helps in running workflows and keeping the code indentation in-line. Hence, assisting in maintaining code quality.

Programming experience

Having no prior technical knowledge I started with the very basics by learning C/C++ and Python in the 1st year of my college as these were very beginner friendly languages. After which I took a dive into data structures and algorithms and practicing competitive programming along-side learning core concepts like Object-oriented programming and Computer Networks. Also explored the world of Mathematical & Scientific Computing along-side Machine Learning using Python.

2nd year was when I was introduced to Web Development and joined a Coding Club in my college(Mircrosoft Learn Students Club) which assisted me into learning Full Stack development with languages like Javascript, Typescript, React, Next, Vue and many more.
This club also gave me an opportunity to work on industry level projects like,

BISM :- we made a website for this library situated in India handling databases for 30,000+ books using Redis insight. Although I was very-new around that time so couldn't contribute here effectively but I worked on UI elements, created a admin page and designed the backend structure for my club heads. This project taught me the basics of web-development and although I didn't contribute much it laid the building blocks for my journey in Web Dev.

The first project I made whilst having a firm grip on Web Development was:-

React-Code-editor:- Through this project I was able to get a brief Idea on what API's are and how to call them. Also performing GET and POST requests along-side working on react components to create an interactive website.

My introduction to open-source was through Hacktoberfest’24 where learned to contribute to opensource repositories.
My first contribution was related to Data-structures and algorithms where I had uploaded a few Code files on Binary Search Tree, Linked-lists, heap & arrays.
After which I explored CircuitVerse where after consistent efforts I was fortunate enough to become a member working on issue-triaging . The purpose of which is to Sort, prioritize, and work on routing issues to ensure the circuits flow smoothly. Also reviewing PR's and assigning issues to new contributors to familiarize them with the codebase.

JavaScript experience

Learning javascript was the building block to learning Web development for me and I did it through a course which I had in the 1st year of college. Thank you to my Professors I was able to grasp it in no time. Learning Javascript helped me in understanding how a website really works what is an Async Await what do callbacks & promises really do which really strengthened my basics. After which I could learn and easily understand various frameworks like React, Node.js, MongoDB and Express.

Majority of my contributions to stdlib where in javascript which enhanced my hands-on practice while working with API calls, writing tests & examples along with creating JSON dependencies helped me explore various other advantages of the rich-ecosystem of it.

I have always seen Javascript as a very powerful tools which can enhance skillset of beginners & opening doors to the vast world of Web development. It is also quite easy to grasp(personal opinion) and used widely in frameworks like React, Express, Node.js and many more through which one can build interactive and complex websites.

However, According to last year’s Github report we can see that Python had overtaken javascript as # 1 language. Which does mean that slowly but steadily the demand for AI and Machine learning is rising. Whilst people are now migrating to use the popular technologies the demand for javascript does turmoil.

Node.js experience

Outside of stdlib I have only worked with npm packages and client-side code for javascript. Why working on stdlib packages was when I got to fully access the potential of the framework working on Node API’s and documenting them.

C/Fortran experience

Since the beginning of my 1st year of college I started learning C programming. Also gave quite some time to it which enhanced my knowledge for various components of C along with sufficient knowledge of ASNCII and the character encoding format for text data in computers and on the internet. Additionally I worked on a few projects using various statements/loops which were incorporated inside custom functions. While learning data-structures I used to cover basic topics like arrays, Linked-Lists & stacks using C and would code the logic without any inbuilt code snippets to have a crystal-clear background check on how the data-structure really works.

I have sufficient knowledge and experience in Fortran although have worked upon it quite less for stdlib packages.

Interest in stdlib

I was keen on learning Data analytics and Mathematical Computing while researching on it I found out that there are several libraries that provided algorithms for optimization, mathematical, statistical, algebraic, differential and Integral equations.
But there was a problem, these were mostly written in Julia or a language which at the time I was not so proficient with. So, I recalled that while learning C programming & working on mathematical operations in we would start by adding

#include <stdlib.h>

first and I always wondered what does this file do? while researching more on it I came across the stdlib repo after traversing my way around the repo all my questions were answered.

While exploring the repository for a prolonged period of time I discovered that stdlib offers operations on packages ranging from ndarray, statistical, basic-linear algebraic, mathematical and many more. stdlib also has a in-house read-eval-print loop (repl) which assists us in various options like running examples for custom input values , finding exiting aliases as well as referenced ones and quite a few more.

stdlib offers similar functionalities as Julia or Scipy but in very beginner friendly languages like C & Javascript. Which Is also the reason why I want to contribute to this big library and bring some meaningful changes and ideations to the table. Inside the stdlib community the mentors and contributors all bonded with each other helping out one another on issues they are facing or discussion on some ideations on weekly basis.

The maintainers also welcomed the new Contributors whole-heartedly and also guided me to my first-contribution.

Which I think is the true spirit of OpenSource.

Version control

Yes

Contributions to stdlib

Merged PR’s

Open PR’s

Issues

stdlib showcase

Scientific-Calculator

Goals

The goal of this project is to develop C implementation for math/base/special and stats/base/dists packages for both single and double-precision. Also adding newer packages based on Upstream implementations to expand the scope of the library. Upon completion of this project we’ll be able to add support for other core functionalities of the library like strided, blas and ndarrays . Along with achieving feature parity based on functionalities with the Upstream libraries.

Moreover will be extending the effort for constants/float32 packages to have a equivalent single-precision versions for the corresponding double-precision ones in contants/float64 which will be laying the ground work for quite a few single-precision packages in math/base/special. Also expanding the scope of contants/float64 will add support while adding NEW packages based on upstream implementations.

Apart from this will also be working side-by-side on adding support for structured package data to facilitate automation and scaffolding for math/base/special packages and other base packages which can be used with other data-structures.

Why this project?

What really drives my passion for this project is the potential to work on something that enhances the user requirements for scientific computing and mathematical operations.
Furthermore, adding core functionalities and utilities same as that in Julia, Scipy & numpy in C and Javascript can have a promising future to be accessible through websites or web browsers and can also cater beginners interested in this field.
All-in-all I am excited to work on this to upscale my skills and work along-side with the community to learn and grow more.

Qualifications

I have studied C/C++ and javascript since the beginning of my Undergrad and have learned core concepts like Data-structures & Algorithms alongside Object-oriented programming. Have also learned and practiced several frameworks of javascript like React, Next, MongoDB & Express.

During my contribution phase at stdlib I was glad enough to achieve immense amount of experience with C, javascript & Node.js and coding essentially in the infamous “stdlib way”.
Upon working on multiple math/base/special packages have achieved in-valuable experience helping me to understand what goes into the development of higher level API's and utilities.
Upon working on stdlib packages was when I thoroughly read the various documentations for upstream implementations like:-

Which gave me a brief idea of it's code structure and how a package should be implemented for such libraries built by industry professionals.

Moreover what I might lack in skills I make up for it with my determination to learn. Even when I was struggling early on in stdlib with my contributions but, I did not give in easily constantly took guidance from the maintainers and worked consistently to improve my code quality.

Prior art

A primal chunk of double-precision and a few single-precision implementations for math/base/special functions as well as the contants/float64 had been worked upon in the last summer by Gunj Joshi in #41 .
Followed by consistent contributions by the contributors to work on single-precision versions for the existing double-precision ones ref:- #649.

For stats/base/dists the generic packages have been implemented as part of good-first-issues.

Additionally for automation and scaffolding the work is in progress and was also worked upon the previous summer as-well ref:-#1147. Prior work has also been done upon various packages like pow and exp.

Commitment

As I do not have any prior commitments I believe I am in a good position to invest 35+ hrs/week accumulating the project length to 350 hr+. I am in on working on this a bit longer to add newer packages and resolve any bugs or documentation work.

I’ll also be having a month long vacation from my college post exams so will be giving maximum amount of time to this project.

Schedule

Implementation Plan

For adding math/base/special packages I have divided the priority order for these implementations in 3 Phases:-

  • Phase 1:- Developing C implementation for existing math/base/special and stats/base/dists along with working on adding standalone packages for both.

  • Phase 2:- Adding single and double-precision packages listed in #649 which haven't been implemented yet. After which will be developing C implementation for single-precision packages of math/base/special referencing the upstream implementation and existing double-precision packages.

  • Phase 3:- Adding NEW packages listed in Phase 3 graph referencing the upstream implementations.
    I have scoped this for very few packages Since will be investing maximum amount of time in ensuring proper implementation of Phase 1 and Phase 2. However, would be consulting the mentors first on the priority & importance of these to the stdlib repo.

I will be referencing the following Implementation plan that I have laid out phase-wise:-

Phase_1.pdf
Phase_2.pdf
Phase_3.pdf

A big thanks to Gunj Joshi for providing in-valuable insights for dependencies in #41.

Since we base the implementation of our packages on the basis of upstream Implementations primarily I’ll be referring these mathematical libraries while adding new packages and developing foat32 implementations for existing float64 versions:-

  • Julia

  • Scipy

  • FreeBSD

  • Boost

  • Golang

  • Cephes

  • Amos
    more..

  • Working parallelly on adding single-precision components to contants/float32 for the corresponding contants/float64 packages as an when we move forward with adding single-precision and NEW packages for math/base/special.

  • Getting the PR's made by contributors on stats/base/dists packages over the finish line. Worst case scenario if the contributor has been afk for a while I'll start by implementing them myself. But, after we are over the line with implementing a major chunk of the project.

  • Ensuring that the existing functions in math/base/special strictly follow the referenced upstream-implementation. Like wise will refactor any package which deviates from the in-house code conventions.

  • Moreover, will be referring to IEEE754 for single-precision floating-point values and precision limit and referring to C99 while dealing with complex numbers.

Challenges

• While working on single-precision packages we’ll have to check if the corresponding Modules exist for float32 values.
• While adding New packages from Julia we would want to make sure if the implementation requires pre-requisite like in contants and number namespaces.
• Being familiar with the binary format for float32 to have proper idea of packages like fromWordf & toWordf.
• Using appropriate hexadecimal & binary values for accurate variable names.

Schedule

Assuming a 12 week schedule,

  • Community Bonding Period: I plan to start my work from the first week itself. With developing C implementation for existing math/base/special packages along with adding Standalone packages which do not need any pre-requisite dependencies.

Phase_1.pdf

Referencing Phase 1 listed packages and Standalone packages.

  • Week 1: Wrapping up the Phase 1 implementations and resolving any bugs. Moving on will add C implementation for existing stats/base/dists packages Ex:- arcsine/logcdf, beta/quantile etc.

  • Week 2: By this time will start to work on adding double-precision packages listed in #649 which haven’t been implemented yet. Parallelly adding essential packages to the contants and number namespaces in order to use them as in-house dependency while working on these implementations.

  • Week 3: I aim to complete the addition of majority of the double-precision packages In this week itself in-order to have more room to work on the corresponding single-precision implementations.

  • Week 4: Will work on any bugs, errors or Backlogs for the work done in the previous week. From the this week I’ll start with working on the major chunk of the project which is adding single-precision implementations for the existing double-precision packages.

Phase_2.pdf

I’ll also be referencing the Phase 2 dependency graph.
As listed in the graph powf acts as the center for many 32-bit packages and have started working on it before-hand to get it in-line by the time we start working on Phase 2.

  • Week 5: Aiming to continue the work on single-precision packages and to complete majority of them by the end of this week.

  • Week 6: (midterm) I’ll dedicate this week to just complete any previous backlogs or bugs that need to be fixed.

  • Week 7: After the addition of a major chunk of packages in math/base/special I will move forward with adding new packages to stats/base/dists referencing the upstream implementations since we’ll also be needing a bunch of math/base/special packages as dependencies while implementing them.

Phase_3.pdf

Will be working my way through the packages listed in Phase 3.

  • Week 8: Upon this week I plan on investing some time for resolving errors and completing any backlogs left. Along-side ensuring that all the existing implementations are in-line with the upstream implementation. Towards the end of the week will set up the blue-prints and pre-requisites for the work ahead on Phase 3.

  • Week 9: From the start of this week will be adding NEW packages based on the upstream implementations for both single and double-precisions after consulting with the mentors on the priority & importance of these to the stdlib repo.

Phase_3.pdf

Will be referencing the Phase 3 mathematical packages. Since they have been drawn up from scipy.special. While accumulating which package is heavily and commonly used.

  • Week 10: As we come closer to the finish line, I aim to start working on automation and scaffolding for backing support to math/base/special packages like powf and refactoring a few packages, also for base packages like strided and generic arrays and scalars.

  • Week 11: I'll dedicate this entire week to closely evaluate my work so far and aim to wrap up the work on errors, bug-fixes, documentation or any backlogs so far.

  • Week 12: Work furthermore on adding upstream packages after consulting with the mentors on the importance of it.

  • Final Week: In this week I will go through all the backlogs(if any) , Plan to work on any packages left-by to be implemented as scoped in this project and resolve any errors for scaffolding. Submitting the final project after approving with the mentors.

Notes:

  • The community bonding period is a 3 week period built into GSoC to help you get to know the project community and participate in project discussion. This is an opportunity for you to setup your local development environment, learn how the project's source control works, refine your project plan, read any necessary documentation, and otherwise prepare to execute on your project project proposal.
  • Usually, even week 1 deliverables include some code.
  • By week 6, you need enough done at this point for your mentor to evaluate your progress and pass you. Usually, you want to be a bit more than halfway done.
  • By week 11, you may want to "code freeze" and focus on completing any tests and/or documentation.
  • During the final week, you'll be submitting your project.

Related issues

Checklist

  • I have read and understood the Code of Conduct.
  • I have read and understood the application materials found in this repository.
  • I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
  • I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
  • I have read and understood the stdlib showcase requirement which is necessary for my application to be considered for acceptance.
  • The issue name begins with [RFC]: and succinctly describes your proposal.
  • I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.
@Neerajpathak07 Neerajpathak07 added 2025 2025 GSoC proposal. rfc Project proposal. labels Mar 17, 2025
@Neerajpathak07
Copy link
Author

Neerajpathak07 commented Mar 17, 2025

CC:- @kgryte @Planeshifter @gunjjoshi
Will be updating the proposal for the following points in the upcoming week:-

  • Removing C implementations for variadic interfaces from the dependency graphs.
  • adding the Stdlib-showcase
  • Scoping automation and scaffolding for recent updates and releases.

@kgryte
Copy link
Member

kgryte commented Mar 31, 2025

@Neerajpathak07 Thank you for opening this RFC. A couple of comments/questions:

  1. You've included stats distributions in this RFC. Do you have a rough idea of how many packages still require C implementations in the stats/base/dists namespace?
  2. You've listed various libraries providing reference implementation for special math functions. How do you plan to assess which reference implementation we should follow? Do you have a specific set of criteria you are planning to use?

@Neerajpathak07
Copy link
Author

Neerajpathak07 commented Mar 31, 2025

@kgryte Thank you for your review

  1. Yes while drafting this proposal I have researched for the various in-house packages for stats/base/dists that still require C implementation and found that approximately 50+ packages still do NOT have their C implementations. In-order to be realistic and achieve the goal In the proposed timeline I have planned to work on the C implementation of packages with high usages and priority. Well how did I decide this so, since math/base/special highly requires some stats/base/dists packages like beta for betaincv and so on also explicitly using exponential packages. Along-side have majorly targeting the following packages:- beta, exponential, gamma, and uniform . Which are also based on scipy.stats and are have high use case for upstream libraries.
    Which sum up to about 15-19 packages.
    I have also created this dependency-graph to have an idea of the pre-requisite math/base/special packages.
    Post GSoC we can work on the C implementations of the rest of the packages.

  2. I have divided which lib I'll be referencing while working on the implementations in the following order:-

  • FreeBSD:- adding single-precision packages for the existing double-precision ones. Along-side adding Bessel functions of 1st and 2nd kinds of order 0.
  • Julia, Scipy & Boost:- For inverse, airy , ellip & prominently for packages listed in Phase_3.pdf.
  • Golang:- for Hypot, tanh & pow10 relying on this lib more to achieve a brief understanding of the function parameters & arguments along with their special cases.
  • amos:- will be referring this lib while working on special functions for complex values & arguments. Along-side C99.
  • Additionally will be mentioning which package I have referred to by adding a link of it in the Implementation for Javascript & C.

@kgryte kgryte added needs feedback received feedback A proposal which has received feedback. and removed needs feedback labels Apr 3, 2025
@Neerajpathak07
Copy link
Author

Neerajpathak07 commented Apr 5, 2025

@kgryte @gunjjoshi anything else you would suggest to enhance this proposal??

@kgryte
Copy link
Member

kgryte commented Apr 6, 2025

Nothing more to add from me.

@gunjjoshi
Copy link
Member

stdlib showcase
Scientific-Calculator

This link seems to be broken.
Apart from this, nothing more from me as well. Best of luck, @Neerajpathak07!

@Neerajpathak07
Copy link
Author

@gunjjoshi Yeah apparently had kept it as a private repository as to make some changes I'll make it public now for you to view it!!.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2025 2025 GSoC proposal. received feedback A proposal which has received feedback. rfc Project proposal.
Projects
None yet
Development

No branches or pull requests

3 participants