[RFC]: add LAPACK bindings and implementations for linear algebra #108
Labels
2025
2025 GSoC proposal.
received feedback
A proposal which has received feedback.
rfc
Project proposal.
Full name
Aayush Khanna
University status
Yes
University name
Indian Institute of Technology (Banaras Hindu University), Varanasi
University program
Major in Civil Engineering
Expected graduation
2027
Short biography
I am a sophomore at IIT Varanasi, majoring in Civil Engineering. I have loved spending time with computers since middle school. I have a strong background in mathematics from my coursework and in my free time, I enjoy tinkering with my personal projects. DevOps is one of my many areas of interests and I'm pursuing it aggressively.
I've worked as a full-stack developer intern at NumberLabs where I worked on the frontend of the product, server side code, database connections and CI using TypeScript and GitHub Actions. I have also been recognized as one of the Core Contributors to stdlib-js for my contributions. I participate in hackathons regularly and I love the thrill of building cool stuff on a time crunch. I won the first prize in StackHack 2.0 organized by Mercer | Mettl, where I developed the backend of our app using TypeScript and Hono.js and deployed it to CloudFlare workers. I also organize hackathons and conduct workshops on various programming languages etc as a member of the programming club of my college.
I'm constantly learning and picking up new stuff! It was during one of these adventures that I landed on stdlib-js, I was really excited when my first PR in an open source project for @stdlib/assert/is-same-accessor-array was merged and I haven't looked back since then.
Timezone
Indian Standard Time (UTC +5:30)
Contact details
email:[email protected], github:aayush0325
Platform
Linux
Editor
I use VS Code for development because of the rich extension marketplace, the dev container support and because it's open sourced.
Programming experience
My programming experience consists of many different projects, and I've also worked as a full stack developer.
JavaScript experience
I've mainly used JavaScript for frontend development using frameworks like React and Next.js along with libraries like MantineUI and MaterialUI.
I love JavaScript for its rich ecosystem including frameworks and libraries for frontend development, cross-platform native apps, CLI applications, authentication services and of course numerical computing! This allows me to use JavaScript for quick prototyping, fun side projects and as a general purpose scripting language. I love the various different runtimes that exist for JavaScript such as Node, Deno, Yarn, Bun etc. I use Bun for a lot of my side projects for the performance boost that it provides as compared to Node.
I dislike JavaScript's lack of types, as it makes working with large codebases challenging. However, TypeScript fixes this issue.
Node.js experience
I've gained experience with Node.js over the past few months through my contributions to stdlib-js and my tenure as a full-stack developer intern. I use Node.js for general purpose scripting and to develop backend APIs for my projects. I'm familiar with various frameworks and libraries such as Express, Prisma, Mongoose, Hono.js, Auth.js, and Passport.js. Recently, I've been implementing LAPACK routines in Node.js as part of my contribution to stdlib-js
C/Fortran experience
I've learnt C as a part of my intro to CS course in college which was focused on C programming and data structures. I've gained experience in working with C from my work here at
stdlib
where I added thendarray
APIs in C for packages instats/strided/*
, Refactored native add-ons from C++ to C, and developed C implementations for mathematical and statistical functions.Although I don't have a lot of experience in Fortran. I'm comfortable in reading and writing it to go through LAPACK source code, re implement LAPACK routines and test my implementation against the Fortran implementation.
Interest in stdlib
I believe that stdlib addresses a very large gap in the JavaScript ecosystem i.e. numerical computing. While there have been other attempts at this, stdlib really takes this effort to a really large scale which can create a massive impact in the JavaScript ecosystem and promote developers to use it more. the development philosophy of stdlib takes the long way of implementing all the required algorithms and utilities from scratch rather than trying to string together a bunch of stuff out in the open source world without any uniformity which is why stdlib has the lower level kernels and the higher level user facing APIs as well! this means that stdlib can also be used as a base layer in developing other libraries and frameworks in JavaScript. I really believe in the long term vision of stdlib and this is why I'm excited to contribute to it.
I also applaud the maintainers for creating a beginner friendly environment and I would love to maintain the same environment for all the newer contributors to come. stdlib will always be special to me as I made my first open source contribution here :)
Version control
Yes
Contributions to stdlib
I've been contributing to stdlib for a while now, I was recently recognized as one of the Core Contributors to the project for my efforts. Having authored over 200 PRs and 600 commits to the main branch of our repository, I've helped with a lot of things around here such as:
stats/base/*
.ndarray
APIs for double precision and single precision packages instats/base/*
.stats/base/*
tostats/strided/*
.stats/base/dists/*
.blas/ext/base/wasm/*
.Additionally I keep revisiting our code base to find any gaps in our documentation, examples and other errors that may have been missed by the reviewers. My most recent efforts include the automation of the process of migrating packages from
stats/base/*
tostats/strided/*
and adding LAPACK routines to stdlib.I've also helped with the on boarding of new contributors by writing the contributing FAQs doc and by documenting the process of setting up a dev container to set up the stdlib repository locally for development. I regularly help out with code review on PRs which are related to the
stats/base/*
namespace of the project.stdlib showcase
As a showcase project, I want to present napi-stdlib-cli! napi-stdlib-cli is a CLI tool to generate native addons for really simple C functions so that we can run them in Node.js, This project leverages
@stdlib/napi
to write the interface between C and Node.js,@stdlib/utils/library-manifest
to resolve dependencies andnode-gyp
to compile the native addon.@stdlib/napi
is a beautiful layer of abstraction overN-API
(node-api) bindings which make it really simple to write addon files and bring uniformity in our code base. Native addons can be really useful if we want to port an existing library written in C to Node.js, use hardware optimized libraries or if we want a performance boost. My showcase project makes it easy for developers to write native addons and usesstdlib
in a unique way!Goals
I want to set an ambitious goal for this summer of implementing 48 LAPACK routines in stdlib according to our conventions with a comprehensive test suite ensuring that we achieve 100% test coverage. Along with relevant docs, thorough benchmarks and examples. Each routine would look like this as per stdlib conventions:
The routines would support both
row-major
(C-style) andcolumn-major
(Fortran-style) storage layouts, overcoming the original LAPACK's column-major limitation.This also includes any utility or helper functions that we may need. By the end of this project
@stdlib/lapack/base
would have JavaScript implementations for 48 LAPACK routines.As this is a long term project I'd love to work on this post GSoC as well! where I'll be working on adding new routines, adding single precision and complex equivalent of these routines, writing C implementations for routines with an existing JavaScript implementation and opening issues and reviewing PRs for these LAPACK routines.
Why this project?
Each LAPACK routine comes with a challenge of it's own when tries to implement it in the stdlib way i.e. by thoroughly testing it, benchmarking it and rolling it out with relevant docs. Every LAPACK routine offers the opportunity to learn something new and that is why I'm excited to take on this project. This project pushes my limits and it'll make me a better engineer. There are many real world application of LAPACK routines such as financial modelling, machine learning and structural engineering. The value which stdlib can offer to its users will definitely increase once we have these routines and I would love to be a part of this effort.
Qualifications
I am confident that I'm qualified to take this project on over the summer. I've been a consistent contributor for months which makes me familiar with our style of writing code and documentation. I've worked on a variety of different problems at stdlib and I'm well prepared to think independently and come up with solutions based on my experience. I'm familiar with the storage layouts (row-major, column-major) and optimization techniques (loop-tiling, loop-reordering) that I may need to use to write these routines. I'm somewhat familiar with Fortran and have a decent understanding of linear algebra from my course work.
I've also authored PRs for a few LAPACK routines in the contribution period for GSoC which make me a strong candidate to work on this project:
dgtts2
: #5882dgttrf
: #5754zlaswp
: #5496claswp
: #5525I've also taken some efforts to lay out a long term plan for adding LAPACK routines to stdlib, to do this, I analyzed the LAPACK source code using
Doxygen
and determined the order in which to implement its routines by performing a topological sort.This ensures that each routine is implemented only after all the routines it depends on are completed. I've also parsed the graphs to generate visual representations of the caller and call graphs of each routine for better understanding, You can learn more about the steps I took and the results I achieved by reading here!
Prior art
My main source of truth for this project will be the original source code as documented in netlib LAPACK. Some work is already done on this front here as tracked in this issue Issue 2464 which is focused on the routines which fall under the
Linear Solve: A * X = B
section of the LAPACK routines. The issue tackles low hanging fruits from the sub-sectionsLU: computational routines
,Cholesky: computational routines
,LDL: computational routines
andTriangular computational routines
. This work sets the standard for what an LAPACK routine should look like when we implement it here using stdlib conventions, it establishes what the examples, tests and documentation should look like and the process was documented delightfully in one of stdlib's blog posts. This work will definitely serve as an important resource for me as I work on the project.Commitment
I do not have any commitments after May 10th, 2025. Since the project will primarily take place during my summer break, I won't face any blockers and will be able to focus on the project. Given the project's requirement of 350 hours, I plan to dedicate 35-40 hours per week, distributed throughout the week. If necessary, I am prepared to extend my work hours to ensure we stay on schedule as planned during the application period. My summer break ends on July 11th, 2025, after which I will return to college. However, I will make provisions to compensate for any lost time and complete the project within the allotted coding period.
Schedule
For the scope of this projected I've handpicked 48 routines which I will work on over the summer. These routines are listed in this spreadsheet along with their source code. These routines have been selected on the basis of usability, time taken to implement these and the topological order deduced from the results of the sorting which I performed. The routines lead to 2 higher level routines dgeev and dgesvj, we can take a breadth first approach rather than a depth first approach in implementing the dependencies for these routines, for example in the call graph of dgeev, the development of dlarfg, dlarf, dlartg, dgebak, and dgebal take place parallelly. This would help us account for PR review cycles and ensure that we don't run into any blockers.
Assuming a 12 week schedule,
Community Bonding Period: Starting from May 11 2025 to 1st June 2025, I'll work on the routines dgebal, dgebak, dlarf1f, dlarfg, dgebd2, dlabrd, dgebrd, dlatrs, dlacn2, drscl and dgecon.
Week 1: During week 1, I plan to work on the routines dgehd2, dlahr2, dlarfb and dgehrd.
Week 2: During week 2, I plan to work on the routines dlanv2, dlahqr, dlarft and dorm2r.
Week 3: During week 3, I plan to work on the routines dormqr, dormhr and dlarf.
Week 4: During week 4, I plan to work on the routines dlarfx, dlasy2 and dlaexc.
Week 5: During week 5, I plan to work on the routines dlaqr2, dlaqr1 and dlaqr5.
Week 6: (midterm) During week 6 I'll submit my project for midterm evaluation to my mentor and I'll work on the routines dlaqr4, dlaqr3 and dlaqr0.
Week 7: During week 7, I plan to work on the routines dhesqr, dorg2r and dorgqr.
Week 8: During week 8, I plan to work on the routines dorghr, dladiv and dlaln2.
Week 9: During week 9, I plan to work on the routines dtrevc3 and dgeev. It is expected that the implementation and testing for dgeev will be time consuming since it's a higher level routine so I've given it more time here in the schedule.
Week 10 : During week 10, I'll work on the routines dgelq2, dgelqf, dgeqr2 and dgeqrf.
Week 11: During week 11, I'll work on the routines dlaqps, dgeqp3, dgscvj0 and dlaqp2.
Week 12: During week 12, I'll work on the routines dgscvj and dgesvj. It is expected that the implementation and testing for dgesvj will be time consuming since it is a higher level routine so I've given it more time here in the schedule.
Final Week: I'll wrap up the project by finishing any documentation or testing that remains and submit my project. I would also love to lay the ground work for adding C implementations of these routines and single precision equivalents in this week to carry on this work post GSoC.
Notes:
Related issues
Linear Solve: A * X = B
section of the LAPACK routines. The issue tackles low hanging fruits from the sub-sectionsLU: computational routines
,Cholesky: computational routines
,LDL: computational routines
andTriangular computational routines
.Checklist
[RFC]:
and succinctly describes your proposal.The text was updated successfully, but these errors were encountered: