Skip to content

[RFC]: Make code blocks on website documentation interactive #60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
6 tasks done
Shubh942 opened this issue Mar 25, 2024 · 6 comments
Closed
6 tasks done

[RFC]: Make code blocks on website documentation interactive #60

Shubh942 opened this issue Mar 25, 2024 · 6 comments
Labels
2024 2024 GSoC proposal. rfc Project proposal.

Comments

@Shubh942
Copy link

Shubh942 commented Mar 25, 2024

Full name

Shubh Mehta

University status

Yes

University name

IIIT Jabalpur

University program

Computer Science

Expected graduation

2025

Short biography

I am Shubh Mehta, currently pursuing my bachelor's degree in Computer Science at IIIT Jabalpur, currently in my prefinal year. My focus lies in cybersecurity research, where I actively engage in uncovering vulnerabilities within various multinational corporations' systems. Additionally, I specialize in developing websites and APIs with a strong emphasis on security, ensuring they are bug-free.

Proficient in languages such as C++, JavaScript, Python, Node.js, and React.js, I am currently expanding my skills in DevOps. My involvement in competitive programming has honed my problem-solving abilities, particularly in dealing with complex issues. In addition to practical experience, I've delved into theoretical foundations through coursework covering Data Structures and Algorithms.

My keen interest lies in frontend application maintenance, particularly in mitigating security threats like Cross-Site Scripting (XSS), which can potentially expose websites to exploitation by hackers. Familiarity with security aids me in undertaking the project Make code blocks on website documentation interactive. Working on this project enhances my knowledge in both frontend and security, ensuring that the blocks do not allow malicious code to render.

Timezone

India Standard Time

Contact details

email: [email protected],[email protected],[email protected],github:Shubh942,Linkedin: https://www.linkedin.com/in/shubh-mehta197/

Platform

Linux

Editor

Visual Studio Code (VS Code) is my preferred code editor because it works seamlessly with Ubuntu and offers many useful features and plugins. It's easy to use, with a simple interface and powerful debugging tools, making it great for writing code in languages like JavaScript, C, and C++. It also integrates well with Git for version control and collaboration. Overall, VS Code makes coding easier and more efficient, whether I'm working on personal projects or professional software development.

Programming experience

Over the past three years, I've been deeply immersed in programming, delving into advanced concepts like data structures, object-oriented programming, and many advanced concepts of CS Fundamentals. With active engagement in competitive programming, I've attained a specialist rank on Codeforces, alongside quite active in LeetCode and CodeChef. My exploration of cybersecurity has been fruitful, uncovering vulnerabilities in prominent companies such as Swiggy, Indeed, and Boatzon, with reports promptly relayed to their security teams. Additionally, I've also participated in various hackathons.

  • Hackoverflow (Prometeo 23 held in IIT Jodhpur): Our team selected in top 10.
  • Amazon Hackon: Our team was selected in the top 50 among 50k+ teams.

I've developed numerous projects using libraries and frameworks such as React, Node, and Express. In these projects, I've prioritized security by conducting rigorous testing to ensure protection against vulnerabilities.

  • CodePro: A versatile platform offering real-time chat functionality, discussion forums, and friend networking. Users can engage in dynamic discussions, connect with friends, and access administrative tools for enhanced management.

    • Real-time chat functionality powered by sockets for seamless communication.
    • Discussion forums enabling multiple users to engage in topic-based discussions.
    • Friend networking system allowing users to send and accept friend requests.
    • Enhanced security features ensuring chat is restricted to friends only.
    • Administrative tools granting special privileges to moderators for content moderation and user support.
      Github : Link
  • GdbUi: Currently in development, this project aims to revolutionize code debugging by offering a user-friendly UI interface. Unlike traditional terminal-based tools like GDB, this project provides a GUI interface with features such as Registers, Memory Management, Threads, and Breakpoints, simplifying the debugging process. Additionally, it offers a structured folder system for efficient organization of the user's workspace.
    Github: Link

JavaScript experience

In the past three years, I've immersed myself in JavaScript, from learning its basics to mastering advanced asynchronous programming. Asynchronous programming is made simple with the async/await syntax. This allows for cleaner and more efficient code handling asynchronous tasks. I've also explored various JavaScript libraries like React, Node.js, and Express.js, each offering unique capabilities for innovation.

Personally, I find arrow functions in JavaScript incredibly convenient. Their concise syntax and implicit return simplify code structure, making them a preferred choice for handling callbacks.

Node.js experience

In my personal projects with Node.js, I've predominantly utilized the Express library to develop server-side applications, integrating various databases such as PostgreSQL, MySQL, and MongoDB. Throughout these projects, I've implemented numerous middleware for authentication and routing, ensuring robust security and efficient navigation within the applications. Additionally, I've incorporated authorization features to maintain administrative functionality, allowing for seamless management of user privileges. Moreover, I've developed a multitude of APIs and managed authentication processes to safeguard sensitive data and ensure secure access to resources.

C/Fortran experience

My proficiency in C is grounded in a strong foundation built through coursework and participation in competitive programming. I possess a deep understanding of low-level programming concepts, adeptness in memory management, proficiency in implementing data structures, and skill in algorithm implementation. From system-level programming to embedded systems development, I've successfully completed projects spanning a diverse range of domains.

In addition, I'm currently engaged in a project involving the integration of GDB into a user-friendly interface for debugging C code without relying on the terminal.

Interest in stdlib

Stdlib, with its extensive collection of math functions, provides developers with a rich toolkit that significantly simplifies coding tasks. Offering a wide range of tools, Stdlib proves invaluable for developers working on diverse projects, spanning various domains. For me, Stdlib represents the perfect blend of two passions: mathematics and web development. Contributing to such a prominent organization would not only be a great learning experience but also a rewarding opportunity to make a meaningful impact in the developer community.

Version control

Yes

Contributions to stdlib

So far in stdlib, I am still contributing with my following pull requests.

Goals

Project Idea

My project aims to enhance website documentation by making code blocks interactive. These interactive code blocks will allow users to edit the code and receive real-time annotations on the output. By enabling users to edit arrays and instantly review the response, the interactive code blocks will significantly improve the user experience. The proposed design includes mechanisms to track and respond to changes effectively.

image

Description

I'm flexible with adapting the design to meet specific requirements. It's essential to note that the use of the require function is restricted to the Node.js environment and cannot be utilized in a browser setting because web browsers do not have a built-in module system that directly supports require. Moreover, granting permission to execute entire code blocks poses security risks, as malicious users could inject harmful payloads into the site. To mitigate this risk, input sanitization measures are implemented to prevent the execution of code if it does not adhere to the expected format. This helps ensure that only safe and expected inputs are processed and executed within the code blocks.

Efficiently integrate standard library packages dynamically by selectively loading them as required within interactive code blocks. This approach optimizes performance and user experience by minimizing unnecessary resource loading through lazy integration.

Idea for running require Function

Bundling executable code enables the creation of versatile applications capable of accepting inputs in any format and producing desired outputs. This approach enhances usability across diverse environments, streamlining code execution for seamless functionality despite input variations. Consequently, it optimizes performance and enhances user experience.

For making code Blocks interactive

There are many options for code editor library which we can use to apply realtime annotation of the user.

  1. Ace Editor
  2. Code Mirror
  3. We also have the option to develop our own logic for creating the editor with annotations, eliminating the need for external libraries. This approach ensures independence from third-party dependencies, allowing us to tailor the editor precisely to our requirements.

Idea for security measures:

  1. Input Sanitization: Ensure that user input is properly sanitized before being rendered in code blocks. This involves removing or escaping any potentially malicious HTML, JavaScript, or other executable code.
  2. Input Validation: Validate user input to ensure that it adheres to expected formats and structures. Reject or sanitize input that does not meet validation criteria before rendering it in code blocks.
  3. Escape HTML Characters: The HTML characters such as < or > can create difficulty, which we can escape through the special function htmlspecialchars() for encoding.

Example of injecting payload

image

Our code returns the Nan value, as it sanitizes the input.

I had also created a prototype of working of my idea, It can be seen in the video provided below.
Video Link: Link

I have also attached the GitHub link and hosted link of my gdb project in which I had integrated a code editor with folder structure
Github link: Link
Hosted link: Link

Implementation of Prototype:

Hosted Link: Link
GitHub Link: Link

I've developed an implementation for executing code blocks directly in the browser. To achieve real-time annotations, I've integrated the logic from my GDB-Ui project, where a code editor has been implemented. I can use the logic of integrating the code editor in stdlib for real-time annotations.

When attempting to execute code directly in the browser, errors such as require is not defined occur due to the unavailability of the require function in the browser environment. To address this limitation, bundling tools like Webpack or Browserify can be employed to create a bundle that encapsulates the required functionality. By bundling the code into a single executable function, which can be named bundle.js, users can input their code and receive the corresponding output seamlessly within the browser environment. This approach allows for the execution of code in a browser-friendly manner, overcoming the limitations posed by the absence of the require function.

  1. Utilize bundling tools like webpack or browserify to encapsulate the required functionality into a single bundle.
  2. Create an executable function, such as bundle.js, which allows users to input their code and receive the corresponding output.
  3. Run the below command for creating bundled.js
browserify bundle.js -o bundled.js

By performing the above action we can create the bundled.js file which can be used in HTML for rendering purposes.
Currently, I've handled all these tasks manually, but I'm actively engaged in research and studying articles to devise an automated solution.

Why this project?

Implementing this project requires a deep understanding of cybersecurity principles and the ability to tackle complex challenges. As someone deeply passionate about both security and code blocks, I'm excited about the technical hurdles involved in designing and implementing interactive code blocks.

With a background in computer science and software development, this project resonates perfectly with my interests and expertise. I bring a blend of theoretical knowledge and hands-on experience to the project, enabling me to offer valuable insights and solutions.

I'm eager to embark on this journey and contribute to the JavaScript ecosystem. This project perfectly aligns with my skills and aspirations, fueling my enthusiasm to make significant contributions and ensure its success.

Qualifications

I am a Full Stack Developer with a deep understanding of cyber security and also quite active in competitive programming,. Achieving Specialist rank on a platform Codeforces, and also being quite active in LeetCode, and CodeChef underscores my proficiency in algorithmic problem-solving. Beyond the realm of programming challenges, I am also a bug hunter. my discovery and disclosure of critical vulnerabilities in prominent organizations such as Swiggy, Indeed, and Boatzon.

  1. Swiggy: Discovered a critical threat allowing unauthorized access to restaurant accounts without OTP verification.
  2. Indeed: Uncovered two XSS vulnerabilities in profile and details pages, potentially enabling attackers to inject malicious scripts, compromise user data, and execute unauthorized actions.
  3. Boatzon: Identified a loophole allowing users to buy products at zero cost through price tampering, posing a significant risk to financial transactions and platform integrity.

The security is been reported to the companies and vulnerabilities are also been solved.

Moreover, I've assumed leadership roles, notably as Security and Backend Lead in our college's Fusion Open Source project. The project is for maintaining institutional affairs.

My experience in cyber security has been beneficial in writing code without bugs.

I've demonstrated my proficiency in utilizing web APIs and handling real-time data within a JavaScript environment, alongside a strong focus on cyber security to ensure secure project development. These experiences have equipped me with the skills necessary to excel in projects like this.

Prior art

In my research for this project, I delved into diverse resources to gain insights and grasp existing implementations thoroughly. I discovered the utility of bundling packages through tools like Webpack or Browserify, as well as the versatility of implementing code editors using libraries such as Ace Editor or Codemirror, or even building one from scratch. Additionally, I explored security measures extensively to ensure the robustness of the project.

For Bundling the packages
Bundling is vital for this project as it's crucial for executing code blocks. We can reference a video demonstrating bundling, showcasing the use of the require function, along with documentation from FreeCodeCamp for detailed guidance on this task.

Bundling via Browserify: Link
Freecodecamp Article: Link

For Implementing Code Editor

Ace Editor: Embedding the editor to the site Link
Without Library: developing by using textbox Link

For security
To ensure application security, it's crucial to have a thorough understanding of potential threats.

Xss by Portswigger: Link
The PortSwigger article demonstrates various XSS attacks and their workings, serving as a guide for securing applications against such vulnerabilities.

Commitment

During my summer vacation from May to the first week of July, I will dedicate 40 hours per week to the project. Once my college resumes, I will be able to allocate 20-22 hours per week starting from that time.

Acknowledging the importance of this project, I aim to dedicate full-time hours during the summer. I'll collaborate with my mentor to brainstorm implementation strategies and task distribution, ensuring the successful achievement of project milestones.

1 May - 26 May -> Bonding Period
27 May - first week of July -> 40 hours/week ( 40 * 6 )
8 July - 17 August -> 21 hours/week ( 20 * 6 )
Total = 240 + 120 = 360 hours

Schedule

Assuming a 12 week schedule,

  • Community Bonding Period:

    • I will actively engage with the stdlib community and mentors, participating in discussions, seeking guidance, and providing progress updates. Additionally, I will conduct further research on bundling and code block implementation, adapting techniques to meet the requirements of stdlib. Furthermore, I am committed to enhancing my knowledge about the code editor that will be implemented in this project.
  • Week 1:

    • Start implementing frontend components for dynamic code blocks and real-time annotations based on the designed interface.
    • Set up basic functionality for code block rendering and user interaction.
  • Week 2-3:

    • Continue frontend implementation, focusing on integrating real-time annotation features.
    • Begin developing the functions to handle user input for processing outputs.
  • Week 4:

    • Implement bundling of frontend code containing required dependencies for browser execution.
    • Integrating the function that manages user input and bundled code
  • Week 5:

    • Testing for the frontend code blocks involves verifying their rendering, interactivity, and compatibility across different browsers and devices.
    • Testing the user inputs with all possible options.
  • Week 6: (midterm)

    • My task is to verify the implementation of the planned task.
    • Midterm evaluation and adjustments as needed.
  • Week 7:

    • Implementation of Lazily integrating code editor into documentation pages and dynamic loading.
  • Week 8:

    • Providing an easy switching ES5 (Function based) and ES6 (Arrow function)
    • Writing functions using the appropriate syntax based on the chosen configuration option.
  • Week 9- Week 10:

    • Sanitizing the code from the security measure defined by OWASP.
    • Also encoding the <, >," and many others so that it can't be rendered.
  • Week 11:

    • Automated testing via Burp Suite employs predefined payloads to thoroughly assess application security, identifying potential vulnerabilities and enhancing protection against diverse attack vectors.
    • Enhancing the code according to the responses of the payloads.
  • Week 12:

    • Finalize implementations and documentation for all implemented things.
    • Conduct manual (hand-to-hand) testing and debugging.
  • Final Week:

    • Prepare the project for submission or integration into the main webpage.

Notes:

  • The community bonding period is a 3 week period built into GSoC to help you get to know the project community and participate in project discussion. This is an opportunity for you to setup your local development environment, learn how the project's source control works, refine your project plan, read any necessary documentation, and otherwise prepare to execute on your project project proposal.
  • Usually, even week 1 deliverables include some code.
  • By week 6, you need enough done at this point for your mentor to evaluate your progress and pass you. Usually, you want to be a bit more than halfway done.
  • By week 11, you may want to "code freeze" and focus on completing any tests and/or documentation.
  • During the final week, you'll be submitting your project.

Related issues

Checklist

  • I have read and understood the Code of Conduct.
  • I have read and understood the application materials found in this repository.
  • I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
  • I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
  • The issue name begins with [RFC]: and succinctly describes your proposal.
  • I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.
@Shubh942 Shubh942 added 2024 2024 GSoC proposal. rfc Project proposal. labels Mar 25, 2024
@Pranavchiku
Copy link
Member

Hey @Shubh942, impressive proposal, thanks for applying! A few suggestions / questions I have though things look good to me.

  • Can you please share code / way of how did you get that arrayView2IteratorRight stuff working and discuss about other possibilities?
  • Also, I think you lack slightly on contribution part, you may push and get patches merged, feel free to ping if you want to work on a particular feature.
  • You may add more details about the project, like which packages you'll pickup first, do we need to write up code for everything manually or these will be automated for each package, etc.

@Shubh942
Copy link
Author

@Pranavchiku, thank you for your response. I added the column on prototype implementation and also added the necessary links, can you please review it also? Regarding the automation of this functionality, I'm actively researching and exploring various approaches to streamline the process. Any suggestions or guidance from your end would be highly appreciated. Additionally, I will also focus on my contribution part and make a good contribution to stdlib.

@kgryte
Copy link
Member

kgryte commented Mar 31, 2024

@Shubh942 Thanks for sharing your draft proposal. A few comments:

  1. You've mentioned security risks. I am curious whose security we are concerned about. For code evaluation, it is happening on a user's local machine, in their web browser, and we shouldn't be performing any execution on our servers. In fact, the entire point is to leverage a user's local browser for example execution. So, I am not following the concerns about sanitation, etc.
  2. In your examples, you provide an output textarea for displaying results. Our preference would be to leverage our current doctest comment convention for displaying results. Do you have any thoughts on how you might be able to leverage those comments?
  3. One of the key problems to solve for this project is the dependency loading problem. Namely, each code block may have a different set of stdlib dependencies. And we cannot simply generate specialized bundles for each code block. And further, as users should be able to edit code blocks and dynamically require other stdlib dependencies, generating code block bundles ahead of time would not be sufficient. One approach is to use our ES module builds, which you learn more about by searching some of our standalone stdlib repositories.

@Shubh942
Copy link
Author

Shubh942 commented Apr 1, 2024

Thank You @kgryte for your response.

After further research and examining your suggestion to utilize ES modules, I have found a more robust approach for this task. Additionally, I have explored the repository provided by stdlib for reference on utilizing ES modules. Based on this, here's the plan I propose for approaching this project.

Prototype Video

bandicam.2024-04-01.16-38-09-025.mp4

In the above video, I defined two code blocks

  1. Import statement
  2. User code

The import function within the code block serves as the entry point for users to execute their code. Through this, we extract the names of dependencies, which users can then leverage for additional tasks. We selectively import these dependencies and integrate them with the user's code, enabling them to utilize the functions provided by stdlib seamlessly.

The way of performing this task

Initially, we can carry out the following operation for each dependency.

import dnansumpw from 'https://cdn.jsdelivr.net/gh/stdlib-js/blas-ext-base-dnansumpw@esm/index.mjs';

The above reference taken by Link

image

We are undertaking this process because the import/require functionality is not supported in web browsers. Therefore, we extract the necessary function from the export and import it from utils.js.

import importModules from "./utils.js";

After that, we can use this in our browser for code execution.

          const modules = await importModules();
           const var1 = {};
            dependencies.forEach((dependency) => {
              if (modules.hasOwnProperty(dependency)) {
                // Assign the property to the window object to make it available in the user code
                window[dependency] = modules[dependency];
                // Assign the default property of the module to var1

                const { default: dependency } = modules[dependency];
                // Assign the destructured default property to a variable with the dependency's name
                var1[dependency] = modules[dependency];
              } else {
                // If default property does not exist, use the module directly
                var1[dependency] = modules[dependency];
              }
            });

Using the method described above, we can ensure that the dependencies are made available to the user. Consequently, the user will have the flexibility to call any dependency within their code blocks as needed.

 const result = eval(userCode);
 document.getElementById("output").innerText = result; // Display the output

We've set up the dependencies as defaults so that the user's code can seamlessly utilize these functions.

@Shubh942
Copy link
Author

Shubh942 commented Apr 1, 2024

A practical approach to utilizing the commented output as the result is to categorize each commented output with class names like result1, result2, result3, and so forth. By doing this, we can easily access and manipulate each result through its corresponding class. Consequently, when we obtain the output, we can simply iterate over the classes in a loop and assign the respective values to each class

resultArray.forEach((element, index) => {
  const className = `class${index + 1}`;
  element.classList.add(className);

  // Set the content of the element to its class name
  element.innerHTML = className;
});

@Shubh942
Copy link
Author

Shubh942 commented Apr 1, 2024

The website appears to be susceptible to reflected XSS (Cross-Site Scripting) attacks, as it directly executes JavaScript code provided by the user without proper sanitization. In such a scenario, the users themselves could unknowingly execute malicious scripts provided by an attacker, leading to potential compromise of their own sensitive data.

To mitigate the risk posed by such threats, we implement sanitization measures to filter out any potentially harmful content from user input.

image

@kgryte kgryte closed this as completed Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024 2024 GSoC proposal. rfc Project proposal.
Projects
None yet
Development

No branches or pull requests

3 participants