Skip to content

feat: Add template for structured extraction #185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 25, 2024

Conversation

marcusschiesser
Copy link
Collaborator

@marcusschiesser marcusschiesser commented Jul 25, 2024

Summary by CodeRabbit

  • New Features

    • Introduced a structured extraction template to streamline data handling processes.
    • Added support for a new template type, "extractor," enhancing template configuration options.
    • Enhanced the question prompt options to include "Structured Extractor," improving user interactions.
    • Launched a FastAPI application to serve extraction queries and manage structured data responses.
  • Documentation

    • Added a comprehensive README template for the FastAPI project, detailing setup and usage instructions.
  • Bug Fixes

    • Improved error handling and response structuring in API query requests.

Copy link

changeset-bot bot commented Jul 25, 2024

🦋 Changeset detected

Latest commit: a4d6145

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
create-llama Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link

coderabbitai bot commented Jul 25, 2024

Walkthrough

The recent changes introduce a new structured extraction template and expand the functionality of the existing codebase to accommodate this addition. Key modifications include the introduction of new types and enhancements to functions that handle template installation and querying. This update aims to improve data management practices, streamline workflows, and enrich user interaction with a new API built on FastAPI.

Changes

Files Summary of Changes
.changeset/proud-seals-yell.md Added a new template for structured data extraction to standardize processes and improve data handling.
helpers/index.ts, helpers/types.ts Modified installTemplate function to include a new template type "extractor"; updated TemplateType to reflect this addition.
questions.ts Enhanced askQuestions function to include "extractor" as a new option and adjusted control flow for framework selection.
templates/types/extractor/fastapi/README-template.md Introduced a README for a FastAPI project using LlamaIndex, outlining setup, usage, and API interactions.
templates/types/extractor/fastapi/app/api/routers/extractor.py Added FastAPI router to handle extraction queries, including request validation and error management.
templates/types/extractor/fastapi/app/api/routers/output.py Created Output model for structured API responses, enhancing validation and data serialization.
templates/types/extractor/fastapi/main.py Implemented FastAPI app setup, including environment configuration and router initialization for extraction functionality.
templates/types/extractor/fastapi/pyproject.toml Defined project metadata and dependencies using Poetry, streamlining project management and setup.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant API
    participant QueryEngine
    participant Output

    User->>API: Send query request
    API->>QueryEngine: Process request data
    QueryEngine->>Output: Generate response
    Output-->>API: Return structured data
    API-->>User: Provide response
Loading

In a garden where bunnies hop,
New templates bloom, they’ll never stop!
With data structured, neat, and bright,
Our workflows dance in pure delight.
Hooray for changes, let joy abound,
In rabbit holes of code, we’re glory-bound! 🐰✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Outside diff range, codebase verification and nitpick comments (4)
templates/types/extractor/fastapi/app/api/routers/output.py (1)

5-5: Consider using a more specific logger name.

Using a more specific logger name can help differentiate logs from different modules.

- logger = logging.getLogger("uvicorn")
+ logger = logging.getLogger(__name__)
templates/types/extractor/fastapi/main.py (1)

19-20: Consider using a more specific logger name.

Using a more specific logger name can help differentiate logs from different modules.

- logger = logging.getLogger("uvicorn")
+ logger = logging.getLogger(__name__)
templates/types/extractor/fastapi/app/api/routers/extractor.py (1)

40-50: Clarify the error message.

The error message should be more user-friendly and avoid referencing specific commands.

- detail=str(
-    "StorageContext is empty - call 'poetry run generate' to generate the storage first"
- )
+ detail="Storage context is empty. Please generate the storage first."
templates/types/extractor/fastapi/README-template.md (1)

58-60: Specify the language for the fenced code block.

To comply with markdown linting rules, specify the language for the fenced code block.

-```
+```shell
ENVIRONMENT=prod python main.py

<details>
<summary>Tools</summary>

<details>
<summary>Markdownlint</summary><blockquote>

58-58: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>Review details</summary>

**Configuration used: CodeRabbit UI**
**Review profile: CHILL**

<details>
<summary>Commits</summary>

Files that changed from the base of the PR and between a553d5051e1d158f0f94f619c0320d797f9034bb and a4d6145bfbaca594397d3133515b685fb6c50681.

</details>


<details>
<summary>Files selected for processing (9)</summary>

* .changeset/proud-seals-yell.md (1 hunks)
* helpers/index.ts (1 hunks)
* helpers/types.ts (1 hunks)
* questions.ts (2 hunks)
* templates/types/extractor/fastapi/README-template.md (1 hunks)
* templates/types/extractor/fastapi/app/api/routers/extractor.py (1 hunks)
* templates/types/extractor/fastapi/app/api/routers/output.py (1 hunks)
* templates/types/extractor/fastapi/main.py (1 hunks)
* templates/types/extractor/fastapi/pyproject.toml (1 hunks)

</details>






<details>
<summary>Files skipped from review due to trivial changes (2)</summary>

* .changeset/proud-seals-yell.md
* templates/types/extractor/fastapi/pyproject.toml

</details>



<details>
<summary>Additional context used</summary>

<details>
<summary>Ruff</summary><blockquote>

<details>
<summary>templates/types/extractor/fastapi/main.py</summary><blockquote>

43-43: Remove unnecessary `True if ... else False`

Remove unnecessary `True if ... else False`

(SIM210)

</blockquote></details>

</blockquote></details>
<details>
<summary>Markdownlint</summary><blockquote>

<details>
<summary>templates/types/extractor/fastapi/README-template.md</summary><blockquote>

58-58: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</blockquote></details>

</details>
<details>
<summary>Additional comments not posted (21)</summary><blockquote>

<details>
<summary>templates/types/extractor/fastapi/app/api/routers/output.py (2)</summary><blockquote>

`8-22`: **LGTM!**

The `Output` class is well-defined with appropriate fields and validation.

---

`24-32`: **LGTM!**

The `Config` class provides a useful example for the JSON schema.

</blockquote></details>
<details>
<summary>templates/types/extractor/fastapi/main.py (7)</summary><blockquote>

`1-3`: **LGTM!**

Environment variables are loaded using `dotenv`.

---

`5-10`: **LGTM!**

Imports and initial setup for FastAPI and logging are correct.

---

`11-13`: **LGTM!**

The extractor router and settings are correctly imported.

---

`15-18`: **LGTM!**

FastAPI app instance and settings initialization are correctly done.

---

`22-30`: **LGTM!**

CORS middleware is correctly configured for development mode.

---

`32-35`: **LGTM!**

Redirection to the documentation page is correctly implemented.

---

`38-38`: **LGTM!**

The extractor router is correctly included.

</blockquote></details>
<details>
<summary>templates/types/extractor/fastapi/app/api/routers/extractor.py (7)</summary><blockquote>

`1-3`: **LGTM!**

Imports for logging and environment variables are correct.

---

`4-7`: **LGTM!**

Imports for FastAPI, HTTPException, and Pydantic are correct.

---

`8-10`: **LGTM!**

Imports for the output model and index function are correct.

---

`11-13`: **LGTM!**

API router instance and logger are correctly initialized.

---

`16-24`: **LGTM!**

The `RequestData` class is well-defined with appropriate fields and validation.

---

`27-37`: **LGTM!**

The `query_request` endpoint is well-defined and correctly processes the request.

---

`52-58`: **LGTM!**

The `get_query_engine` function is well-defined and correctly configures the query engine.

</blockquote></details>
<details>
<summary>helpers/types.ts (1)</summary><blockquote>

`21-21`: **LGTM! The new template type "extractor" enhances flexibility.**

The addition of the "extractor" type to `TemplateType` looks good and enhances the flexibility of the type definition.

</blockquote></details>
<details>
<summary>templates/types/extractor/fastapi/README-template.md (1)</summary><blockquote>

`1-68`: **LGTM! The README provides clear and comprehensive instructions.**

The instructions for setting up and running the FastAPI project with LlamaIndex are clear and comprehensive.

<details>
<summary>Tools</summary>

<details>
<summary>Markdownlint</summary><blockquote>

58-58: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</details>

</blockquote></details>
<details>
<summary>helpers/index.ts (1)</summary><blockquote>

`166-170`: **LGTM! The conditional logic now supports the "extractor" template type.**

The modification to include the "extractor" template type in the conditional logic of the `installTemplate` function looks good and broadens the functionality.

</blockquote></details>
<details>
<summary>questions.ts (2)</summary><blockquote>

`345-348`: **Addition of new template option "Structured Extractor".**

The addition of the new option "Structured Extractor" is well-integrated into the existing list of template choices.

---

`409-411`: **Modification of conditional logic to handle "extractor" template.**

The conditional logic now appropriately includes a check for the "extractor" template, setting the framework to "fastapi" if the template is either "multiagent" or "extractor".

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant