Skip to content

Commit ef46e89

Browse files
kartikpersistentvasanthasaikallurijayanth-002abhishekkumar-27prakriti-solankey
committed
Concurrent processing of files (#665)
* Update README.md * Droped the old vector index (#652) * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * Added chatbot "hybrid " mode use case * added the concurrent file processing * page refresh scenario * fixed waiting files processing issue in refresh scenario * removed boolean param * fixed processedCount issue * checkbox with waiting check * fixed the refresh scenario with processing files * processing files check * server side error * processing file count check for processing files less than batch size * processing count check to handle allselected files * created helper functions * code improvements * __ changes (#656) * DiffbotGraphTransformer doesn't need an LLMGraphTransformer (#659) Co-authored-by: jeromechoo <[email protected]> * Removed experiments/llm-chatbot-python folder from DEV branch * redcued the password clear timeout * Removed experiments/Cypher_Queries.ipynb file from DEV branch * disabled the closed button on banner and connection dialog while API is in pending state * update delete query with entities * node id check (#663) * Status source and type filtering (#664) * status source * Name change * type change * rollback to previous working nvl version * added the alert * add BATCH_SIZE to docker * temp fixes for 0.3.1 * alert fix for less than batch size processing * new virtual env * added Hybrid Chat modes (#670) * Rename the function #657 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * Graph node filename check * env fixes with latest nvl libraries * format fixes * removed local files * Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Status source and type filtering (#664) * status source * Name change * type change * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * page refresh scenario * fixed waiting files processing issue in refresh scenario * Removed experiments/llm-chatbot-python folder from DEV branch * disabled the closed button on banner and connection dialog while API is in pending state * node id check (#663) * Status source and type filtering (#664) * status source * Name change * type change * rollback to previous working nvl version * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Status source and type filtering (#664) * status source * Name change * type change * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * property spell fix --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Jayanth T <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Jerome Choo <[email protected]> Co-authored-by: jeromechoo <[email protected]> Co-authored-by: Pravesh Kumar <[email protected]>
1 parent 81e64f4 commit ef46e89

File tree

15 files changed

+418
-88
lines changed

15 files changed

+418
-88
lines changed

.gitignore

+4-2
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
__pycache__/
44
*.py[cod]
55
*$py.class
6-
6+
.vennv
77
# C extensions
88
*.so
99
/backend/graph
@@ -170,4 +170,6 @@ google-cloud-cli-469.0.0-linux-x86_64.tar.gz
170170
/backend/chunks
171171
google-cloud-cli-linux-x86_64.tar.gz
172172
.vennv
173-
newenv
173+
newenv
174+
files
175+

backend/src/graphDB_dataAccess.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,7 @@ def execute_query(self, query, param=None):
185185

186186
def get_current_status_document_node(self, file_name):
187187
query = """
188-
MATCH(d:Document {fileName : $file_name}) RETURN d.stats AS Status , d.processingTime AS processingTime,
188+
MATCH(d:Document {fileName : $file_name}) RETURN d.status AS Status , d.processingTime AS processingTime,
189189
d.nodeCount AS nodeCount, d.model as model, d.relationshipCount as relationshipCount,
190190
d.total_pages AS total_pages, d.total_chunks AS total_chunks , d.fileSize as fileSize,
191191
d.is_cancelled as is_cancelled, d.processed_chunk as processed_chunk, d.fileSource as fileSource

docker-compose.yml

+1
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ services:
6161
- CHUNK_SIZE=${CHUNK_SIZE-5242880}
6262
- ENV=${ENV-DEV}
6363
- CHAT_MODES=${CHAT_MODES-""}
64+
- BATCH_SIZE=${BATCH_SIZE-2}
6465
volumes:
6566
- ./frontend:/app
6667
- /app/node_modules

frontend/Dockerfile

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,21 @@
11
# Step 1: Build the React application
22
FROM node:20 AS build
33

4-
ARG VITE_BACKEND_API_URL="https://dev-backend-dcavk67s4a-uc.a.run.app"
4+
ARG VITE_BACKEND_API_URL="http://localhost:8000"
55
ARG VITE_REACT_APP_SOURCES=""
66
ARG VITE_LLM_MODELS=""
7-
ARG VITE_GOOGLE_CLIENT_ID="967196130891-vsu933h8nj6b6l6gfuk0nhh0pcagu0aa.apps.googleusercontent.com"
7+
ARG VITE_GOOGLE_CLIENT_ID=""
88
ARG VITE_BLOOM_URL="https://workspace-preview.neo4j.io/workspace/explore?connectURL={CONNECT_URL}&search=Show+me+a+graph&featureGenAISuggestions=true&featureGenAISuggestionsInternal=true"
99
ARG VITE_TIME_PER_CHUNK=4
1010
ARG VITE_TIME_PER_PAGE=50
1111
ARG VITE_LARGE_FILE_SIZE=5242880
1212
ARG VITE_CHUNK_SIZE=5242880
1313
ARG VITE_CHAT_MODES=""
1414
ARG VITE_ENV="DEV"
15+
ARG VITE_BATCH_SIZE=2
1516

1617
WORKDIR /app
1718
COPY package.json yarn.lock ./
18-
RUN yarn add @neo4j-nvl/base @neo4j-nvl/react
1919
RUN yarn install
2020
COPY . ./
2121
RUN VITE_BACKEND_API_URL=$VITE_BACKEND_API_URL \

frontend/example.env

+2-1
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,5 @@ TIME_PER_PAGE=50
88
CHUNK_SIZE=5242880
99
LARGE_FILE_SIZE=5242880
1010
GOOGLE_CLIENT_ID=""
11-
CHAT_MODES=""
11+
CHAT_MODES=""
12+
BATCH_SIZE=2

0 commit comments

Comments
 (0)