Skip to content

Commit 9353d84

Browse files
kartikpersistentvasanthasaikallurijayanth-002abhishekkumar-27prakriti-solankey
committed
Concurrent processing of files (#665)
* Update README.md * Droped the old vector index (#652) * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * Added chatbot "hybrid " mode use case * added the concurrent file processing * page refresh scenario * fixed waiting files processing issue in refresh scenario * removed boolean param * fixed processedCount issue * checkbox with waiting check * fixed the refresh scenario with processing files * processing files check * server side error * processing file count check for processing files less than batch size * processing count check to handle allselected files * created helper functions * code improvements * __ changes (#656) * DiffbotGraphTransformer doesn't need an LLMGraphTransformer (#659) Co-authored-by: jeromechoo <[email protected]> * Removed experiments/llm-chatbot-python folder from DEV branch * redcued the password clear timeout * Removed experiments/Cypher_Queries.ipynb file from DEV branch * disabled the closed button on banner and connection dialog while API is in pending state * update delete query with entities * node id check (#663) * Status source and type filtering (#664) * status source * Name change * type change * rollback to previous working nvl version * added the alert * add BATCH_SIZE to docker * temp fixes for 0.3.1 * alert fix for less than batch size processing * new virtual env * added Hybrid Chat modes (#670) * Rename the function #657 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * Graph node filename check * env fixes with latest nvl libraries * format fixes * removed local files * Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Status source and type filtering (#664) * status source * Name change * type change * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * page refresh scenario * fixed waiting files processing issue in refresh scenario * Removed experiments/llm-chatbot-python folder from DEV branch * disabled the closed button on banner and connection dialog while API is in pending state * node id check (#663) * Status source and type filtering (#664) * status source * Name change * type change * rollback to previous working nvl version * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Status source and type filtering (#664) * status source * Name change * type change * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * property spell fix --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Jayanth T <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Jerome Choo <[email protected]> Co-authored-by: jeromechoo <[email protected]> Co-authored-by: Pravesh Kumar <[email protected]>
1 parent 709770f commit 9353d84

File tree

15 files changed

+438
-107
lines changed

15 files changed

+438
-107
lines changed

.gitignore

+4-2
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
__pycache__/
44
*.py[cod]
55
*$py.class
6-
6+
.vennv
77
# C extensions
88
*.so
99
/backend/graph
@@ -170,4 +170,6 @@ google-cloud-cli-469.0.0-linux-x86_64.tar.gz
170170
/backend/chunks
171171
google-cloud-cli-linux-x86_64.tar.gz
172172
.vennv
173-
newenv
173+
newenv
174+
files
175+

backend/src/graphDB_dataAccess.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,7 @@ def execute_query(self, query, param=None):
185185

186186
def get_current_status_document_node(self, file_name):
187187
query = """
188-
MATCH(d:Document {fileName : $file_name}) RETURN d.stats AS Status , d.processingTime AS processingTime,
188+
MATCH(d:Document {fileName : $file_name}) RETURN d.status AS Status , d.processingTime AS processingTime,
189189
d.nodeCount AS nodeCount, d.model as model, d.relationshipCount as relationshipCount,
190190
d.total_pages AS total_pages, d.total_chunks AS total_chunks , d.fileSize as fileSize,
191191
d.is_cancelled as is_cancelled, d.processed_chunk as processed_chunk, d.fileSource as fileSource

docker-compose.yml

+1
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ services:
6161
- CHUNK_SIZE=${CHUNK_SIZE-5242880}
6262
- ENV=${ENV-DEV}
6363
- CHAT_MODES=${CHAT_MODES-""}
64+
- BATCH_SIZE=${BATCH_SIZE-2}
6465
volumes:
6566
- ./frontend:/app
6667
- /app/node_modules

frontend/Dockerfile

+23-22
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,34 @@
11
# Step 1: Build the React application
22
FROM node:20 AS build
33

4-
ARG BACKEND_API_URL="http://localhost:8000"
5-
ARG REACT_APP_SOURCES=""
6-
ARG LLM_MODELS=""
7-
ARG GOOGLE_CLIENT_ID=""
8-
ARG BLOOM_URL="https://workspace-preview.neo4j.io/workspace/explore?connectURL={CONNECT_URL}&search=Show+me+a+graph&featureGenAISuggestions=true&featureGenAISuggestionsInternal=true"
9-
ARG TIME_PER_CHUNK=4
10-
ARG TIME_PER_PAGE=50
11-
ARG LARGE_FILE_SIZE=5242880
12-
ARG CHUNK_SIZE=5242880
13-
ARG CHAT_MODES=""
14-
ARG ENV="DEV"
4+
ARG VITE_BACKEND_API_URL="http://localhost:8000"
5+
ARG VITE_REACT_APP_SOURCES=""
6+
ARG VITE_LLM_MODELS=""
7+
ARG VITE_GOOGLE_CLIENT_ID=""
8+
ARG VITE_BLOOM_URL="https://workspace-preview.neo4j.io/workspace/explore?connectURL={CONNECT_URL}&search=Show+me+a+graph&featureGenAISuggestions=true&featureGenAISuggestionsInternal=true"
9+
ARG VITE_TIME_PER_CHUNK=4
10+
ARG VITE_TIME_PER_PAGE=50
11+
ARG VITE_LARGE_FILE_SIZE=5242880
12+
ARG VITE_CHUNK_SIZE=5242880
13+
ARG VITE_CHAT_MODES=""
14+
ARG VITE_ENV="DEV"
15+
ARG VITE_BATCH_SIZE=2
1516

1617
WORKDIR /app
1718
COPY package.json yarn.lock ./
18-
RUN yarn add @neo4j-nvl/base @neo4j-nvl/react
1919
RUN yarn install
2020
COPY . ./
21-
RUN BACKEND_API_URL=$BACKEND_API_URL \
22-
REACT_APP_SOURCES=$REACT_APP_SOURCES \
23-
LLM_MODELS=$LLM_MODELS \
24-
GOOGLE_CLIENT_ID=$GOOGLE_CLIENT_ID \
25-
BLOOM_URL=$BLOOM_URL \
26-
TIME_PER_CHUNK=$TIME_PER_CHUNK \
27-
CHUNK_SIZE=$CHUNK_SIZE \
28-
ENV=$ENV \
29-
LARGE_FILE_SIZE=${LARGE_FILE_SIZE} \
30-
CHAT_MODES=$CHAT_MODES \
21+
RUN VITE_BACKEND_API_URL=$VITE_BACKEND_API_URL \
22+
VITE_REACT_APP_SOURCES=$VITE_REACT_APP_SOURCES \
23+
VITE_LLM_MODELS=$VITE_LLM_MODELS \
24+
VITE_GOOGLE_CLIENT_ID=$VITE_GOOGLE_CLIENT_ID \
25+
VITE_BLOOM_URL=$VITE_BLOOM_URL \
26+
VITE_TIME_PER_CHUNK=$VITE_TIME_PER_CHUNK \
27+
VITE_CHUNK_SIZE=$VITE_CHUNK_SIZE \
28+
VITE_ENV=$VITE_ENV \
29+
VITE_LARGE_FILE_SIZE=${VITE_LARGE_FILE_SIZE} \
30+
VITE_CHAT_MODES=$VITE_CHAT_MODES \
31+
VITE_BATCH_SIZE=$VITE_BATCH_SIZE
3132
yarn run build
3233

3334
# Step 2: Serve the application using Nginx

frontend/example.env

+2-1
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,5 @@ TIME_PER_PAGE=50
88
CHUNK_SIZE=5242880
99
LARGE_FILE_SIZE=5242880
1010
GOOGLE_CLIENT_ID=""
11-
CHAT_MODES=""
11+
CHAT_MODES=""
12+
BATCH_SIZE=2

0 commit comments

Comments
 (0)