Skip to content

Commit 23aa5b8

Browse files
kartikpersistentpraveshkumar1988prakriti-solankeyvasanthasaikallurijayanth-002
committed
Retry processing (#698)
* Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Reapply "Dockerfile changes with VITE label" This reverts commit a83e085. * Revert "Dockerfile changes with VITE label" This reverts commit 2840ebc. * Concurrent processing of files (#665) * Update README.md * Droped the old vector index (#652) * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * Added chatbot "hybrid " mode use case * added the concurrent file processing * page refresh scenario * fixed waiting files processing issue in refresh scenario * removed boolean param * fixed processedCount issue * checkbox with waiting check * fixed the refresh scenario with processing files * processing files check * server side error * processing file count check for processing files less than batch size * processing count check to handle allselected files * created helper functions * code improvements * __ changes (#656) * DiffbotGraphTransformer doesn't need an LLMGraphTransformer (#659) Co-authored-by: jeromechoo <[email protected]> * Removed experiments/llm-chatbot-python folder from DEV branch * redcued the password clear timeout * Removed experiments/Cypher_Queries.ipynb file from DEV branch * disabled the closed button on banner and connection dialog while API is in pending state * update delete query with entities * node id check (#663) * Status source and type filtering (#664) * status source * Name change * type change * rollback to previous working nvl version * added the alert * add BATCH_SIZE to docker * temp fixes for 0.3.1 * alert fix for less than batch size processing * new virtual env * added Hybrid Chat modes (#670) * Rename the function #657 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * Graph node filename check * env fixes with latest nvl libraries * format fixes * removed local files * Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Status source and type filtering (#664) * status source * Name change * type change * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * page refresh scenario * fixed waiting files processing issue in refresh scenario * Removed experiments/llm-chatbot-python folder from DEV branch * disabled the closed button on banner and connection dialog while API is in pending state * node id check (#663) * Status source and type filtering (#664) * status source * Name change * type change * rollback to previous working nvl version * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Status source and type filtering (#664) * status source * Name change * type change * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * property spell fix --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Jayanth T <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Jerome Choo <[email protected]> Co-authored-by: jeromechoo <[email protected]> Co-authored-by: Pravesh Kumar <[email protected]> * env changes * format fixes * set retry status * retry processing backend * added the retry icon on rows * vite changes in docker compose * added retry dialog * Integrated the Retry processing API * Integrated the Extract API fro retry processing * Integrated ndl toast component * replaced foreach with normal for loop for better performance * types improvements * used toast component * spell fix * Issue fixed * processing changes in main * function closing fix * retry processing issue fixed * autoclosing the retry popup on retry api success * removed the retry if check * resetting the node and relationship count on retry * added the enter key events on the popups * fixed wikipedia icon on large file alert popup * setting nodes to 0 and start from last processed chunk logic changes * Retry Popup fixes * status changes for upload failed scenario * kept condition specific * changed status to reprocess from retry * Reprocess wording changes * tooltip changes * wordings and size changes * Changed status to Reprocess --------- Co-authored-by: Pravesh Kumar <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Jayanth T <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: Jerome Choo <[email protected]> Co-authored-by: jeromechoo <[email protected]> Co-authored-by: aashipandya <[email protected]>
1 parent f8ad5fc commit 23aa5b8

34 files changed

+1027
-745
lines changed

backend/score.py

+26-8
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,9 @@ async def extract_knowledge_graph_from_file(
137137
file_name=Form(None),
138138
allowedNodes=Form(None),
139139
allowedRelationship=Form(None),
140-
language=Form(None)
140+
language=Form(None),
141+
access_token=Form(None),
142+
retry_condition=Form(None)
141143
):
142144
"""
143145
Calls 'extract_graph_from_file' in a new thread to create Neo4jGraph from a
@@ -159,30 +161,30 @@ async def extract_knowledge_graph_from_file(
159161

160162
if source_type == 'local file':
161163
result = await asyncio.to_thread(
162-
extract_graph_from_file_local_file, uri, userName, password, database, model, merged_file_path, file_name, allowedNodes, allowedRelationship)
164+
extract_graph_from_file_local_file, uri, userName, password, database, model, merged_file_path, file_name, allowedNodes, allowedRelationship, retry_condition)
163165

164166
elif source_type == 's3 bucket' and source_url:
165167
result = await asyncio.to_thread(
166-
extract_graph_from_file_s3, uri, userName, password, database, model, source_url, aws_access_key_id, aws_secret_access_key, allowedNodes, allowedRelationship)
168+
extract_graph_from_file_s3, uri, userName, password, database, model, source_url, aws_access_key_id, aws_secret_access_key, file_name, allowedNodes, allowedRelationship, retry_condition)
167169

168170
elif source_type == 'web-url':
169171
result = await asyncio.to_thread(
170-
extract_graph_from_web_page, uri, userName, password, database, model, source_url, allowedNodes, allowedRelationship)
172+
extract_graph_from_web_page, uri, userName, password, database, model, source_url, file_name, allowedNodes, allowedRelationship, retry_condition)
171173

172174
elif source_type == 'youtube' and source_url:
173175
result = await asyncio.to_thread(
174-
extract_graph_from_file_youtube, uri, userName, password, database, model, source_url, allowedNodes, allowedRelationship)
176+
extract_graph_from_file_youtube, uri, userName, password, database, model, source_url, file_name, allowedNodes, allowedRelationship, retry_condition)
175177

176178
elif source_type == 'Wikipedia' and wiki_query:
177179
result = await asyncio.to_thread(
178-
extract_graph_from_file_Wikipedia, uri, userName, password, database, model, wiki_query, max_sources, language, allowedNodes, allowedRelationship)
180+
extract_graph_from_file_Wikipedia, uri, userName, password, database, model, wiki_query, language, file_name, allowedNodes, allowedRelationship, retry_condition)
179181

180182
elif source_type == 'gcs bucket' and gcs_bucket_name:
181183
result = await asyncio.to_thread(
182-
extract_graph_from_file_gcs, uri, userName, password, database, model, gcs_project_id, gcs_bucket_name, gcs_bucket_folder, gcs_blob_filename, access_token, allowedNodes, allowedRelationship)
184+
extract_graph_from_file_gcs, uri, userName, password, database, model, gcs_project_id, gcs_bucket_name, gcs_bucket_folder, gcs_blob_filename, access_token, file_name, allowedNodes, allowedRelationship, retry_condition)
183185
else:
184186
return create_api_response('Failed',message='source_type is other than accepted source')
185-
187+
186188
if result is not None:
187189
result['db_url'] = uri
188190
result['api_name'] = 'extract'
@@ -623,6 +625,22 @@ async def merge_duplicate_nodes(uri=Form(), userName=Form(), password=Form(), da
623625
return create_api_response(job_status, message=message, error=error_message)
624626
finally:
625627
gc.collect()
628+
629+
@app.post("/retry_processing")
630+
async def retry_processing(uri=Form(), userName=Form(), password=Form(), database=Form(), file_name=Form(), retry_condition=Form()):
631+
try:
632+
graph = create_graph_database_connection(uri, userName, password, database)
633+
await asyncio.to_thread(set_status_retry, graph,file_name,retry_condition)
634+
#set_status_retry(graph,file_name,retry_condition)
635+
return create_api_response('Success',message=f"Status set to Reprocess for filename : {file_name}")
636+
except Exception as e:
637+
job_status = "Failed"
638+
message="Unable to set status to Retry"
639+
error_message = str(e)
640+
logging.exception(f'{error_message}')
641+
return create_api_response(job_status, message=message, error=error_message)
642+
finally:
643+
gc.collect()
626644

627645
if __name__ == "__main__":
628646
uvicorn.run(app)

backend/src/entities/source_node.py

+2
Original file line numberDiff line numberDiff line change
@@ -23,3 +23,5 @@ class sourceNode:
2323
language:str=None
2424
is_cancelled:bool=None
2525
processed_chunk:int=None
26+
access_token:str=None
27+
retry_condition:str=None

backend/src/graphDB_dataAccess.py

+7-4
Original file line numberDiff line numberDiff line change
@@ -68,10 +68,10 @@ def update_source_node(self, obj_source_node:sourceNode):
6868
if obj_source_node.processing_time is not None and obj_source_node.processing_time != 0:
6969
params['processingTime'] = round(obj_source_node.processing_time.total_seconds(),2)
7070

71-
if obj_source_node.node_count is not None and obj_source_node.node_count != 0:
71+
if obj_source_node.node_count is not None :
7272
params['nodeCount'] = obj_source_node.node_count
7373

74-
if obj_source_node.relationship_count is not None and obj_source_node.relationship_count != 0:
74+
if obj_source_node.relationship_count is not None :
7575
params['relationshipCount'] = obj_source_node.relationship_count
7676

7777
if obj_source_node.model is not None and obj_source_node.model != '':
@@ -83,11 +83,14 @@ def update_source_node(self, obj_source_node:sourceNode):
8383
if obj_source_node.total_chunks is not None and obj_source_node.total_chunks != 0:
8484
params['total_chunks'] = obj_source_node.total_chunks
8585

86-
if obj_source_node.is_cancelled is not None and obj_source_node.is_cancelled != False:
86+
if obj_source_node.is_cancelled is not None:
8787
params['is_cancelled'] = obj_source_node.is_cancelled
8888

89-
if obj_source_node.processed_chunk is not None and obj_source_node.processed_chunk != 0:
89+
if obj_source_node.processed_chunk is not None :
9090
params['processed_chunk'] = obj_source_node.processed_chunk
91+
92+
if obj_source_node.retry_condition is not None :
93+
params['retry_condition'] = obj_source_node.retry_condition
9194

9295
param= {"props":params}
9396

0 commit comments

Comments
 (0)