Skip to content

Commit 46e334d

Browse files
kartikpersistentpraveshkumar1988prakriti-solankeyvasanthasaikallurijayanth-002
authored
Retry processing (#698)
* Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Reapply "Dockerfile changes with VITE label" This reverts commit a83e085. * Revert "Dockerfile changes with VITE label" This reverts commit 2840ebc. * Concurrent processing of files (#665) * Update README.md * Droped the old vector index (#652) * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * Added chatbot "hybrid " mode use case * added the concurrent file processing * page refresh scenario * fixed waiting files processing issue in refresh scenario * removed boolean param * fixed processedCount issue * checkbox with waiting check * fixed the refresh scenario with processing files * processing files check * server side error * processing file count check for processing files less than batch size * processing count check to handle allselected files * created helper functions * code improvements * __ changes (#656) * DiffbotGraphTransformer doesn't need an LLMGraphTransformer (#659) Co-authored-by: jeromechoo <[email protected]> * Removed experiments/llm-chatbot-python folder from DEV branch * redcued the password clear timeout * Removed experiments/Cypher_Queries.ipynb file from DEV branch * disabled the closed button on banner and connection dialog while API is in pending state * update delete query with entities * node id check (#663) * Status source and type filtering (#664) * status source * Name change * type change * rollback to previous working nvl version * added the alert * add BATCH_SIZE to docker * temp fixes for 0.3.1 * alert fix for less than batch size processing * new virtual env * added Hybrid Chat modes (#670) * Rename the function #657 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * Graph node filename check * env fixes with latest nvl libraries * format fixes * removed local files * Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Status source and type filtering (#664) * status source * Name change * type change * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * page refresh scenario * fixed waiting files processing issue in refresh scenario * Removed experiments/llm-chatbot-python folder from DEV branch * disabled the closed button on banner and connection dialog while API is in pending state * node id check (#663) * Status source and type filtering (#664) * status source * Name change * type change * rollback to previous working nvl version * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Status source and type filtering (#664) * status source * Name change * type change * added the alert * temp fixes for 0.3.1 * label and checkboxes placement changes (#675) * label and checkboxes placement changes * checkbox placement changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * ndl changes * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * env fixes with latest nvl libraries * format fixes * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * property spell fix --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Jayanth T <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Jerome Choo <[email protected]> Co-authored-by: jeromechoo <[email protected]> Co-authored-by: Pravesh Kumar <[email protected]> * env changes * format fixes * set retry status * retry processing backend * added the retry icon on rows * vite changes in docker compose * added retry dialog * Integrated the Retry processing API * Integrated the Extract API fro retry processing * Integrated ndl toast component * replaced foreach with normal for loop for better performance * types improvements * used toast component * spell fix * Issue fixed * processing changes in main * function closing fix * retry processing issue fixed * autoclosing the retry popup on retry api success * removed the retry if check * resetting the node and relationship count on retry * added the enter key events on the popups * fixed wikipedia icon on large file alert popup * setting nodes to 0 and start from last processed chunk logic changes * Retry Popup fixes * status changes for upload failed scenario * kept condition specific * changed status to reprocess from retry * Reprocess wording changes * tooltip changes * wordings and size changes * Changed status to Reprocess --------- Co-authored-by: Pravesh Kumar <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Jayanth T <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: Jerome Choo <[email protected]> Co-authored-by: jeromechoo <[email protected]> Co-authored-by: aashipandya <[email protected]>
1 parent 7652b64 commit 46e334d

35 files changed

+1006
-739
lines changed

backend/score.py

+25-8
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,8 @@ async def extract_knowledge_graph_from_file(
137137
allowedNodes=Form(None),
138138
allowedRelationship=Form(None),
139139
language=Form(None),
140-
access_token=Form(None)
140+
access_token=Form(None),
141+
retry_condition=Form(None)
141142
):
142143
"""
143144
Calls 'extract_graph_from_file' in a new thread to create Neo4jGraph from a
@@ -161,30 +162,30 @@ async def extract_knowledge_graph_from_file(
161162
merged_file_path = os.path.join(MERGED_DIR,file_name)
162163
logging.info(f'File path:{merged_file_path}')
163164
result = await asyncio.to_thread(
164-
extract_graph_from_file_local_file, uri, userName, password, database, model, merged_file_path, file_name, allowedNodes, allowedRelationship)
165+
extract_graph_from_file_local_file, uri, userName, password, database, model, merged_file_path, file_name, allowedNodes, allowedRelationship, retry_condition)
165166

166167
elif source_type == 's3 bucket' and source_url:
167168
result = await asyncio.to_thread(
168-
extract_graph_from_file_s3, uri, userName, password, database, model, source_url, aws_access_key_id, aws_secret_access_key, allowedNodes, allowedRelationship)
169+
extract_graph_from_file_s3, uri, userName, password, database, model, source_url, aws_access_key_id, aws_secret_access_key, file_name, allowedNodes, allowedRelationship, retry_condition)
169170

170171
elif source_type == 'web-url':
171172
result = await asyncio.to_thread(
172-
extract_graph_from_web_page, uri, userName, password, database, model, source_url, allowedNodes, allowedRelationship)
173+
extract_graph_from_web_page, uri, userName, password, database, model, source_url, file_name, allowedNodes, allowedRelationship, retry_condition)
173174

174175
elif source_type == 'youtube' and source_url:
175176
result = await asyncio.to_thread(
176-
extract_graph_from_file_youtube, uri, userName, password, database, model, source_url, allowedNodes, allowedRelationship)
177+
extract_graph_from_file_youtube, uri, userName, password, database, model, source_url, file_name, allowedNodes, allowedRelationship, retry_condition)
177178

178179
elif source_type == 'Wikipedia' and wiki_query:
179180
result = await asyncio.to_thread(
180-
extract_graph_from_file_Wikipedia, uri, userName, password, database, model, wiki_query, max_sources, language, allowedNodes, allowedRelationship)
181+
extract_graph_from_file_Wikipedia, uri, userName, password, database, model, wiki_query, language, file_name, allowedNodes, allowedRelationship, retry_condition)
181182

182183
elif source_type == 'gcs bucket' and gcs_bucket_name:
183184
result = await asyncio.to_thread(
184-
extract_graph_from_file_gcs, uri, userName, password, database, model, gcs_project_id, gcs_bucket_name, gcs_bucket_folder, gcs_blob_filename, access_token, allowedNodes, allowedRelationship)
185+
extract_graph_from_file_gcs, uri, userName, password, database, model, gcs_project_id, gcs_bucket_name, gcs_bucket_folder, gcs_blob_filename, access_token, file_name, allowedNodes, allowedRelationship, retry_condition)
185186
else:
186187
return create_api_response('Failed',message='source_type is other than accepted source')
187-
188+
188189
if result is not None:
189190
result['db_url'] = uri
190191
result['api_name'] = 'extract'
@@ -626,6 +627,22 @@ async def merge_duplicate_nodes(uri=Form(), userName=Form(), password=Form(), da
626627
return create_api_response(job_status, message=message, error=error_message)
627628
finally:
628629
gc.collect()
630+
631+
@app.post("/retry_processing")
632+
async def retry_processing(uri=Form(), userName=Form(), password=Form(), database=Form(), file_name=Form(), retry_condition=Form()):
633+
try:
634+
graph = create_graph_database_connection(uri, userName, password, database)
635+
await asyncio.to_thread(set_status_retry, graph,file_name,retry_condition)
636+
#set_status_retry(graph,file_name,retry_condition)
637+
return create_api_response('Success',message=f"Status set to Reprocess for filename : {file_name}")
638+
except Exception as e:
639+
job_status = "Failed"
640+
message="Unable to set status to Retry"
641+
error_message = str(e)
642+
logging.exception(f'{error_message}')
643+
return create_api_response(job_status, message=message, error=error_message)
644+
finally:
645+
gc.collect()
629646

630647
if __name__ == "__main__":
631648
uvicorn.run(app)

backend/src/entities/source_node.py

+1
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,4 @@ class sourceNode:
2424
is_cancelled:bool=None
2525
processed_chunk:int=None
2626
access_token:str=None
27+
retry_condition:str=None

backend/src/graphDB_dataAccess.py

+7-4
Original file line numberDiff line numberDiff line change
@@ -71,10 +71,10 @@ def update_source_node(self, obj_source_node:sourceNode):
7171
if obj_source_node.processing_time is not None and obj_source_node.processing_time != 0:
7272
params['processingTime'] = round(obj_source_node.processing_time.total_seconds(),2)
7373

74-
if obj_source_node.node_count is not None and obj_source_node.node_count != 0:
74+
if obj_source_node.node_count is not None :
7575
params['nodeCount'] = obj_source_node.node_count
7676

77-
if obj_source_node.relationship_count is not None and obj_source_node.relationship_count != 0:
77+
if obj_source_node.relationship_count is not None :
7878
params['relationshipCount'] = obj_source_node.relationship_count
7979

8080
if obj_source_node.model is not None and obj_source_node.model != '':
@@ -86,11 +86,14 @@ def update_source_node(self, obj_source_node:sourceNode):
8686
if obj_source_node.total_chunks is not None and obj_source_node.total_chunks != 0:
8787
params['total_chunks'] = obj_source_node.total_chunks
8888

89-
if obj_source_node.is_cancelled is not None and obj_source_node.is_cancelled != False:
89+
if obj_source_node.is_cancelled is not None:
9090
params['is_cancelled'] = obj_source_node.is_cancelled
9191

92-
if obj_source_node.processed_chunk is not None and obj_source_node.processed_chunk != 0:
92+
if obj_source_node.processed_chunk is not None :
9393
params['processed_chunk'] = obj_source_node.processed_chunk
94+
95+
if obj_source_node.retry_condition is not None :
96+
params['retry_condition'] = obj_source_node.retry_condition
9497

9598
param= {"props":params}
9699

0 commit comments

Comments
 (0)