Skip to content

Commit 7e06073

Browse files
aashipandyakartikpersistentedenbuaaabhishekkumar-27vasanthasaikalluri
committed
Dev to staging (#859)
* processing count updated on cancel * format fixes * remove whitespace for enviroment variable which due to an error "xxx may not contain whitespace" (#707) * updated disconnected nodes * updated disconnected nodes * fix: Processed count update on failed condition * added disconnected and up nodes * removed __Entity__ labels * removed graph_object * removed graph object in the function * resetting the alert message on success scenario * Modified queries * populate graph schema * not clearing the password when there is error scenario * fixed the vector index loading issue * fix: empty credentials payload for recreate vector index api * chatbot status (#676) * chatbot status * connection status check for ASK button * refresh disable check * review comment resolved * format fixes * added properties and modified to entity labels * Post processing call after all files completion (#716) * Dev To STAGING (#532) * format fixes and graph schema indication fix * Update README.md * added chat modes variable in env updated the readme * spell fix * added the chat mode in env table * added the logos * fixed the overflow issues * removed the extra fix * Fixed specific scenario "when the text from schema closes it should reopen the previous modal" * readme changes * removed dev console logs * added new retrieval query (#533) * format fixes and tab rendering fix * fixed the setting modal reopen issue --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> * Dev (#535) * format fixes and graph schema indication fix * Update README.md * added chat modes variable in env updated the readme * spell fix * added the chat mode in env table * added the logos * fixed the overflow issues * removed the extra fix * Fixed specific scenario "when the text from schema closes it should reopen the previous modal" * readme changes * removed dev console logs * added new retrieval query (#533) * format fixes and tab rendering fix * fixed the setting modal reopen issue --------- Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> * Dev (#537) * format fixes and graph schema indication fix * Update README.md * added chat modes variable in env updated the readme * spell fix * added the chat mode in env table * added the logos * fixed the overflow issues * removed the extra fix * Fixed specific scenario "when the text from schema closes it should reopen the previous modal" * readme changes * removed dev console logs * added new retrieval query (#533) * format fixes and tab rendering fix * fixed the setting modal reopen issue --------- Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> * Fix typo: correct 'josn_obj' to 'json_obj' (#697) * Staging To Main (#495) * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * recent merges * pdf deletion due to out of diskspace * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * Convert is_cancelled value from string to bool * added the default page size * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * offset in chunks (#389) * page number in gcs loader (#393) * added youtube timestamps (#392) * chat pop up button (#387) * expand * minimize-icon * css changes * chat history * chatbot wider Side Nav * expand icon * chatbot UI * Delete * merge fixes * code suggestions --------- Co-authored-by: kartikpersistent <[email protected]> * chunks create before extraction using is_pre_process variable (#383) * chunks create before extraction using is_pre_process variable * Return total pages for Model * update requirement.txt * total pages on uplaod API * added the Confirmation Dialog * added the selected files into the confirmation modal * format and lint fixes * added the stop watch image * fileselection on alert dialog * Add timeout in docker for gunicorn workers * Add cancel icon to info popup (#384) * Info Modal Changes * css changes * recent merges * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * added the default page size * Convert is_cancelled value from string to bool * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * Save Total Pages in DB * Added total Pages * file selection when we didn't select anything from Main table * added the danger icon only for large files * added the overflow for more files and file selection for all new files * moved the interface to types * added the icon accoroding to the source * set total page for wiki and youtube * h3 heading * merge * updated the alert on basis if total pages * deleted chunks * polling based on total pages * isNan check * large file based on file size for s3 and gcs * file source in server side event * time calculation based on chunks for gcs and s3 --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: aashipandya <[email protected]> * fixed the layout issue * Populate graph schema (#399) * crreate new endpoint populate_graph_schema and update the query for getting lables from DB * Added main.py changes * conditionally-including-the-gcs-login-flow-in-gcs-as-source (#396) * added the condtion * removed llms * Fixed issue : Remove extra unused param * get emb only if used (#278) * Chatbot chunks (#402) * Added file name to the content sent to LLM * added chunk text in the response * increased the docs parts sent to llm * Modified graph query * mardown rendering * youtube starttime * icons * offset changes * removed the files due to codespace space issue --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user (#405) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * fixed css issue * fixed status blank issue * Modified response when no docs is retrived (#413) * Fixed env/docker-compose for local deployments + README doc (#410) * Fixed env/docker-compose for local deployments + README doc * wrong place for ENV in README * by default, removed langsmith + fixed knn score string to float * by default, removed langsmith + fixed knn score string to float * Fixed strings in docker-compose env * Added requirements (neo4j 5.15 or later, APOC, and instructions for Neo4j Desktop) * Missed the TIME_PER_PAGE env, was causing NaN issue in the approx time processing notification. fixed that * Support for all unstructured files (#401) * all unstructured files * responsiveness * added file type * added the extensions * spell mistake * ppt file changes --------- Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user with checkbox (#415) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * Extract schema using direct ChatOpenAI API and Chain * integrated the checkbox for schema to text dialog * Update SettingModal.tsx --------- Co-authored-by: Pravesh Kumar <[email protected]> * gcs file content read via storage client (#417) * gcs file content read via storage client * added the access token the file state --------- Co-authored-by: kartikpersistent <[email protected]> * pypdf2 to read files from gcs (#420) * 407 remove driver from frontend (#416) * removed driver * removed API * connecting to database on page refresh --------- Co-authored-by: kartikpersistent <[email protected]> * Css handling of info modal and Tooltips (#418) * css change * toolTips * Sidebar Tooltips * copy to clip * css change * added image types * added gcs * type fix * docker changes * speech * added the toolip for dropzone sources --------- Co-authored-by: kartikpersistent <[email protected]> * Fixed retrival bugs (#421) * yarn format fixes * changed the delete message * added the cancel button * changed the message on tooltip * added space * UI fixes * tooltip for setting * updated req * wikipedia URL input (#424) * accept only wikipedia links * added wikipedia link * added wikilink regex * wikipedia single url only * changed the alert message * wording change * pushed validation state persist error --------- Co-authored-by: aashipandya <[email protected]> * speech and copy (#422) * speech and copy * startTime * added chunk properties * tooltips --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Fixed issue for out of range in KNN API * solved conflicts * conflict solved * Remove logging info from update KNN API * tooltip changes * format and lint fixes * responsiveness changes * Fixed issue for total pages GCS, S3 * UI polishing (#428) * button and tooltip changes * checking validation on change * settings module populate fix * format fixes * opening the modal after auth success * removed the limit * added the scrobar for dropdowns * speech state (#426) * speech state * Button Details changes * delete wording change * Total pages in buckets (#431) * page number NA for buckets * added N/A for gcs and s3 pages * total pages for gcs * remove unwanted logger --------- Co-authored-by: kartikpersistent <[email protected]> * removed the max width * Update FileTable.tsx * Update the docker file * Modified prompt (#438) * Update Dockerfile * Update Dockerfile * Update Dockerfile * rendering Fix * Local file upload gcs (#442) * Uplaod file to GCS * GCS local upload fixed issue and delete file from GCS after processing and failed or cancelled * Add life cycle rule on uploaded bucket * pdf upload local and gcs bucket check * delete files when processed and extract changes --------- Co-authored-by: Pravesh Kumar <[email protected]> * Modified chat length and entities used (#443) * metadata for unstructured files (#446) * Unstructured file metadata (#447) * metadata for unstructured files * sleep in gcs upload * updated * icons added to chunks (#435) * icons added to chunks * info modal icons * Dev (#433) * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * recent merges * pdf deletion due to out of diskspace * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * Convert is_cancelled value from string to bool * added the default page size * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * offset in chunks (#389) * page number in gcs loader (#393) * added youtube timestamps (#392) * chat pop up button (#387) * expand * minimize-icon * css changes * chat history * chatbot wider Side Nav * expand icon * chatbot UI * Delete * merge fixes * code suggestions --------- Co-authored-by: kartikpersistent <[email protected]> * chunks create before extraction using is_pre_process variable (#383) * chunks create before extraction using is_pre_process variable * Return total pages for Model * update requirement.txt * total pages on uplaod API * added the Confirmation Dialog * added the selected files into the confirmation modal * format and lint fixes * added the stop watch image * fileselection on alert dialog * Add timeout in docker for gunicorn workers * Add cancel icon to info popup (#384) * Info Modal Changes * css changes * recent merges * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * added the default page size * Convert is_cancelled value from string to bool * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * Save Total Pages in DB * Added total Pages * file selection when we didn't select anything from Main table * added the danger icon only for large files * added the overflow for more files and file selection for all new files * moved the interface to types * added the icon accoroding to the source * set total page for wiki and youtube * h3 heading * merge * updated the alert on basis if total pages * deleted chunks * polling based on total pages * isNan check * large file based on file size for s3 and gcs * file source in server side event * time calculation based on chunks for gcs and s3 --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: aashipandya <[email protected]> * fixed the layout issue * Populate graph schema (#399) * crreate new endpoint populate_graph_schema and update the query for getting lables from DB * Added main.py changes * conditionally-including-the-gcs-login-flow-in-gcs-as-source (#396) * added the condtion * removed llms * Fixed issue : Remove extra unused param * get emb only if used (#278) * Chatbot chunks (#402) * Added file name to the content sent to LLM * added chunk text in the response * increased the docs parts sent to llm * Modified graph query * mardown rendering * youtube starttime * icons * offset changes * removed the files due to codespace space issue --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user (#405) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * fixed css issue * fixed status blank issue * Modified response when no docs is retrived (#413) * Fixed env/docker-compose for local deployments + README doc (#410) * Fixed env/docker-compose for local deployments + README doc * wrong place for ENV in README * by default, removed langsmith + fixed knn score string to float * by default, removed langsmith + fixed knn score string to float * Fixed strings in docker-compose env * Added requirements (neo4j 5.15 or later, APOC, and instructions for Neo4j Desktop) * Missed the TIME_PER_PAGE env, was causing NaN issue in the approx time processing notification. fixed that * Support for all unstructured files (#401) * all unstructured files * responsiveness * added file type * added the extensions * spell mistake * ppt file changes --------- Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user with checkbox (#415) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * Extract schema using direct ChatOpenAI API and Chain * integrated the checkbox for schema to text dialog * Update SettingModal.tsx --------- Co-authored-by: Pravesh Kumar <[email protected]> * gcs file content read via storage client (#417) * gcs file content read via storage client * added the access token the file state --------- Co-authored-by: kartikpersistent <[email protected]> * pypdf2 to read files from gcs (#420) * 407 remove driver from frontend (#416) * removed driver * removed API * connecting to database on page refresh --------- Co-authored-by: kartikpersistent <[email protected]> * Css handling of info modal and Tooltips (#418) * css change * toolTips * Sidebar Tooltips * copy to clip * css change * added image types * added gcs * type fix * docker changes * speech * added the toolip for dropzone sources --------- Co-authored-by: kartikpersistent <[email protected]> * Fixed retrival bugs (#421) * yarn format fixes * changed the delete message * added the cancel button * changed the message on tooltip * added space * UI fixes * tooltip for setting * updated req * wikipedia URL input (#424) * accept only wikipedia links * added wikipedia link * added wikilink regex * wikipedia single url only * changed the alert message * wording change * pushed validation state persist error --------- Co-authored-by: aashipandya <[email protected]> * speech and copy (#422) * speech and copy * startTime * added chunk properties * tooltips --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Fixed issue for out of range in KNN API * solved conflicts * conflict solved * Remove logging info from update KNN API * tooltip changes * format and lint fixes * responsiveness changes * Fixed issue for total pages GCS, S3 * UI polishing (#428) * button and tooltip changes * checking validation on change * settings module populate fix * format fixes * opening the modal after auth success * removed the limit * added the scrobar for dropdowns * speech state (#426) * speech state * Button Details changes * delete wording change * Total pages in buckets (#431) * page number NA for buckets * added N/A for gcs and s3 pages * total pages for gcs * remove unwanted logger --------- Co-authored-by: kartikpersistent <[email protected]> * removed the max width * Update FileTable.tsx * Update the docker file * Modified prompt (#438) * Update Dockerfile * Update Dockerfile * Update Dockerfile * rendering Fix * Local file upload gcs (#442) * Uplaod file to GCS * GCS local upload fixed issue and delete file from GCS after processing and failed or cancelled * Add life cycle rule on uploaded bucket * pdf upload local and gcs bucket check * delete files when processed and extract changes --------- Co-authored-by: Pravesh Kumar <[email protected]> * Modified chat length and entities used (#443) * metadata for unstructured files (#446) * Unstructured file metadata (#447) * metadata for unstructured files * sleep in gcs upload * updated * icons added to chunks (#435) * icons added to chunks * info modal icons --------- Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: Pravesh Kumar <[email protected]> Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Ajay Meena <[email protected]> Co-authored-by: Morgan Senechal <[email protected]> Co-authored-by: karanchellani <[email protected]> * fixed gcs status message issue * added if check for failed count * Null issue Fixed from backend for upload API and graph_document when model name mismatch * added word break issue * Added neo4j-rust-ext * processing time estimation based on bytes * File extension upper case fixed, File delete from GCS or local based on env variable. * timer per byte * Update Dockerfile * Adding sort rows on the table (#451) * Gcs upload folder hashed (#453) * implement foldername hashed in GCS bucket uplaod * Raise exception if invalid model selected * folder name for gcs upload --------- Co-authored-by: aashipandya <[email protected]> * upload all unstructuredfiles to gcs (#455) * Mofified chunk query (#454) * Added libre office for fixing error -- soffice command was not found. Please install libreoffice on your system and try again. - Install instructions: https://www.libreoffice.org/get-help/install-howto/ - Mac: https://formulae.brew.sh/cask/libreoffice - Debian: https://wiki.debian.org/LibreOffice" * Fix the PARTIAL CONTENT issue * File-table no data found (#456) * 'file-table'' * review comment * Llm format change (#459) * changed the llm models format to lowercase * added the error message * llm model changes * format fixes * removed unused import * added the capitalize method * delete files from merged_file_path only if source is local file --------- Co-authored-by: aashipandya <[email protected]> * commented total page code (#460) * format fixes * removed the disabled check on dropdown * Large file env * DEV to STAGING (#461) * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * recent merges * pdf deletion due to out of diskspace * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * Convert is_cancelled value from string to bool * added the default page size * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * offset in chunks (#389) * page number in gcs loader (#393) * added youtube timestamps (#392) * chat pop up button (#387) * expand * minimize-icon * css changes * chat history * chatbot wider Side Nav * expand icon * chatbot UI * Delete * merge fixes * code suggestions --------- Co-authored-by: kartikpersistent <[email protected]> * chunks create before extraction using is_pre_process variable (#383) * chunks create before extraction using is_pre_process variable * Return total pages for Model * update requirement.txt * total pages on uplaod API * added the Confirmation Dialog * added the selected files into the confirmation modal * format and lint fixes * added the stop watch image * fileselection on alert dialog * Add timeout in docker for gunicorn workers * Add cancel icon to info popup (#384) * Info Modal Changes * css changes * recent merges * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * added the default page size * Convert is_cancelled value from string to bool * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * Save Total Pages in DB * Added total Pages * file selection when we didn't select anything from Main table * added the danger icon only for large files * added the overflow for more files and file selection for all new files * moved the interface to types * added the icon accoroding to the source * set total page for wiki and youtube * h3 heading * merge * updated the alert on basis if total pages * deleted chunks * polling based on total pages * isNan check * large file based on file size for s3 and gcs * file source in server side event * time calculation based on chunks for gcs and s3 --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: aashipandya <[email protected]> * fixed the layout issue * Populate graph schema (#399) * crreate new endpoint populate_graph_schema and update the query for getting lables from DB * Added main.py changes * conditionally-including-the-gcs-login-flow-in-gcs-as-source (#396) * added the condtion * removed llms * Fixed issue : Remove extra unused param * get emb only if used (#278) * Chatbot chunks (#402) * Added file name to the content sent to LLM * added chunk text in the response * increased the docs parts sent to llm * Modified graph query * mardown rendering * youtube starttime * icons * offset changes * removed the files due to codespace space issue --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user (#405) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * fixed css issue * fixed status blank issue * Modified response when no docs is retrived (#413) * Fixed env/docker-compose for local deployments + README doc (#410) * Fixed env/docker-compose for local deployments + README doc * wrong place for ENV in README * by default, removed langsmith + fixed knn score string to float * by default, removed langsmith + fixed knn score string to float * Fixed strings in docker-compose env * Added requirements (neo4j 5.15 or later, APOC, and instructions for Neo4j Desktop) * Missed the TIME_PER_PAGE env, was causing NaN issue in the approx time processing notification. fixed that * Support for all unstructured files (#401) * all unstructured files * responsiveness * added file type * added the extensions * spell mistake * ppt file changes --------- Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user with checkbox (#415) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * Extract schema using direct ChatOpenAI API and Chain * integrated the checkbox for schema to text dialog * Update SettingModal.tsx --------- Co-authored-by: Pravesh Kumar <[email protected]> * gcs file content read via storage client (#417) * gcs file content read via storage client * added the access token the file state --------- Co-authored-by: kartikpersistent <[email protected]> * pypdf2 to read files from gcs (#420) * 407 remove driver from frontend (#416) * removed driver * removed API * connecting to database on page refresh --------- Co-authored-by: kartikpersistent <[email protected]> * Css handling of info modal and Tooltips (#418) * css change * toolTips * Sidebar Tooltips * copy to clip * css change * added image types * added gcs * type fix * docker changes * speech * added the toolip for dropzone sources --------- Co-authored-by: kartikpersistent <[email protected]> * Fixed retrival bugs (#421) * yarn format fixes * changed the delete message * added the cancel button * changed the message on tooltip * added space * UI fixes * tooltip for setting * updated req * wikipedia URL input (#424) * accept only wikipedia links * added wikipedia link * added wikilink regex * wikipedia single url only * changed the alert message * wording change * pushed validation state persist error --------- Co-authored-by: aashipandya <[email protected]> * speech and copy (#422) * speech and copy * startTime * added chunk properties * tooltips --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Fixed issue for out of range in KNN API * solved conflicts * conflict solved * Remove logging info from update KNN API * tooltip changes * format and lint fixes * responsiveness changes * Fixed issue for total pages GCS, S3 * UI polishing (#428) * button and tooltip changes * checking validation on change * settings module populate fix * format fixes * opening the modal after auth success * removed the limit * added the scrobar for dropdowns * speech state (#426) * speech state * Button Details changes * delete wording change * Total pages in buckets (#431) * page number NA for buckets * added N/A for gcs and s3 pages * total pages for gcs * remove unwanted logger --------- Co-authored-by: kartikpersistent <[email protected]> * removed the max width * Update FileTable.tsx * Update the docker file * Modified prompt (#438) * Update Dockerfile * Update Dockerfile * Update Dockerfile * rendering Fix * Local file upload gcs (#442) * Uplaod file to GCS * GCS local upload fixed issue and delete file from GCS after processing and failed or cancelled * Add life cycle rule on uploaded bucket * pdf upload local and gcs bucket check * delete files when processed and extract changes --------- Co-authored-by: Pravesh Kumar <[email protected]> * Modified chat length and entities used (#443) * metadata for unstructured files (#446) * Unstructured file metadata (#447) * metadata for unstructured files * sleep in gcs upload * updated * icons added to chunks (#435) * icons added to chunks * info modal icons * fixed gcs status message issue * added if check for failed count * Null issue Fixed from backend for upload API and graph_document when model name mismatch * added word break issue * Added neo4j-rust-ext * processing time estimation based on bytes * File extension upper case fixed, File delete from GCS or local based on env variable. * timer per byte * Update Dockerfile * Adding sort rows on the table (#451) * Gcs upload folder hashed (#453) * implement foldername hashed in GCS bucket uplaod * Raise exception if invalid model selected * folder name for gcs upload --------- Co-authored-by: aashipandya <[email protected]> * upload all unstructuredfiles to gcs (#455) * Mofified chunk query (#454) * Added libre office for fixing error -- soffice command was not found. Please install libreoffice on your system and try again. - Install instructions: https://www.libreoffice.org/get-help/install-howto/ - Mac: https://formulae.brew.sh/cask/libreoffice - Debian: https://wiki.debian.org/LibreOffice" * Fix the PARTIAL CONTENT issue * File-table no data found (#456) * 'file-table'' * review comment * Llm format change (#459) * changed the llm models format to lowercase * added the error message * llm model changes * format fixes * removed unused import * added the capitalize method * delete files from merged_file_path only if source is local file --------- Co-authored-by: aashipandya <[email protected]> * commented total page code (#460) * format fixes * removed the disabled check on dropdown * Large file env --------- Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: aashipandya <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Ajay Meena <[email protected]> Co-authored-by: Morgan Senechal <[email protected]> Co-authored-by: karanchellani <[email protected]> * DEV to STAGING (#462) * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * recent merges * pdf deletion due to out of diskspace * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * Convert is_cancelled value from string to bool * added the default page size * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * offset in chunks (#389) * page number in gcs loader (#393) * added youtube timestamps (#392) * chat pop up button (#387) * expand * minimize-icon * css changes * chat history * chatbot wider Side Nav * expand icon * chatbot UI * Delete * merge fixes * code suggestions --------- Co-authored-by: kartikpersistent <[email protected]> * chunks create before extraction using is_pre_process variable (#383) * chunks create before extraction using is_pre_process variable * Return total pages for Model * update requirement.txt * total pages on uplaod API * added the Confirmation Dialog * added the selected files into the confirmation modal * format and lint fixes * added the stop watch image * fileselection on alert dialog * Add timeout in docker for gunicorn workers * Add cancel icon to info popup (#384) * Info Modal Changes * css changes * recent merges * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * added the default page size * Convert is_cancelled value from string to bool * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * Save Total Pages in DB * Added total Pages * file selection when we didn't select anything from Main table * added the danger icon only for large files * added the overflow for more files and file selection for all new files * moved the interface to types * added the icon accoroding to the source * set total page for wiki and youtube * h3 heading * merge * updated the alert on basis if total pages * deleted chunks * polling based on total pages * isNan check * large file based on file size for s3 and gcs * file source in server side event * time calculation based on chunks for gcs and s3 --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: aashipandya <[email protected]> * fixed the layout issue * Populate graph schema (#399) * crreate new endpoint populate_graph_schema and update the query for getting lables from DB * Added main.py changes * conditionally-including-the-gcs-login-flow-in-gcs-as-source (#396) * added the condtion * removed llms * Fixed issue : Remove extra unused param * get emb only if used (#278) * Chatbot chunks (#402) * Added file name to the content sent to LLM * added chunk text in the response * increased the docs parts sent to llm * Modified graph query * mardown rendering * youtube starttime * icons * offset changes * removed the files due to codespace space issue --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user (#405) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * fixed css issue * fixed status blank issue * Modified response when no docs is retrived (#413) * Fixed env/docker-compose for local deployments + README doc (#410) * Fixed env/docker-compose for local deployments + README doc * wrong place for ENV in README * by default, removed langsmith + fixed knn score string to float * by default, removed langsmith + fixed knn score string to float * Fixed strings in docker-compose env * Added requirements (neo4j 5.15 or later, APOC, and instructions for Neo4j Desktop) * Missed the TIME_PER_PAGE env, was causing NaN issue in the approx time processing notification. fixed that * Support for all unstructured files (#401) * all unstructured files * responsiveness * added file type * added the extensions * spell mistake * ppt file changes --------- Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user with checkbox (#415) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * Extract schema using direct ChatOpenAI API and Chain * integrated the checkbox for schema to text dialog * Update SettingModal.tsx --------- Co-authored-by: Pravesh Kumar <[email protected]> * gcs file content read via storage client (#417) * gcs file content read via storage client * added the access token the file state --------- Co-authored-by: kartikpersistent <[email protected]> * pypdf2 to read files from gcs (#420) * 407 remove driver from frontend (#416) * removed driver * removed API * connecting to database on page refresh --------- Co-authored-by: kartikpersistent <[email protected]> * Css handling of info modal and Tooltips (#418) * css change * toolTips * Sidebar Tooltips * copy to clip * css change * added image types * added gcs * type fix * docker changes * speech * added the toolip for dropzone sources --------- Co-authored-by: kartikpersistent <[email protected]> * Fixed retrival bugs (#421) * yarn format fixes * changed the delete message * added the cancel button * changed the message on tooltip * added space * UI fixes * tooltip for setting * updated req * wikipedia URL input (#424) * accept only wikipedia links * added wikipedia link * added wikilink regex * wikipedia single url only * changed the alert message * wording change * pushed validation state persist error --------- Co-authored-by: aashipandya <[email protected]> * speech and copy (#422) * speech and copy * startTime * added chunk properties * tooltips --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Fixed issue for out of range in KNN API * solved conflicts * conflict solved * Remove logging info from update KNN API * tooltip changes * format and lint fixes * responsiveness changes * Fixed issue for total pages GCS, S3 * UI polishing (#428) * button and tooltip changes * checking validation on change * settings module populate fix * format fixes * opening the modal after auth success * removed the limit * added the scrobar for dropdowns * speech state (#426) * speech state * Button Details changes * delete wording change * Total pages in buckets (#431) * page number NA for buckets * added N/A for gcs and s3 pages * total pages for gcs * remove unwanted logger --------- Co-authored-by: kartikpersistent <[email protected]> * removed the max width * Update FileTable.tsx * Update the docker file * Modified prompt (#438) * Update Dockerfile * Update Dockerfile * Update Dockerfile * rendering Fix * Local file upload gcs (#442) * Uplaod file to GCS * GCS local upload fixed issue and delete file from GCS after processing and failed or cancelled * Add life cycle rule on uploaded bucket * pdf upload local and gcs bucket check * delete files when processed and extract changes --------- Co-authored-by: Pravesh Kumar <[email protected]> * Modified chat length and entities used (#443) * metadata for unstructured files (#446) * Unstructured file metadata (#447) * metadata for unstructured files * sleep in gcs upload * updated * icons added to chunks (#435) * icons added to chunks * info modal icons * fixed gcs status message issue * added if check for failed count * Null issue Fixed from backend for upload API and graph_document when model name mismatch * added word break issue * Added neo4j-rust-ext * processing time estimation based on bytes * File extension upper case fixed, File delete from GCS or local based on env variable. * timer per byte * Update Dockerfile * Adding sort rows on the table (#451) * Gcs upload folder hashed (#453) * implement foldername hashed in GCS bucket uplaod * Raise exception if invalid model selected * folder name for gcs upload --------- Co-authored-by: aashipandya <[email protected]> * upload all unstructuredfiles to gcs (#455) * Mofified chunk query (#454) * Added libre office for fixing error -- soffice command was not found. Please install libreoffice on your system and try again. - Install instructions: https://www.libreoffice.org/get-help/install-howto/ - Mac: https://formulae.brew.sh/cask/libreoffice - Debian: https://wiki.debian.org/LibreOffice" * Fix the PARTIAL CONTENT issue * File-table no data found (#456) * 'file-table'' * review comment * Llm format change (#459) * changed the llm models format to lowercase * added the error message * llm model changes * format fixes * removed unused import * added the capitalize method * delete files from merged_file_path only if source is local file --------- Co-authored-by: aashipandya <[email protected]> * commented total page code (#460) * format fixes * removed the disabled check on dropdown * Large file env --------- Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: aashipandya <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Ajay Meena <[email protected]> Co-authored-by: Morgan Senechal <[email protected]> Co-authored-by: karanchellani <[email protected]> * added upload api * changed the dropzone error message * Dev to staging (#466) * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * recent merges * pdf deletion due to out of diskspace * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * Convert is_cancelled value from string to bool * added the default page size * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * offset in chunks (#389) * page number in gcs loader (#393) * added youtube timestamps (#392) * chat pop up button (#387) * expand * minimize-icon * css changes * chat history * chatbot wider Side Nav * expand icon * chatbot UI * Delete * merge fixes * code suggestions --------- Co-authored-by: kartikpersistent <[email protected]> * chunks create before extraction using is_pre_process variable (#383) * chunks create before extraction using is_pre_process variable * Return total pages for Model * update requirement.txt * total pages on uplaod API * added the Confirmation Dialog * added the selected files into the confirmation modal * format and lint fixes * added the stop watch image * fileselection on alert dialog * Add timeout in docker for gunicorn workers * Add cancel icon to info popup (#384) * Info Modal Changes * css changes * recent merges * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- Co-authored-by: Pravesh Kumar <[email protected]> * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * added the default page size * Convert is_cancelled value from string to bool * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- Co-authored-by: kartikpersistent <[email protected]> * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- Co-authored-by: kartikpersistent <[email protected]> * Save Total Pages in DB * Added total Pages * file selection when we didn't select anything from Main table * added the danger icon only for large files * added the overflow for more files and file selection for all new files * moved the interface to types * added the icon accoroding to the source * set total page for wiki and youtube * h3 heading * merge * updated the alert on basis if total pages * deleted chunks * polling based on total pages * isNan check * large file based on file size for s3 and gcs * file source in server side event * time calculation based on chunks for gcs and s3 --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: aashipandya <[email protected]> * fixed the layout issue * Populate graph schema (#399) * crreate new endpoint populate_graph_schema and update the query for getting lables from DB * Added main.py changes * conditionally-including-the-gcs-login-flow-in-gcs-as-source (#396) * added the condtion * removed llms * Fixed issue : Remove extra unused param * get emb only if used (#278) * Chatbot chunks (#402) * Added file name to the content sent to LLM * added chunk text in the response * increased the docs parts sent to llm * Modified graph query * mardown rendering * youtube starttime * icons * offset changes * removed the files due to codespace space issue --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user (#405) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * fixed css issue * fixed status blank issue * Modified response when no docs is retrived (#413) * Fixed env/docker-compose for local deployments + README doc (#410) * Fixed env/docker-compose for local deployments + README doc * wrong place for ENV in README * by default, removed langsmith + fixed knn score string to float * by default, removed langsmith + fixed knn score string to float * Fixed strings in docker-compose env * Added requirements (neo4j 5.15 or later, APOC, and instructions for Neo4j Desktop) * Missed the TIME_PER_PAGE env, was causing NaN issue in the approx time processing notification. fixed that * Support for all unstructured files (#401) * all unstructured files * responsiveness * added file type * added the extensions * spell mistake * ppt file changes --------- Co-authored-by: kartikpersistent <[email protected]> * Settings modal to support generating the labels from the llm by using text given by user with checkbox (#415) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * Extract schema using direct ChatOpenAI API and Chain * integrated the checkbox for schema to text dialog * Update SettingModal.tsx --------- Co-authored-by: Pravesh Kumar <[email protected]> * gcs file content read via storage client (#417) * gcs file content read via storage client * added the access token the file state --------- Co-authored-by: kartikpersistent <[email protected]> * pypdf2 to read files from gcs (#420) * 407 remove driver from frontend (#416) * removed driver * removed API * connecting to database on page refresh --------- Co-authored-by: kartikpersistent <[email protected]> * Css handling of info modal and Tooltips (#418) * css change * toolTips * Sidebar Tooltips * copy to clip * css change * added image types * added gcs * type fix * docker changes * speech * added the toolip for dropzone sources --------- Co-authored-by: kartikpersistent <[email protected]> * Fixed retrival bugs (#421) * yarn format fixes * changed the delete message * added the cancel button * changed the message on tooltip * added space * UI fixes * tooltip for setting * updated req * wikipedia URL input (#424) * accept only wikipedia links * added wikipedia link * added wikilink regex * wikipedia single url only * changed the alert message * wording change * pushed validation state persist error --------- Co-authored-by: aashipandya <[email protected]> * speech and copy (#422) * speech and copy * startTime * added chunk properties * tooltips --------- Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: kartikpersistent <[email protected]> * Fixed issue for out of range in KNN API * solved conflicts * conflict solved * Remove logging info from update KNN API * tooltip changes * format and lint fixes * responsiveness changes * Fixed issue for total pages GCS, S3 * UI polishing (#428) * button and tooltip changes * checking validation on change * settings module populate fix * format fixes * opening the modal after auth success * removed the limit * added the scrobar for dropdowns * speech state (#426) * speech state * Button Details changes * delete wording change * Total pages in buckets (#431) * page number NA for buckets * added N/A for gcs and s3 pages * total pages for gcs * remove unwanted logger --------- Co-authored-by: kartikpersistent <[email protected]> * removed the max width * Update FileTable.tsx * Update the docker file * Modified prompt (#438) * Update Dockerfile * Update Dockerfile * Update Dockerfile * rendering Fix * Local file upload gcs (#442) * Uplaod file to GCS * GCS local upload fixed issue and delete file from GCS after processing and failed or cancelled * Add life cycle rule on uploaded bucket * pdf upload local and gcs bucket check * delete files when processed and extract changes --------- Co-authored-by: Pravesh Kumar <[email protected]> * Modified chat length and entities used (#443) * metadata for unstructured files (#446) * Unstructured file metadata (#447) * metadata for unstructured files * sleep in gcs upload * updated * icons added to chunks (#435) * icons added to chunks * info modal icons * fixed gcs status message issue * added if check for failed count * Null issue Fixed from backend for upload API and graph_document when model name mismatch * added word break issue * Added neo4j-rust-ext * processing time estimation based on bytes * File extension upper case fixed, File delete from GCS or local based on env variable. * timer per byte * Update Dockerfile * Adding sort rows on the table (#451) * Gcs upload folder hashed (#453) * implement foldername hashed in GCS bucket uplaod * Raise exception if invalid model selected * folder name for gcs upload --------- Co-authored-by: aashipandya <[email protected]> * upload all unstructuredfiles to gcs (#455) * Mofified chunk query (#454) * Added libre office for fixing error -- soffice command was not found. Please install libreoffice on your system and try again. - Install instructions: https://www.libreoffice.org/get-help/install-howto/ - Mac: https://formulae.brew.sh/cask/libreoffice - Debian: https://wiki.debian.org/LibreOffice" * Fix the PARTIAL CONTENT issue * File-table no data found (#456) * 'file-table'' * review comment * Llm format change (#459) * changed the llm models format to lowercase * added the error message * llm model changes * format fixes * removed unused import * added the capitalize method * delete files from merged_file_path only if source is local file --------- Co-authored-by: aashipandya <[email protected]> * commented total page code (#460) * format fixes * removed the disabled check on dropdown * Large file env * added upload api * changed the dropzone error message --------- Co-authored-by: abhishekkumar-27 <[email protected]> Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: aashipandya <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Ajay Meena <[email protected]> Co-authored-by: Morgan Senechal <[email protected]> Co-authored-by: karanchellani <142801957+karanchellani… * modified the summary creation * fixed the summary creation * Configuration change. Update LLM models and remove --preload from docker file * Retry processing (#698) * Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Remove TotalPages when save file on local (#684) * file_name reference and verify_ssl issue fixed (#683) * User flow changes for recreating supported vector index (#682) * removed the if check * Add one more check for create vector index when chunks are exist without embeddings * removed local files * condition changes * chunks exists check * chunk exists without embeddings check * vector Index issue fixed * vector index with different dimension * Update graphDB_dataAccess.py --------- Co-authored-by: Pravesh Kumar <[email protected]> * Reapply "Dockerfile changes with VITE label" This reverts commit a83e0855fbf54d2b5af009d96c4edf0bcd7ab84a. * Revert "Dockerfile changes with VITE label" This reverts commit 2840ebc9e6156c51465a9f54be72ca2d014147c2. * Concurrent processing of files (#665) * Update README.md * Droped the old vector index (#652) * added cypher_queries and llm chatbot files * updated llm-chatbot-python * added llm-chatbot-python * updated llm-chatbot-python folder * Added chatbot "hybrid " mode use case * added the concurrent file processing * page refresh scenario * fixed waiting files processing issue in refresh scenario * removed boolean param * fixed processedCount issue * checkbox with waiting check * fixed the refresh scenario with processing files * processing files check * server side error * processing file count check for processing files less than batch size * processing count check to handle allselected files * created helper functions * code improvements * __ changes (#656) * DiffbotGraphTransformer doesn't need an LLMGraphTransformer (#659) Co-authored-by: jeromechoo <[email protected]
1 parent 98da2fa commit 7e06073

File tree

19 files changed

+194
-228
lines changed

19 files changed

+194
-228
lines changed

backend/score.py

+12-32
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@
3636
from src.ragas_eval import *
3737
from starlette.types import ASGIApp, Message, Receive, Scope, Send
3838
import gzip
39-
from langchain_neo4j import Neo4jGraph
4039

4140
logger = CustomLogger()
4241
CHUNK_DIR = os.path.join(os.path.dirname(__file__), "chunks")
@@ -82,9 +81,10 @@ async def __call__(self, scope: Scope, receive: Receive, send: Send):
8281
await gzip_middleware(scope, receive, send)
8382
app = FastAPI()
8483
# SecWeb(app=app, Option={'referrer': False, 'xframe': False})
85-
# app.add_middleware(ContentSecurityPolicy, Option={'default-src': ["'self'"], 'base-uri': ["'self'"], 'block-all-mixed-content': []}, script_nonce=False, style_nonce=False, report_only=False)
84+
app.add_middleware(ContentSecurityPolicy, Option={'default-src': ["'self'"], 'base-uri': ["'self'"], 'block-all-mixed-content': []}, script_nonce=False, style_nonce=False, report_only=False)
8685
app.add_middleware(XContentTypeOptions)
8786
app.add_middleware(XFrame, Option={'X-Frame-Options': 'DENY'})
87+
#app.add_middleware(GZipMiddleware, minimum_size=1000, compresslevel=5)
8888
app.add_middleware(CustomGZipMiddleware, minimum_size=1000, compresslevel=5,paths=["/sources_list","/url/scan","/extract","/chat_bot","/chunk_entities","/get_neighbours","/graph_query","/schema","/populate_graph_schema","/get_unconnected_nodes_list","/get_duplicate_nodes","/fetch_chunktext"])
8989
app.add_middleware(
9090
CORSMiddleware,
@@ -930,6 +930,16 @@ async def fetch_chunktext(
930930
page_no: int = Form(1)
931931
):
932932
try:
933+
payload_json_obj = {
934+
'api_name': 'fetch_chunktext',
935+
'db_url': uri,
936+
'userName': userName,
937+
'database': database,
938+
'document_name': document_name,
939+
'page_no': page_no,
940+
'logging_time': formatted_time(datetime.now(timezone.utc))
941+
}
942+
logger.log_struct(payload_json_obj, "INFO")
933943
start = time.time()
934944
result = await asyncio.to_thread(
935945
get_chunktext_results,
@@ -945,8 +955,6 @@ async def fetch_chunktext(
945955
json_obj = {
946956
'api_name': 'fetch_chunktext',
947957
'db_url': uri,
948-
'userName': userName,
949-
'database': database,
950958
'document_name': document_name,
951959
'page_no': page_no,
952960
'logging_time': formatted_time(datetime.now(timezone.utc)),
@@ -963,34 +971,6 @@ async def fetch_chunktext(
963971
finally:
964972
gc.collect()
965973

966-
@app.post("/backend_connection_configuation")
967-
async def backend_connection_configuation():
968-
try:
969-
graph = Neo4jGraph()
970-
logging.info(f'login connection status of object: {graph}')
971-
if graph is not None:
972-
graph_connection = True
973-
isURI = os.getenv('NEO4J_URI')
974-
isUsername= os.getenv('NEO4J_USERNAME')
975-
isDatabase= os.getenv('NEO4J_DATABASE')
976-
isPassword= os.getenv('NEO4J_PASSWORD')
977-
encoded_password = encode_password(isPassword)
978-
graphDb_data_Access = graphDBdataAccess(graph)
979-
gds_status = graphDb_data_Access.check_gds_version()
980-
write_access = graphDb_data_Access.check_account_access(database=isDatabase)
981-
return create_api_response('Success',message=f"Backend connection successful",data={'graph_connection':graph_connection,'uri':isURI,'user_name':isUsername,'database':isDatabase,'password':encoded_password,'gds_status':gds_status,'write_access':write_access})
982-
else:
983-
graph_connection = False
984-
return create_api_response('Success',message=f"Backend connection is not successful",data=graph_connection)
985-
except Exception as e:
986-
graph_connection = False
987-
job_status = "Failed"
988-
message="Unable to connect backend DB"
989-
error_message = str(e)
990-
logging.exception(f'{error_message}')
991-
return create_api_response(job_status, message=message, error=error_message + ' or fill from the login dialog', data=graph_connection)
992-
finally:
993-
gc.collect()
994974

995975
if __name__ == "__main__":
996976
uvicorn.run(app)

backend/src/graphDB_dataAccess.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -362,7 +362,7 @@ def get_duplicate_nodes_list(self):
362362
score_value = float(os.environ.get('DUPLICATE_SCORE_VALUE'))
363363
text_distance = int(os.environ.get('DUPLICATE_TEXT_DISTANCE'))
364364
query_duplicate_nodes = """
365-
MATCH (n:!Chunk&!Session&!Document&!`__Community__`) with n
365+
MATCH (n:!Chunk&!Session&!Document&!`__Community__`&!`__Entity__`) with n
366366
WHERE n.embedding is not null and n.id is not null // and size(toString(n.id)) > 3
367367
WITH n ORDER BY count {{ (n)--() }} DESC, size(toString(n.id)) DESC // updated
368368
WITH collect(n) as nodes

backend/src/shared/constants.py

-84
Original file line numberDiff line numberDiff line change
@@ -164,90 +164,6 @@
164164
LIMIT $limit
165165
"""
166166

167-
NODEREL_COUNT_QUERY_WITH_COMMUNITY = """
168-
MATCH (d:Document)
169-
WHERE d.fileName IS NOT NULL
170-
OPTIONAL MATCH (d)<-[po:PART_OF]-(c:Chunk)
171-
OPTIONAL MATCH (c)-[he:HAS_ENTITY]->(e:__Entity__)
172-
OPTIONAL MATCH (c)-[sim:SIMILAR]->(c2:Chunk)
173-
OPTIONAL MATCH (c)-[nc:NEXT_CHUNK]->(c3:Chunk)
174-
OPTIONAL MATCH (e)-[ic:IN_COMMUNITY]->(comm:__Community__)
175-
OPTIONAL MATCH (comm)-[pc1:PARENT_COMMUNITY]->(first_level:__Community__)
176-
OPTIONAL MATCH (first_level)-[pc2:PARENT_COMMUNITY]->(second_level:__Community__)
177-
OPTIONAL MATCH (second_level)-[pc3:PARENT_COMMUNITY]->(third_level:__Community__)
178-
WITH
179-
d.fileName AS filename,
180-
count(DISTINCT c) AS chunkNodeCount,
181-
count(DISTINCT po) AS partOfRelCount,
182-
count(DISTINCT he) AS hasEntityRelCount,
183-
count(DISTINCT sim) AS similarRelCount,
184-
count(DISTINCT nc) AS nextChunkRelCount,
185-
count(DISTINCT e) AS entityNodeCount,
186-
collect(DISTINCT e) AS entities,
187-
count(DISTINCT comm) AS baseCommunityCount,
188-
count(DISTINCT first_level) AS firstlevelcommCount,
189-
count(DISTINCT second_level) AS secondlevelcommCount,
190-
count(DISTINCT third_level) AS thirdlevelcommCount,
191-
count(DISTINCT ic) AS inCommunityCount,
192-
count(DISTINCT pc1) AS parentCommunityRelCount1,
193-
count(DISTINCT pc2) AS parentCommunityRelCount2,
194-
count(DISTINCT pc3) AS parentCommunityRelCount3
195-
WITH
196-
filename,
197-
chunkNodeCount,
198-
partOfRelCount + hasEntityRelCount + similarRelCount + nextChunkRelCount AS chunkRelCount,
199-
entityNodeCount,
200-
entities,
201-
baseCommunityCount + firstlevelcommCount + secondlevelcommCount + thirdlevelcommCount AS commCount,
202-
inCommunityCount + parentCommunityRelCount1 + parentCommunityRelCount2 + parentCommunityRelCount3 AS communityRelCount
203-
CALL (entities) {
204-
UNWIND entities AS e
205-
RETURN sum(COUNT { (e)-->(e2:__Entity__) WHERE e2 in entities }) AS entityEntityRelCount
206-
}
207-
RETURN
208-
filename,
209-
COALESCE(chunkNodeCount, 0) AS chunkNodeCount,
210-
COALESCE(chunkRelCount, 0) AS chunkRelCount,
211-
COALESCE(entityNodeCount, 0) AS entityNodeCount,
212-
COALESCE(entityEntityRelCount, 0) AS entityEntityRelCount,
213-
COALESCE(commCount, 0) AS communityNodeCount,
214-
COALESCE(communityRelCount, 0) AS communityRelCount
215-
"""
216-
NODEREL_COUNT_QUERY_WITHOUT_COMMUNITY = """
217-
MATCH (d:Document)
218-
WHERE d.fileName = $document_name
219-
OPTIONAL MATCH (d)<-[po:PART_OF]-(c:Chunk)
220-
OPTIONAL MATCH (c)-[he:HAS_ENTITY]->(e:__Entity__)
221-
OPTIONAL MATCH (c)-[sim:SIMILAR]->(c2:Chunk)
222-
OPTIONAL MATCH (c)-[nc:NEXT_CHUNK]->(c3:Chunk)
223-
WITH
224-
d.fileName AS filename,
225-
count(DISTINCT c) AS chunkNodeCount,
226-
count(DISTINCT po) AS partOfRelCount,
227-
count(DISTINCT he) AS hasEntityRelCount,
228-
count(DISTINCT sim) AS similarRelCount,
229-
count(DISTINCT nc) AS nextChunkRelCount,
230-
count(DISTINCT e) AS entityNodeCount,
231-
collect(DISTINCT e) AS entities
232-
WITH
233-
filename,
234-
chunkNodeCount,
235-
partOfRelCount + hasEntityRelCount + similarRelCount + nextChunkRelCount AS chunkRelCount,
236-
entityNodeCount,
237-
entities
238-
CALL (entities) {
239-
UNWIND entities AS e
240-
RETURN sum(COUNT { (e)-->(e2:__Entity__) WHERE e2 in entities }) AS entityEntityRelCount
241-
}
242-
RETURN
243-
filename,
244-
COALESCE(chunkNodeCount, 0) AS chunkNodeCount,
245-
COALESCE(chunkRelCount, 0) AS chunkRelCount,
246-
COALESCE(entityNodeCount, 0) AS entityNodeCount,
247-
COALESCE(entityEntityRelCount, 0) AS entityEntityRelCount
248-
"""
249-
250-
251167
## CHAT SETUP
252168
CHAT_MAX_TOKENS = 1000
253169
CHAT_SEARCH_KWARG_SCORE_THRESHOLD = 0.5

backend/test_integrationqa.py

+21
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,27 @@ def test_graph_website(model_name):
137137
# print("Fail: ", e)
138138
return weburl_result
139139

140+
def test_graph_website(model_name):
141+
"""Test graph creation from a Website page."""
142+
#graph, model, source_url, source_type
143+
source_url = 'https://www.amazon.com/'
144+
source_type = 'web-url'
145+
create_source_node_graph_web_url(graph, model_name, source_url, source_type)
146+
147+
weburl_result = extract_graph_from_web_page(URI, USERNAME, PASSWORD, DATABASE, model_name, source_url, '', '')
148+
logging.info("WebUrl test done")
149+
print(weburl_result)
150+
151+
try:
152+
assert weburl_result['status'] == 'Completed'
153+
assert weburl_result['nodeCount'] > 0
154+
assert weburl_result['relationshipCount'] > 0
155+
print("Success")
156+
except AssertionError as e:
157+
print("Fail: ", e)
158+
return weburl_result
159+
160+
140161
def test_graph_from_youtube_video(model_name):
141162
"""Test graph creation from a YouTube video."""
142163
source_url = 'https://www.youtube.com/watch?v=T-qy-zPWgqA'

frontend/src/components/ChatBot/ChunkInfo.tsx

+1-1
Original file line numberDiff line numberDiff line change
@@ -268,4 +268,4 @@ const ChunkInfo: FC<ChunkProps> = ({ loading, chunks, mode }) => {
268268
</>
269269
);
270270
};
271-
export default ChunkInfo;
271+
export default ChunkInfo;

frontend/src/components/ChatBot/CommunitiesInfo.tsx

+1-1
Original file line numberDiff line numberDiff line change
@@ -79,4 +79,4 @@ const CommunitiesInfo: FC<CommunitiesProps> = ({ loading, communities, mode }) =
7979
);
8080
};
8181

82-
export default CommunitiesInfo;
82+
export default CommunitiesInfo;

frontend/src/components/ChatBot/EntitiesInfo.tsx

+1-1
Original file line numberDiff line numberDiff line change
@@ -150,4 +150,4 @@ const EntitiesInfo: FC<EntitiesProps> = ({ loading, mode, graphonly_entities, in
150150
</>
151151
);
152152
};
153-
export default EntitiesInfo;
153+
export default EntitiesInfo;

frontend/src/components/ChatBot/chatInfo.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -37,4 +37,4 @@ export const handleGraphNodeClick = async (
3737
setLoadingGraphView(false);
3838
}
3939
}
40-
};
40+
};

frontend/src/components/Content.tsx

+78-11
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,16 @@ import { Button, Typography, Flex, StatusIndicator, useMediaQuery } from '@neo4j
44
import { useCredentials } from '../context/UserCredentials';
55
import { useFileContext } from '../context/UsersFiles';
66
import { extractAPI } from '../utils/FileAPI';
7-
import { BannerAlertProps, ChildRef, ContentProps, CustomFile, OptionType, UserCredentials, chunkdata } from '../types';
7+
import {
8+
BannerAlertProps,
9+
ChildRef,
10+
ContentProps,
11+
CustomFile,
12+
OptionType,
13+
UserCredentials,
14+
chunkdata,
15+
connectionState,
16+
} from '../types';
817
import deleteAPI from '../services/DeleteFiles';
918
import { postProcessing } from '../services/PostProcessing';
1019
import { triggerStatusUpdateAPI } from '../services/ServerSideStatusUpdateAPI';
@@ -57,7 +66,16 @@ const Content: React.FC<ContentProps> = ({
5766
const [openGraphView, setOpenGraphView] = useState<boolean>(false);
5867
const [inspectedName, setInspectedName] = useState<string>('');
5968
const [documentName, setDocumentName] = useState<string>('');
60-
const { setUserCredentials, userCredentials, setConnectionStatus, isGdsActive, isReadOnlyUser } = useCredentials();
69+
const {
70+
setUserCredentials,
71+
userCredentials,
72+
connectionStatus,
73+
setConnectionStatus,
74+
isGdsActive,
75+
setGdsActive,
76+
setIsReadOnlyUser,
77+
isReadOnlyUser,
78+
} = useCredentials();
6179
const [showConfirmationModal, setshowConfirmationModal] = useState<boolean>(false);
6280
const [extractLoading, setextractLoading] = useState<boolean>(false);
6381
const [retryFile, setRetryFile] = useState<string>('');
@@ -90,12 +108,11 @@ const Content: React.FC<ContentProps> = ({
90108
setchatModes,
91109
model,
92110
} = useFileContext();
93-
const [viewPoint, setViewPoint] = useState<'tableView' | 'showGraphView' | 'chatInfoView' | 'neighborView'>(
94-
'tableView'
95-
);
111+
const [viewPoint, setViewPoint] = useState<'tableView' | 'showGraphView' | 'chatInfoView'|'neighborView'>('tableView');
96112
const [showDeletePopUp, setshowDeletePopUp] = useState<boolean>(false);
97113
const [deleteLoading, setdeleteLoading] = useState<boolean>(false);
98114

115+
99116
const { updateStatusForLargeFiles } = useServerSideEvent(
100117
(inMinutes, time, fileName) => {
101118
showNormalToast(`${fileName} will take approx ${time} ${inMinutes ? 'Min' : 'Sec'}`);
@@ -106,14 +123,55 @@ const Content: React.FC<ContentProps> = ({
106123
}
107124
);
108125
const childRef = useRef<ChildRef>(null);
109-
const incrementPage = async () => {
126+
const incrementPage = () => {
110127
setCurrentPage((prev) => prev + 1);
111-
await getChunks(documentName, currentPage + 1);
112128
};
113-
const decrementPage = async () => {
129+
const decrementPage = () => {
114130
setCurrentPage((prev) => prev - 1);
115-
await getChunks(documentName, currentPage - 1);
116131
};
132+
useEffect(() => {
133+
if (!init && !searchParams.has('connectURL')) {
134+
let session = localStorage.getItem('neo4j.connection');
135+
if (session) {
136+
let neo4jConnection = JSON.parse(session);
137+
setUserCredentials({
138+
uri: neo4jConnection.uri,
139+
userName: neo4jConnection.user,
140+
password: atob(neo4jConnection.password),
141+
database: neo4jConnection.database,
142+
port: neo4jConnection.uri.split(':')[2],
143+
});
144+
if (neo4jConnection.isgdsActive !== undefined) {
145+
setGdsActive(neo4jConnection.isgdsActive);
146+
}
147+
if (neo4jConnection.isReadOnlyUser !== undefined) {
148+
setIsReadOnlyUser(neo4jConnection.isReadOnlyUser);
149+
}
150+
} else {
151+
setOpenConnection((prev) => ({ ...prev, openPopUp: true }));
152+
}
153+
setInit(true);
154+
} else {
155+
setOpenConnection((prev) => ({ ...prev, openPopUp: true }));
156+
}
157+
}, []);
158+
useEffect(() => {
159+
if (currentPage >= 1) {
160+
(async () => {
161+
await getChunks(documentName, currentPage);
162+
})();
163+
}
164+
}, [currentPage, documentName]);
165+
useEffect(() => {
166+
setFilesData((prevfiles) => {
167+
return prevfiles.map((curfile) => {
168+
return {
169+
...curfile,
170+
model: curfile.status === 'New' || curfile.status === 'Reprocess' ? model : curfile.model,
171+
};
172+
});
173+
});
174+
}, [model]);
117175

118176
useEffect(() => {
119177
if (afterFirstRender) {
@@ -206,6 +264,15 @@ const Content: React.FC<ContentProps> = ({
206264
}
207265
toggleChunksLoading();
208266
};
267+
const getChunks = async (name: string, pageNo: number) => {
268+
toggleChunksLoading();
269+
const response = await getChunkText(userCredentials as UserCredentials, name, pageNo);
270+
setTextChunks(response.data.data.pageitems);
271+
if (!totalPageCount) {
272+
setTotalPageCount(response.data.data.total_pages);
273+
}
274+
toggleChunksLoading();
275+
};
209276
const extractData = async (uid: string, isselectedRows = false, filesTobeProcess: CustomFile[]) => {
210277
if (!isselectedRows) {
211278
const fileItem = filesData.find((f) => f.id == uid);
@@ -848,7 +915,7 @@ const Content: React.FC<ContentProps> = ({
848915
setTotalPageCount(null);
849916
}
850917
setCurrentPage(1);
851-
await getChunks(name, 1);
918+
// await getChunks(name, 1);
852919
}
853920
}}
854921
ref={childRef}
@@ -931,4 +998,4 @@ const Content: React.FC<ContentProps> = ({
931998
);
932999
};
9331000

934-
export default Content;
1001+
export default Content;

0 commit comments

Comments
 (0)