Skip to content

Commit ca86f26

Browse files
authored
Restructure examples folder (#146)
* Structure proposal * Backup old examples in a specific folder (tmp) * WIP: example folder structure refactoring * ruff * Add result formatter example * LLM examples * MistralAILLM example + doc * Simple KG builder example * Embeder examples * Weaviate example * Fix import for cohere embeddings * Format * Update README with links to new files * Move Pinecone examples * Can't remove this file yet - but remove link to this specific file from doc - need to keep the file until the next release but then remove * Pinecone + cleaning * Cleaning 'old' folder * Components examples * Test and harmonize retriever section * Deal with qdrant examples - add custom component * Nicer path definition * Mypy/ruff * Rename answer -> QA + add links * Use pre_filters variable for explicitness * ruff * ruff * Missing files for db operations * Fix openai example * Fix CI * :'(
1 parent 47312c0 commit ca86f26

File tree

87 files changed

+3299
-888
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

87 files changed

+3299
-888
lines changed

docs/source/user_guide_kg_builder.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ A Knowledge Graph (KG) construction pipeline requires a few components:
3333
This package contains the interface and implementations for each of these components, which are detailed in the following sections.
3434

3535
To see an end-to-end example of a Knowledge Graph construction pipeline,
36-
refer to `this example <https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/pipeline/kg_builder.py>`_.
36+
refer to the `example folder <https://github.com/neo4j/neo4j-graphrag-python/blob/main/examples/>`_ in the project GitHub repository.
3737

3838
**********************************
3939
Knowledge Graph Builder Components

docs/source/user_guide_rag.rst

+26
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ If OpenAI cannot be used directly, there are a few available alternatives:
7878
- Use Azure OpenAI (GPT...).
7979
- Use Google VertexAI (Gemini...).
8080
- Use Anthropic LLM (Claude...).
81+
- Use Mistral LLM
8182
- Use Cohere.
8283
- Use a local Ollama model.
8384
- Implement a custom interface.
@@ -164,6 +165,31 @@ To use Anthropic, instantiate the `AnthropicLLM` class:
164165
See :ref:`anthropicllm`.
165166

166167

168+
Using MistralAI LLM
169+
-------------------
170+
171+
To use MistralAI, instantiate the `MistralAILLM` class:
172+
173+
.. code:: python
174+
175+
from neo4j_graphrag.llm import MistralAILLM
176+
177+
llm = MistralAILLM(
178+
model_name="mistral-small-latest",
179+
api_key=api_key, # can also set `MISTRAL_API_KEY` in env vars
180+
)
181+
llm.invoke("say something")
182+
183+
184+
.. note::
185+
186+
In order to run this code, the `mistralai` Python package needs to be installed:
187+
`pip install mistralai`
188+
189+
See :ref:`mistralaillm`.
190+
191+
192+
167193
Using Cohere LLM
168194
----------------
169195

examples/README.md

+132
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Examples Index
2+
3+
This folder contains examples usage for the different features
4+
supported by the `neo4j-graphrag` package:
5+
6+
- [Build Knowledge Graph](#build-knowledge-graph) from PDF or text
7+
- [Retrieve](#retrieve) information from the graph
8+
- [Question Answering](#answer-graphrag) (Q&A)
9+
10+
Each of these steps have many customization options which
11+
are listed in [the last section of this file](#customize).
12+
13+
## Build Knowledge Graph
14+
15+
- [End to end PDF to graph simple pipeline](build_graph/simple_kg_builder_from_pdf.py)
16+
- [End to end text to graph simple pipeline](build_graph/simple_kg_builder_from_text.py)
17+
18+
19+
## Retrieve
20+
21+
- [Retriever from an embedding vector](retrieve/similarity_search_for_vector.py)
22+
- [Retriever from a text](retrieve/similarity_search_for_text.py)
23+
- [Graph-based retrieval with VectorCypherRetriever](retrieve/vector_cypher_retriever.py)
24+
- [Hybrid retriever](./retrieve/hybrid_retriever.py)
25+
- [Hybrid Cypher retriever](./retrieve/hybrid_cypher_retriever.py)
26+
- [Text2Cypher retriever](./retrieve/text2cypher_search.py)
27+
28+
29+
### External Retrievers
30+
31+
#### Weaviate
32+
33+
- [Vector search](customize/retrievers/external/weaviate/weaviate_vector_search.py)
34+
- [Text search with local embeder](customize/retrievers/external/weaviate/weaviate_text_search_local_embedder.py)
35+
- [Text search with remote embeder](customize/retrievers/external/weaviate/weaviate_text_search_remote_embedder.py)
36+
37+
#### Pinecone
38+
39+
- [Vector search](./customize/retrievers/external/pinecone/pinecone_vector_search.py)
40+
- [Text search](./customize/retrievers/external/pinecone/pinecone_text_search.py)
41+
42+
43+
### Qdrant
44+
45+
- [Vector search](./customize/retrievers/external/qdrant/qdrant_vector_search.py)
46+
- [Text search](./customize/retrievers/external/qdrant/qdrant_text_search.py)
47+
48+
49+
## Answer: GraphRAG
50+
51+
- [End to end GraphRAG](./answer/graphrag.py)
52+
53+
54+
## Customize
55+
56+
### Retriever
57+
58+
- [Control result format for VectorRetriever](customize/retrievers/result_formatter_vector_retriever.py)
59+
- [Control result format for VectorCypherRetriever](customize/retrievers/result_formatter_vector_cypher_retriever.py)
60+
61+
62+
### LLMs
63+
64+
- [OpenAI (GPT)](./customize/llms/openai_llm.py)
65+
- [Azure OpenAI]()
66+
- [VertexAI (Gemini)](./customize/llms/vertexai_llm.py)
67+
- [MistralAI](./customize/llms/mistalai_llm.py)
68+
- [Cohere](./customize/llms/cohere_llm.py)
69+
- [Anthropic (Claude)](./customize/llms/anthropic_llm.py)
70+
- [Ollama]()
71+
- [Custom LLM](./customize/llms/custom_llm.py)
72+
73+
74+
### Prompts
75+
76+
- [Using a custom prompt](old/graphrag_custom_prompt.py)
77+
78+
79+
### Embedders
80+
81+
- [OpenAI](./customize/embeddings/openai_embeddings.py)
82+
- [Azure OpenAI](./customize/embeddings/azure_openai_embeddings.py)
83+
- [VertexAI](./customize/embeddings/vertexai_embeddings.py)
84+
- [MistralAI](./customize/embeddings/mistalai_embeddings.py)
85+
- [Cohere](./customize/embeddings/cohere_embeddings.py)
86+
- [Ollama](./customize/embeddings/ollama_embeddings.py)
87+
- [Custom LLM](./customize/embeddings/custom_embeddings.py)
88+
89+
90+
### KG Construction - Pipeline
91+
92+
- [End to end example with explicit components and text input](./customize/build_graph/pipeline/kg_builder_from_text.py)
93+
- [End to end example with explicit components and PDF input](./customize/build_graph/pipeline/kg_builder_from_pdf.py)
94+
95+
#### Components
96+
97+
- Loaders:
98+
- [Load PDF file](./customize/build_graph/components/loaders/pdf_loader.py)
99+
- [Custom](./customize/build_graph/components/loaders/custom_loader.py)
100+
- Text Splitter:
101+
- [Fixed size splitter](./customize/build_graph/components/splitters/fixed_size_splitter.py)
102+
- [Splitter from LangChain](./customize/build_graph/components/splitters/langhchain_splitter.py)
103+
- [Splitter from LLamaIndex](./customize/build_graph/components/splitters/llamaindex_splitter.py)
104+
- [Custom](./customize/build_graph/components/splitters/custom_splitter.py)
105+
- [Chunk embedder]()
106+
- Schema Builder:
107+
- [User-defined](./customize/build_graph/components/schema_builders/schema.py)
108+
- Entity Relation Extractor:
109+
- [LLM-based](./customize/build_graph/components/extractors/llm_entity_relation_extractor.py)
110+
- [LLM-based with custom prompt](./customize/build_graph/components/extractors/llm_entity_relation_extractor_with_custom_prompt.py)
111+
- [Custom](./customize/build_graph/components/extractors/custom_extractor.py)
112+
- Knowledge Graph Writer:
113+
- [Neo4j writer](./customize/build_graph/components/writers/neo4j_writer.py)
114+
- [Custom](./customize/build_graph/components/writers/custom_writer.py)
115+
- Entity Resolver:
116+
- [SinglePropertyExactMatchResolver](./customize/build_graph/components/resolvers/simple_entity_resolver.py)
117+
- [SinglePropertyExactMatchResolver with pre-filter](./customize/build_graph/components/resolvers/simple_entity_resolver_pre_filter.py)
118+
- [Custom resolver](./customize/build_graph/components/resolvers/custom_resolver.py)
119+
- [Custom component](./customize/build_graph/components/custom_component.py)
120+
121+
122+
### Answer: GraphRAG
123+
124+
- [LangChain compatibility](./customize/answer/langchain_compatiblity.py)
125+
- [Use a custom prompt](./customize/answer/custom_prompt.py)
126+
127+
128+
## Database Operations
129+
130+
- [Create vector index](database_operations/create_vector_index.py)
131+
- [Create full text index](create_fulltext_index.py)
132+
- [Populate vector index](populate_vector_index.py)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
"""This example illustrates how to get started easily with the SimpleKGPipeline
2+
and ingest PDF into a Neo4j Knowledge Graph.
3+
4+
This example assumes a Neo4j db is up and running. Update the credentials below
5+
if needed.
6+
7+
OPENAI_API_KEY needs to be in the env vars.
8+
"""
9+
10+
import asyncio
11+
from pathlib import Path
12+
13+
import neo4j
14+
from neo4j_graphrag.embeddings import OpenAIEmbeddings
15+
from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline
16+
from neo4j_graphrag.experimental.pipeline.pipeline import PipelineResult
17+
from neo4j_graphrag.llm import LLMInterface
18+
from neo4j_graphrag.llm.openai_llm import OpenAILLM
19+
20+
# Neo4j db infos
21+
URI = "neo4j://localhost:7687"
22+
AUTH = ("neo4j", "password")
23+
DATABASE = "neo4j"
24+
25+
26+
root_dir = Path(__file__).parents[4]
27+
file_path = root_dir / "data" / "Harry Potter and the Chamber of Secrets Summary.pdf"
28+
29+
30+
# Instantiate Entity and Relation objects. This defines the
31+
# entities and relations the LLM will be looking for in the text.
32+
ENTITIES = ["Person", "Organization", "Location"]
33+
RELATIONS = ["SITUATED_AT", "INTERACTS", "LED_BY"]
34+
POTENTIAL_SCHEMA = [
35+
("Person", "SITUATED_AT", "Location"),
36+
("Person", "INTERACTS", "Person"),
37+
("Organization", "LED_BY", "Person"),
38+
]
39+
40+
41+
async def define_and_run_pipeline(
42+
neo4j_driver: neo4j.Driver,
43+
llm: LLMInterface,
44+
) -> PipelineResult:
45+
# Create an instance of the SimpleKGPipeline
46+
kg_builder = SimpleKGPipeline(
47+
llm=llm,
48+
driver=neo4j_driver,
49+
embedder=OpenAIEmbeddings(),
50+
entities=ENTITIES,
51+
relations=RELATIONS,
52+
potential_schema=POTENTIAL_SCHEMA,
53+
)
54+
return await kg_builder.run_async(file_path=str(file_path))
55+
56+
57+
async def main() -> PipelineResult:
58+
llm = OpenAILLM(
59+
model_name="gpt-4o",
60+
model_params={
61+
"max_tokens": 2000,
62+
"response_format": {"type": "json_object"},
63+
},
64+
)
65+
with neo4j.GraphDatabase.driver(URI, auth=AUTH, database=DATABASE) as driver:
66+
res = await define_and_run_pipeline(driver, llm)
67+
await llm.async_client.close()
68+
return res
69+
70+
71+
if __name__ == "__main__":
72+
res = asyncio.run(main())
73+
print(res)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
"""This example illustrates how to get started easily with the SimpleKGPipeline
2+
and ingest text into a Neo4j Knowledge Graph.
3+
4+
This example assumes a Neo4j db is up and running. Update the credentials below
5+
if needed.
6+
"""
7+
8+
import asyncio
9+
10+
import neo4j
11+
from neo4j_graphrag.embeddings import OpenAIEmbeddings
12+
from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline
13+
from neo4j_graphrag.experimental.pipeline.pipeline import PipelineResult
14+
from neo4j_graphrag.llm import LLMInterface
15+
from neo4j_graphrag.llm.openai_llm import OpenAILLM
16+
17+
# Neo4j db infos
18+
URI = "neo4j://localhost:7687"
19+
AUTH = ("neo4j", "password")
20+
DATABASE = "neo4j"
21+
22+
# Text to process
23+
TEXT = """The son of Duke Leto Atreides and the Lady Jessica, Paul is the heir of House Atreides,
24+
an aristocratic family that rules the planet Caladan."""
25+
26+
# Instantiate Entity and Relation objects. This defines the
27+
# entities and relations the LLM will be looking for in the text.
28+
ENTITIES = ["Person", "House", "Planet"]
29+
RELATIONS = ["PARENT_OF", "HEIR_OF", "RULES"]
30+
POTENTIAL_SCHEMA = [
31+
("Person", "PARENT_OF", "Person"),
32+
("Person", "HEIR_OF", "House"),
33+
("House", "RULES", "Planet"),
34+
]
35+
36+
37+
async def define_and_run_pipeline(
38+
neo4j_driver: neo4j.Driver,
39+
llm: LLMInterface,
40+
) -> PipelineResult:
41+
# Create an instance of the SimpleKGPipeline
42+
kg_builder = SimpleKGPipeline(
43+
llm=llm,
44+
driver=neo4j_driver,
45+
embedder=OpenAIEmbeddings(),
46+
entities=ENTITIES,
47+
relations=RELATIONS,
48+
potential_schema=POTENTIAL_SCHEMA,
49+
from_pdf=False,
50+
)
51+
return await kg_builder.run_async(text=TEXT)
52+
53+
54+
async def main() -> PipelineResult:
55+
llm = OpenAILLM(
56+
model_name="gpt-4o",
57+
model_params={
58+
"max_tokens": 2000,
59+
"response_format": {"type": "json_object"},
60+
},
61+
)
62+
with neo4j.GraphDatabase.driver(URI, auth=AUTH, database=DATABASE) as driver:
63+
res = await define_and_run_pipeline(driver, llm)
64+
await llm.async_client.close()
65+
return res
66+
67+
68+
if __name__ == "__main__":
69+
res = asyncio.run(main())
70+
print(res)

examples/graphrag_custom_prompt.py renamed to examples/customize/answer/custom_prompt.py

+1-15
Original file line numberDiff line numberDiff line change
@@ -8,31 +8,18 @@
88
- Logging configuration
99
"""
1010

11-
import logging
12-
1311
import neo4j
1412
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
1513
from neo4j_graphrag.generation import GraphRAG, RagTemplate
1614
from neo4j_graphrag.llm import OpenAILLM
1715
from neo4j_graphrag.retrievers import VectorCypherRetriever
18-
from neo4j_graphrag.types import RetrieverResultItem
1916

2017
URI = "neo4j://localhost:7687"
2118
AUTH = ("neo4j", "password")
2219
DATABASE = "neo4j"
2320
INDEX = "moviePlotsEmbedding"
2421

2522

26-
# setup logger config
27-
logger = logging.getLogger("neo4j_graphrag")
28-
logging.basicConfig(format="%(asctime)s - %(message)s")
29-
logger.setLevel(logging.DEBUG)
30-
31-
32-
def formatter(record: neo4j.Record) -> RetrieverResultItem:
33-
return RetrieverResultItem(content=f'{record.get("title")}: {record.get("plot")}')
34-
35-
3623
driver = neo4j.GraphDatabase.driver(
3724
URI,
3825
auth=AUTH,
@@ -44,8 +31,7 @@ def formatter(record: neo4j.Record) -> RetrieverResultItem:
4431
retriever = VectorCypherRetriever(
4532
driver,
4633
index_name=INDEX,
47-
retrieval_query="with node, score return node.title as title, node.plot as plot",
48-
result_formatter=formatter,
34+
retrieval_query="WITH node, score RETURN node.title as title, node.plot as plot",
4935
embedder=embedder,
5036
)
5137

0 commit comments

Comments
 (0)