Skip to content

Commit f49be97

Browse files
authored
Merge pull request #1 from mongodb-developer/refresh-mongodb-langchain
Refresh langchain-mongodb
2 parents 4d1e104 + bade173 commit f49be97

File tree

5 files changed

+75
-44
lines changed

5 files changed

+75
-44
lines changed

docs/5-vector-search/3-implementing-vector-search.mdx

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,43 @@ There are two components that you will need to do to implement vector search int
66

77
First, you will need to create this mathematical representation of your data. This is called vectorization. In the Create Vectors section, you will learn how to create vectors with some of the major encoders.
88

9+
You will use [LangChain JS](https://js.langchain.com/docs/get_started/introduction) to create vectors for your dataset. LangChain JS is a JavaScript library that allows you to create vectors from text data. You can use it to create vectors for your dataset and for your query.
10+
```javascript
11+
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
12+
...
13+
await MongoDBAtlasVectorSearch.fromDocuments(
14+
docs,
15+
new OpenAIEmbeddings(),
16+
{
17+
collection,
18+
indexName: "vector_index",
19+
textKey: "text",
20+
embeddingKey: "embedding",
21+
}
22+
);
23+
```
24+
925
## Vectorize your query
1026

1127
The second component is to vectorize your query. This is the same process as vectorizing your dataset, but instead of vectorizing a large dataset, you are vectorizing a single query.
1228

1329
It is important to use the same encoder for both your dataset and your query. This is because the encoder learns a specific way to represent the data. If you use a different encoder, the vectors will be different and the search will not work.
30+
31+
The query is performed by the [LangChain JS](https://js.langchain.com/docs/get_started/introduction) library. You can use the `MongoDBAtlasVectorSearch` class to perform the search.
32+
```javascript
33+
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
34+
...
35+
const vectorStore = new MongoDBAtlasVectorSearch(
36+
new OpenAIEmbeddings({
37+
stripNewLines: true,
38+
}), dbConfig);
39+
40+
const question = await req.text();
41+
const retriever = await vectorStore.asRetriever({
42+
searchType: "mmr", // Defaults to "similarity
43+
searchKwargs: {
44+
fetchK: 20,
45+
lambda: 0.1,
46+
},
47+
});
48+
```

docs/5-vector-search/4-create-embeddings.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ If you look at the `package.js` file, you'll see the available scripts. One is c
77
```js
88
import { promises as fsp } from "fs";
99
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
10-
import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
11-
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
10+
import { OpenAIEmbeddings } from "@langchain/openai";
11+
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
1212
import { MongoClient } from "mongodb";
1313
import "dotenv/config";
1414

@@ -46,7 +46,7 @@ console.log("Done: Closing Connection");
4646
await client.close();
4747
```
4848

49-
This script is using the `langchain` library to create embeddings for each document in the `_workshop_assets/fake_docs` folder. It's using the `RecursiveCharacterTextSplitter` to split each document into chunks of 500 characters with a 50 character overlap. It's then using the `OpenAIEmbeddings` class to create embeddings for each chunk. Finally, it's using the `MongoDBAtlasVectorSearch` class to store the embeddings in a MongoDB Atlas cluster.
49+
This script is using the `langchain` [library](https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-integrations/langchain-js/) to create embeddings for each document in the `_workshop_assets/fake_docs` folder. It's using the `RecursiveCharacterTextSplitter` to split each document into chunks of 500 characters with a 50 character overlap. It's then using the `OpenAIEmbeddings` class to create embeddings for each chunk. Finally, it's using the `MongoDBAtlasVectorSearch` class to store the embeddings in a MongoDB Atlas cluster.
5050

5151
## 🏃‍♀️ Run the script
5252

docs/5-vector-search/5-create-index.mdx

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -21,20 +21,16 @@ This time, you will use the **JSON Editor** to create your index
2121
<Screenshot src="img/screenshots/7-vector-search/6-create-index/3-json-editor.png" alt="The 'Create Index' page with the 'JSON Editor' tab highlighted" url="https://cloud.mongodb.com" />
2222

2323

24-
Select your database and collection, leave the index name set to `default`, and add the following code in the JSON editor:
24+
Select your database and collection, leave the index name set to `vector_index`, and add the following code in the JSON editor:
2525

2626
```js
2727
{
28-
"mappings": {
29-
"dynamic": true,
30-
"fields": {
31-
"embedding": {
32-
"dimensions": 1536,
33-
"similarity": "cosine",
34-
"type": "knnVector"
35-
}
36-
}
37-
}
28+
"fields": [{
29+
"type": "vector",
30+
"path": "embedding",
31+
"numDimensions": 1536,
32+
"similarity": "cosine"
33+
}]
3834
}
3935
```
4036

docs/5-vector-search/6-update-routes.mdx

Lines changed: 30 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -9,40 +9,40 @@ The default API route for our chats can be found in `api/chat/route.ts`.
99
Before we modify that route, we'll implement a new route dedicated to handling Vector Search. Create a new file `api/vectorSearch/route.ts` and add the following code:
1010

1111
```ts
12-
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
13-
import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
12+
import { OpenAIEmbeddings } from "@langchain/openai";
1413
import mongoClientPromise from '@/app/lib/mongodb';
14+
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
1515

1616
export async function POST(req: Request) {
17-
const client = await mongoClientPromise;
18-
const dbName = "docs";
19-
const collectionName = "embeddings";
20-
const collection = client.db(dbName).collection(collectionName);
17+
const client = await mongoClientPromise;
18+
const dbName = "docs";
19+
const collectionName = "embeddings";
20+
const collection = client.db(dbName).collection(collectionName);
21+
22+
const dbConfig = {
23+
collection: collection,
24+
indexName: "vector_index", // The name of the Atlas search index to use.
25+
textKey: "text", // Field name for the raw text content. Defaults to "text".
26+
embeddingKey: "embedding", // Field name for the vector embeddings. Defaults to "embedding".
27+
};
28+
const vectorStore = new MongoDBAtlasVectorSearch(
29+
new OpenAIEmbeddings({
30+
stripNewLines: true,
31+
}), dbConfig);
32+
33+
const question = await req.text();
34+
const retriever = await vectorStore.asRetriever({
35+
searchType: "mmr", // Defaults to "similarity
36+
searchKwargs: {
37+
fetchK: 20,
38+
lambda: 0.1,
39+
},
40+
});
41+
42+
43+
const retrievedResults = await retriever.getRelevantDocuments(question)
2144

22-
const question = await req.text();
23-
24-
const vectorStore = new MongoDBAtlasVectorSearch(
25-
new OpenAIEmbeddings({
26-
modelName: 'text-embedding-ada-002',
27-
stripNewLines: true,
28-
}), {
29-
collection,
30-
indexName: "default",
31-
textKey: "text",
32-
embeddingKey: "embedding",
33-
});
34-
35-
const retriever = vectorStore.asRetriever({
36-
searchType: "mmr",
37-
searchKwargs: {
38-
fetchK: 20,
39-
lambda: 0.1,
40-
},
41-
});
42-
43-
const retrieverOutput = await retriever.getRelevantDocuments(question);
44-
45-
return Response.json(retrieverOutput);
45+
return Response.json(retrievedResults);
4646
}
4747
```
4848

Loading

0 commit comments

Comments
 (0)