Merge pull request #1 from mongodb-developer/refresh-mongodb-langchain

codeSTACKr · web-flow · commit f49be975331c · 2024-05-14T08:30:05.000-05:00
Refresh langchain-mongodb
diff --git a/docs/5-vector-search/3-implementing-vector-search.mdx b/docs/5-vector-search/3-implementing-vector-search.mdx
@@ -6,8 +6,43 @@ There are two components that you will need to do to implement vector search int
 
 First, you will need to create this mathematical representation of your data. This is called vectorization. In the Create Vectors section, you will learn how to create vectors with some of the major encoders.
 
+You will use [LangChain JS](https://js.langchain.com/docs/get_started/introduction) to create vectors for your dataset. LangChain JS is a JavaScript library that allows you to create vectors from text data. You can use it to create vectors for your dataset and for your query.
+```javascript
+import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
+... 
+ await MongoDBAtlasVectorSearch.fromDocuments(
+    docs,
+    new OpenAIEmbeddings(),
+    {
+      collection,
+      indexName: "vector_index",
+      textKey: "text",
+      embeddingKey: "embedding",
+    }
+  );
+```
+
 ## Vectorize your query
 
 The second component is to vectorize your query. This is the same process as vectorizing your dataset, but instead of vectorizing a large dataset, you are vectorizing a single query.
 
 It is important to use the same encoder for both your dataset and your query. This is because the encoder learns a specific way to represent the data. If you use a different encoder, the vectors will be different and the search will not work.
+
+The query is performed by the [LangChain JS](https://js.langchain.com/docs/get_started/introduction) library. You can use the `MongoDBAtlasVectorSearch` class to perform the search.
+```javascript
+import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
+...
+const vectorStore = new MongoDBAtlasVectorSearch(
+        new OpenAIEmbeddings({
+          stripNewLines: true,
+        }), dbConfig);
+        
+const question = await req.text();
+const retriever = await vectorStore.asRetriever({
+searchType: "mmr", // Defaults to "similarity
+searchKwargs: {
+    fetchK: 20,
+    lambda: 0.1,
+},
+});
+```
diff --git a/docs/5-vector-search/4-create-embeddings.mdx b/docs/5-vector-search/4-create-embeddings.mdx
@@ -7,8 +7,8 @@ If you look at the `package.js` file, you'll see the available scripts. One is c
 ```js
 import { promises as fsp } from "fs";
 import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
-import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
-import { OpenAIEmbeddings } from "langchain/embeddings/openai";
+import { OpenAIEmbeddings } from "@langchain/openai";
+import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
 import { MongoClient } from "mongodb";
 import "dotenv/config";
 
@@ -46,7 +46,7 @@ console.log("Done: Closing Connection");
 await client.close();
 ```
 
-This script is using the `langchain` library to create embeddings for each document in the `_workshop_assets/fake_docs` folder. It's using the `RecursiveCharacterTextSplitter` to split each document into chunks of 500 characters with a 50 character overlap. It's then using the `OpenAIEmbeddings` class to create embeddings for each chunk. Finally, it's using the `MongoDBAtlasVectorSearch` class to store the embeddings in a MongoDB Atlas cluster.
+This script is using the `langchain` [library](https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-integrations/langchain-js/) to create embeddings for each document in the `_workshop_assets/fake_docs` folder. It's using the `RecursiveCharacterTextSplitter` to split each document into chunks of 500 characters with a 50 character overlap. It's then using the `OpenAIEmbeddings` class to create embeddings for each chunk. Finally, it's using the `MongoDBAtlasVectorSearch` class to store the embeddings in a MongoDB Atlas cluster.
 
 ## 🏃‍♀️ Run the script
 
diff --git a/docs/5-vector-search/5-create-index.mdx b/docs/5-vector-search/5-create-index.mdx
@@ -21,20 +21,16 @@ This time, you will use the **JSON Editor** to create your index
 <Screenshot src="img/screenshots/7-vector-search/6-create-index/3-json-editor.png" alt="The 'Create Index' page with the 'JSON Editor' tab highlighted" url="https://cloud.mongodb.com" />
 
 
-Select your database and collection, leave the index name set to `default`, and add the following code in the JSON editor:
+Select your database and collection, leave the index name set to `vector_index`, and add the following code in the JSON editor:
 
 ```js
 {
-  "mappings": {
-    "dynamic": true,
-    "fields": {
-      "embedding": {
-        "dimensions": 1536,
-        "similarity": "cosine",
-        "type": "knnVector"
-      }
-    }
-  }
+  "fields": [{
+    "type": "vector",
+    "path": "embedding",
+    "numDimensions": 1536,
+    "similarity": "cosine"
+  }]
 }
 ```
 
diff --git a/docs/5-vector-search/6-update-routes.mdx b/docs/5-vector-search/6-update-routes.mdx
@@ -9,40 +9,40 @@ The default API route for our chats can be found in `api/chat/route.ts`.
 Before we modify that route, we'll implement a new route dedicated to handling Vector Search. Create a new file `api/vectorSearch/route.ts` and add the following code:
 
 ```ts
-import { OpenAIEmbeddings } from "langchain/embeddings/openai";
-import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
+import { OpenAIEmbeddings } from "@langchain/openai";
 import mongoClientPromise from '@/app/lib/mongodb';
+import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
 
 export async function POST(req: Request) {
-  const client = await mongoClientPromise;
-  const dbName = "docs";
-  const collectionName = "embeddings";
-  const collection = client.db(dbName).collection(collectionName);
+    const client = await mongoClientPromise;
+    const dbName = "docs";
+    const collectionName = "embeddings";
+    const collection = client.db(dbName).collection(collectionName);
+
+    const dbConfig = {  
+        collection: collection,
+        indexName: "vector_index", // The name of the Atlas search index to use.
+        textKey: "text", // Field name for the raw text content. Defaults to "text".
+        embeddingKey: "embedding", // Field name for the vector embeddings. Defaults to "embedding".
+      };
+      const vectorStore = new MongoDBAtlasVectorSearch(
+        new OpenAIEmbeddings({
+          stripNewLines: true,
+        }), dbConfig);
+        
+      const question = await req.text();
+      const retriever = await vectorStore.asRetriever({
+        searchType: "mmr", // Defaults to "similarity
+        searchKwargs: {
+          fetchK: 20,
+          lambda: 0.1,
+        },
+      });
+    
+
+    const retrievedResults = await retriever.getRelevantDocuments(question)
   
-  const question = await req.text();
-
-  const vectorStore = new MongoDBAtlasVectorSearch(
-    new OpenAIEmbeddings({
-      modelName: 'text-embedding-ada-002',
-      stripNewLines: true,
-    }), {
-    collection,
-    indexName: "default",
-    textKey: "text", 
-    embeddingKey: "embedding",
-  });
-
-  const retriever = vectorStore.asRetriever({
-    searchType: "mmr",
-    searchKwargs: {
-      fetchK: 20,
-      lambda: 0.1,
-    },
-  });
-  
-  const retrieverOutput = await retriever.getRelevantDocuments(question);
-  
-  return Response.json(retrieverOutput);
+    return Response.json(retrievedResults);
 }
 ```
 
diff --git a/static/img/screenshots/7-vector-search/6-create-index/3-json-editor.png b/static/img/screenshots/7-vector-search/6-create-index/3-json-editor.png