NotebookLLM 是一个强大的开源、AI驱动的 Notebook 系统,可以本地化部署,尊重您的隐私。
功能包括:
- AI + 笔记,内容集成支持 link, PDF, TXT, PPT, YouTube。
- 本地部署 LLM 系统, 同时兼容 OpenAI 接口的 LLM。
- 播客生成器,支持本地部署的 LLM 系统。
- 支持中英文内容笔记,以及中英文播客生成以及丰富播客声音。
- Jun, 独立开发者
- 基于开源 open-notebook 二次开发, 同时集成可以在本地的部署 LLM。
- 播客生成采用 podcastfy 二次开发。
- 本地部署 ipex-llm LLM 是基于 ipex-llm 框架和 FastAPI 构建,同时兼容 OpenAI 接口, 支持接口如下:
notebookllm 架构图如下:
功能 | OpenAI 接口 | OpenAI 模型 | ipex-llm 模型和描述 | 状态 |
---|---|---|---|---|
对话 | /v1/chat/completions | gpt-4o/gpt-4o-mini | Qwen/Qwen2.5-7B-Instruct | 完成 |
向量嵌入 | /v1/embeddings | text-embedding-3-small/large | maidalun1020/bce-embedding-base_v1 | 完成 |
Text to Speech | /v1/audio/speech | tts-1/tts-1-hd | microsoft/speecht5_tts, hexgrad/Kokoro-82M | 完成 |
生成播客内容脚本 | - | - | 集成新 ipex-llm llm openai 兼容 provider 生成播客内容脚本 | 完成 |
生成播客声音 | - | - | 集成 podcastfy 生成播客声音 | 完成 |
Text To Speech 现在是基于微软的 speecht5_tts 模型和 hexgrad/Kokoro-82M 模型开发。
在 open-notebook 的 UI 界面, 增加 ipex-llm provider 的 UI 界面, 这个可以作为优化项来开发, 目前做法是用 OpenAI provider, 调整 OPENAI_BASE_URL 为 ipex-llm LLM 的地址
本地部署 ipex-llm LLM 是基于 ipex-llm 框架和 FastAPI 构建,同时兼容 OpenAI 接口的,可以本地部署 LLM 系统。其目录在 llm 目录下,目录结构和代码如下:
├── Dockerfile
├── Makefile
├── __init__.py
├── config.py
├── kokoro_tts.py
├── requirements.txt
├── service.py
└── tts.py
在 llm 目录下,有 Makefile 文件
make help
Usage:
make <target>
General
help Display this help.
image-llm Build docker image with the llm.
push-image-llm Push llm images.
执行打包 ipex-llm 镜像和推送到阿里云镜像仓库
GIT_VERSION=v0.0.4 make image-llm
GIT_VERSION=v0.0.4 make push-image-llm
用 huggingface cli 下载模型到 /models 目录下
## Qwen2.5-7B-Instruct model
huggingface-cli download --resume-download Qwen/Qwen2.5-7B-Instruct --local-dir /models/Qwen/Qwen2.5-7B-Instruct
## bce-embedding-base_v1 model
huggingface-cli download --resume-download maidalun1020/bce-embedding-base_v1 --local-dir /models/maidalun1020/bce-embedding-base_v1
## speecht5_tts model
huggingface-cli download --resume-download microsoft/speecht5_tts --local-dir /models/microsoft/speecht5_tts
## speecht5_hifigan model
huggingface-cli download --resume-download microsoft/speecht5_hifigan --local-dir /models/microsoft/speecht5_hifigan
## Kokoro Voice model
huggingface-cli download --resume-download hexgrad/Kokoro-82M --local-dir /models/hexgrad/Kokoro-82M
在 docker 目录下,有 docker-compose.yaml 文件
services:
llm:
image: registry.cn-hangzhou.aliyuncs.com/2456868764/ipex-llm:v0.0.3
ports:
- "9000:9000"
volumes:
- /models:/models
pull_policy: always
surrealdb:
image: registry.cn-hangzhou.aliyuncs.com/2456868764/surrealdb:v2
ports:
- "8000:8000"
volumes:
- ./surreal_data:/mydata
command: start --user root --pass root rocksdb:/mydata/mydatabase.db
pull_policy: always
user: root
open_notebook:
image: registry.cn-hangzhou.aliyuncs.com/2456868764/open_notebook:0.1.1
ports:
- "8080:8502"
environment:
- OPENAI_BASE_URL=http://llm:9000/v1
- OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxx
- SURREAL_ADDRESS=surrealdb
- SURREAL_PORT=8000
- SURREAL_USER=root
- SURREAL_PASS=root
- SURREAL_NAMESPACE=open_notebook
- SURREAL_DATABASE=open_notebook
depends_on:
- surrealdb
pull_policy: always
volumes:
- ./notebook_data:/app/data
networks:
default:
driver: bridge
启动容器执行
docker-compose up -d
测试在阿里云 ECS:
- 实例规格: ecs.c8i.3xlarge
- CPU&内存: 12核(vCPU) 24 GiB
- 处理器:采用Intel®Xeon®Emerald Rapids或者Intel®Xeon®Sapphire Rapids,主频不低于2.7 GHz,全核睿频3.2 GHz,计算性能稳定。
容器启动后,在浏览器中打开 http://localhost:8080/ 即可访问 WebUI
http://127.0.0.1:8080/Notebooks
http://127.0.0.1:8080/Ask_and_Search
http://127.0.0.1:8080/Podcasts
curl -X 'POST' \
'http://localhost:9000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
"model": "qwen2.5-7b-instruct",
"messages": [
{
"role": "user",
"content": "你好,请介绍一下自己"
}
],
"temperature": 0.7,
"max_tokens": 100,
"stream": true
}'
{"id":"chatcmpl-1736399818","object":"chat.completion","created":1736399818,"model":"qwen2.5-7b-instruct","choices":[{"index":0,"message":{"role":"assistant","content":"你好!我是Qwen,由阿里云开发的人工智能模型。我能够帮助用户生成各种类型的文本,如文章、故事、诗歌、故事等,并且也可以回答问题、参与对话等。很高兴为你服务!如果你有任何问题或需要帮助,随时都可以向我提问。"},"finish_reason":"stop"}],"usage":{"prompt_tokens":33,"completion_tokens":61,"total_tokens":94}}
curl -X POST \
http://localhost:9000/v1/embeddings \
-H 'Content-Type: application/json' \
-d '{
"model": "text-embedding-3-small",
"input": [
"第一个测试句子"
]
}'
{"object":"list","data":[{"object":"embedding","embedding":[-0.0017389243002980947,-0.02127007581293583,-0.0169060360640...],"index":0}],"model":"text-embedding-3-small","usage":{"prompt_tokens":7,"total_tokens":7}}
curl -X POST \
'http://localhost:9000/v1/audio/speech' \
-H 'Content-Type: application/json' \
-d '{
"model": "tts-1",
"input": "Istio is an open source service mesh that layers transparently onto existing distributed applications. Istio’s powerful features provide a uniform and more efficient way to secure, connect, and monitor services. Istio is the path to load balancing, service-to-service authentication, and monitoring – with few or no service code changes. ",
"voice": "echo",
"response_format": "mp3"
}' \
--output speech_echo.mp3
podcastfy 二次开发代码 fork 到个人 github 的 podcastfy feat-openai 分支下, 调用本地部署 llm 生成播客 transcript 测试代码 如下:
from podcastfy.client import generate_podcast
import os
if __name__ == "__main__":
text = """Istio is an open source service mesh that layers transparently onto existing distributed applications. Istio’s powerful features provide a uniform and more efficient way to secure, connect, and monitor services. Istio is the path to load balancing, service-to-service authentication, and monitoring – with few or no service code changes. It gives you:
Secure service-to-service communication in a cluster with mutual TLS encryption, strong identity-based authentication and authorization
Automatic load balancing for HTTP, gRPC, WebSocket, and TCP traffic
Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection
A pluggable policy layer and configuration API supporting access controls, rate limits and quotas
Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and egress
Istio is designed for extensibility and can handle a diverse range of deployment needs. Istio’s control plane runs on Kubernetes, and you can add applications deployed in that cluster to your mesh, extend the mesh to other clusters, or even connect VMs or other endpoints running outside of Kubernetes.
A large ecosystem of contributors, partners, integrations, and distributors extend and leverage Istio for a wide variety of scenarios. You can install Istio yourself, or a large number of vendors have products that integrate Istio and manage it for you.
How it works
Istio uses a proxy to intercept all your network traffic, allowing a broad set of application-aware features based on configuration you set.
The control plane takes your desired configuration, and its view of the services, and dynamically programs the proxy servers, updating them as the rules or the environment changes.
The data plane is the communication between services. Without a service mesh, the network doesn’t understand the traffic being sent over, and can’t make any decisions based on what type of traffic it is, or who it is from or to.
Istio supports two data plane modes:
sidecar mode, which deploys an Envoy proxy along with each pod that you start in your cluster, or running alongside services running on VMs.
ambient mode, which uses a per-node Layer 4 proxy, and optionally a per-namespace Envoy proxy for Layer 7 features."""
os.environ['OPENAI_API_KEY'] = 'sk-xxxx'
os.environ['OPENAI_BASE_URL'] = "http://127.0.0.1:9000/v1"
transcript_file = generate_podcast(text=text, tts_model="openai", api_key_label="OPENAI_API_KEY",
llm_model_name="gpt-4o", transcript_only=True)
print(transcript_file)
生成播客内容脚本如下:
Welcome to PODCASTIFY - Your Personal Generative AI Podcast. Today, we're diving into the world of Istio, an open-source service mesh that simplifies and enhances the management of distributed applications. Person1, what do you think is the most exciting feature of Istio?
<Person1> Oh wow, I mean, look at the power it packs! Imagine having secure service-to-service communication right out of the box with mutual TLS encryption. That’s like having a digital handshake that confirms everyone’s identity before they exchange data. It’s not just about connecting services; it’s about making sure those connections are secure.
</Person1><Person2> Absolutely, Person1! And let’s not forget automatic load balancing. This is a game-changer for handling high volumes of traffic without needing to rewrite any code. It’s like having a traffic cop directing cars through a busy city intersection without ever slowing down the flow.
</Person2><Person1> Right! And with fine-grained control over traffic behavior—think of it as having the power to reroute traffic based on rules. It’s like having a smart navigation system that knows when to take you a different route to avoid traffic jams. And let’s talk about the pluggable policy layer. This means you can tailor Istio to fit your exact needs, whether it’s controlling access, setting rate limits, or managing quotas.
</Person1><Person2> Exactly! And the best part? All of this comes with automatic metrics, logs, and traces for monitoring everything that happens within your cluster. It’s like having a dashboard that shows you exactly how well your applications are performing in real-time.
</Person2><Person1> That’s the beauty of it! But wait, there’s more. Istio isn’t just a tool for Kubernetes. It’s designed to be highly extensible and can work with a wide range of deployment needs, including VMs and other endpoints outside of Kubernetes.
</Person1><Person2> And let’s not forget the ecosystem. There’s a huge community backing Istio with contributions, partnerships, and integrations. Whether you’re installing it yourself or letting a vendor manage it for you, there’s always someone who can help.
</Person2><Person1> True, true. But how does Istio actually work under the hood? Let’s break it down. So, Istio uses a proxy to intercept all your network traffic. Think of it as a digital intermediary that makes all the magic happen.
</Person1><Person2> Correct! And the control plane takes your desired configuration and updates the proxies dynamically. It’s like a smart thermostat that learns your preferences and adjusts the temperature accordingly.
</Person2><Person1> And then there’s the data plane—the actual communication between services. Without a service mesh like Istio, the network wouldn’t understand the type of traffic being sent or who it’s coming from or going to. It’s like having a dark city where no one knows the rules.
</Person1><Person2> And to top it all off, Istio supports two modes for deploying its proxies: sidecar mode, where it runs alongside services in pods, and ambient mode, which uses a per-node Layer 4 proxy for general communication and a per-namespace Layer 7 proxy for more granular control.
</Person2><Person1> Exactly! It’s all about giving you the flexibility and control you need to manage your services efficiently. Thanks for joining us on this journey through the world of Istio!
</Person1><Person2> Thanks for listening! Join us next time for more insights on cutting-edge tech. Until then, happy coding!</Person2>
这是基于 kokoro-82M 模型生成 podcast 关于 istio 和 deepseek 的播客样例。