NotebookLlama is a fully open source tool, based on LlamaCloud technology, designed to help users manage documents and generate podcast-like audio content. It is an alternative to Google NotebookLM for researchers, students and business users. Users can upload documents, create knowledge bases, and extract key information through intelligent analytics.NotebookLlama also supports the conversion of document content into natural-sounding audio, making it easy for users to access information in mobile scenarios. The project is hosted on GitHub, with transparent code, strong community support, and a clear installation process for tech enthusiasts and professionals.
Function List
- Document Upload and Management: Support for uploading documents in multiple formats (e.g. PDF) to build individual or team knowledge bases.
- Knowledge Extraction and Summarization: Automatically analyze documents, extract core content and generate summaries through LlamaCloud technology.
- Audio Generation: Convert document content into podcast-like audio with support for natural speech output.
- Open Source and Customizable: The code is completely open source, users can modify or expand the function according to the demand.
- Multi-platform support: Runs via Docker and Streamlit and supports local or cloud deployments.
- Intelligent Search: Provides intelligent search based on document content to quickly locate information.
Using Help
Installation process
To use NotebookLlama, users need to complete the installation and configuration first. Below are the detailed installation steps:
- Cloning Codebase
Run the following command in the terminal to clone the NotebookLlama project locally:git clone https://github.com/run-llama/notebookllama
Go to the project catalog:
cd notebookllama/
- Installation of dependencies
utilizationuv
tool installs the necessary dependency packages:uv sync
Ensure that you have Python and the
uv
. If you don't have it, install Python 3.8 or above first and pass thepip install uv
mountinguv
The - Configuring API Keys
The project requires three API keys: OpenAI, ElevenLabs and LlamaCloud.The steps are as follows:- Open the project directory in the
.env.example
Documentation. - Get the API key:
OPENAI_API_KEY
: Log in to the OpenAI platform and go to Account Settings to generate a key.ELEVENLABS_API_KEY
: Get it on the Settings page of the ElevenLabs website.LLAMACLOUD_API_KEY
: Visit the LlamaCloud dashboard to get the key.
- Fill the key into the
.env.example
file and then rename the file:mv .env.example .env
- Open the project directory in the
- Run the initialization script
Execute the following commands to create the LlamaCloud indexing and extraction agent:uv run tools/create_llama_extract_agent.py uv run tools/create_llama_cloud_index.py
- Starting services
Start the Postgres and Jaeger services with Docker:docker compose up -d
Start the MCP server:
uv run src/notebookllama/server.py
- Running the Streamlit application
Launches the Streamlit front-end interface:streamlit run src/notebookllama/Home.py
mounting
ffmpeg
(if not already installed) to support audio functionality:- On Ubuntu:
sudo apt-get install ffmpeg
- On macOS:
brew install ffmpeg
- On Ubuntu:
- Access to applications
Open your browser and visithttp://localhost:8751/
You can start using NotebookLlama now.
Main Functions
Document uploading and knowledge base creation
- procedure::
- Log in to the Streamlit interface and click the "Upload Document" button.
- Select PDF or other supported document format to upload to the system.
- The system automatically parses the document content and incorporates it into the knowledge base.
- Functional Features::
- Supports batch uploading, suitable for processing large amounts of research materials.
- Document content is automatically indexed for subsequent search and analysis.
Knowledge Extraction and Summarization
- procedure::
- Select the uploaded document in the interface.
- Click the "Extract Information" or "Generate Summary" button.
- The system analyzes the document and outputs key points, summaries, or Q&A content.
- Functional Features::
- Intelligent analysis based on LlamaCloud for accurate and concise extraction.
- Supports user-defined extraction scope, e.g., extracting only a certain chapter.
Audio Generation
- procedure::
- Select the document or summary content for which you need to generate audio.
- Click "Generate Podcast" button, the system calls ElevenLabs API to convert text to speech.
- Download the generated audio file or play it directly online.
- Functional Features::
- The audio is natural and smooth, close to the human podcast effect.
- Supports multi-language voice output, suitable for internationalization needs.
Intelligent Search
- procedure::
- Enter a keyword or question in the interface.
- The system returns relevant document fragments or answers.
- Functional Features::
- Search results are based on document content and are highly accurate.
- Support for complex queries, such as "summarize the topic of a document .
caveat
- Ensure that the network is stable and that the API calls require an internet connection.
- If audio generation fails, check the
ffmpeg
Is it properly installed. - Regularly update the code base for the latest features:
git pull origin main
The
application scenario
- academic research
Researchers can upload academic papers to quickly extract key information or generate summaries. The audio feature is suitable for listening to the content of the paper while commuting to improve efficiency. - Business Analysis
Enterprise users upload market reports or internal documents to build a knowledge base. Intelligent search and summarization features help to quickly locate key data to aid decision-making. - Educational learning
Students upload textbooks or handouts to generate summaries or audio for easy review. The audio feature is especially suitable for auditory learners. - content creation
Podcast creators can convert articles or notes to audio to quickly generate podcast content and save recording time.
QA
- What document formats does NotebookLlama support?
Currently supports PDF, TXT and other common formats, and may be expanded to more formats in the future. - Do I need to pay to use the API?
Yes, the APIs for OpenAI, ElevenLabs and LlamaCloud require their respective paid accounts. Users will need to register and get the key themselves. - Does local deployment require high-performance hardware?
A typical home computer (8GB RAM, 4-core CPU) can run it. a Docker deployment requires about 10GB of disk space. - How well does the audio generate speech?
The voice, provided by ElevenLabs, is near human announcer level and supports multiple languages and tones.