Overseas access: www.kdjingpai.com

Ctrl + D Favorites

Local Deployment of Open Source Large Modeling Tools

 Submit Website

Local LLM Notepad: A Portable Tool for Running Local Large Language Models Offline
Local LLM Notepad is an open source offline application that allows users to run Local Large Language Models on any Windows computer via a USB device without an Internet connection and without installation. Users simply copy a single executable file (EXE) and a model file (e.g. GGUF format) to a USB drive...
07-03 310kudos
llm.pdf: experimental project to run a large-scale language model in a PDF file
llm.pdf is an open source project that allows users to run Large Language Models (LLMs) directly in PDF files. This project, developed by EvanZhouDev and hosted on GitHub, demonstrates an innovative approach: compiling llama.cpp via Emscripten as ...
05-05 6580kudos
Aana SDK: An Open Source Tool for Easy Deployment of Multimodal AI Models
Aana SDK is an open source framework developed by Mobius Labs, named after the Malayalam word ആന (elephant). It helps developers quickly deploy and manage multimodal AI models, supporting processing of text, images, audio and video, and other data.Aana SDK is based on the Ray Distributed Computing Framework ...
03-25 8860kudos
BrowserAI: Running AI Models Locally in the Browser with WebGPUs
BrowserAI is an open source tool that lets users run native AI models directly in the browser. It was developed by the Cloud-Code-AI team and supports language models like Llama, DeepSeek, and Kokoro. Users can complete text generation through the browser without a server or complex setup...
03-16 9520kudos
LitServe: Rapidly Deploying Enterprise-Grade General AI Model Reasoning Services
LitServe is an open source AI model service engine from Lightning AI, built on FastAPI and focused on rapidly deploying inference services for general-purpose AI models. It supports a wide range of scenarios from large language models (LLMs), visual models, audio models to classical machine learning models, providing batch...
03-10 8150kudos
Nexa: a small multimodal AI solution for local operation
Nexa AI is a platform focused on multimodal AI solutions that run locally. It offers a wide range of AI models, including Natural Language Processing (NLP), Computer Vision, Speech Recognition and Generation (ASR and TTS), all of which can be run locally on devices without relying on cloud-based services. This ...
02-01 1.2 K0kudos
vLLM: LLM reasoning and service engine for efficient memory utilization
vLLM is a high-throughput and memory-efficient reasoning and service engine designed for Large Language Modeling (LLM). Originally developed by the Sky Computing Lab at UC Berkeley, it has become a community project driven by academia and industry. vLLM aims to provide fast, easy...
01-17 1.1 K0kudos
Llama 3.2 Reasoning WebGPU: Running Llama-3.2 in a Browser
Transformers.js is a JavaScript library provided by Hugging Face designed to run state-of-the-art machine learning models directly in the browser without server support. The library is compatible with Hugging Face's Python version of transformer...
01-15 1.1 K0kudos
Harbor: a containerized toolset for easily managing and running AI services with one-click deployment of local LLM development environments
Harbor is a revolutionary containerized LLM toolset focused on simplifying the deployment and management of local AI development environments. It enables developers to launch and manage all AI service components including LLM backend, API interfaces, and front-end interfaces with a single click through a clean command line interface (CLI) and companion application....
01-02 1.4 K0kudos
Xinference: Easy Distributed AI Model Deployment and Serving
Xorbits Inference (Xinference for short) is a powerful and versatile library focused on providing distributed deployment and serving of language models, speech recognition models, and multimodal models. With Xorbits Inference, users can easily deploy and serve their own models or built-in advanced models,...
01-02 9300kudos
AI Dev Gallery: Windows Native AI Model Development Toolset, End-Side Model Integration into Windows Applications
AI Dev Gallery is an AI development tools application from Microsoft (currently in public preview) designed for Windows developers. It provides a comprehensive platform to help developers easily integrate AI features into their Windows applications. The most notable feature of the tool is that it provides...
12-30 1.4 K0kudos
LightLLM: An Efficient Lightweight Framework for Reasoning and Serving Large Language Models
LightLLM is a Python-based Large Language Model (LLM) inference and service framework known for its lightweight design, ease of extension, and efficient performance. The framework leverages a variety of well-known open source implementations, including FasterTransformer, TGI, vLLM, and FlashAtten...
12-17 1.0 K0kudos
Transformers.js: running nearly 700 AI macromodels in the local web
Transformers.js is a JavaScript library developed by Hugging Face to enable users to run state-of-the-art machine learning models directly in the browser without server support. The library is compatible with Hugging Face's Python trans...
12-02 1.4 K0kudos
GLM Edge: Smart Spectrum Releases End-Side Large Language Model and Multi-Modal Understanding Model for Mobile, Car and PC Platforms
GLM-Edge is a series of large language models and multimodal understanding models designed for end-side devices from Tsinghua University (Smart Spectrum Light Language). These models include GLM-Edge-1.5B-Chat, GLM-Edge-4B-Chat, GLM-Edge-V-2B and GLM-Edge-V-5...
12-01 1.4 K0kudos
EXO: Running distributed AI clusters using idle home devices with support for multiple inference engines and automated device discovery.
Exo is an open source project that aims to run its own AI cluster using everyday devices (e.g. iPhone, iPad, Android, Mac, Linux, etc.). Through dynamic model partitioning and automated device discovery, Exo is able to unify multiple devices into a single powerful GPU, supporting multiple models such as LLaMA, Mistral...
11-28 2.2 K0kudos
LocalAI: open source local AI deployment solutions, support for multiple model architectures, WebUI unified management of models and APIs
LocalAI is an open source local AI alternative that aims to provide API interfaces compatible with OpenAI, Claude, and others. It supports running on consumer-grade hardware, does not require a GPU, and is capable of performing a wide range of tasks such as text, audio, video, image generation, and speech cloning.LocalAI was developed by Ettore Di G...
11-28 1.8 K0kudos
llamafile: Distribute and run LLMs using a single file, simplify LLM deployment, cross-platform support for LLMs
llamafile is a tool from the Mozilla Builders project designed to simplify the deployment and operation of the Large Language Model (LLM). By combining llama.cpp with Cosmopolitan Libc, llamafile takes the complexity of LLM deployment through...
11-21 1.4 K0kudos
Petals: distributed shared GPU running and fine-tuning of large language models, sharing GPU resources like a BitTorrent network
Petals is an open source project developed by the BigScience Workshop to run Large Language Models (LLMs) through a distributed computing approach. Users can run and fine-tune LLMs at home using consumer-grade GPUs or Google Colab , e.g. Llama 3 .....
11-20 1.4 K0kudos
Aphrodite Engine: an efficient LLM inference engine that supports multiple quantization formats and distributed inference.
The Aphrodite Engine is the official backend engine for PygmalionAI, designed to provide an inference endpoint for PygmalionAI sites and to support the rapid deployment of Hugging Face-compatible models. The engine utilizes vLLM's Paged Attention technology to enable efficient K/...
11-20 1.3 K0kudos

English