llama-cpp-agent

v4 Finalist

llama-cpp-agent

Build agents with llama.cpp, vllm and TGI

In this space at hugginface you can chat with a research agent that can access the web and will provide you with a research report on any topic you ask to it.

About

The llama-cpp-agent framework is a tool designed to simplify interactions with Large Language Models (LLMs). It provides an interface for chatting with LLMs, executing function calls, generating structured output, performing retrieval augmented generation, and processing text using agentic chains with tools. The framework uses guided sampling to constrain the model output to the user defined structures. This way also models not fine-tuned to do function calling and JSON output will be able to do it. The framework is compatible with the llama.cpp server, llama-cpp-python and its server, and with TGI and vllm servers. Key Features - Simple Chat Interface: Engage in seamless conversations with LLMs. - Structured Output: Generate structured output (objects) from LLMs. - Single and Parallel Function Calling: Execute functions using LLMs. - RAG - Retrieval Augmented Generation: Perform retrieval augmented generation with colbert reranking. - Agent Chains: Process text using agent chains with tools, supporting Conversational, Sequential, and Mapping Chains. - Guided Sampling: Allows most 7B LLMs to do function calling and structured output.