Nab theme, more professional navigation theme
Ctrl + D Favorites
Current Position:fig. beginning " AI Tool Library

PRAG: Parameterized Retrieval Augmentation Generation Tool for Improving the Performance of Q&A Systems

2025-02-06 571

General Introduction

PRAG (Parametric Retrieval-Augmented Generation) is an innovative retrieval-augmented generation tool designed to enhance generation by embedding external knowledge directly into the parameter space of a Large Language Model (LLM). The tool overcomes the limitations of traditional contextual retrieval-augmented generation methods, reduces computational overhead, and enhances the model's reasoning and synthesis capabilities by deeply integrating external knowledge.PRAG provides end-to-end implementations including a data enhancement module, a parameter training module, and an inference module for performance testing of various quiz datasets.

PRAG: Parameterized Retrieval Augmentation Generation Tool for Improving the Performance of Q&A Systems-1

 

Function List

  • Data Enhancement Module: Convert documents into data-enhanced datasets.
  • Parameter Training Module: Train additional LoRA parameters to generate a parameterized representation of the document.
  • inference module: Merge parameterized representations of related documents and insert them into the LLM for inference.
  • Environment Installation: Provides detailed steps and dependencies for installing the environment.
  • self-improvement: Supports direct use of pre-enhanced data files or self-processed data enhancements.
  • Search preparation: Download and prepare Wikipedia datasets for retrieval.

 

Using Help

Environment Installation

  1. Create and activate a virtual environment:
   conda create -n prag python=3.10.4
conda activate prag
  1. Install the necessary dependencies:
   pip install torch==2.1.0
pip install -r requirements.txt
  1. modifications src/root_dir_path.py hit the nail on the head ROOT_DIR variable is the address of the folder where the PRAG is stored.

data enhancement

  1. Use pre-enhanced data files:
   tar -xzvf data_aug.tar.gz
  1. Self-processing data enhancement:
    • Download the Wikipedia dataset: bash
      mkdir -p data/dpr
      wget -O data/dpr/psgs_w100.tsv.gz https://dl.fbaipublicfiles.com/dpr/wikipedia_split/psgs_w100.tsv.gz
    • intend BM25 Retrieve: bash
      # 具体步骤请参考项目文档

parametric training

  1. Generate a parameterized representation of the document:
   # 具体步骤请参考项目文档

inference

  1. Parameterized representations of related documents are merged and inserted into the LLM for inference:
   # 具体步骤请参考项目文档

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Scan the code to follow

qrcode

Contact Us

Top

en_USEnglish