jerryjliu / gpt_index
- среда, 7 декабря 2022 г. в 00:39:14
An index created by GPT to organize external information and answer queries!
GPT Index is a project consisting of a set of data structures that are created using GPT-3 and can be traversed using GPT-3 in order to answer queries.
That's where the GPT Index data structures come in. Instead of relying on world knowledge encoded in the model weights, a GPT Index data structure does the following:
The high-level design exercise of this project is to test the capability of GPT-3 as a general-purpose processor to organize and retrieve data. From our current understanding, related works have used GPT-3 to reason with external db sources (see below); this work links reasoning with knowledge building.
pip install gpt-index
Examples are in the examples folder. Indices are in the indices folder (see list of indices below).
To build a tree index do the following:
from gpt_index import GPTTreeIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('data').load_data()
index = GPTTreeIndex(documents)To save to disk and load from disk, do
# save to disk
index.save_to_disk('index.json')
# load from disk
index = GPTTreeIndex.load_from_disk('index.json')To query,
index.query("<question_text>?", child_branch_factor=1)The main third-party package requirements are transformers, openai, and langchain.
All requirements should be contained within the setup.py file. To run the package locally without building the wheel, simply do pip install -r requirements.txt.
Tree Index: Tree data structures
Keyword Table Index: a keyword-based table
List Index: a simple list-based data structure
We currently offer connectors into the following data sources. External data sources are retrieved through their APIs + corresponding authentication token.
NotionPageReader)GoogleDocsReader)SlackReader)SimpleMongoReader)WikipediaReader)SimpleDirectoryReader)Example notebooks of how to use data connectors are found in the Data Connector Example Notebooks.
Measuring and Narrowing the Compositionality Gap in Language Models, by Press et al.
ReAct: Synergizing Reasoning and Acting in Language Models, by Yao et al.
Please let me know if there are other related works - I am not up-to-date on the latest NLP/LLM ArXiv papers or Github projects. I am happy to give references/credit below.