jerryjliu / gpt_index
- среда, 7 декабря 2022 г. в 00:39:14
An index created by GPT to organize external information and answer queries!
GPT Index is a project consisting of a set of data structures that are created using GPT-3 and can be traversed using GPT-3 in order to answer queries.
That's where the GPT Index data structures come in. Instead of relying on world knowledge encoded in the model weights, a GPT Index data structure does the following:
The high-level design exercise of this project is to test the capability of GPT-3 as a general-purpose processor to organize and retrieve data. From our current understanding, related works have used GPT-3 to reason with external db sources (see below); this work links reasoning with knowledge building.
pip install gpt-index
Examples are in the examples
folder. Indices are in the indices
folder (see list of indices below).
To build a tree index do the following:
from gpt_index import GPTTreeIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('data').load_data()
index = GPTTreeIndex(documents)
To save to disk and load from disk, do
# save to disk
index.save_to_disk('index.json')
# load from disk
index = GPTTreeIndex.load_from_disk('index.json')
To query,
index.query("<question_text>?", child_branch_factor=1)
The main third-party package requirements are transformers
, openai
, and langchain
.
All requirements should be contained within the setup.py
file. To run the package locally without building the wheel, simply do pip install -r requirements.txt
.
Tree Index
: Tree data structures
Keyword Table Index
: a keyword-based table
List Index
: a simple list-based data structure
We currently offer connectors into the following data sources. External data sources are retrieved through their APIs + corresponding authentication token.
NotionPageReader
)GoogleDocsReader
)SlackReader
)SimpleMongoReader
)WikipediaReader
)SimpleDirectoryReader
)Example notebooks of how to use data connectors are found in the Data Connector Example Notebooks.
Measuring and Narrowing the Compositionality Gap in Language Models, by Press et al.
ReAct: Synergizing Reasoning and Acting in Language Models, by Yao et al.
Please let me know if there are other related works - I am not up-to-date on the latest NLP/LLM ArXiv papers or Github projects. I am happy to give references/credit below.