smol-ai / developer
- четверг, 18 мая 2023 г. в 00:00:02
with 100k context windows on the way, it's now feasible for every dev to have their own smol developer
Human-centric & Coherent Whole Program Synthesis aka your own personal junior developer
Build the thing that builds the thing! a
smol dev
for every dev in every situation
this is a prototype of a "junior developer" agent (aka smol dev
) that scaffolds an entire codebase out for you once you give it a product spec, but does not end the world or overpromise AGI. instead of making and maintaining specific, rigid, one-shot starters, like create-react-app
, or create-nextjs-app
, this is basically create-anything-app
where you develop your scaffolding prompt in a tight loop with your smol dev.
AI that is helpful, harmless, and honest is complemented by a codebase that is simple, safe, and smol - <200 lines of Python and Prompts, so this is easy to understand and customize.
engineering with prompts, rather than prompt engineering
The demo example in prompt.md
shows the potential of AI-enabled, but still firmly human developer centric, workflow:
main.py
generates codedebugger.py
which reads the whole codebase to make specific code change suggestionsLoop until happiness is attained. Notice that AI is only used as long as it is adding value - once it gets in your way, just take over the codebase from your smol junior developer with no fuss and no hurt feelings. (we could also have smol-dev take over an existing codebase and bootstrap its own prompt... but that's a Future Direction)
Not no code, not low code, but some third thing.
Perhaps a higher order evolution of programming where you still need to be technical, but no longer have to implement every detail at least to scaffold things out.
naturally generated with gpt4, like we did for babyagi
Please subscribe to https://latent.space/ for a fuller writeup and insights and reflections
variable_names
or entire ``` code fenced code samples)
curl
input and outputcat
ing the whole codebase with your error message and getting specific fix suggestions - particularly delightful!shared_dependencies.md
, and then insisting on using that in generating each file. This basically means GPT is able to talk to itself...shared_dependencies.md
is sometimes not comperehensive in understanding what are hard dependencies between files. So we just solved it by specifying a specific name
in the prompt. felt dirty at first but it works, and really it's just clear unambiguous communication at the end of the day.prompt.md
for SOTA smol-dev promptingPlease subscribe to https://latent.space/ for a fuller writeup and insights and reflections
We were working on a Chrome Extension, which requires images to be generated, so we added some usecase specific code in there to skip destroying/regenerating them, that we haven't decided how to generalize.
We dont have access to GPT4-32k, but if we did, we'd explore dumping entire API/SDK documentation into context.
The feedback loop is very slow right now (time
says about 2-4 mins to generate a program with GPT4, even with parallelization due to Modal (occasionally spiking higher)), but it's a safe bet that it will go down over time (see also "future directions" below).
it's basically:
git clone https://github.com/smol-ai/developer
..example.env
to .env
filling in your API keys.There are no python dependencies to wrangle thanks to using Modal as a self-provisioning runtime.
Unfortunately this project also uses 3 waitlisted things:
pip install modal-client
(private beta - hit up the modal team to get an invite, and login)
pip install -r requirements.txt
python main_no_modal.py YOUR_PROMPT_HERE
yes, the most important skill in being an ai engineer is social engineering to get off waitlists. Modal will let you in if you say the keyword "swyx"
you'll have to adapt this code on a fork if you want to use it on other infra. please open issues/PRs and i'll happily highlight your fork here.
the /generated
and /exampleChromeExtension
folder contains a Chrome Manifest V3 extension that reads the current page, and offers a popup UI that has the page title+content and a textarea for a prompt (with a default value we specify). When the user hits submit, it sends the page title+content to the Anthropic Claude API along with the up to date prompt to summarize it. The user can modify that prompt and re-send the prompt+content to get another summary view of the content.
this entire extension was generated by the prompt in prompt.md
(except for the images), and was built up over time by adding more words to the prompt in an iterative process.
basic usage
modal run main.py --prompt "a Chrome extension that, when clicked, opens a small window with a page where you can enter a prompt for reading the currently open page and generating some response from openai"
after a while of adding to your prompt, you can extract your prompt to a file, as long as your "prompt" ends in a .md extension we'll go look for that file
modal run main.py --prompt prompt.md
each time you run this, the generated directory is deleted (except for images) and all files are rewritten from scratch.
In the shared_dependencies.md
file is a helper file that ensures coherence between files.
if you make a tweak to the prompt and only want it to affect one file, and keep the rest of the files, specify the file param:
modal run main.py --prompt prompt.md --file popup.js
take the entire contents of the generated directory in context, feed in an error, get a response. this basically takes advantage of longer (32k-100k) context so we basically dont have to do any embedding of the source.
modal run debugger.py --prompt "Uncaught (in promise) TypeError: Cannot destructure property 'pageTitle' of '(intermediate value)' as it is undefined. at init (popup.js:59:11)"
# gpt4
modal run debugger.py --prompt "your_error msg_here" --model=gpt-4
take the entire contents of the generated directory in context, and get a prompt back that could synthesize the whole program. basically smol dev
, in reverse.
modal run code2prompt.py # ~0.5 second
# use gpt4
modal run code2prompt.py --model=gpt-4 # 2 mins, MUCH better results
We have done indicative runs of both, stored in code2prompt-gpt3.md
vs code2prompt-gpt4.md
. Note how incredibly better gpt4 is at prompt engineering its future self.
Naturally, we had to try code2prompt2code
...
# add prompt... this needed a few iterations to get right
modal run code2prompt.py --prompt "make sure all the id's of the DOM elements, and the data structure of the page content (stored with {pageTitle, pageContent }) , referenced/shared by the js files match up exactly. take note to only use Chrome Manifest V3 apis. rename the extension to code2prompt2code" --model=gpt-4 # takes 4 mins. produces semi working chrome extension copy based purely on the model-generated description of a different codebase
# must go deeper
modal run main.py --prompt code2prompt-gpt4.md --directory code2prompt2code
We leave the social and technical impacts of multilayer generative deep-frying of codebases as an exercise to the reader.
things to try/would accept open issue discussions and PRs:
popup.html.md
and content_script.js.md
and so onprompt.md
for existing codebases - write a script to read in a codebase and write a descriptive, bullet pointed prompt that generates it
smol pm
, but its not very good yet - would love for some focused polish/effort until we have quine smol developer that can generate itself lmaomodal run anthropic.py --prompt prompt.md --outputdir=anthropic
to try it