Introducing recon - a CLI tool to gather context for LLMs

I use large language models like ChatGPT and Claude Opus a lot. No. More than that. A lot a lot.

I use them on my coding side projects, on my writing projects, at my day job, as well as for personal things like organizing my D&D campaign.

But one problem I'm constantly running into is that there are things I'd like to ask of an AI, but either there is a ton of data or it's spread all over the place. For example, if I'm working on RPG Portrait, I may want to discuss growth rates and have it help me prioritize my tasks accordingly. But the growth rates are in a Postgres database, the tasks are in GitHub issues, and it probably needs a bit of the app's source code too.

That's where recon comes in.

The basics

Recon is a command-line tool that makes it easy to gather context from various sources and feed it to an LLM as a single, coherent prompt. With a simple command, you can recursively gather text from files and directories, fetch additional information from URLs, and pipe it all to your AI assistant of choice.

Here's a simple example:

recon --files ./src/landing-pages \
--urls https://example.com/rules-for-landing-page-copy \
--prompt "Evaluate our landing pages against these rules" \
| llm

This command will:

  1. Recursively read all files in the ./src/landing-pages directory
  2. Fetch the text from the URL https://example.com/rules-for-landing-page-copy
  3. Combine all the text (with their respective names) into a single prompt
  4. Pipe the prompt to the llm command

Side note: If you're not using Simon Willison's excellent llm CLI tool, you probably should be.

But even if you're not using llm, you can still use recon to gather context and feed it to a model. It supports a --clipboard command for when you want to paste the prompt into a web-based chat interface.

Getting fancy

The --files and --urls flags are handy for ad-hoc types of queries, but recon also supports a configuration file that allows you to define named sources and prompts. This makes it easy to reuse the same sources across multiple queries.

// .recon.config.mjs
export default {
commands: {
// The key is the name of the command
docs: {
gather: {
files: ['./docs'],
urls: ['https://example.com/some-more-docs']
}
}
}
};

Now I can run the following command:

recon docs \
--prompt "What's the procedure for multi-stage migrations?" \
--clipboard

Getting fanciest

Recon also supports defining your own custom "Recon agents" that can gather text from any source you can imagine. For example, you could write an agent that fetches text from a Google Sheet, a database, or even a proprietary API.

// This is a hypothetical recon agent for querying a database
import myDbAgent from './myDbAgent';

export default {
agents: [myDbAgent]
commands: {
growth: {
// The default prompt for this command
prompt: "How are we doing on our growth goals?",
gather: {
// The default set of files to gather information from
files: ['./docs/business-plan.md'],
// The default set of urls to gather information from
urls: ['https://example.com/'],
// This tells myDbAgent how to gather its information
myDb: {
query: 'SELECT COUNT(*) FROM users',
},
// Additional notes to pass along with the prompt.
// This is the built in `notes` agent
notes: `REMEMBER: All replies should be in business-speak.
The more synergy, the better.
`
,
}
},
}
};

Future plans

I have a lot of ideas for improving recon. Some are obvious, like adding more built-in recon agents. One upcoming feature I'm particularly excited about is letting custom agents register their own CLI flags. This would make the CLI's ad-hoc querying capabilities even more powerful.

Wrapping up

It's still early days for Recon, but I use it every day and it's already saved me a ton of time. I hope that it can do the same for some of you!

If you're interested in trying it out or contributing, you can find it on GitHub.