May 24, 2024

DALL·E 2024-05-31 01.50.08 - A serene beach scene with lines of source code written in the sand, as if someone has meticulously carved each line with a stick. The waves are gently.webp

Andreessen’s “software will eat the world”, the growth of the tech industry, etc. all fundamentally traces back to a simple economic observation: software is expensive to produce, but has essentially zero marginal costs to distribute or to execute.

LLM generated code takes this economic reduction a step further by driving the cost of producing software also to zero. This enables a category of software that hasn’t really existed before — let’s call it ephemeral code*.*** I’ll argue that this is a Very Big Deal, in ways that haven’t been fully appreciated or leveraged yet.

TLDR 1: Generating and running ephemeral code ****is the most powerful action taking paradigm for LLMs, and will come to dominate most agent-like applications in the days to come.

TLDR 2: This is due to its supremacy on three increasingly fuzzy metrics: (1) expressiveness per token, (2) expressiveness per atomic operation, and (3) economic alignment.

(skippable background) What do we mean by LLM “actions?”

Karpathy often has great analogies, but one in particular sticks out:

TLDR looking at LLMs as chatbots is the same as looking at early computers as calculators.

Chatbots take advantage of the natural language capabilities of LLMs, but the underlying ability to reason has a far bigger upside and application surface. Putting it more bluntly: LLM impact will be felt most acutely through their ability to make plans and take actions.

The simplest paradigm for taking actions is to ask the LLM to choose between a discrete set of levers given context. Function calling expands this to allow the levers to take arguments. And ephemeral code expands this even further by allowing the LLM to construct an arbitrary computational graph using the levers available to it.

Let’s lead with an example.

Kel shops at Safeway and hates looking through coupons — so much so that he’s actually willing to give their industry-leading Safeway Assistant (TM) a shot.

“Any deals on orange soda this weekend?”

There’s a few (overlapping) ways the Safeway engineering team could seek to handle this. From the title of this post, you can probably guess which one is going to win.

Untitled

Option 1: Function calling.

Idea: Provide your LLM with an API reference, including a custom built tool check_for_deals(query_term, date_range). Ask your LLM to include in its output a structured choice of which function to call and which arguments to pass.

Untitled

Fast, intuitive, obvious! This approach works well for scenarios where user queries are dominated by a small number of request types. Unfortunately, the most compelling use cases for agents and LLMs tend to involve a heavy tail of personalized use cases and one-off scenarios.