For a school activity we are organizing a rubber band bracelet making. Here is a how to we did for it. Feel free to reuse it if you're doing the same.
LLMs have seen a huge surge of popularity with ChatGPT by going from prompt to text for various use cases. But what's really exciting is that they are also extremely useful as ways to implement normal functions within a program. This is what I call LLM As A Function.
For example, if you want to build a website builder that will only use components that you have in your design system, you can write the following:
const prompt = 'Build a chat app UI'; const components = llm<Array<string>>( 'You only have the following components: ' + designSystem.getAllExistingComponents().join(', ') + '\n' + 'What components to do you need to do the following:\n' + prompt ); // ['List', 'Card', 'ProfilePicture', 'TextInput'] const result = llm<{javascript: string, css: string}>( 'You only have the following components: ' + components.join(',') + '\n' + 'Here are examples of how to use them:\n' + components.map(component => designSystem.getExamplesForComponent(component).join('\n') ).join('\n') + '\n' + 'Write code for making the following:\n' + prompt ); // { javascript: '...', css: '...' } |
What's pretty magical here is that the llm
calls are taking as input arbitrary string but output real values and not just string. In this case it's using JavaScript and TypeScript for the type definition but can be anything you want (Python, Java, Hack...).
How does it work?
The function that we are using is llm<Type>(prompt: string): Type
. It takes an explicit type that will be returned.
The first step is you need to have introspection / code generation from your language to be able to take the type you put and manipulate it. With this type we are going to do two things:
We will convert it to a JSON example and augment the prompt with it. For example in the second invocation the type was {javascript: string, css: string}
, we are going to generate: 'You need to respond using JSON that looks like {"javascript": "...", "css": "..."}'
. We are using prompt engineering to nudge the LLM to be responding in the format we want.
We also convert it to a JSON Schema that looks something like this:
{ "type": "object", "properties": { "javascript": {"type": "string"}, "css": {"type": "string"} } } |
This is fed to JSONFormer which restricts what the LLM can output to 100% follow the schema. The way LLMs generate the next token is by computing the probability of every single token and then picking the most likely. JSONFormer restricts it to only the tokens that match the schema.
In this case, the first generated tokens can only be {"javascript": "
and then the LLM is left filling the blanks until the next "
at which point it will be forced to insert ", "css": "
, left on its own again and then forced with "}
.
The great property of LLMs generating new tokens based on the previous ones is that even without the added prompt engineering, if it sees {"javascript": "
it will automatically continue generating JSON and will not be likely to add all the intros like Sure, here is the response
.
At this point we are guaranteed to get a valid JSON using our structure. So we can use JSON.parse()
on it and then convert it to the JavaScript object we requested.
Conclusion
Before we implemented this magic llm<Type>()
function, we'd see people adding a lot brittle logic in order to try and get the LLM to output things in the correct format, do lot of prompt engineering, add fuzzy parsing, retry logic... This was both brittle and added latency to the system.
This is not only a reliability improvement but really unlocks a whole new world of possibilities. You can now leverage LLMs within your codebase to implement functions that returns values just like any other function would, but instead of writing code to run it, you tell it what to do using text.
Early in my time at Facebook I realized the hard way that I couldn't do everything myself and for things to be sustainable I needed to find ways to work with other people on the problems I cared about.
But how do you do that in practice? There are lots of techniques from various courses, training, tips... In this note I'm going to explain the technique I'm using the most that has been very successful for me: "casting lines".
So you want something to happen, let say implement a feature in a tool. The first step is to post a message in the feedback group of the tool explaining what the problem is and what you want to happen. It's fine if it's your own tool. The objective here is that you have something you can reference when talking to people, you can send them the link with all the context. It can also be an issue on a github project, a quip, a note... the form doesn't matter as long as you can link to it.
If the thing is already on people's roadmap or already implemented under a gk, then congratz, you win. But most likely it is not.
This is where you start "casting lines". The idea is that anytime you chat with someone, whether it is in 1-1 meetings, group conversations, hallway chat... and the topic of discussion comes close (for a very lax definition of close), you want to bring up that specific feature: "It would be so awesome if we could do X". At you see the reaction. If that person feels interested, you then start to get them excited about them building it. Find ways it connects to their strengths, roadmap, career objectives... and of course send them the link.
In practice, the success rate of this approach, in the moment, is small because people usually don't have nothing to do right now and can jump on shipping a feature that they never thought about. But if you keep casting lines consistently in all your interactions with people, at some, point someone will bite.
The more lines you cast, the more stuff are going to get done.
While this technique has been very effective at getting things done at scale, there are drawbacks to this approach. The biggest one being uncertainty around timelines. Unless someone bites, you don't know when something will be done. Some of my lines are still up from many years ago.
PS: while researching for this note, I learned that the fishing technique shown in the cover photo is called "Troll Fishing".
If you watch pro pool players, most of the time the game is super boring, if you don’t believe me, watch this video from someone that puts 152 balls in a row. What’s interesting is that if you were to look at each shot individually, most of them are easy. I can likely make the 152 pots he did in a row, if I didn’t have to care about positioning myself for the next ball.
The real talent of pro pool players is being able to not only pot the ball but put the white ball in a good position for shooting the next ball. When they play well, they “make the game easy” by having the white ball always in a good position for the next shot.
What this means is that if you see a pro player doing some crazy shot, this means that they “got out of position” in the previous ball. And in practice at this level, usually they did a mistake a few shots earlier and haven’t been able to correct the position back and it gradually amplified.
This is a really bad property when watching the game, so most tournaments introduce a 30s limit so you don’t let the players properly think through and increase the likelihood of making mistakes, and having to come up with interesting shots.
There are a lot of interesting strategies in order to get good at it:
- Routes that avoid crossing “danger zones” (right behind an enemy ball)
- Land in a place where there are multiple shots available next rather than a single one
- Come into the line of the next shot so that you don’t need to be super precise with your speed
- Remove all the balls next to each others so you don’t have to go up and down the table with longer shots
Now is probably the point where you’re asking yourself, that’s interesting but what does it have to do with software engineering. Well, I think that there are a lot of parallels with building software.
When I see people doing very visible and consequential actions, I find myself thinking that they are doing a “hero shot” and it must mean that they got “out of position” for the past few shots and now the only option that they have left is unsatisfying but there’s no other choice.
On the other hand, I see people appearing to somehow always be in easy projects where everything just works out fine and they deliver a lot of impact. I used to think that they were lucky, now I think that they are pro players and are able to plan multiple shots in advance and able to execute on their strategy.
Andres Suarez pointed me to some interesting code in the Hack codebase:
let slash_escaped_string_of_path path = let buf = Buffer.create (String.length path) in String.iter (fun ch -> match ch with | '\\' -> Buffer.add_string buf "zB" | ':' -> Buffer.add_string buf "zC" | '/' -> Buffer.add_string buf "zS" | '\x00' -> Buffer.add_string buf "z0" | 'z' -> Buffer.add_string buf "zZ" | _ -> Buffer.add_char buf ch ) path; Buffer.contents buf |
What it does is to turn all the occurrences of \
, :
, /
, \0
and z
into zB
, zC
, zS
, z0
and zZ
. This way, there won't be any of those characters in the original string which are probably invalid in the context where that string is transported. But you still have a way to get them back by transforming all the z-sequences
back to their original form.
Why is it useful?
The first interesting aspect about it is that it's using z
as an escape character instead of the usual \
. In practice, it's less likely for a string to contain a z
rather than a \
so we have to escape less often.
But the big wins are coming when escaping multiple times. In the \
escape sequence, it looks something like this:
\
->\\
->\\\\
->\\\\\\\\
->\\\\\\\\\\\\\\\\
whereas with the z escape sequence:
z
->zZ
->zZZ
->zZZZ
->zZZZZ
The fact that escaping a second time doubles the number of escape characters is problematic in practice. I was working on a project once where we found out that the \
character represented 70% of the payload!
Conclusion
It's way too late to change all the existing programming languages to use a different way to escape characters but if you have the opportunity to design an escape sequence, know that \
escape sequence is not always the best 🙂