Beyond ChatGPT: Language Models as Intelligent Business Decision Engines
A lot of hay has been made about the abilities and limitations of large language models (it's all “ChatGPT” to your uncle) since they roared into public prominence a couple of years ago. In fact, with the holiday season in full swing, chances are you've had some family or friends corner you at a holiday party to opine and register their thoughts publicly.
That's been my experience at least. It's always an interesting conversation, but this year was a little different. With a couple of LLM-centered production releases under my belt, I feel like I can start to provide a little bit of insight into what practical roles it's ready to take on.
One of the earliest ideas that I had when starting to experiment with LLMs was an API consumer that automatically updated to accommodate relatively minor changes to the target schema. Initially this was spurred on by some frequent, breaking changes to a weather API I was using for a project. Adjusting to the changes wasn't rocket science, but it required attention, time, and a level of vigilance that could probably be better applied elsewhere.
I imagined a language model that could perform the mild course correction on its own, and continue without needing to bring me into the mix. I was curious, but I was also cheap, so I spun up a tiny local model to test the theory(big shout out to open-hermes-2.)
The results were surprisingly good on the first go around. If there was a request error, then the OpenAPI spec and error code were sent to the model, which then made changes to the outgoing requests via an intermediary .txt file. It didn't always fix it on the first try, but if I gave it up to three chances, it almost always got it.
Fast forward to a couple of months ago working on a project here at Shift, I ended up giving gpt-4o a similar degree of autonomy. We were working on streamlining data entry and aggregation for a client. It was a manual process, and one that in prior years may not have been able to have been automated at all. It wasn't particularly complicated, but like my weather data, certain pieces didn't have a reliable API to use and couldn't be counted on to keep the data moving through the chain.
I ended up letting the LLM make some simple choices in the hazy areas, for example:
based on { prior_data }, choose the correct parameter on the current page from the following [A, B, C, etc... ]
To hedge against the degree of randomness inherent to working with these models, it actually makes the decision 3 times, and goes with the majority decision. Though, looking at the logs, I don't think it's ever actually had a split decision.
It's simple, but it worked, and it saved hours upon hours of manual copy/pasting that preceded its implementation. AGI may not be here yet (not starting that conversation!), but the current crop of frontier models are more than capable of making small decisions that turn them into powerful universal adapters.
So next time your uncle wants to talk shop, you can let him know that skynet isn't quite here, but that he may be able to offload some of the small decisions and focus his brainpower on some of the bigger ones.