How LLMs are starting to really use tools, and then also discuss a cutting-edge topic of agents, which is where we'll let LLMs try to decide for themselves what action they want to take next.
Tool use for food order taking--chatbot
- LLM can't just say...., because it needs to take some action to actually send xxx to you
- An LLM might output this response, order xxxx for the user, and then also say the user message is to say, " Okay, it's on its way."
- An LLM that's been fine-tuned to output text like this will be able to generate an order, which in this case would trigger a software application ( passes a request)
- What is shown to the user is not the full LLM output, but rather set to the user as the response
- A better user interface would be to pop up a verification dialogue;
- And clearly, given that LLMs' outputs are not completely reliable for any safety-critical or mission-critical action, it would be a good idea to let a user confirm that that's the right action before letting the LLM trigger some potentially costly mistake by itself
Tools for reasoning
- LLMs are not great at precise math
- It turns out LLMs, having learned to predict the next words or maybe even instruction tunes, are not great at precise math
- So, rather than having the LLM output the answer directly, if the LLM were to output this, after compounding and so on, you would have a calculator
- This can be interpreted as a command to call an external calculator program to explicitly compute the right answer, and plug the result back into the text to give the user the correct data figure
- So by giving LLMs the ability to call tools in their output, we can significantly extend the reasoning or the action-taking capabilities of LLMs
- Make sure that tools aren't triggered in a way that causes harm or irreversible damage
Agents
-
Use LLM to choose and carry out complex sequences of actions
- Going beyond tools into a more experimental area, which goes beyond triggering a tool to carry out a single action. But it's exploring whether LLMs can choose and carry out complex sequences of actions
-
Cutting-edge area of AI research
- It's not yet mature enough to count on for the most important applications
- An agent uses an LLM as a reasoning engine to figure out what are the steps it needs to carry out to do the task.
- And this reasoning engine, the LLM, might decide it needs to search for the list of the xxx,
- Then visit the website for each xxx, and finally, for each xxx, write a summary based on the homepage content.
- And then perhaps by making a sequence of calls to this reasoning engine, it may figure out how to search the xxx, it has to trigger a tool to call a web search engine on the query xxx
- It may visit the websites of some of the xxx to download their homepages
- Call an LLM yet again to summarize the text that it found on the website