Last updated
Last updated
Langtorch is a Java framework that assists you in developing large language model applications. It is designed with reusability, composability and Fluent style in mind. It can aid you in developing workflows or pipelines that include large language models.
In Langtorch, we introduce the concept of a processor. A processor is a container for the smallest computational unit in Langtorch. The response produced by the processor can either originate from a large language model, such as OpenAI's GPT model(retrieved by rest api), or it could be a deterministic Java function.
A processor is an interface that includes two functions: run() and runAsync(). Anything that implements these two functions can be considered a processor. For instance, a processor could be something that sends an HTTP request to OpenAI to invoke its GPT model and generate a response. It could also be a calculator function, where the input is 1+1, and the output is 2.
Using this approach, we can conveniently add a processor, such as the soon-to-be-publicly-available Google PALM 2 API. At the same time, when we chain different processors together, we can leverage this to avoid some of the shortcomings of large language models (LLMs). For instance, when we want to implement a chatbot, if a user asks a mathematical question, we can parse this question using the LLM's capabilities into an input for our calculator to get an accurate answer, rather than letting the LLM come to a conclusion directly.
Note: The processor is the smallest computational unit in Langtorch, so a processor is generally only allowed to process a single task. For example, it could have the ability to handle text completion, chat completion, or generate images based on a prompt. If the requirements are complex, such as first generating a company's slogan through text completion, and then generating an image based on the slogan, this should be accomplished by chaining different processors together, rather than completing everything within a single processor.
As previously mentioned, the processor is the smallest container of a computational unit, and often it is not sufficient to handle all situations. We need to enhance the processor!
Here we introduce the concept of Capability. If the processor is likened to an internal combustion steam engine, then a Capability could be a steam train based on the steam engine, or a electricity generator based on the steam engine.
Imagine that you are implementing a chatbot. If the processor is based on OpenAI's API, sending every user's input to the OpenAI GPT-4 model and returning its response, what would the user experience be like?
The reason is that the chatbot does not incorporate chat history. Therefore, in capability, we can add memory (a simple implementation of memory is to put all conversation records into the request sent to OpenAI).
To make the combination of capabilities easier, we introduce the concept of a Node Adapter, and we refer to capabilities nodes composition as a Capability graph.
However, the capability graph can only be a Directed Acyclic Graph (DAG), i.e., there are no cycles allowed.
The Node Adapter is primarily used for validation and optimization of the Capability graph. It wraps the capability and also includes some information about the Capability graph, such as the current node's globally unique ID, what the next nodes are, and so on.