Agent as a service #16614

qingtian1771 · 2024-01-26T05:51:01Z

qingtian1771
Jan 26, 2024

GAI and SAI

What will the artificial intelligence that the world finally presents look like? Do future artificial intelligences have anything in common? I think one of the common features is the use of artificial intelligence through dialogue. In the future, you can complete the following tasks through dialogue:

Place an order (hotel, flight or business order)
learn a language
write articles
draw a picture
design 3D models
control robots
program
use computers
etc.

If the above tasks can be achieved through dialogue, then do we need a lot of dialogue applications to achieve it? Or implement all functions through a unic conversational application?

My idea is to implement all functions through a general conversation application, but in fact this general conversation application does not complete specific tasks. It just acts as a proxy and forwards user requests to other specific conversation applications, and then send feedback from the specific conversation application to the user.

In fact, this general conversational program is not just an agent, it does many other things. We can think of it as a personal assistant, and we can give it a name: General Artificial Intelligence (GAI), with the opposite name is: special artificial intelligence (SAI), which is used to complete a specific task, for example designing a three-dimensional model.

General Artificial Intelligence (GAI) mainly accomplishes the following tasks:

Have a normal conversation with the user.
Some SGIs can be searched and discovered.
Can talk to SAI.
Keep conversation: Save all messages in a conversation.
Open a new conversation: If the user talks about other things in a conversation, GAI will open another conversation.
Continue a previous conversation.
Understand user preferences: Understand user needs and preferences through dialogue.
Proactively push valuable SAI.

Special artificial intelligence (SAI) mainly accomplishes the following tasks:

It can make itself discovered by registering.
It can complete work in its own field through dialogue. For example, it can complete the design of a three-dimensional model through dialogue.
it only can talk about domain-related topics cannot complete a general conversation.

The conversation between GAI and SAI can be completed through a protocol. Let's first give an example to illustrate what a protocol is.

For example, in the FastAPI library in python, the client (which can be a browser) accesses the interfaces exposed by a FastAPI application through the http protocol. These interfaces can be GET, POST, UPDATE, DELETE, etc. for something. This is a kind of protocol.

Similarly, the conversation between GAI and SAI is similar to the above dialogue. GAI and SAI use language (analogous to the http protocol above) to dialogue, but not all languages can be used. SAI can only understand its field related content (analogous to the interface exposed by the FastAPI application), GAI completes a task through conversataion with SAI, such as completing the design task of a three-dimensional model.

But GAI doesn't know what to say to SAI before the conversation. The actual situation is that SAI tells GAI how to talk to SAI. It can be understood that SAI sends a prompt to GAI. After GAI gets this prompt, it knows how to talk to SAI. During the conversation with GAI, SAI continuously sends new promts to GAI according to the content of the conversation to control the direction and content of the conversation.

The basic process is that the user talks to GAI, GAI talks to SAI, GAI feeds back the results of the dialogue to the user, and the user continues the dialogue with GAI.

The advantage of blocking users from direct conversations with SGI is that it prevents users from sending things to SAI that it cannot understand, causing confusion in the conversation.

How to design a SAI.

Let's imagine an example to see how an SAI is designed. Let's still use this 3D model design SGI. The function of the 3D SAI is that users can interactively design a 3D model, such as a mug or a bicycle.

Usually, users use 3D design software, such as AutoDesk, SolidWorks, etc., to do interactive design. Users use the mouse and keyboard to communicate within the interface of the 3D design software. Behind this interaction, there are actually commands one by one. For example, adding an R3 corner to a certain part corresponds to a command.

Therefore, a three-dimensional model can correspond to a command sequence, and the result of executing this command sequence is the 3D model. So, "3D model = command sequence".

This command sequence can be understood as a piece of text, which is the context of the conversation, and then the large language model can be used to modify this text. The prompt template used for modification may be designed like this:
"
Modify the following model based on user input, and the output results are displayed in the result:
model: {command_sequence}
user input: {input}
result:
"

When using, fill in the variables in the promt template and send it to the large language model, then you can expect the large language model to output the desired results. But one problem here is that there may be no command sequence in the training data of the large language model, and the output results may be incorrect. There may be two ways to deal with this problem:
Use RAG to get some examples and add them to the prompt.
The language model is fine-tuned to adapt to the generation of command sequences.

The above is just a brief discussion of how to make an SAI. The specific process should be much more complicated than this. From a business perspective, there will be some SAI providers in the future, perhaps a certain 3D design software manufacturer, who developed an SAI and put it online so that it can be discovered and used.

SAI based on this architecture is actually a service. This service is used through dialogue, thus avoiding the complexity of using API or software interface. Interacting with SAI through GAI can ensure the accuracy of interaction. The result is that the design of SGI becomes simple, and it does not need to handle many abnormal situations.

As you can imagine, this Agent as a service approach can be used in a wide range of applications, and may even change the world.

AdithiyaG · 2024-05-06T14:54:49Z

AdithiyaG
May 6, 2024

@qingtian1771 you experimented with Lang graph? got any insights?

0 replies

qingtian1771 · 2024-05-11T04:15:12Z

qingtian1771
May 11, 2024
Author

@AdithiyaG ，yes, I learned Langgraph, I think it is great for special artificial intelligence (SAI), but we still need a General Artificial Intelligence (GAI).

Langgraph is a state machine. A designed graph is suitable for handling a certain type of problem. However, a key problem is that you don't know what questions users will ask, so you must design an architecture that can handle them all.So the cricital problem is: How to design an architecture for all kinds of question?

Maybe there are two kinds of methods:
1. Create the state machine on the fly.
2. Create universal agents architecture.

Let's talk about first solution.

Before answering the user's question, we need to analyze the user's question. Maybe we need to ask some of the following questions about the user's question:

Can I use language models to directly answer user questions?
Do I need to search online to answer this question?
Does answering this question require going to a special place to obtain information?
Does this problem need to be broken down into smaller problems and solved step by step?
In the process of solving this problem, do we need help from human to help the agents make judgments?

All of above problems need to think before agent answer user's question. Problems vary greatly, and so do solutions. so the structure of the state machine should change as the problem changes. This method seems complicated and costly, maybe it is not the best solution, but we can still explore how to dynamically build the graph, which is the structure of the state machine.

At present, langgraph is hard coding, the state machine sructure is fixed and is represented by code. and how to make it dynamic, maybe we need three steps:

create a graph language to represent the state machine sructure, maybe this language is a xml file, and it can be compiled into a state machine that can be run.
create a transform chain or agent that can transform state machine described in natural language to the xml file described above.
Analyze user's problem and create a state machine described in natural language.

Dynamic graph is complicated, Let't talk about second solution: Create universal agents architecture.
It is actually the solution proposed by this issue, but we will see how to implement it from the perspective of langgraph.
I think General Artificial Intelligence (GAI) and special artificial intelligence (SAI) are actually the graph, it is composed of some agents related to each other to complete a certain type of function. Let's give it a definition for now and call it an intelligent unit.

First, let's give an example to explain what is an intelligent unit.
intelligent unit name: document writing assitant.
intelligent unit description: this unit can help user to write a kind of document, user can provide some document examples, and the unit will communicate with user to collect some information related to the kind of document and output the document, the unit can still modify the document through further communication with user.

This intelligent unit looks like a tool, but actually it is a graph, it is a state machine. it can be registed on the web and other intelligent units can use it. for example, General Intelligent Unit (GIU, it is General Artificial Intelligence above :-) ) can use it to help user write a kind of document, maybe the chat like following:

User to GIU : Hello, I want to write a rent agreement.
GIU to User : OK, Let's me think...

GIU search on the web, and find a intelligent unit called "Document Writing Assitant" (DWA) which can be help to finish this task, so the GIU ask DWA for help.

GIU to DWA : user ask me to write a rent agreement, could you please help?
DWA to GIU : yes, please ask user to provide some examples documents.
GIU to User : could you give me some rent agreement examples?
User to GIU : yes, I have some examples, I give you right now.

user send some example documents to GIU, and GIU send these documents DWA

GIU to DWA : user provide these example documents.
DWA to GIU : I analyzed these documents, I need collect some information from user, please ask following questions about the document: {some questions here...}
GIU to User : could you answer following questions: {some questions here...}

when user finished all of question

GIU to DWA : following are questions and answers: {questions and answers}
DWA to GIU : OK, I fihished the document and sent it to you.
GIU to DWA : Thanks for your help.
GIU to User : this is the finished document, please check.
User to GIU : the document looks nice, thank you!

During the entire communication process, the user is unaware of the existence of DWA. The user just talks to the GIU and the problem is solved.
If compared with human society, then the dynamic graph is similar to an unorganized society, but when problems are encountered, some people are temporarily organized to solve them. The universal agents architecture is similar to today's human society. There are already some organizations and institutions, such as hospitals, schools, governments and police stations. When a problem occurs, you may not know how to solve it, but you just know who to look for.

@hwchase17 and @hinthornw , maybe have a look.

Thanks!

0 replies

qingtian1771 · 2024-07-14T10:24:23Z

qingtian1771
Jul 14, 2024
Author

see:
https://github.com/run-llama/llama-agents

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent as a service #16614

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Agent as a service #16614

qingtian1771 Jan 26, 2024

GAI and SAI

How to design a SAI.

Replies: 3 comments

AdithiyaG May 6, 2024

qingtian1771 May 11, 2024 Author

qingtian1771 Jul 14, 2024 Author

qingtian1771
Jan 26, 2024

AdithiyaG
May 6, 2024

qingtian1771
May 11, 2024
Author

qingtian1771
Jul 14, 2024
Author