Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the file-manager and chunk-manager services by the NON and NDN functions of the CYFS stack #220

Open
3 tasks
weiqiushi opened this issue Apr 17, 2023 · 6 comments
Assignees
Labels
App-Manager The App Manager basic service OOD-daemon The OOD-daemon basic service task This is a task

Comments

@weiqiushi
Copy link
Member

Now that the NON and NDN features of the CYFS stack are largely available, you can start considering using the NON and NDN features of the CYFS stack to replace the old file-manager and chunk-manager services.

  • Release the "final" version of file-manager and chunk-manager. This version inserts the data it receives into the CYFS stack as object and chunk. This version is designed to be compatible with the old NamedDataClient
  • Reimplement the new NamedDataClient with the CYFS stack interface, using the built-in stack, or a shared stack, to upload data to the OOD's CYFS stack

A few versions later:

  • Consider removing the file-manager and chunk-manager services
@lurenpluto
Copy link
Member

For the modification of the basic services need to be very careful, need to design in advance, and consider the upgrade brings a column of compatibility issues, currently thought of a few questions:

  • for the previous old local files and chunks, after upgrading to the new version of services, is it not recognized? Do someone need to republish the data or will there be a built-in data migration logic? Or the old data is not important at all and can be discarded?

  • After using cyfs-stack's NON/NDN, how will the uploaded files be tracked and managed? Especially after the introduction of GC mechanism, these files need to be associated to global-state in order not to be recycled, so the solution design needs to be completed

@weiqiushi
Copy link
Member Author

Question 1: The "final" version of file-manager and chunk-manager will have a one-time merge logic that will reinsert all the old local data into the local CYFS stack and delete the original independent local data after the successful insertion. After that, all get requests will be redirected to the NON and NDN interfaces of the CYFS stack.

Question 2:
The upload logic of the new NamedDataClient will follow the upload logic of CYFS-tool and will associate the uploaded ObjectId with system permission to a fixed path in the root-state to ensure that it will not be reclaimed by GC. The merge logic of file-manager will also associate the inserted object ids to the same path

@lurenpluto
Copy link
Member

Question 1: The "final" version of file-manager and chunk-manager will have a one-time merge logic that will reinsert all the old local data into the local CYFS stack and delete the original independent local data after the successful insertion. After that, all get requests will be redirected to the NON and NDN interfaces of the CYFS stack.

Question 2: The upload logic of the new NamedDataClient will follow the upload logic of CYFS-tool and will associate the uploaded ObjectId with system permission to a fixed path in the root-state to ensure that it will not be reclaimed by GC. The merge logic of file-manager will also associate the inserted object ids to the same path

I have a few questions in mind about how the data uploaded through CYFS-tool is managed, including

1. where to store the files for each upload

If a file, locally modified part of its content, is uploaded again with the same command, does the old logic now, overwrite the already existing file, or generate a new one because the FileId(ObjectId) has changed? Is there a similar overwriting logic?

2. Data validity management

Does the data already uploaded exist forever? That is, the "upload directory" or chunk-cache on OOD (where the new version should be stored?) It is always growing.

If you attach FileId/ChunkId to root-state, you need to consider the lifecycle management issues, especially for multiple versions of the same data, you need to have a strategy to manage it easily, for example:

The same data, when attached to the root-state, is associated by name instead of ObjectId, so that each time the data with the same name is upload multiple times, only the last one will be kept, and the rest will be recycled and deleted by GC; but if the management is based on ObjectId every time, it feels very unfriendly, because from the user's point of view, it is difficult to Data management by object-id

3. whether CYFS-tool can delete the uploaded data

Can tool now delete the specified data? If so, how does it work from the user's point of view? Do we still need an interface to query the list? I feel like there needs to be an easy and error-free way to manage and delete data

@weiqiushi
Copy link
Member Author

As a conceptual implementation of the cyfs-tool upload command, NamedDataClient is implemented for the same purpose as upload, i.e. to convert local data into data that can be streamed in the CYFS network. This data can be used by other components in the CYFS network.

They differ in that

  • cyfs-tool relies on the local runtime stack and uses trans publish and download interfaces for uploads, first publish the relevant files on the runtime and then download them on the OOD side
  • NamedDataClient may relies on the self-built CYFS stack, and the implementation will dock to the existing logic, directly using the NON and NDN interfaces of the protocol stack, generating objects locally, and then directly uploading the chunk data to OOD

The reason for the difference in logic between these two components is that NamedDataClient needs to implement both upload and download functions, and needs to support subfolder downloads. The trans interface does not currently support subfolder-based downloads. In the premise of achieving the functionality, the above implementation method is used, requiring minimal modifications

Data storage location

At the OOD side, the data will be stored in chunk cache as a unit. If a file is modified and uploaded again, a new FileId is generated and treated as a new file. chunk overwriting and reuse logic is handled by the new chunk-manager component built into the CYFS stack.

Data management logic

When a file/folder is uploaded, it becomes part of the CYFS network and is managed via ObjectId. This is the design principle of these two tools. They will in most cases not be used by regular users, but by DecApp developers as part of a development process, or some kind of integrated development tool. Their output will be used as input for other parts

Currently, neither of these tools is designed for delete operations.
When the CYFS stack implements the GC logic for chunk and object, both tools will have a simple deletion process: enter the object id, or cyfs o link, and delete the bound object on the root-state for the corresponding path. Then the GC logic will handle it

@lurenpluto
Copy link
Member

lurenpluto commented Apr 18, 2023

If the tool is only based on the object-id layer for data management, it will be cumbersome to use, so let's go back to the original requirements and look at some issues:

1. the design purpose of cyfs-tool

What is the purpose of cyfs-tool and what are the needs of users and developers? What are some typical usage scenarios?

2. How to access and use the published data?

Through o-links or r-links? After attached on the root-state, it can be accessed by r-link.

3. From the user's point of view, how to manage multiple versions of a data?

Theoretically, this is a very common requirement, and it is possible to manage the versions through name+r-link, that is, only one copy of the data with the same name can exist on the root-state.
But if we simply keep only an object-id and access it only through o-link, then versioning is extremely difficult for the user and a burden for the cyfs-stack (no GC).

@weiqiushi
Copy link
Member Author

  1. The main purpose of this tool is to upload a local file to OOD, which can then be used in other components. The envisioned scenario is usually:
    • Upload a local image to OOD and then fill the O link into the Body icon field of the People object to be displayed as the avatar of this People
    • Similar to the previous scenario, upload an icon and use this O link as the icon of DecApp
    • Upload a static resource and then use this resource in a cyfs page with

In the absence of other products, cyfs-tool serves a "basic" solution, allowing users to complete the first step of interaction with the CYFS stack

  1. In the case of cyfs-tool, the o-link is usually used. The r-link generated at the same time is just a by-product to avoid GC, and does not represent the real value of r-link.

  2. From the perspective of the base tool, there is no such thing as multiple versions of a piece of data. Different data have different ObjectId, which can also most directly represent the features of the CYFS object protocol.
    I believe that with the development of the CYFS protocol, some more functional and user-friendly products will make it easier for ordinary users to use CYFS. for example: web3 web disk, web3 IM, etc.

weiqiushi added a commit that referenced this issue Apr 19, 2023
… redirect get/set request to stack non functions
weiqiushi added a commit that referenced this issue Apr 19, 2023
…redirect get/put request to stack ndn function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
App-Manager The App Manager basic service OOD-daemon The OOD-daemon basic service task This is a task
Projects
Status: 💬To Discuss
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants