Yet Another Distributed File System (YADFS)

This distributed file system is designed to efficiently store, manage, and retrieve large files across multiple machines in a network.

High-Level Architecture

The architecture of YADFS involves the implementation of data nodes and name nodes for the storage and management of data. The key components are:

Name Nodes

Manages metadata about files and directories.
Maintains a namespace hierarchy and file-to-block mapping.
Monitors the health and availability of Data Nodes.

Availability Check

Periodically pings each Data Node.
Data Nodes acknowledge the Name Node ping.

Data Nodes

Responsible for handling the reading and writing of data.
Provides API for write and read operations on data blocks.
Considers replication across Data Nodes with a replication factor of 3.

Organizing Data in Data Nodes

Data is stored in the form of files within folders.
At least one root folder exists.
Users can view a virtual tree of all folders and files.

Note on Data Blocks

Data blocks are used to store and manage large files efficiently.
Each file is divided into fixed-size blocks distributed across Data Nodes.
Metadata tracks the location of each block.

Fault Tolerance

Implements mechanisms to handle Data Node failures.
Maintains multiple replicas of data blocks.
Detects failed nodes and redistributes data blocks to healthy nodes.

Client Interaction and Features

Develop a command line interface or web interface for interacting with YADFS, supporting the following actions:

Metadata Operations

Create, delete, move, and copy directories and files.
List files and directories within a directory.
Traverse directories.

DFS Operations

Upload and download files from YADFS.

Process Flows

Upload Process Flow

File Splitting: - The client divides the file into fixed-size blocks.
Block Creation: - The client assigns a unique identifier to each block.
Uploading Blocks: - Client sends data blocks to Data Nodes.
Replication: - System creates replicas for fault tolerance.
Metadata Update: - Client updates metadata with file information.
Namespace Resolution: - Client and file system determine block storage.
Client Acknowledgment: - Client receives acknowledgements from Data Nodes.

Download Process Flow

Client Request: - Client requests file download from DFS.
Metadata Retrieval: - Client retrieves file information from Name Node.
Block Location Retrieval: - Client learns data block locations.
Data Block Retrieval: - Client retrieves data blocks from Data Nodes.
Data Transfer: - Data Nodes transfer blocks to the client.
Reassembly: - Client reassembles blocks into the original file.
File Completion Check: - Client checks successful retrieval of all data blocks.
Cleanup: - Client may delete temporary data and close connections.

Metadata Operation Process Flow

Client Request: - Client sends operation request to Name Node.
NameNode Verification: - Name Node verifies the validity of the operation.
NameNode Operation: - Name Node operates and updates metadata.
Client Response: - The client receives a response from Name Node regarding operation status.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
datablocks		datablocks
metadata		metadata
README.md		README.md
client.py		client.py
config_param.py		config_param.py
datanode.py		datanode.py
namenode.py		namenode.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yet Another Distributed File System (YADFS)

High-Level Architecture

Name Nodes

Availability Check

Data Nodes

Organizing Data in Data Nodes

Note on Data Blocks

Fault Tolerance

Client Interaction and Features

Metadata Operations

DFS Operations

Process Flows

Upload Process Flow

Download Process Flow

Metadata Operation Process Flow

About

Releases

Packages

Contributors 2

Languages

Anushkaghei/Distributed-File-System

Folders and files

Latest commit

History

Repository files navigation

Yet Another Distributed File System (YADFS)

High-Level Architecture

Name Nodes

Availability Check

Data Nodes

Organizing Data in Data Nodes

Note on Data Blocks

Fault Tolerance

Client Interaction and Features

Metadata Operations

DFS Operations

Process Flows

Upload Process Flow

Download Process Flow

Metadata Operation Process Flow

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages