Google File System (GFS) was developed by Google to meet high data processing needs. The Hadoop Distributed File System (HDFS) was originally developed by Yahoo.Inc but is maintained as open source by the Apache Software Foundation. HDFS was created based on GFS and Google's Map Reduce. Since the Internet data was rapidly increasing, it was necessary to store the incoming large data, so Google developed a distributed file system called GFS, and HDFS was developed to meet the different needs of customers. These are built on commodity hardware, so the systems often fail. To make systems reliable, data is replicated between multiple nodes. By default, the minimum number of replicas is 3. Millions of files and large files are common with these types of file systems. Data is read more often than written. Large streaming needs and small casual needs are supported. How GFS works: GFS consists of a master node and block servers that are accessed by multiple clients. The client requests the location of the block from the master. The client sends the file name and block index it needs. Master stores the name...
tags