Q> What is HDFS Block size? How is it different from traditional file system block size? In HDFS data is split into blocks and distributed across multiple nodes in the cluster. Each block is typically 64Mb or 128Mb in size. |
||||||||||||||||||
Q> How can you transfer or copy files from one node to another node in a Hadoop cluster? distcp (distributed copy) is a tool used for large inter/intra-cluster copying.
|
||||||||||||||||||
Q> How to change the block size of a file already exists in the cluster? distcp (distributed copy) can be used to change the block size of the file.
|
||||||||||||||||||
Q> How to set the replication factor for one file when it is uploaded by ‘hdfs dfs -put’ command in HDFS? hadoop dfs -Ddfs.replication=N -put <source_url> <destination_url>
|
||||||||||||||||||
Q> How to change replication factor of existing files in HDFS OR How do you overwrite replication factor of an existing file? Ex. To set replication of an individual file to 4:
Ex. To change replication of a particular directory to 2 recursively:
Ex. You can also do this recursively to change replication of entire HDFS to 1:
|
||||||||||||||||||
Q> Find version of Java, Hadoop, Hive, Pig, Sqoop, HBase, Spark, Oozie, Impala Hadoop:
Sqoop:
HBase:
Oozie:
Java:
Hive:
Pig:
Impala:
Spark:
|