How is indexing done in Hadoop HDFS?
What is throughput? How does hdfs provides good throughput?
What is non-dfs used in hdfs web console
What is the benifit of Distributed cache, why can we just have the file in HDFS and have the application read it?
Differentiate HDFS & HBase?
What happens if the block on Hadoop HDFS is corrupted?
Characterize data integrity? How does hdfs ensure information integrity of data blocks squares kept in hdfs?
Replication causes data redundancy and consume a lot of space, then why is it pursued in hdfs?
If a particular file is 50 mb, will the hdfs block still consume 64 mb as the default size?
What is a block in HDFS? what is the default size in Hadoop 1 and Hadoop 2? Can we change the block size?
What is a rack awareness algorithm?
What are tools available to send the streaming data to hdfs?
How to use hdfs put command for data transfer from flume to hdfs?
What is the optimal block size in HDFS?
What are file permissions in HDFS? how does HDFS check permissions for files/directory?