Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Should I install spark on all nodes of yarn cluster?
Explain JobConf in MapReduce.
Do we need to give a password, even if the key is added in ssh?
You have a file personal_data.txt in the HDFS directory with 100 records. You want to see only the first 5 records from the employee.txt file. How will you do this?
What is accumulator in spark?
what are views in Hive?
Is hive a nosql database?
Give me the examples of Columnar database ?
Which command is used to list all the tables in a database or list all the columns in a table?
What are the differences between PIG and HIVE?
What are the exact differences between reduce and fold operation in Spark?
What are the benefits of block transfer?
Suppose there is file of size 514 mb stored in hdfs (hadoop 2.x) using default block size configuration and default replication factor. Then, how many blocks will be created in total and what will be the size of each block?
How is Apache Spark better than Hadoop?
What is the use of dataframe in spark?