Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Is it possible to leverage real time analysis on the big data collected by flume directly? If yes, then explain how?
77What does the following query do? Insert overwrite table employees partition (country, state) select ..., Se.cnty, se.st from staged_employees se;
921While loading data into a hive table using the load data clause, how do you specify it is a hdfs file and not a local file ?
734
What is the difference between hive and spark?
What are the other components of ambari that are important for automation and integration?
What is nagios is used in ambari?
What is a local repository and when will you use it?
Replication causes data redundancy and consume a lot of space, then why is it pursued in hdfs?
Why is spark popular?
Describe impala shell (impala-shell command)?
What is a broker?
Explain the terms Spark Partitions and Partitioners?
What is ganglia is used for in ambari?
Explain about the major libraries that constitute the Spark Ecosystem?
How do I get apache spark on windows 10?
What problems can be addressed by using Zookeeper?
What is FlumeNG?
What do you understand by the term Straggler ?