Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Explain about the execution pl of a pig script?
or
differentiate between the logical and physical plan of an apache pig script?
Is impala intended to handle real time queries in low-latency applications or is it for ad hoc queries for the purpose of data exploration?
43
How can you import only a subset of rows from a table?
In ambari 2.6.2 version added the following features:
Define Cassandra?
Explain the process to trigger automatic clean-up in Spark to manage accumulated metadata.
What is the difference between an RDBMS and Hadoop?
What is the full form of MSLAB?
Is it possible to add 100 more nodes when we already have 100 nodes in Hive?
What are the features of apache mahout?
How does spark work with python?
What is Cassandra-CQL collection?
how does hdfs ensure data integrity of data blocks stored in hadoop hdfs?
What is the non dfs used?
What is the logical plan in pig architecture?
What is driver memory and executor memory in spark?
What do you mean by taskinstance?