What is a spark rdd?
Answer / Neeraj Sahu
Resilient Distributed Datasets (RDDs) are the fundamental data structure in Apache Spark. They are fault-tolerant distributed collections of objects that can be processed in parallel.
| Is This Answer Correct ? | 0 Yes | 0 No |
What is Immutable?
Explain the difference between Spark SQL and Hive.
What does the Spark Engine do?
How do I start a spark server?
How many ways can you create rdd in spark?
Do I need to install hadoop for spark?
Is spark better than mapreduce?
How is rdd fault?
What are the components of Apache Spark Ecosystem?
What are the ways to launch Apache Spark over YARN?
What are the functions of "Spark Core"?
Explain the processing speed difference between Hadoop and Apache Spark?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)