What's rdd?
Answer / Laljeet
RDD stands for Resilient Distributed Datasets. It is a fault-tolerant distributed collection of data objects that can be processed in parallel across nodes in Apache Spark clusters.
| Is This Answer Correct ? | 0 Yes | 0 No |
What are the differences between Caching and Persistence method in Apache Spark?
What is the driver program in spark?
How to identify that given operation is transformation/action in your program?
What is executor memory in a spark application?
What is standalone mode in spark?
What is the use of spark driver, where it gets executed on the cluster?
What is spark tool in big data?
Apache Spark is a good fit for which type of machine learning techniques?
What is Spark MLlib?
Is spark sql a database?
Is java required for spark?
What is pipelined rdd?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)