Explain about mappartitions() and mappartitionswithindex()
Answer / Mitan Verma
mappartitions() is a transformation in Apache Spark that creates a new RDD by re-partitioning the data across a specified number of partitions. The result RDD has no dependencies on the input RDD's partition structure. On the other hand, mappartitionsWithIndex() returns an RDD with two datasets: one containing the original data, and another containing partition indices.
| Is This Answer Correct ? | 0 Yes | 0 No |
Explain benefits of lazy evaluation in RDD in Apache Spark?
What is scala spark?
What is a parquet file?
What is spark checkpointing?
Are spark dataframes immutable?
What is application master in spark?
What is spark technology?
How tasks are created in spark?
Name types of Cluster Managers in Spark.
Name some internal daemons used in spark?
Describe the run-time architecture of Spark?
What is javardd?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)