What are the abstractions of Apache Spark?
Answer / Sada Shiv Mishra
The primary abstraction in Apache Spark is the Resilient Distributed Dataset (RDD), which is an immutable distributed collection of data. Other abstractions include DataFrames and Datasets, which provide a more convenient API for manipulating structured data.
| Is This Answer Correct ? | 0 Yes | 0 No |
What is the advantage of a Parquet file?
In what ways sparksession different from sparkcontext?
Which serialization libraries are supported in spark?
What is stage and task in spark?
Explain the terms Spark Partitions and Partitioners?
Is apache spark a database?
Define Partitions?
What is spark shuffle service?
Different Running Modes of Apache Spark
What is client mode in spark?
What is Directed Acyclic Graph(DAG)?
What is meant by rdd lazy evaluation?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)