What are the ways to create RDDs in Apache Spark? Explain.
Answer / Dibya Prakash Tiwari
{"ways":[n "parallelize": creates an RDD from a local collection of data with specified parallelism.n "textFile": reads lines of text from one or more files located on HDFS or any Hadoop compatible file system.n "wholeTextFiles": reads whole content of zero or more files located on HDFS or any Hadoop compatible file system as a single RDD of (filename, content) pairs.n ]}
| Is This Answer Correct ? | 0 Yes | 0 No |
Explain Spark saveAsTextFile() operation?
What is spark ml?
What is meant by in-memory processing in Spark?
What are the components of Spark Ecosystem?
How many types of Transformation are there?
Explain the flatMap() transformation in Apache Spark?
What is big data spark?
What is the disadvantage of spark sql?
How will you calculate the number of executors required to do real-time processing using Apache Spark? What factors need to be considered for deciding on the number of nodes for real-time processing?
List the various types of "Cluster Managers" in Spark.
What is spark executor cores?
What are spark jobs?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)