How can we create RDD in Apache Spark?
Answer / Dhanraj
RDDs can be created from various sources such as local files, HDFS files, or even other RDDs using Spark's API (Application Programming Interface). Some ways to create RDDs include textFile(path), wholeTextFiles(path), and parallelize(iterable) in Scala, SparkSession.textFile(path), SparkSession.wholeTextFiles(path), and SparkSession.parallelize(iterable) in Java and Python respectively.
| Is This Answer Correct ? | 0 Yes | 0 No |
How many ways we can create rdd?
Name few companies that are the uses of apache spark?
Is apache spark in demand?
What is difference between client and cluster mode in spark?
What is Catalyst framework?
What file systems does spark support?
Does spark store data?
How rdd persist the data?
Is spark written in scala?
What are the features of spark rdd?
What is a "Parquet" in Spark?
Why is there a need for broadcast variables when working with Apache Spark?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)