Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What are the three steps involved in big data?
What happens when two clients try to access the same file in the hdfs?
Can you explain apache ambari?
What are the different methods to run Spark over Apache Hadoop?
What is the default extension of the files produced from a sqoop import using the –compress parameter?
What are components of Cassandra Data Model?
List out the difference between textFile and wholeTextFile in Apache Spark?
What are the common faults of the developer while using Apache Spark?
What is standalone mode in spark?
Explain a common use case for Flume?
What is “serde” in “hive”?
Are spark dataframes distributed?
What do you mean by replication strategy?
Explain what is the purpose of RecordReader in Hadoop?
Differentiate between piglatin and hiveql?