Big Data Interview Questions
Questions Answers Views Company eMail

What is a IdentityMapper and IdentityReducer in MapReduce ?

643

Explain Working of MapReduce?

667

Write a Mapreduce Program for Character Count ?

713

how to proceed to write your first mapreducer program?

1031

How to set the number of reducers?

1040

Developing a MapReduce Application?

660

Different ways of debugging a job in MapReduce?

807

Explain the Reducer's reduce phase?

711

Why do we use HDFS for applications having large data sets and not when there are lot of small files?

1 2137

What are the functions of NameNode?

1 1413

How to configure hadoop to reuse JVM for mappers?

793

mapper or reducer?

633

How to resolve IOException: Cannot create directory

679

Does Pig support multi-line commands?

573

How to change replication factor of files already stored in HDFS?

713


Un-Answered Questions { Big Data }

Explain Spark streaming?

189


Explain bucketing in Hive?

501


What is the role of Consumer API?

304


What are the advantages and Disadvantages in archieving partition in Hive?

459


When to use secondary indexes?

50






What are clusters in cassandra?

42


Why aggregation cannot be done in Mapper?

304


Is spark good for machine learning?

207


Explain what is speculative execution?

219


Please explain the sparse vector in Spark.

211


is it posible to join multiple fields in pig scripts?

314


Why HDFS stores data using commodity hardware despite the higher chance of failures?

21


Explain the common input formats in hadoop?

240


What are sink processors?

645


Explain jsonloader, jsonstorage functions in pig?

323