How do you handle compression in pig?
Answer / Tripti Agarwal
In Apache Pig, data compression can be achieved by using Storing policies. The STORED AS part of the CREATE TABLE or LOAD command is used to specify the storage policy. You can use 'org.apache.hadoop.hive.conf.HiveConf' to set the serialization and compression properties like 'serialization.format.version', 'io.compression.codecs' etc. For example, you can compress data using Gzip or Snappy compression by setting 'io.compression.codec.gzip' or 'io.compression.codec.snappy' respectively.
| Is This Answer Correct ? | 0 Yes | 0 No |
Why we use BloomMapFile?
Explain the difference between COUNT_STAR and COUNT functions in Apache Pig?
What is a bag in Pig Latin?
Can we say a COGROUP is a group of more than 1 data set?
How can we see only top 15 records from the student.txt out of100 records in the HDFS directory?
What is UDF in Pig?
What is hadoop pig?
Why should we use ‘distinct’ keyword in Pig scripts?
How to load data in pig?
What is BloomMapFile?
Explain bagtostring in pig?
What are the common hadoop PIG interview questions, that you have been asked in a Hadoop Job Interview?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)