Tuesday, March 28, 2017

Uncompress file to HDFS without unzipping on local FS

Sourced from: http://bigdatanoob.blogspot.com/2011/07/copy-and-uncompress-file-to-hdfs.html


Quick and dirty method to be able to uncompress a large file directly into HDFS without having to uncompress locally:

Syntax:
       $ gunzip -c localfile.gz | hadoop fs -put - /user/user1/localfile


Explanation of options:
       gunzip -c  = The -c option causes the output of the gunzip operation to be written to the
       console.

       The '-' specified in the hadoop fs -put operation points the source file to be originated
       from the console.


So with this example:
       $ gunzip -c 3GB_json.gz | hadoop fs -put - /user/cloudera/3GB_json

The shell will run gunzip using the a compressed 3GB Json file (3GB_json.gz) sending its output to the console, which is then piped into the hadoop fs -put operation, which will then place the payload into the file /user/cloudera/3GB_json.







No comments:

Post a Comment