Sourced from: http://bigdatanoob.blogspot.com/2011/07/copy-and-uncompress-file-to-hdfs.html
Quick and dirty method to be able to uncompress a large file directly into HDFS without having to uncompress locally:
Syntax:
$ gunzip -c localfile.gz | hadoop fs -put - /user/user1/localfile
Explanation of options:
gunzip -c = The -c option causes the output of the gunzip operation to be written to the
console.
The '-' specified in the hadoop fs -put operation points the source file to be originated
from the console.
So with this example:
$ gunzip -c 3GB_json.gz | hadoop fs -put - /user/cloudera/3GB_json
The shell will run gunzip using the a compressed 3GB Json file (3GB_json.gz) sending its output to the console, which is then piped into the hadoop fs -put operation, which will then place the payload into the file /user/cloudera/3GB_json.
No comments:
Post a Comment