Member since
06-07-2016
923
Posts
322
Kudos Received
115
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4101 | 10-18-2017 10:19 PM | |
| 4345 | 10-18-2017 09:51 PM | |
| 14851 | 09-21-2017 01:35 PM | |
| 1841 | 08-04-2017 02:00 PM | |
| 2424 | 07-31-2017 03:02 PM |
02-01-2017
11:10 PM
@samuel sayag what is this script element in your hbase-site.xml and hive-site.xml. Can you please remove that and try it again?
... View more
02-01-2017
02:24 PM
In production, you would have "edge nodes" where you have client programs install and they are talking to the cluster. But even if you put data in local file system on data node and then copy into HDFS, it will not prevent data distribution. The client file is in local file system (XFS, ext4) which is unrelated to HDFS (well not exactly, but as far as your question is concerned). Standard practice is to use Edge node and not name node.
... View more
01-31-2017
08:44 PM
fair enough. see the new answer by @bpreachuk. I was assuming you are loking for free tools but if you can get syncsort or if you already have it, that's the easiest way to do this.
... View more
01-31-2017
08:20 PM
@samuel sayag The error you are getting is this Unable to set watcher on znode (/hbase/hbaseid) Is your zookeeper running? If yes, please share your hbase-site.xml.
... View more
01-31-2017
05:01 PM
1 Kudo
@Joby Johny You can use Cloudbreak to setup HDP cluster on AWS if HDC does not have everything you need. http://sequenceiq.com/cloudbreak-docs/release-1.6.1/aws/
... View more
01-31-2017
04:55 PM
@Prasanna G your putty is an ssh client and not an hdfs client. once you ssh to your sandbox, then you are able to run hdfs command because sandbox is where HDFS is installed including on your shell commands.
This is simialr to the fact that you cannot run "ls /some directory" from putty before you ssh into the box.
... View more
01-31-2017
07:13 AM
@Saurabh
What is the value of the following property? dfs.namenode.accesstime.precision check docs here --> https://hadoop.apache.org/docs/r2.6.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml Use FileStatus.getAccessTime() to get the last access time. It will depend on the precision set above. If it's currently set to zero then you don't have access time. If it's set to default of one hour then you can get access time up to the precision of one hour. If you have set your own precision, then you get whatever you have set. https://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/fs/FileStatus.html
... View more
01-31-2017
05:08 AM
@karthick baskaran You can use following project. It uses JRecord to do the conversion. https://github.com/tmalaska/CopybookInputFormat You can use Spark to read your EBCDIC files from hadoop and convert them to ASCII using above library.
... View more
01-31-2017
03:00 AM
@Prasanna G All documentation for sandbox is right here which I think you are aware of. As for which file system, I have never verified but I would think it is standard linux file system like ext4. Just type "mount" command without any parameters and it will show you the mounted file systems and their types. http://hortonworks.com/hadoop-tutorial/learning-the-ropes-of-the-hortonworks-sandbox/
... View more
01-31-2017
02:15 AM
1 Kudo
@Prasanna G I think you are copying in your local file system and looking in your HDFS. Check your local tmp folder. Also, your full command is invisible but I am assuming its something like below: pscp -P 2222 C:\Users\prgovind\Downloads\f.txt root@localhost:///tmp Is that right? I think you first need to copy to local tmp folder on your sand box and then push it into HDFS. Try following: pscp -P 2222 C:\Users\prgovind\Downloads\f.txt root@sandbox:/tmp ssh root@sandbox hdfs dfs -put /tmp/f.txt /user/praskutti
... View more