Member since
06-07-2016
923
Posts
322
Kudos Received
115
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4132 | 10-18-2017 10:19 PM | |
| 4363 | 10-18-2017 09:51 PM | |
| 14923 | 09-21-2017 01:35 PM | |
| 1863 | 08-04-2017 02:00 PM | |
| 2433 | 07-31-2017 03:02 PM |
08-25-2016
03:48 PM
@Sami Ahmad Can you try distcp2 instead? hadoop distcp2 hdfs:///user/sami/ hdfs:///user/zhang
... View more
08-25-2016
02:10 AM
2 Kudos
@Michel Sumbul When you talk about encryption in HBase, you Encrypt HFile and WAL. You cannot encrypt only some columns and not others. When you encrypt the HFile, your cells are encrypted. Please check the following link on how to implement this. https://hbase.apache.org/book.html#hbase.encryption.server You can also create HDFS level encryption zone for /hbase directory and your data will be encrypted. Please check the following link https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_hdfs_admin_tools/content/hbase-with-hdfs-encr.html
... View more
08-24-2016
09:14 PM
1 Kudo
When creating table use the following: TBLPROPERTIES ('serialization.null.format'='') Then do INSERT INTO table_name (col1,col3, col5) select * from csvtable . Check the following. This should just work. https://community.hortonworks.com/questions/1216/techniques-for-dealing-with-malformed-data-hive.html
... View more
08-24-2016
09:00 PM
3 Kudos
@Brandon Wilson
Theoretically, there isn't a limit on number of snapshots but, like everything there is a price to pay. Snapshot as you know only captures metadata information at point in time. Now imagine you created one snapshot every minute (taking an extreme example to explain what will happen). Two hours later you have 120 snapshots. hfiles are immutable. Guess what happens. The moment snapshot is taken, snapshot will contain a reference to hfiles at that point in time. HBase snapshot doesn't make any copies of data. That only happens when you are restoring from snapshot. But what do you think happens when a compaction or deletion is triggered? If snapshot has reference to those immutable hfiles, then they are moved to an archiving folder. They are not really deleted. Because you might decide to restore from that snapshot. If you have lots of compactions and updates then each snapshot might be pointing to different hfiles. This means, your snapshots will affect your storage. So, you do not have a theoretical limit on number of snapshots But, snapshots if used aggressively are not entirely free of cost. You might end up using a significant amount of storage. So, doing that 30 days, 6 months or a year will require significant storage overhead.
... View more
08-24-2016
07:45 PM
@Gaurab D Do you have Oracle drive in Sqoop directory? Check /usr/hdp/current/sqoop-client/lib/. If it's not there, pease put it here in classpath.
... View more
08-24-2016
05:55 PM
so you have two clusters on same node? Is it possible that two clusters have different block size settings? Can you please verify dfs.blocksize setting on both clusters?
... View more
08-24-2016
05:49 PM
@Hans Feldmann One way I parsed my json was to convert it to Avro. So basically, after getting rid of special characters from json using "replaceText" processor, I sent it to "inferAvroSchema". Then used convertJsonToAvro using the inferred schema and then wrote that to HDFS where I had a table and read it in Hive. Another way is to use Json hive Serde. That's actually much easier. Check this out. https://github.com/rcongiu/Hive-JSON-Serde
... View more
08-24-2016
04:03 AM
This is a connection issue. Can you connect to the cluster from command line from this machine?
... View more
08-24-2016
04:02 AM
So there is no credential cache. You need to do a kinit first. Then you should run klist -A. Then it will show you the credential cache. It should work.
... View more