Member since
01-09-2019
401
Posts
163
Kudos Received
80
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2081 | 06-21-2017 03:53 PM | |
3160 | 03-14-2017 01:24 PM | |
1986 | 01-25-2017 03:36 PM | |
3166 | 12-20-2016 06:19 PM | |
1583 | 12-14-2016 05:24 PM |
07-07-2016
12:20 PM
I don't see a reason for the first insert to be a text/uncompressed avro file. Using HCatalog, you can directly import from sqoop to hive table as ORC. That would save you a lot of space because of compression. Once the initial data import is in Hive as ORC, you can then still continue and transform this data as necessary. If the reason for writing as text is to access from Pig and MR, a HCatalog table also can be accessed from Pig/MR.
... View more
07-07-2016
12:36 AM
1 Kudo
If you already have a running cluster, exporting blueprint from ambari, editing relevant entries and using that to create a DR cluster works. Another approach is to create your first cluster as well with a blueprint. This will ensure easier creation of second cluster. I don't know of any other custom tools that can create an almost identical cluster
... View more
07-06-2016
06:32 PM
1 Kudo
A quick workaround is to disable ranger authorization from hive->authorization in ambari UI and restart hive. However, if ranger is part of your tests and want to keep it enabled, this is not the solution.
... View more
07-06-2016
06:16 PM
A quick workaround is to disable ranger authorization from hive->authorization in ambari UI and restart hive. However, if ranger is part of your tests and want to keep it enabled, this is not the solution.
... View more
07-06-2016
03:13 PM
1 Kudo
Checkpointing is the process of merging editlogs with base fsimage. This will be stored in namenode metadata directories. Its not the same as editlog, since editlog has the changes that you make to HDFS.
... View more
07-05-2016
04:07 PM
@bganesan Now that https://issues.apache.org/jira/browse/RANGER-205 is fixed, can we use the rest API instead of DB script?
... View more
07-01-2016
09:07 PM
1 Kudo
When you click on OVA to import, you will see Guest OS Type? What do you see there and has this been changed. You should see Red Hat (64-bit) there. By default you don't change this and if you change this, import is not going to work.
... View more
07-01-2016
03:47 PM
If you want to use the files as is, then yes. But do you have the file already split by dates? In that case, you will need to have the date column as both a column and a partition (with different names). But you may be better off reorganizing these files into ORC for better lookup speeds. If you want to do that you will create a second table as ORC and can do an insert overwrite.
... View more
07-01-2016
03:29 PM
Why do you need to include date information as a column? If you are creating a merge using Pig (or hive query), you can move the date field that is a column into a partition.
... View more
07-01-2016
03:08 PM
1 Kudo
It is difficult to say what should be your PARTITION which this information. Best way to get to finding partition is from future query patterns. If you know there will be a where clause in most of the queries and the value is not high cardinality, then that could be your partition. If you think your queries mostly hit a date range, you could partition by date.
... View more