Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2627 | 12-25-2018 10:42 PM | |
12061 | 10-09-2018 03:52 AM | |
4164 | 02-23-2018 11:46 PM | |
1839 | 09-02-2017 01:49 AM | |
2166 | 06-21-2017 12:06 AM |
03-23-2017
11:30 PM
Inserting non-ascii strings from the command line might not work for a number of reasons, and even if it works it's inpractical for large data. Instead, put all your strings, any script supported by UTF-8 including cyrillic in a file, upload the file to HDFS, create an external table based on that file and try to explore the table using for example "... WHERE name='привет' ", that should work. Note that for a new table there is no need to declare serialization.encoding='UTF-8', it will be UTF-8 by default. Ditto for external tables already using UTF-8 data. You need it only if the input file is in non-UTF-8 character set like KOI8-R, and in such a case it will be serialization.encoding='KOI8-R'. For more details on using native charsets in Hive see my article.
... View more
03-22-2017
08:24 AM
Hi @Juan Manuel Nieto, well done! I noticed AMBARI-18898 and suspected it's causing a havoc on the command line, but didn't have time to try it. Though, now, after fixing Solr, Ranger audit cannot connect to it and Ambari is showing "http 500" (false) alerts on both Infra Solr instances. Edit: I missed "DEFAULT" in the name rules, omitted as I tried with only one rule before. After adding DEFAULT everything is back to normal!
... View more
03-18-2017
08:22 AM
1 Kudo
It worked. part-m-00001 is not a separate table, it's just another file in your import directory. If you create an external table on /date_new7, Hive will see a single table with 3 rows. Ditto for Map-reduce jobs taking /date_new7 as their input. If you end up with many small files you can merge them into one (from time to time) by using for example hadoop-streaming, see this example and set "mapreduce.job.reduces=1".
... View more
03-17-2017
02:12 AM
Great analogy, but I only have a bike! 🙂 I'd like to be able to say "set my.transport.engine=ferrari;" and it here it is, at my front door!
... View more
03-17-2017
01:41 AM
You need 2 principals and keytabs per instance: SOLR_KERBEROS_PRINCIPAL=HTTP/solr-host1.fqdn@YOUR.REALM, this will be saved in infra-solr-env.sh, and principal=infra-solr/solr-host1.fqdn@YOUR.REALM, this one will go to infra_solr_jaas.conf. And two more for your host2. You can set both from Ambari, in the "Advanced infra-solr-env" section, referring to the host fqdn as _HOST, for example HTTP/_HOST@YOUR.REALM.
... View more
03-16-2017
07:48 AM
If your table is partitioned you have to create it first as "STORED AS ORC" and then do " INSERT INTO" it listing all fields in SELECT. Also enable dynamic partitions. set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
create table if not exists t1 (a int, b int) partitioned by (c int); -- your original table
create table t1orc (a int, b int) partitioned by (c int) stored as ORC; -- your compressed table
insert into table t1orc partition(c) select a, b, c from t1;
... View more
03-16-2017
05:42 AM
1 Kudo
As simple as this: CREATE TABLE t1_orc STORED AS ORC AS SELECT * FROM <your-existing-table>; Note that if you have a single 200T table, this is going to take a while. You can test on a smaller table first.
... View more
03-16-2017
01:58 AM
Okay, actually I had some mess with my Oozie folders, and I couldn't stop Oozie, because /var/tmp/oozie was missing. After adding it and restarting Oozie I can see a 1 action DAG. So, again, if you have less than 25 actions make sure your Oozie is healthy, restart it and retry.
... View more
03-16-2017
01:33 AM
This is maybe a bug, I see the same error on my HDP-2.5.0 Oozie, with only 1 action. Do you really have more than 25 actions? On a HDP-2.5.3 cluster it works fine with 1 action. If you really have more than 25 actions then it won't work as shown by @Murali Ramasami
... View more
03-14-2017
04:14 AM
Hi guys, this is an old question, already resolved. If you have new issues, please post a new question. In your case you obviously have some issues with system time on your DB server versus system time on the node where your run sqoop.
... View more