About pminovic

pminovic · ‎03-23-2017

Inserting non-ascii strings from the command line might not work for a number of reasons, and even if it works it's inpractical for large data. Instead, put all your strings, any script supported by UTF-8 including cyrillic in a file, upload the file to HDFS, create an external table based on that file and try to explore the table using for example "... WHERE name='привет' ", that should work. Note that for a new table there is no need to declare serialization.encoding='UTF-8', it will be UTF-8 by default. Ditto for external tables already using UTF-8 data. You need it only if the input file is in non-UTF-8 character set like KOI8-R, and in such a case it will be serialization.encoding='KOI8-R'. For more details on using native charsets in Hive see my article.

pminovic · ‎03-22-2017

Hi @Juan Manuel Nieto, well done! I noticed AMBARI-18898 and suspected it's causing a havoc on the command line, but didn't have time to try it. Though, now, after fixing Solr, Ranger audit cannot connect to it and Ambari is showing "http 500" (false) alerts on both Infra Solr instances. Edit: I missed "DEFAULT" in the name rules, omitted as I tried with only one rule before. After adding DEFAULT everything is back to normal!

pminovic · ‎03-18-2017

It worked. part-m-00001 is not a separate table, it's just another file in your import directory. If you create an external table on /date_new7, Hive will see a single table with 3 rows. Ditto for Map-reduce jobs taking /date_new7 as their input. If you end up with many small files you can merge them into one (from time to time) by using for example hadoop-streaming, see this example and set "mapreduce.job.reduces=1".

pminovic · ‎03-17-2017

Great analogy, but I only have a bike! 🙂 I'd like to be able to say "set my.transport.engine=ferrari;" and it here it is, at my front door!

pminovic · ‎03-17-2017

You need 2 principals and keytabs per instance: SOLR_KERBEROS_PRINCIPAL=HTTP/solr-host1.fqdn@YOUR.REALM, this will be saved in infra-solr-env.sh, and principal=infra-solr/solr-host1.fqdn@YOUR.REALM, this one will go to infra_solr_jaas.conf. And two more for your host2. You can set both from Ambari, in the "Advanced infra-solr-env" section, referring to the host fqdn as _HOST, for example HTTP/_HOST@YOUR.REALM.

pminovic · ‎03-16-2017

If your table is partitioned you have to create it first as "STORED AS ORC" and then do " INSERT INTO" it listing all fields in SELECT. Also enable dynamic partitions. set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; create table if not exists t1 (a int, b int) partitioned by (c int); -- your original table create table t1orc (a int, b int) partitioned by (c int) stored as ORC; -- your compressed table insert into table t1orc partition(c) select a, b, c from t1;

pminovic · ‎03-16-2017

As simple as this: CREATE TABLE t1_orc STORED AS ORC AS SELECT * FROM <your-existing-table>; Note that if you have a single 200T table, this is going to take a while. You can test on a smaller table first.

pminovic · ‎03-16-2017

Okay, actually I had some mess with my Oozie folders, and I couldn't stop Oozie, because /var/tmp/oozie was missing. After adding it and restarting Oozie I can see a 1 action DAG. So, again, if you have less than 25 actions make sure your Oozie is healthy, restart it and retry.

pminovic · ‎03-16-2017

This is maybe a bug, I see the same error on my HDP-2.5.0 Oozie, with only 1 action. Do you really have more than 25 actions? On a HDP-2.5.3 cluster it works fine with 1 action. If you really have more than 25 actions then it won't work as shown by @Murali Ramasami

pminovic · ‎03-14-2017

Hi guys, this is an old question, already resolved. If you have new issues, please post a new question. In your case you obviously have some issues with system time on your DB server versus system time on the node where your run sqoop.

Online	Offline
Last Visited	‎08-19-2019 01:20 AM

Member Since	‎09-24-2015 04:02 AM
Last Visited	‎08-19-2019 01:20 AM
Posts	816
Kudos received	481

Cloudera Community

Re: datanode + Error occurred during initializatio...

Re: Problem when Distcp between two HA Cluster.

Re: Beeline over KNOX fails with HTTP Response co...

Re: What does nclients option of performance evalu...

Re: missing directories in ambari installation pac...

Re: Hive UTF-8 problems

Re: Ambari infra fails to start after install on k...

Re: incremental load from mysql to hdfs in hadoop

Re: On processing Large volumes tables MR is perf...

Re: Ambari infra fails to start after install on k...

Re: how to compress existed table in Hive

Re: how to compress existed table in Hive

Re: Oozie can't display the graph cause number of ...

Re: Oozie can't display the graph cause number of ...

Re: sqoop incremental import working fine ,now i w...