About georg_kf_heiler

georg_kf_heiler · ‎05-04-2018

Further digging around in the hive source code I have found: https://github.com/apache/hive/commit/8ce0118ffe517f0c622571778251cbd9f760c4f5#diff-a0e344e574e0fe542ad8a98e64c967cf in particular https://github.com/apache/hive/blob/1eea5a80ded2df33d57b2296b3bed98cb18383fd/ql/src/test/queries/clientpositive/reloadJar.q leads me to believe that hfs should be supported. --! qt:dataset:src dfs -mkdir ${system:test.tmp.dir}/aux; dfs -cp ${system:hive.root}/data/files/identity_udf.jar ${system:test.tmp.dir}/aux/udfexample.jar; SET hive.reloadable.aux.jars.path=${system:test.tmp.dir}/aux; RELOAD; CREATE TEMPORARY FUNCTION example_iden AS 'IdentityStringUDF'; EXPLAIN SELECT example_iden(key) FROM src LIMIT 1; SELECT example_iden(key) FROM src LIMIT 1; DROP TEMPORARY FUNCTION example_iden; dfs -rm -r ${system:test.tmp.dir}/aux; EDIT It appears, that CREATE TEMPORARY FUNCTION example_iden AS 'IdentityStringUDF'; throws a Warning of : WARN. . permanent functions created without USIJNG clause will not be replicated so I assume the USING /path/to/jar.jar is mandatory for permanent UDFs even when reloadable flag is set.

georg_kf_heiler · ‎05-03-2018

Hi,I want to update Hive UDFs without requiring a restart of hive. According to: https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cm_mc_hive_udf.html#concept_zb2_rxr_lw setting hive.reloadable.aux.jars.path is required. I have set it to /user/hive/libs/udf (which resides on HDFS). However following their documentation I see: file:///usr/lib/hive/lib/foo.jar which is confusing me. Does this property only work for files residing on the local file system? Do I understand correctly. that I should execute beelines reload manually? Also in case this property works for HDFS does it automatically pick up the classes in the jar (load them) and no longer requires to specify the CREATE FUNCTION foo AS 'my/path/to/jar-1.jar'? Desired behaviour: 1. copy jar to HDFS /user/hive/lib/udf/foo-1.jar 2. add function to hive: DROP FUNCTION IF EXISTS foo; CREATE FUNCTION foo AS 'my.class.path.in.jar.FooUDF' USING JAR '/user/hive/lib/udf/foo-1.jar'; 3. add a new jar to HDFS /user/hive/lib/udf/foo-2.jar 4. update function in hive: DROP FUNCTION IF EXISTS foo; CREATE FUNCTION foo AS 'my.class.path.in.jar.FooUDF' USING JAR '/user/hive/lib/udf/foo-2.jar'; This currently does not work and requires a restart of hive. It results in round robin seeing the updated UDF (or still the old one). How can I get hive to not require a restart when updating UDF? Also I do not want to put the UDF locally into a directory. It should reside on HDFS. Best, Georg

georg_kf_heiler · ‎04-24-2017

Thanks a lot for the great tutorial. How could this be extended to not only listen to a web socket, but rather periodically send control commands like: https://blockchain.info/api/api_websocket for example `{"op":"unconfirmed_sub"}`?

georg_kf_heiler · ‎02-16-2017

Great. Do you know when 2.5.4 should be released?

georg_kf_heiler · ‎02-15-2017

Ambari will support the installation of spark 2.0.0 as a technical preview. This version of spark contains a lot of bugs which are fixed in 2.0.2 or 2.1.0 how can I install any of these versions via ambari?

georg_kf_heiler · ‎02-14-2017

Is it possible to export a blueprint of the current cluster configuration?

georg_kf_heiler · ‎02-14-2017

Initially, I deployed a blueprint to ambari.Having used the nice ambari UI to create some configuration changes I would like to know how to export the current cluster configuration as a blueprint. If this is not possible, how can I access the "default" configuration to know the config values which need to be passed in a blueprint.

georg_kf_heiler · ‎02-13-2017

I used``` "policymgr_external_url" : "https://{% if isSingleNode %} {{ groups[cluster_name+'_mn01'][0] }} {% else %} {{ groups[cluster_name+'_mn03'][0] }} {% endif %}:6182", {% else %} "policymgr_external_url" : "http://{% if isSingleNode %} {{ groups[cluster_name+'_mn01'][0] }} {% else %} {{ groups[cluster_name+'_mn03'][0] }}{% endif %}:6080", {% endif %} ``` to create the JSON. It is adopted from https://github.com/bushnoh/ansible-hadoop-asap/blob/master/blueprints/bare_cluster.bp.j2 When removing the spaces inn all places ineed, finally ambari successfully installs ranger.

georg_kf_heiler · ‎02-13-2017

When installing Ranger via a blueprint I get the following exception: 2017-02-13 18:22:10,754 [E] create_dbversion_catalog.sql file import failed! 2017-02-13 18:22:40,763 [JISQL] /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64/bin/java -cp /usr/hdp/current/ranger-admin/ews/lib/mysql-connector-java.jar:/usr/hdp/current/ranger-admin/jisql/lib/* org.apache.util.sql.Jisql -driver mysqlconj -cstring jdbc:mysql://mn01.vagrant :3306/ranger -u 'ranger' -p '********' -noheader -trim -c \; -query "show tables like 'x_db_version_h';" SQLException : SQL state: 3D000 java.sql.SQLException: No database selected ErrorCode: 1046 SQLException : SQL state: 3D000 java.sql.SQLException: No database selected ErrorCode: 1046 2017-02-13 18:22:41,149 [I] Table x_db_version_h does not exist in database ranger 2017-02-13 18:22:41,149 [JISQL] /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64/bin/java -cp /usr/hdp/current/ranger-admin/ews/lib/mysql-connector-java.jar:/usr/hdp/current/ranger-admin/jisql/lib/* org.apache.util.sql.Jisql -driver mysqlconj -cstring jdbc:mysql://mn01.vagrant :3306/ranger -u 'ranger' -p '********' -noheader -trim -c \; -input /usr/hdp/current/ranger-admin/db/mysql/create_dbversion_catalog.sql Error executing: create table if not exists x_db_version_h ( id bigint not null auto_increment primary key, version varchar(64) not null, inst_at timestamp not null default current_timestamp, inst_by varchar(256) not null, updated_at timestamp null default null, updated_by varchar(256) not null, active ENUM('Y', 'N') default 'Y' ) ; java.sql.SQLException: No database selected SQLException : SQL state: 3D000 java.sql.SQLException: No database selected ErrorCode: 1046 2017-02-13 18:22:41,556 [E] create_dbversion_catalog.sql file import failed! The blueprint's json can be found here: https://gist.github.com/geoHeil/bbe4eb9cef4f4e6c2feca743f2b19bc8 The complete input json is found here https://gist.github.com/geoHeil/b54181d35c0d4549c0da25465cc93e29 and the full output txt https://gist.github.com/geoHeil/6b3d08d748e03703b35ddce424b108c1

Online	Offline
Last Visited	‎10-05-2018 05:26 AM

Member Since	‎10-22-2016 09:04 AM
Last Visited	‎10-05-2018 05:26 AM
Posts	28
Kudos received	5

Cloudera Community

Re: updating hive UDF without restarting the clust...

updating hive UDF without restarting the cluster

Re: NiFi WebSocket support

Re: Spark 2 Technical preview with patches

Spark 2 Technical preview with patches

Re: Automate HDP installation using Ambari Bluepri...

ambari export blueprint

Re: Install Ranger via blueprint

Install Ranger via blueprint