About ludof

ludof · ‎04-17-2018

@Harsh J Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.

ludof · ‎04-16-2018

Thank you, exactly what I was thinking. With all queries aggregated in one script I gain speed (no overhead on Yarn containers) but in case of error I loose granularity for debug.

ludof · ‎04-10-2018

Thank you very much for the detailed answer @mzkdm. This is indeed a very interesting point. Do you think could make sense to have daily-based partitions, since my main ingestion workflow run once a day? And how can I force Hive or Impala users to use the last point-in-time data? Thanks for the help!

ludof · ‎03-28-2018

I've finally solved using the executeUpdate method: // invalidate metadata and rebuild index on Impala try { Statement stmt = impalaConn.createStatement(); try { String query = "INVALIDATE METADATA;"; int result = stmt.executeUpdate(query); while (resultSet.next()) { // do something } } finally { stmt.close(); } } catch(SQLException ex) { while (ex != null) { ex.printStackTrace(); ex = ex.getNextException(); } System.exit(1); } Thanks for help!

GeKas · ‎03-02-2018

The second error is normal. You are using the HiveServer1 driver to connect to HiveServer2, that's why it is failing. According to your beeline command, you have HS2. You need the following URL jdbc:hive2://localhost:10000;AuthMech=0;transportMode=binary with Class.forName("com.cloudera.hive.jdbc4.HS2Driver"); UID is not required, but you can provide it. Since you don't have LDAP configured, you definitelly don't have to set password.

ludof · ‎02-07-2018

Solved, I was looking in the wrong section:

ludof · ‎01-26-2018

Finally solved by adding the connection manager parameter and the Teradata jdbc drivers in the path: sqoop import \ --connection-manager com.teradata.jdbc.TeraDriver \ --connect jdbc:teradata://host/DATABASE=db \ [...] The following files should be available to Oozie (I've put them in the application lib folder): sqoop-connector-teradata-1.6c5.jar tdgssconfig.jar terajdbc4.jar I guess the documentation should be updated, since with the informations provided the import from Teradata works from Sqoop standalone but fails always inside Oozie.

ludof · ‎01-25-2018

Thank you for the answer The JAVA_HOME is not set, and I've installed the jdk version from the CDH repository. Then I guess the Cloudera Manager also installed the jdk 6 during the isntallation/configuration phase. So you recommend to uninstall jdk 6 and 7 and install the latest supported jdk on each host?

ludof · ‎01-22-2018

Hi bgooley, yhank you for the answer! I've set the Sqoop 1 Client Gateway and deployed the configuration. The Teradata Connector parcel was already activated during the cluster configuration phase. What puzzles me is that in the Cloudera documentation, for the manual installation, says that the following property must be added to the sqoop-site.xml file in order to use the Teradata connector with Oozie: <configuration> <property> <name>sqoop.connection.factories</name> <value>com.cloudera.connector.teradata.TeradataManagerFactory</value> </property> <configuration> I've followed the teradata connector installation path trough Cloaudera Manager, thus I was expecting that the property had been injected automatically by the CM, but it seems missing in the sqoop-site.xml files: /run/cloudera-scm-agent/process/ccdeploy_sqoop-conf_etcsqoopconf.cloudera.sqoop_client_1328941158299821742/sqoop-conf/sqoop-site.xml <?xml version="1.0" encoding="UTF-8"?>  <configuration> <property> <name>sqoop.connection.factories</name> <value></value> </property> <property> <name>sqoop.tool.plugins</name> <value></value> </property> </configuration> /etc/sqoop/conf.cloudera.sqoop_client/sqoop-site.xml <?xml version="1.0" encoding="UTF-8"?>  <configuration> <property> <name>sqoop.connection.factories</name> <value></value> </property> <property> <name>sqoop.tool.plugins</name> <value></value> </property> </configuration> In this case should I inject it manually from the configuration in Cloudera Manager web console as explained from @saranvisa? Thanks for the help

ludof · ‎01-05-2018

Solved with the following: job.properties nameNode=hdfs://quickstart.cloudera:8020 jobTracker=localhost:8032 oozie.wf.application.path=${nameNode}/user/cloudera/oozie/sqoop-app oozie.use.system.libpath=true oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop oozie.action.sharelib.for.hive=hive,hcatalog,sqoop Copied hive-site.xml in the root folder of the application and added it in the workflow under <job-xml> tag: workflow.xml <workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4"> <start to="sqoop_action" /> <action name="sqoop_action"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${nameNode}/user/cloudera/categories"/> </prepare> <job-xml>hive-site.xml</job-xml> <command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by ',' --hive-import --hive-table cloudera.categories</command> </sqoop> <ok to="success"/> <error to="fail"/> </action> <kill name="fail"> <message>JOB FAILED!</message> </kill> <end name="success"/> </workflow-app>

Online	Offline
Last Visited	‎12-21-2018 06:29 AM

Member Since	‎11-24-2017 01:33 AM
Last Visited	‎12-21-2018 06:29 AM
Posts	76
Kudos received	7

Cloudera Community

Re: Oozie with HDFS High Availability

Re: Invalidate metadata using Cloudera Impala JDBC...

Re: Cloudera Manager: oozie.service.WorkflowAppSer...

Re: Oozie Sqoop actions fails when importing data ...

Re: Oozie Sqoop action fails on --hive-import

Re: Spark SQL action fails in Kerberos secured clu...

Re: Best practice for Hive actions inside Oozie

Re: Rollback

Re: Invalidate metadata using Cloudera Impala JDBC...

Re: How to connect to Hive through jdbc

Re: Cloudera Manager: oozie.service.WorkflowAppSer...

Re: Oozie Sqoop actions fails when importing data ...

Re: Which version of java is used in CM-cluster?

Re: How configuration files are used in in a Cloud...

Re: Oozie Sqoop action fails on --hive-import