Member since
11-24-2017
76
Posts
8
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2904 | 05-14-2018 10:28 AM | |
5345 | 03-28-2018 12:19 AM | |
2614 | 02-07-2018 02:54 AM | |
3077 | 01-26-2018 03:41 AM | |
4450 | 01-05-2018 02:06 AM |
04-17-2018
06:37 AM
@Harsh J Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.
... View more
04-16-2018
01:55 AM
Thank you, exactly what I was thinking. With all queries aggregated in one script I gain speed (no overhead on Yarn containers) but in case of error I loose granularity for debug.
... View more
04-10-2018
11:45 AM
Thank you very much for the detailed answer @mzkdm. This is indeed a very interesting point. Do you think could make sense to have daily-based partitions, since my main ingestion workflow run once a day? And how can I force Hive or Impala users to use the last point-in-time data? Thanks for the help!
... View more
03-28-2018
12:19 AM
2 Kudos
I've finally solved using the executeUpdate method: // invalidate metadata and rebuild index on Impala
try {
Statement stmt = impalaConn.createStatement();
try {
String query = "INVALIDATE METADATA;";
int result = stmt.executeUpdate(query);
while (resultSet.next()) {
// do something
}
}
finally {
stmt.close();
}
}
catch(SQLException ex) {
while (ex != null)
{
ex.printStackTrace();
ex = ex.getNextException();
}
System.exit(1);
} Thanks for help!
... View more
03-02-2018
01:10 PM
The second error is normal. You are using the HiveServer1 driver to connect to HiveServer2, that's why it is failing. According to your beeline command, you have HS2. You need the following URL jdbc:hive2://localhost:10000;AuthMech=0;transportMode=binary with Class.forName("com.cloudera.hive.jdbc4.HS2Driver"); UID is not required, but you can provide it. Since you don't have LDAP configured, you definitelly don't have to set password.
... View more
02-07-2018
02:54 AM
Solved, I was looking in the wrong section:
... View more
01-26-2018
03:41 AM
Finally solved by adding the connection manager parameter and the Teradata jdbc drivers in the path: sqoop import \
--connection-manager com.teradata.jdbc.TeraDriver \
--connect jdbc:teradata://host/DATABASE=db \
[...]
The following files should be available to Oozie (I've put them in the application lib folder): sqoop-connector-teradata-1.6c5.jar tdgssconfig.jar terajdbc4.jar I guess the documentation should be updated, since with the informations provided the import from Teradata works from Sqoop standalone but fails always inside Oozie.
... View more
01-25-2018
12:51 AM
Thank you for the answer The JAVA_HOME is not set, and I've installed the jdk version from the CDH repository. Then I guess the Cloudera Manager also installed the jdk 6 during the isntallation/configuration phase. So you recommend to uninstall jdk 6 and 7 and install the latest supported jdk on each host?
... View more
01-22-2018
07:58 AM
Hi bgooley, yhank you for the answer! I've set the Sqoop 1 Client Gateway and deployed the configuration. The Teradata Connector parcel was already activated during the cluster configuration phase. What puzzles me is that in the Cloudera documentation, for the manual installation, says that the following property must be added to the sqoop-site.xml file in order to use the Teradata connector with Oozie: <configuration>
<property>
<name>sqoop.connection.factories</name>
<value>com.cloudera.connector.teradata.TeradataManagerFactory</value>
</property>
<configuration> I've followed the teradata connector installation path trough Cloaudera Manager, thus I was expecting that the property had been injected automatically by the CM, but it seems missing in the sqoop-site.xml files: /run/cloudera-scm-agent/process/ccdeploy_sqoop-conf_etcsqoopconf.cloudera.sqoop_client_1328941158299821742/sqoop-conf/sqoop-site.xml <?xml version="1.0" encoding="UTF-8"?> <!--Autogenerated by Cloudera Manager--> <configuration> <property> <name>sqoop.connection.factories</name> <value></value> </property> <property> <name>sqoop.tool.plugins</name> <value></value> </property> </configuration> /etc/sqoop/conf.cloudera.sqoop_client/sqoop-site.xml <?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>sqoop.connection.factories</name>
<value></value>
</property>
<property>
<name>sqoop.tool.plugins</name>
<value></value>
</property>
</configuration> In this case should I inject it manually from the configuration in Cloudera Manager web console as explained from @saranvisa? Thanks for the help
... View more
01-05-2018
02:06 AM
Solved with the following: job.properties nameNode=hdfs://quickstart.cloudera:8020
jobTracker=localhost:8032
oozie.wf.application.path=${nameNode}/user/cloudera/oozie/sqoop-app
oozie.use.system.libpath=true
oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop
oozie.action.sharelib.for.hive=hive,hcatalog,sqoop Copied hive-site.xml in the root folder of the application and added it in the workflow under <job-xml> tag: workflow.xml <workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">
<start to="sqoop_action" />
<action name="sqoop_action">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/cloudera/categories"/>
</prepare>
<job-xml>hive-site.xml</job-xml>
<command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by ',' --hive-import --hive-table cloudera.categories</command>
</sqoop>
<ok to="success"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>JOB FAILED!</message>
</kill>
<end name="success"/>
</workflow-app>
... View more
- « Previous
-
- 1
- 2
- Next »