Member since
10-19-2017
23
Posts
1
Kudos Received
0
Solutions
11-15-2021
06:14 AM
Hello, i am interested in the same question (CDP 7.1.6). Best regards
... View more
09-08-2021
03:19 AM
Hi, I am having the same issue on CDP 7.1.6 with Oozie 5.1.0. But the suggested solution does not seem to work anymore. Setting <property> <name>oozie.launcher.yarn.app.mapreduce.am.env</name> <value>SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark/</value> </property> has no effect. Is there anything else I can do? Did the setting change?
... View more
06-02-2021
06:34 AM
Hi, you can get the legacy behavior of "create table" by executing SET hive.create.as.external.legacy=true; https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/configuring-apache-hive/topics/hive_create_table_default.html This gives you the old behavior of hive 1.x / 2.x for your tables
... View more
11-05-2019
11:15 PM
I have the solution now: In the "Custom hanaes-site" set sap. hana. es. dmz. proxy. host=sandbox-proxy And in the Remote Datasource in HANA: CREATE REMOTE SOURCE “proxy_spark" ADAPTER "sparksql" CONFIGURATION 'server=<SANDBOX-VM IP>;port=8090;ssl_mode=disabled;proxy_host=<SANDBOX-VM IP>' WITH CREDENTIAL TYPE 'PASSWORD' USING 'user=hanaes;password=hanaes'; With WebHcat server enabled I got some errors, that the port is not free, so either use another port, that is published in the sandbox-proxy or disable webhcat server. Normally 8090 + 8091 should be free to use according to the docs
... View more
11-05-2019
09:16 AM
I think we have to configure the sandbox-proxy container as proxy host in the sparkcontroller config. I will try that next and report the result
... View more
11-05-2019
04:42 AM
Hi, we have exactly the same issue... querying an external table as suggested does not help. We have also completely deleted Ranger and Knox from the Sandbox. The port we use is 8090. 8090 and 8091 are published in the hdp docker container. Does anyone know how to resolve the issue?
... View more
10-19-2017
03:29 AM
I am building a java application that is suppoed to import data via sqoop and do some hdfs operations. This application should run on Cloudera CHD5.12.0 and sqoop 1.4.6 I successfully ran the sqoop import using a java ProcessBuilder. But this sees like a dirty way to do this. My next approach was to use the sqoop ImortTool. But that way i ran into depreciated issues... I used org.apache.sqoop.tool.ImportTool which accepts a SqoopOptions object in the constructor. But it needs the SqoopOptions from com.cloudera... Package, which is depreciated. So I tried a differen Solution: import org.apache.sqoop.Sqoop;
import org.apache.hadoop.conf.Configuration;
Configuration conf = new Configuration();
Sqoop.runTool(sqoopOptions.toArray(new String[sqoopOptions.size()]), conf); This results in folloing Error: 17/10/18 11:01:15 WARN tool.SqoopTool: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
17/10/18 11:01:15 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.12.0
17/10/18 11:01:15 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/10/18 11:01:15 WARN sqoop.ConnFactory: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
17/10/18 11:01:15 INFO manager.SqlManager: Using default fetchSize of 1000
17/10/18 11:01:15 ERROR oracle.OracleConnectionFactory: Unable to load the jdbc driver class : oracle.jdbc.OracleDriver
17/10/18 11:01:15 ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.RuntimeException: Unable to load the jdbc driver class : oracle.jdbc.OracleDriver
at org.apache.sqoop.manager.oracle.OracleConnectionFactory.loadJdbcDriver(OracleConnectionFactory.java:75)
at org.apache.sqoop.manager.oracle.OracleConnectionFactory.createOracleJdbcConnection(OracleConnectionFactory.java:52)
at org.apache.sqoop.manager.oracle.OraOopConnManager.makeConnection(OraOopConnManager.java:97)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.oracle.OraOopManagerFactory.accept(OraOopManagerFactory.java:114)
at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:184)
at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:270)
at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:95)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:609)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at jobs.util.FullTableImportStrategy.runSqoopImportToHdfsTempDir(FullTableImportStrategy.java:77)
at jobs.util.FullTableImportStrategy.execute(FullTableImportStrategy.java:29)
at jobs.ImportJob.<init>(ImportJob.java:12)
at start.Main.main(Main.java:12)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Oracle JDBC Driver is set up correctly, since everything works from commandline and with the ProcessBuilder. How can i resolve this error, and is there maybe a better way to run a sqoop command from a java program? Thank you for your help!
... View more
Labels:
- Labels:
-
Apache Sqoop
09-05-2017
11:46 AM
Do you need any more details to help me? Anyone already faced the same issue?
... View more
09-04-2017
05:02 PM
I have a falcon feed on an external hive table. This feed is input to a Falcon Process. When I schedule both, the input feed and the process, the process stays in WAITING status. I understand that hive thinks that the data is not yet there, because no partiton is added for the data. If the process instance would start i could run a MSCK REPAIR TABLE input_table; If i do this manually, the feed is processed as far as data was copied to the right directory. But how can i add the partitions automatically?
... View more
Labels:
- Labels:
-
Apache Falcon
-
Apache HCatalog
-
Apache Hive
08-22-2017
08:05 AM
so basically the root of the problem is the lack of resources? and if i cannot increase them, i can just deal with the symptoms (alerts)?
... View more
08-22-2017
07:39 AM
A few minutes after the job has ended everything is back to normal
... View more
08-22-2017
07:38 AM
Hi, my cluster seems to work fine, but when I submit a hive or sqoop job I get alerts (see screnshot). I already followed the recommendations of the post "how to get rid of ambari stale alerts", but they keep showing up... My cluster is running on vsphere virtual machines. Could this cause the problem. e.g. network is overloaded?
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
-
Apache Hive
06-01-2017
12:00 PM
1 Kudo
Hi, I am just getting into hadoop and HDFS. I am very confused how HDFS handles data loss, in case the NameNode fails. In the documentation i found three mechanisms to prevent data loss. Secondary NameNode, CheckpointNode and BackupNode. I understood the differences between them, but I am not sure if CheckpointNode and BackupNode are depreciated, since I cant find them in my Hortonworks distribution. I also understood, that none of them is neccessary if you deploy hadoop HA. Is there any guideline which of the nodes should be used in production? thank you for your answers
... View more
Labels:
- Labels:
-
Apache Hadoop