Member since
06-07-2016
923
Posts
322
Kudos Received
115
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3287 | 10-18-2017 10:19 PM | |
3645 | 10-18-2017 09:51 PM | |
13324 | 09-21-2017 01:35 PM | |
1350 | 08-04-2017 02:00 PM | |
1734 | 07-31-2017 03:02 PM |
07-06-2016
10:02 PM
Check the link I just added to my answer.
... View more
07-06-2016
09:58 PM
Hi @Qi Wang Which user is running the sqoop command? Can you verify file /etc/hive/2.5.0.0-817/0/xasecure-audit.xml exists? Does the user running sqoop import has read access to this file? Also, check the following link. It might be your issue. https://community.hortonworks.com/questions/369/installed-ranger-in-a-cluster-and-running-into-the.html
... View more
07-06-2016
05:12 PM
@Sunile Manjee Yes. Here is what I did. Let me know if you have any questions. try{
UserGroupInformation ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(kerberos_principal, kerberos_keytab);
objectOfMyType = ugi.doAs(new PrivilegedExceptionAction<MyType>(){
@Override
public MyType run() throws Exception{
System.setProperty("spark.serializer","org.apache.spark.serializer.KryoSerializer");
System.setProperty("spark.kryo.registrator","fire.util.spark.Registrator");
System.setProperty("spark.akka.timeout","900");
System.setProperty("spark.worker.timeout","900");
System.setProperty("spark.storage.blockManagerSlaveTimeoutMs","3200000");
// create spark context
SparkConf sparkConf = new SparkConf().setAppName("MyApp");
sparkConf.setMaster("local");
sparkConf.set("spark.broadcast.compress", "false");
sparkConf.set("spark.shuffle.compress", "false");
JavaSparkContext ctx = new JavaSparkContext(sparkConf);
DataFrame tdf = ctx.sqlctx().read().format("com.databricks.spark.csv")
.option("header", String.valueOf(header)) // Use first line of all files as header
.option("inferSchema", "true") // Automatically infer data types
.option("delimiter", delimiter)
.load(path);
//some more application specific code here
return objectOfMyType;
}
});
}
catch (Exception exception){
exception.printStackTrace();
}
... View more
07-06-2016
03:28 PM
1 Kudo
I figured this out. I changed master to local and then simply loading remote HDFS data. It was still giving an exception because it's a kerberized cluster. While I was using UserGroupInformation and then creating a proxy user with valid keytab to access my cluster, the reason it was failing was because I was creating JavaSparkContext outside of "doAs" method. Once I created JavaSparkContext using the right proxy user, everything worked.
... View more
07-01-2016
07:36 PM
Hive jdbc jar should be at the following location. You can copy it from here.
/usr/hdp/current/hive-client/lib/hive-jdbc.jar
... View more
07-01-2016
08:26 AM
Hi I am trying to run an application from my eclipse so I can put break points as well as monitor changing values of my variables. I create a JavaSparkContext which uses "SparkConf" object. This object should have access to my yarn-site.xml and core-site.xml so it knows how to connect to the cluster. I have these files under /etc/hadoop/conf and two environment variables set "HADOOP_CONF_DIR" and "YARN_CONF_DIR" on my mac using ~/Library/LaunchAgents/environment.plist where I have eclipse. I have verified these variables are available when I boot up mac and I can view these variables in my my app in eclipse using "System.getenv("HADOOP_CONF_DIR") and they point to the right location. I have also tried adding environment variables in my build configuration in eclipse. After doing all this, my code consistently fails because it's unable to read yarn-site.xml or core-site.xml because I run into following issue INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:803216/07/01 00:57:16 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) As you can see, it's not trying to connect to the correct location of resource manager. Here is how the code looks in create(). Please let me know what you think as this is blocking me. public static JavaSparkContext create() {
System.setProperty("spark.serializer","org.apache.spark.serializer.KryoSerializer");
System.setProperty("spark.kryo.registrator","fire.util.spark.Registrator");
System.setProperty("spark.akka.timeout","900");
System.setProperty("spark.worker.timeout","900");
System.setProperty("spark.storage.blockManagerSlaveTimeoutMs","3200000");
// create spark context
SparkConf sparkConf = new SparkConf().setAppName("MyApp");
// if (clusterMode == false)
{
sparkConf.setMaster("yarn-client");
sparkConf.set("spark.broadcast.compress", "false");
sparkConf.set("spark.shuffle.compress", "false");
}
JavaSparkContext ctx = new JavaSparkContext(sparkConf); <- Fails Here
return ctx;
}
... View more
Labels:
- Labels:
-
Apache Spark
06-30-2016
08:41 PM
@hoda moradi You will have to do some research but you might be missing a jar file. Are you sure you have jdbc jar files in classpath? See the following two links. https://community.hortonworks.com/questions/19396/oozie-hive-action-errors-out-with-exit-code-12.html https://community.hortonworks.com/articles/9148/troubleshooting-an-oozie-flow.html
... View more
06-30-2016
08:22 PM
Hi @hoda moradi Here is the issue you are running into. User: hive is not allowed to impersonate anonymous at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:266) at I am assuming this is simple development and you are not so much concerned about policies. If you are, then only your organization's security team can tell you which users can hive impersonate. But basically you need to enable hive impersonation. Can you see if following is set to true in your hive-site.xml? <property>
<name>hive.server2.enable.impersonation</name>
<description>Enable user impersonation for HiveServer2</description>
<value>true</value>
</property> and check the following link to setup proxyuser settings for hive user in core-site.xml http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_ambari_views_guide/content/_setup_HDFS_proxy_user.html You need to set the following. Remember, this definitely cannot be * if this is for work and that is where your security team comes in. They will tell you who hive use can impersonate. hadoop.proxyuser.hive.groups=*
hadoop.proxyuser.hive.hosts=*
... View more
06-28-2016
05:26 PM
Do you have Ambari running? You should be able to check from Ambarithe status of your JHS. Otherwise, this should bring the UI assuming you haven't modified the default ports. http://<host>:19888
... View more
06-28-2016
04:22 PM
@hoda moradi Can you please share your log? Is your job history server running? Thanks
... View more