About asirna

asirna · ‎10-17-2018

@Anpan K, Yes. you can read it like below %pyspark content = sc.textFile("file:///path/example.txt") If file schema is not given,it defaults to HDFS

asirna · ‎10-16-2018

@Michael Bronson, Are you running the command from the same node where zookeeper is running ? Can you please paste the command that you are running. Can you try passing proper hostname instead of localhost:2181 while running the command?

asirna · ‎10-16-2018

@Michael Bronson, Is your cluster Kerberized? If it is not kerberized you may not have kinit installed. In that case, you can just run these commands # su hdfs # zookeeper-client -server {zk-host}:2181 ## zk: zkhost-1(CONNECTED) 1] ls / If this doesn't work, try restarting zookeeper server and try again.

asirna · ‎10-16-2018

@HENI MAHER, Looks like ResourceManager is not running. Please start ResourceManager and try again.

asirna · ‎10-16-2018

I am facing an issue while starting Spark thrift server when NN HA is enabled. I have 2 namenodes on host1 and host2. It is starting when namenode on host1 is active and fails to start when namenode on host1 is standby. Below is the stack trace Exception in thread "main" org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1952) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1423) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3085) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1154) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:966) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) ); at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106) at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194) at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:53) at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:79) at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Pasting the contents of spark-thrift-sparkconf.conf spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64 spark.dynamicAllocation.enabled true spark.dynamicAllocation.initialExecutors 0 spark.dynamicAllocation.maxExecutors 10 spark.dynamicAllocation.minExecutors 0 spark.eventLog.dir hdfs:///spark2-history/ spark.eventLog.enabled true spark.executor.extraJavaOptions -XX:+UseNUMA spark.executor.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64 spark.hadoop.cacheConf false spark.history.fs.cleaner.enabled true spark.history.fs.cleaner.interval 7d spark.history.fs.cleaner.maxAge 90d spark.history.fs.logDirectory hdfs:///spark2-history/ spark.history.provider org.apache.spark.deploy.history.FsHistoryProvider spark.io.compression.lz4.blockSize 128kb spark.master yarn-client spark.scheduler.allocation.file /usr/hdp/current/spark2-thriftserver/conf/spark-thrift-fairscheduler.xml spark.scheduler.mode FAIR spark.shuffle.file.buffer 1m spark.shuffle.io.backLog 8192 spark.shuffle.io.serverThreads 128 spark.shuffle.service.enabled true spark.shuffle.unsafe.file.output.buffer 5m spark.sql.autoBroadcastJoinThreshold 26214400 spark.sql.hive.convertMetastoreOrc true spark.sql.hive.metastore.jars /usr/hdp/3.0.0.0-1634/spark2/standalone-metastore/standalone-metastore-1.21.2.3.0.0.0-1634-hive3.jar spark.sql.hive.metastore.version 3.0 spark.sql.orc.filterPushdown true spark.sql.orc.impl native spark.sql.statistics.fallBackToHdfs true spark.sql.warehouse.dir /apps/spark/warehouse spark.unsafe.sorter.spill.reader.buffer.size 1m spark.yarn.executor.failuresValidityInterval 2h spark.yarn.maxAppAttempts 1 spark.yarn.queue default I checked for core-site.xml and hdfs-site.xml in the node where spark thrift server is running. fs.defaultFS is having the proper value ( ie hdfs://namespace). I am guessing that it is picking the host1 value from some config file but not sure from which file. Please let me know any other places to look. . Thanks

asirna · ‎10-16-2018

@Michael Bronson, From the logs it looks like the client is not yet connected to the server [zk: localhost:2181(CONNECTING)0] If it is connected , you should get CONNECTED instead of CONNECTING. If your cluster is Kerberized, you need to run kinit before connecting to the zookeeper client. You can run the below steps # kinit -kt /etc/security/keytabs/hdfs.headless.keytab {principal} # zookeeper-client -server {zk-host}:2181 ## zk: zkhost-1(CONNECTED) 1] ls /

asirna · ‎10-16-2018

If you have erasure coded some directory and perform some operations on the directory you might have observed WARN messages like below WARN erasurecode.ErasureCodeNative: Loading ISA-L failed: Failed to load libisal.so.2 (libisal.so.2: cannot open shared object file: No such file or directory) WARN erasurecode.ErasureCodeNative: ISA-L support is not available in your platform... using builtin-java codec where applicable This WARN messages are due to ISA Library not being present on the node. Below are the steps to enable the library 1) Clone the isa-l github repository. # git clone https://github.com/01org/isa-l.git 2) Go to the cloned directory # cd isa-l 3) Install yasm if you do not have it already # yum install -y yasm ---> centOS # apt-get install yasm ----> ubuntu 4) Build the library # make -f Makefile.unx 5) Copy the library files to lib directory # cp bin/libisal.so bin/libisal.so.2 /lib64 6) Verify that isa-l library is enabled properly # hadoop checknative Expected output 18/10/12 10:20:03 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native 18/10/12 10:20:03 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library Native library checking: hadoop: true /usr/hdp/3.0.0.0-1634/hadoop/lib/native/libhadoop.so.1.0.0 zlib: true /lib64/libz.so.1 zstd : false snappy: true /usr/hdp/3.0.0.0-1634/hadoop/lib/native/libsnappy.so.1 lz4: true revision:10301 bzip2: true /lib64/libbz2.so.1 openssl: true /lib64/libcrypto.so ISA-L: true /lib64/libisal.so.2 -------------> Shows that ISA-L is loaded. If step 6 uses /usr/lib64 directory instead of /lib64, you need copy the .so files in Step 5 to /usr/lib64 directory. Perform the steps on all datanode and namenode hosts or copy the .so files from the above node to /lib64 directories of all other nodes. . Hope this helps 🙂

asirna · ‎10-15-2018

@Madhura Mhatre, You can install all these components on some other node and then stop and delete all these components from the node. 1) Go to Ambari -> Hosts 2) Select the Host where you want to move these components to 3) Click on +ADD button and select the component you want to install (Spark2 History Server, Livy for spark2 server etc) 4) Start the components on the new host after installing them 5) Click on the old host and stop all the spark components and delete them. . -Aditya

asirna · ‎10-11-2018

@vamsi valiveti, You need atleast the classname to get all the methods available in the class. I can think of a solution without Google. Method 1: Run this command (Replace jar-path with real jar path) jar -tf {jar-path} | grep -i class | sed -e 's/\//./g' | sed -e 's/\.class//g' | xargs javap -classpath {jar-path} Method 2: You can open the Jar file and check the list of the classes and then list the methods in the class 1) Check the classnames using vim (not vi) vim Piggybank.jar 2) Take the clasname in which you want to list the methods (copy the path including package name) javap -classpath {path-to-jar-file} {full-class-name-including-package-name} ex: javap -classpath example.jar org.apache.hadoop.xyz.Abc (Abc is the class name) . If this helps, please take a moment to login and Accept the answer.

asirna · ‎10-10-2018

@Sami Ahmad, You can run the below command set; For ex: If you want to check the params that are set to true, then you can run hive -e 'set;' | grep true . -Aditya

Online	Offline
Last Visited	‎05-28-2019 04:37 PM

Member Since	‎11-07-2016 08:16 AM
Last Visited	‎05-28-2019 04:37 PM
Posts	637
Kudos received	253

Cloudera Community

Re: Ranger "Test Connection" button REST API?

Re: knox PUT request failing

Re: spark 2 jobs with OOZIE on HDP 2.6.5 hangs

Re: Ambari access through private hostname on aws ...

Re: Restart Resource Manager does not work, can no...

Re: pyspark read file

Re: zookeper + Client session timed out, have not...

Re: zookeper + Client session timed out, have not...

Re: Start Hive & HiveServer2

Spark thrift server is failing to start when NN HA...

Re: zookeper + Client session timed out, have not...

Enable Intel's Intelligent Storage Acceleration Li...

Re: How to move Spark2 components to another node?

Re: List of Functions in a Jar File

Re: how to view hive cli settings