About ssankarau

ssankarau · ‎07-16-2018

I have a root user and a test-user. Steps I followed to lauch spark2-shell 1. Login as root 2. kinit keytab principal 3. spark2-shell I could able to successfully launch spark2-shell without any error. 1. login as test-user 2. kint keytab principal 3. spark2-shell I am Getting below Exception while launching. I hope this error falls under Unix File System Permission Denied. Can you help where do I need to update. java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder': at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1053) at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130) at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:129) at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:126) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938) at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:938) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:97) ... 47 elided Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.io.IOException: Permission denied; at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:108) at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:195) at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:106) at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:94) at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35) at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050) ... 61 more Caused by: java.lang.RuntimeException: java.io.IOException: Permission denied at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:554) at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:176) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:357) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:261) at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:68) at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:67) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:196) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:196) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:196) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:99) ... 70 more Caused by: java.io.IOException: Permission denied at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2024) at org.apache.hadoop.hive.common.FileUtils.createTempFile(FileUtils.java:787) at org.apache.hadoop.hive.ql.session.SessionState.createTempFile(SessionState.java:872) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:552) ... 84 more <console>:14: error: not found: value spark import spark.implicits._ ^ <console>:14: error: not found: value spark import spark.sql ^ Regards, Sankar

ssankarau · ‎11-24-2017

I have followed http://atlas.apache.org/InstallationSteps.html to setup atlas (with external hbase-solr). After starting the ./atlas_start.py, There were no more logs generated after "AuditFilter Initialization Started". application log: 017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingDslQuery (GraphTransactionAdvisor$1:41) 2017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingFullTextQuery (GraphTransactionAdvisor$1:41) 2017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingBasicQuery (GraphTransactionAdvisor$1:41) 2017-11-24 18:09:41,176 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityLineageService.getSchemaForHiveTableByGuid (GraphTransactionAdvisor$1:41) 2017-11-24 18:09:41,227 INFO - [main:] ~ Starting service org.apache.atlas.web.service.ActiveInstanceElectorService (Services:53) 2017-11-24 18:09:41,227 INFO - [main:] ~ HA is not enabled, no need to start leader election service (ActiveInstanceElectorService:96) 2017-11-24 18:09:41,228 INFO - [main:] ~ Starting service org.apache.atlas.kafka.KafkaNotification (Services:53) 2017-11-24 18:09:41,228 INFO - [main:] ~ Starting service org.apache.atlas.notification.NotificationHookConsumer (Services:53) 2017-11-24 18:09:41,228 INFO - [main:] ~ HA is disabled, starting consumers inline. (NotificationHookConsumer:143) 2017-11-24 18:09:41,228 INFO - [main:] ~ Consumer property: atlas.kafka.enable.auto.commit: null (KafkaNotification:275) 2017-11-24 18:09:41,346 WARN - [main:] ~ The configuration hook.group.id = atlas was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,346 WARN - [main:] ~ The configuration data = /home/ec2-user/sankar/atlas/distro/target/apache-atlas-1.0.0-SNAPSHOT-bin/apache-atlas-1.0.0-SNAPSHOT/data/kafka was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.connection.timeout.ms = 200 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration key.serializer = org.apache.kafka.common.serialization.StringSerializer was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.session.timeout.ms = 400 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration value.serializer = org.apache.kafka.common.serialization.StringSerializer was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.connect = 10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.sync.time.ms = 20 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,348 WARN - [main:] ~ The configuration poll.timeout.ms = 1000 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,416 INFO - [main:] ~ Starting service org.apache.atlas.repository.audit.HBaseBasedAuditRepository (Services:53) 2017-11-24 18:09:41,417 INFO - [NotificationHookConsumer thread-0:] ~ [atlas-hook-consumer-thread], Starting (Logging$class:68) 2017-11-24 18:09:41,418 INFO - [NotificationHookConsumer thread-0:] ~ ==> HookConsumer doWork() (NotificationHookConsumer$HookConsumer:305) 2017-11-24 18:09:41,418 INFO - [NotificationHookConsumer thread-0:] ~ Atlas Server is ready, can start reading Kafka events. (NotificationHookConsumer$HookConsumer:508) 2017-11-24 18:09:41,437 INFO - [main:] ~ HA is disabled. Hence creating table on startup. (HBaseBasedAuditRepository:384) 2017-11-24 18:09:41,438 INFO - [main:] ~ Checking if table apache_atlas_entity_audit exists (HBaseBasedAuditRepository:343) 2017-11-24 18:09:41,447 INFO - [main:] ~ Table apache_atlas_entity_audit exists (HBaseBasedAuditRepository:355) 2017-11-24 18:09:41,835 INFO - [main:] ~ AuditFilter initialization started (AuditFilter:57) config File: # Graph Database #Configures the graph database to use. Defaults to JanusGraph 0.1.1 #atlas.graphdb.backend=org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase # Graph Storage atlas.graph.storage.backend=hbase atlas.graph.storage.hbase.table=apache_atlas_titan #Hbase #For standalone mode , specify localhost #for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2 atlas.graph.storage.hostname=10.115.80.165,10.115.80.168,10.115.80.97 atlas.graph.storage.hbase.regions-per-server=1 atlas.graph.storage.lock.wait-time=10000 # Gremlin Query Optimizer # # Enables rewriting gremlin queries to maximize performance. This flag is provided as # a possible way to work around any defects that are found in the optimizer until they # are resolved. #atlas.query.gremlinOptimizerEnabled=true # Delete handler # # This allows the default behavior of doing "soft" deletes to be changed. # # Allowed Values: # org.apache.atlas.repository.graph.SoftDeleteHandler - all deletes are "soft" deletes # org.apache.atlas.repository.graph.HardDeleteHandler - all deletes are "hard" deletes # #atlas.DeleteHandler.impl=org.apache.atlas.repository.graph.SoftDeleteHandler # Entity audit repository # # This allows the default behavior of logging entity changes to hbase to be changed. # # Allowed Values: # org.apache.atlas.repository.audit.HBaseBasedAuditRepository - log entity changes to hbase # org.apache.atlas.repository.audit.NoopEntityAuditRepository - disable the audit repository # #atlas.EntityAuditRepository.impl=org.apache.atlas.repository.audit.NoopEntityAuditRepository #org.apache.atlas.repository.audit.HBaseBasedAuditRepository # Graph Search Index atlas.graph.index.search.backend=solr #Solr #Solr cloud mode properties atlas.graph.index.search.solr.mode=cloud atlas.graph.index.search.solr.zookeeper-url=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 atlas.graph.index.search.solr.zookeeper-connect-timeout=60000 atlas.graph.index.search.solr.zookeeper-session-timeout=60000 #Solr http mode properties atlas.graph.index.search.solr.mode=http atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr # Solr-specific configuration property atlas.graph.index.search.max-result-set-size=150 ######### Notification Configs ######### atlas.notification.embedded=false atlas.kafka.data=${sys:atlas.home}/data/kafka atlas.kafka.zookeeper.connect=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 #localhost:9026 atlas.kafka.bootstrap.servers=10.115.80.165:9092 atlas.kafka.zookeeper.session.timeout.ms=400 atlas.kafka.zookeeper.connection.timeout.ms=200 atlas.kafka.zookeeper.sync.time.ms=20 atlas.kafka.auto.commit.interval.ms=1000 atlas.kafka.hook.group.id=atlas atlas.kafka.enable.auto.commit=false atlas.kafka.auto.offset.reset=earliest atlas.kafka.session.timeout.ms=30000 atlas.kafka.poll.timeout.ms=1000 atlas.notification.create.topics=true atlas.notification.replicas=1 atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES atlas.notification.log.failed.messages=true atlas.notification.consumer.retry.interval=500 atlas.notification.hook.retry.interval=1000 # Enable for Kerberized Kafka clusters #atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM #atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab ######### Hive Lineage Configs ######### ## Schema atlas.lineage.schema.query.hive_table=hive_table where __guid='%s'\, columns atlas.lineage.schema.query.Table=Table where __guid='%s'\, columns ## Server port configuration #atlas.server.http.port=21000 #atlas.server.https.port=21443 ######### Security Properties ######### # SSL config atlas.enableTLS=false #truststore.file=/path/to/truststore.jks #cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks ######### Server Properties ######### atlas.rest.address=http://10.115.80.165:21000 # If enabled and set to true, this will run setup steps when the server starts #atlas.server.run.setup.on.start=false ######### Entity Audit Configs ######### atlas.audit.hbase.tablename=apache_atlas_entity_audit atlas.audit.zookeeper.session.timeout.ms=1000 atlas.audit.hbase.zookeeper.quorum=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 I tried to hit the application url : ( say, http://localhost:21000). Its throwing error. wget http://10.115.80.165:21000 --no-proxy --2017-11-24 17:54:05-- http://10.x.x.x:21000/ Connecting to 10.x.x.x:21000... connected. HTTP request sent, awaiting response... No data received. Retrying. I have validated the port being used by netstat -tunlp | grep 21000 tcp 0 0 0.0.0.0:21000 0.0.0.0:* LISTEN 35415/java I have no Idea how to proceed....

ssankarau · ‎09-27-2017

Hi All, 1. I am trying launch a 5 node cluster (aws) using Cloudera Direcotor bootstrap (director ip : say 10.10.10.1). Cloudera director is proxy enabled. 2. I have created a local repository and exposed in some other instance (say http://10.10.10.2/test_repo). 3. I had set NO_PROXY in bash_profile at Cloudera director(10.10.10.1) to access the repository. and I am able to acess it. cloudera director has a properties file called application.properties in (/etc/cloudera-director-client/) I have updated the proxy settings in application.properties. but there is no property for NO_PROXY to exclude the IPs. I have set the repositoryurl property in the config file. while bootstrapping the config file I am getting an error saying service unavailable 503. Is there anything that i can do to exclude the local repository(10.10.10.2) by setting NO_PROXY in application.properties. Or is ther any other sollution that I can configre? Sankaranarayanan. S.

Online	Offline
Last Visited	‎09-05-2018 07:26 AM

Member Since	‎07-03-2017 11:19 PM
Last Visited	‎09-05-2018 07:26 AM
Posts	10
Kudos received	1

Cloudera Community

Spark2-shell throws exception lang.IllegalArgument...

Could Not able to Launch Apache Atlas ( External h...

CLOUDERA BOOTSTRAP FAILED WITH SERVICE UNAVAILABLE...