Member since
07-03-2017
10
Posts
1
Kudos Received
0
Solutions
07-16-2018
12:25 AM
I have a root user and a test-user. Steps I followed to lauch spark2-shell 1. Login as root 2. kinit keytab principal 3. spark2-shell I could able to successfully launch spark2-shell without any error. 1. login as test-user 2. kint keytab principal 3. spark2-shell I am Getting below Exception while launching. I hope this error falls under Unix File System Permission Denied. Can you help where do I need to update. java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1053)
at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)
at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:129)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:126)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:938)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:97)
... 47 elided
Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.io.IOException: Permission denied;
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:108)
at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:195)
at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:106)
at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:94)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)
at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)
... 61 more
Caused by: java.lang.RuntimeException: java.io.IOException: Permission denied
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:554)
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:176)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:357)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:261)
at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:68)
at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:67)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:196)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:196)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:196)
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:99)
... 70 more
Caused by: java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2024)
at org.apache.hadoop.hive.common.FileUtils.createTempFile(FileUtils.java:787)
at org.apache.hadoop.hive.ql.session.SessionState.createTempFile(SessionState.java:872)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:552)
... 84 more
<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^ Regards, Sankar
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
11-24-2017
08:04 PM
I have followed http://atlas.apache.org/InstallationSteps.html to setup atlas (with external hbase-solr). After starting the ./atlas_start.py, There were no more logs generated after "AuditFilter Initialization Started". application log: 017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingDslQuery (GraphTransactionAdvisor$1:41)
2017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingFullTextQuery (GraphTransactionAdvisor$1:41)
2017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingBasicQuery (GraphTransactionAdvisor$1:41)
2017-11-24 18:09:41,176 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityLineageService.getSchemaForHiveTableByGuid (GraphTransactionAdvisor$1:41)
2017-11-24 18:09:41,227 INFO - [main:] ~ Starting service org.apache.atlas.web.service.ActiveInstanceElectorService (Services:53)
2017-11-24 18:09:41,227 INFO - [main:] ~ HA is not enabled, no need to start leader election service (ActiveInstanceElectorService:96)
2017-11-24 18:09:41,228 INFO - [main:] ~ Starting service org.apache.atlas.kafka.KafkaNotification (Services:53)
2017-11-24 18:09:41,228 INFO - [main:] ~ Starting service org.apache.atlas.notification.NotificationHookConsumer (Services:53)
2017-11-24 18:09:41,228 INFO - [main:] ~ HA is disabled, starting consumers inline. (NotificationHookConsumer:143)
2017-11-24 18:09:41,228 INFO - [main:] ~ Consumer property: atlas.kafka.enable.auto.commit: null (KafkaNotification:275)
2017-11-24 18:09:41,346 WARN - [main:] ~ The configuration hook.group.id = atlas was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,346 WARN - [main:] ~ The configuration data = /home/ec2-user/sankar/atlas/distro/target/apache-atlas-1.0.0-SNAPSHOT-bin/apache-atlas-1.0.0-SNAPSHOT/data/kafka was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.connection.timeout.ms = 200 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration key.serializer = org.apache.kafka.common.serialization.StringSerializer was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.session.timeout.ms = 400 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration value.serializer = org.apache.kafka.common.serialization.StringSerializer was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.connect = 10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.sync.time.ms = 20 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,348 WARN - [main:] ~ The configuration poll.timeout.ms = 1000 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,416 INFO - [main:] ~ Starting service org.apache.atlas.repository.audit.HBaseBasedAuditRepository (Services:53)
2017-11-24 18:09:41,417 INFO - [NotificationHookConsumer thread-0:] ~ [atlas-hook-consumer-thread], Starting (Logging$class:68)
2017-11-24 18:09:41,418 INFO - [NotificationHookConsumer thread-0:] ~ ==> HookConsumer doWork() (NotificationHookConsumer$HookConsumer:305)
2017-11-24 18:09:41,418 INFO - [NotificationHookConsumer thread-0:] ~ Atlas Server is ready, can start reading Kafka events. (NotificationHookConsumer$HookConsumer:508)
2017-11-24 18:09:41,437 INFO - [main:] ~ HA is disabled. Hence creating table on startup. (HBaseBasedAuditRepository:384)
2017-11-24 18:09:41,438 INFO - [main:] ~ Checking if table apache_atlas_entity_audit exists (HBaseBasedAuditRepository:343)
2017-11-24 18:09:41,447 INFO - [main:] ~ Table apache_atlas_entity_audit exists (HBaseBasedAuditRepository:355)
2017-11-24 18:09:41,835 INFO - [main:] ~ AuditFilter initialization started (AuditFilter:57) config File: # Graph Database
#Configures the graph database to use. Defaults to JanusGraph 0.1.1
#atlas.graphdb.backend=org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase
# Graph Storage
atlas.graph.storage.backend=hbase
atlas.graph.storage.hbase.table=apache_atlas_titan
#Hbase
#For standalone mode , specify localhost
#for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2
atlas.graph.storage.hostname=10.115.80.165,10.115.80.168,10.115.80.97
atlas.graph.storage.hbase.regions-per-server=1
atlas.graph.storage.lock.wait-time=10000
# Gremlin Query Optimizer
#
# Enables rewriting gremlin queries to maximize performance. This flag is provided as
# a possible way to work around any defects that are found in the optimizer until they
# are resolved.
#atlas.query.gremlinOptimizerEnabled=true
# Delete handler
#
# This allows the default behavior of doing "soft" deletes to be changed.
#
# Allowed Values:
# org.apache.atlas.repository.graph.SoftDeleteHandler - all deletes are "soft" deletes
# org.apache.atlas.repository.graph.HardDeleteHandler - all deletes are "hard" deletes
#
#atlas.DeleteHandler.impl=org.apache.atlas.repository.graph.SoftDeleteHandler
# Entity audit repository
#
# This allows the default behavior of logging entity changes to hbase to be changed.
#
# Allowed Values:
# org.apache.atlas.repository.audit.HBaseBasedAuditRepository - log entity changes to hbase
# org.apache.atlas.repository.audit.NoopEntityAuditRepository - disable the audit repository
#
#atlas.EntityAuditRepository.impl=org.apache.atlas.repository.audit.NoopEntityAuditRepository
#org.apache.atlas.repository.audit.HBaseBasedAuditRepository
# Graph Search Index
atlas.graph.index.search.backend=solr
#Solr
#Solr cloud mode properties
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.zookeeper-url=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181
atlas.graph.index.search.solr.zookeeper-connect-timeout=60000
atlas.graph.index.search.solr.zookeeper-session-timeout=60000
#Solr http mode properties
atlas.graph.index.search.solr.mode=http
atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr
# Solr-specific configuration property
atlas.graph.index.search.max-result-set-size=150
######### Notification Configs #########
atlas.notification.embedded=false
atlas.kafka.data=${sys:atlas.home}/data/kafka
atlas.kafka.zookeeper.connect=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181
#localhost:9026
atlas.kafka.bootstrap.servers=10.115.80.165:9092
atlas.kafka.zookeeper.session.timeout.ms=400
atlas.kafka.zookeeper.connection.timeout.ms=200
atlas.kafka.zookeeper.sync.time.ms=20
atlas.kafka.auto.commit.interval.ms=1000
atlas.kafka.hook.group.id=atlas
atlas.kafka.enable.auto.commit=false
atlas.kafka.auto.offset.reset=earliest
atlas.kafka.session.timeout.ms=30000
atlas.kafka.poll.timeout.ms=1000
atlas.notification.create.topics=true
atlas.notification.replicas=1
atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
atlas.notification.log.failed.messages=true
atlas.notification.consumer.retry.interval=500
atlas.notification.hook.retry.interval=1000
# Enable for Kerberized Kafka clusters
#atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM
#atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab
######### Hive Lineage Configs #########
## Schema
atlas.lineage.schema.query.hive_table=hive_table where __guid='%s'\, columns
atlas.lineage.schema.query.Table=Table where __guid='%s'\, columns
## Server port configuration
#atlas.server.http.port=21000
#atlas.server.https.port=21443
######### Security Properties #########
# SSL config
atlas.enableTLS=false
#truststore.file=/path/to/truststore.jks
#cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks
######### Server Properties #########
atlas.rest.address=http://10.115.80.165:21000
# If enabled and set to true, this will run setup steps when the server starts
#atlas.server.run.setup.on.start=false
######### Entity Audit Configs #########
atlas.audit.hbase.tablename=apache_atlas_entity_audit
atlas.audit.zookeeper.session.timeout.ms=1000
atlas.audit.hbase.zookeeper.quorum=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 I tried to hit the application url : ( say, http://localhost:21000). Its throwing error. wget http://10.115.80.165:21000 --no-proxy
--2017-11-24 17:54:05-- http://10.x.x.x:21000/ Connecting to 10.x.x.x:21000... connected.
HTTP request sent, awaiting response... No data received.
Retrying.
I have validated the port being used by netstat -tunlp | grep 21000
tcp 0 0 0.0.0.0:21000 0.0.0.0:* LISTEN 35415/java I have no Idea how to proceed....
... View more
Labels:
- Labels:
-
Apache Atlas
09-27-2017
11:09 PM
Hi All, 1. I am trying launch a 5 node cluster (aws) using Cloudera Direcotor bootstrap (director ip : say 10.10.10.1). Cloudera director is proxy enabled. 2. I have created a local repository and exposed in some other instance (say http://10.10.10.2/test_repo). 3. I had set NO_PROXY in bash_profile at Cloudera director(10.10.10.1) to access the repository. and I am able to acess it. cloudera director has a properties file called application.properties in (/etc/cloudera-director-client/) I have updated the proxy settings in application.properties. but there is no property for NO_PROXY to exclude the IPs. I have set the repositoryurl property in the config file. while bootstrapping the config file I am getting an error saying service unavailable 503. Is there anything that i can do to exclude the local repository(10.10.10.2) by setting NO_PROXY in application.properties. Or is ther any other sollution that I can configre? Sankaranarayanan. S.
... View more
Labels:
- Labels:
-
Cloudera Manager