Member since
10-24-2015
207
Posts
18
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
743 | 03-04-2018 08:18 PM | |
2046 | 09-19-2017 04:01 PM | |
490 | 01-28-2017 10:31 PM | |
196 | 12-08-2016 03:04 PM |
02-10-2020
09:58 AM
Hi, can someone help explain about dead executors in Spark and their behavior?Can it slow down a job? How to prevent dead executors?
... View more
- Tags:
- Spark
Labels:
12-03-2019
12:51 PM
@hadoop_ammiredd We are also getting same error. HDP 2.6.3 - were you able to resolve this issue?
... View more
10-17-2019
06:41 PM
Hi All,
Need help finding yarn queue usage of all queues from either rm rest api or ams api.
Basically i need to run a script to pull this data and see if the yarn queues are going beyond a threshold value and send out an alert.
I cannot use ambari alerts here. Need it in a script form.
Appreciate help.
... View more
Labels:
05-28-2019
08:57 AM
@Vinay Thanks for the reply. I want to know how to restrict a hive database not to have more than 1 policy. Thanks for your time.
... View more
05-28-2019
03:35 AM
Hi, I have a very simple question. Can hive dB have multiple ranger policies? In any circumstance can you restrict having multiple ranger policies?
... View more
Labels:
03-17-2019
01:59 PM
@Geoffrey Shelton Okot Cluster has HA enabled - with Active and Standby Namenode. The tool that we use is Atscale - in this it says Primary NN and Secondary NN - doesnt have an option for service name. HDP version 2.6.3
... View more
03-17-2019
12:33 AM
Hi, we have BI tools in which the configuration file has primary namenode and seondary namenode hardcoded. We are seeing a lot of "read operation in standby not allowed errors" - i guess this will occur when client tries to first check which namenode is active and could throw error if it tries to conenct to standby. But we are seeing more than usual when it comes to this error. But jobs are running fine. I would like to know if it is cluster side issue with namenode failing over too many times? or BI tool config issue where we are hardcoding nn and snn, instead of service name. Thanks.
... View more
Labels:
12-31-2018
03:13 AM
@Jay Kumar SenSharma It works, Great!! There was a typo in your command. This works for me:
http://$AMS_COLLECTOR_HOSTNAME:6188/ws/v1/timeline/metrics?metricNames=yarn.QueueMetrics.Queue=root.default.AvailableMB._max&appId=resourcemanager&startTime=1545613074&endTime=1546217874 Thanks again.
... View more
12-31-2018
02:19 AM
@Jay Kumar SenSharma Thank you so much for the explanation and commands, but i am getting 404 not found error when i try to run this command using curl. I see there are some special characters and was wondering if it has to do anything with it? For eg below path has a ☆ before tTime and after max there is a = curly sign. http://$AMS_COLLECTOR_HOSTNAME:6188/ws/v1/timeline/metrics?metricNames=yarn.QueueMetrics.Queue=root.default.AvailableMB._max≈pId=resourcemanager☆tTime=1545613074&endTime=1546217874
... View more
12-30-2018
08:53 PM
Hi, i am trying to get the ambari metrics rest api path to get the memory usage for a particular queue Q1 for the last 1 week. how can i get that?
... View more
12-20-2018
02:30 PM
Hi All, I have access to grafana and other stuff but we use a BI tool which uses separate yarn queue, i want to write a script or get data into csv which contains the queue utilization (only 1 specific queue) and other metrics related to this queue as a separate report either from grafana or any other means. Can somebody suggest a solution? I would like to update the metrics every 5 min. Thanks.
... View more
Labels:
11-12-2018
07:19 PM
Hi, We use an application which uses Hive and spark jobs. We want to monitor from our application (edge node of hadoop) to check if the connection to hive and spark working or not using script... hive uses beeline with logins,we cannot use login in the script, is there any other way to monitor hive and spark jdbc connections? using curl? Please suggest.
... View more
Labels:
10-30-2018
12:37 AM
@nyadav Hi, we are having same issue ... even though by default gpl compression libraries are installed, still i am getting this error. please let me know what you did to resolve this.
... View more
10-15-2018
10:06 PM
I have a user user1 with group group1 I use a third party tool which writes spark event logs to a directory in hdfs. Currently using the user home directory to write the logs temporarily. I created this directory like: Hdfs dfs -mkdir /user/user1/sparkeventlogs Right now this is created under user1:group2 Changed ownership to the right group: Hdfs dfs -chown -R user1:group1 /user/user1/sparkeventlogs I also added ACL to the above directory using setfacl command and when I do getfacl it gives me correct user and group assigned. User and group both have rwx permissions Now when the job is run, it the giving permission denied with a user who ran it under that AD group saying User1:group2 drwx——— When actually it is User1:group1 drwxrwx—- We have ranger enabled but I don’t have access to it Thanks for you help
... View more
Labels:
09-16-2018
10:04 PM
Hi All, I am trying to understand how does a tool like Atscale(data wrangling tool) connect to Hive metastore? There is a weird behavior in the log files of Atscale, when you try to wrangle data through Atscale, it is connecting and disconnecting from hive metastore every few seconds.... is this normal? when a user submits a query when is the connection made to hive metastore and when does it ends? and how often can this happen? in case of Atscale, there could be aggregates, not sure how this works.. Thanks in advance.
... View more
Labels:
05-21-2018
06:42 PM
I added hive user to all queues, now i see hive running applications in each queue, not sure why? also this error: 2018-05-21 14:40:26,547 INFO [HiveServer2-Handler-Pool: Thread-306]: thrift.ThriftCLIService (ThriftCLIService.java:OpenSession(313)) - Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8
2018-05-21 14:40:26,551 WARN [HiveServer2-Handler-Pool: Thread-306]: impl.MetricsSystemImpl (MetricsSystemImpl.java:init(152)) - hiveserver2 metrics system already initialized!
2018-05-21 14:40:26,551 ERROR [HiveServer2-Handler-Pool: Thread-306]: metastore.HiveMetaStore (HiveMetaStore.java:init(518)) - error in Metrics init: java.lang.reflect.InvocationTargetException null
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.common.metrics.common.MetricsFactory.init(MetricsFactory.java:42)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:515)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:77)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:83)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6022)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:203)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1549)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:89)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:135)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:107)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3252)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3271)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:524)
at org.apache.hive.service.cli.session.HiveSessionImpl.open(HiveSessionImpl.java:144)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy46.open(Unknown Source)
at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:281)
at org.apache.hive.service.cli.CLIService.openSessionWithImpersonation(CLIService.java:204)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:421)
at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:316)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1257)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1242)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.metrics2.MetricsException: Metrics source hiveserver2 already exists!
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:144)
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:117)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
at com.github.joshelser.dropwizard.metrics.hadoop.HadoopMetrics2Reporter.<init>(HadoopMetrics2Reporter.java:206)
at com.github.joshelser.dropwizard.metrics.hadoop.HadoopMetrics2Reporter.<init>(HadoopMetrics2Reporter.java:62)
at com.github.joshelser.dropwizard.metrics.hadoop.HadoopMetrics2Reporter$Builder.build(HadoopMetrics2Reporter.java:162)
at org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.initReporting(CodahaleMetrics.java:377)
at org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.<init>(CodahaleMetrics.java:199)
... View more
05-21-2018
06:19 PM
Hi all, As soon as i enabled hive acid properties and restarted hive services, i am getting these logs in hiveserver2 and resource manager. although resource manager is configured 8050 port, it is looking for 8032 and retrying ... also hive is unable to submit an application in a queue. 2018-05-21 14:17:01,965 INFO [main]: retry.RetryInvocationHandler (RetryInvocationHandler.java:log(267)) - Exception while invoking ApplicationClientProtocolPBClientImpl.submitApplication over rm1 after 24 failover attempts. Trying to failover immediately.
org.apache.hadoop.security.AccessControlException: User hive does not have permission to submit application_1526926153151_0001 to queue AdHoc
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:380)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:292)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:585)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:239)
at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
at com.sun.proxy.$Proxy42.submitApplication(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:259)
at org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72)
at org.apache.tez.client.TezClient.start(TezClient.java:437)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:117)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:76)
at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:488)
at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:87)
at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:720)
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:593)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): User hive does not have permission to submit application_1526926153151_0001 to queue AdHoc
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:380)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:292)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:585)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1552)
at org.apache.hadoop.ipc.Client.call(Client.java:1496)
at org.apache.hadoop.ipc.Client.call(Client.java:1396)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy41.submitApplication(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:236)
... 23 more 2018-05-21 14:18:13,106 INFO [main]: server.HiveServer2 (HiveServer2.java:stop(405)) - Shutting down HiveServer2
2018-05-21 14:18:13,106 INFO [main]: thrift.ThriftCLIService (ThriftCLIService.java:stop(218)) - Thrift server has stopped
2018-05-21 14:18:13,106 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:ThriftBinaryCLIService is stopped.
2018-05-21 14:18:13,107 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:OperationManager is stopped.
2018-05-21 14:18:13,107 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:SessionManager is stopped.
2018-05-21 14:18:23,108 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:CLIService is stopped.
2018-05-21 14:18:23,108 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(824)) - 0: Shutting down the object store...
2018-05-21 14:18:23,108 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(393)) - ugi=hive ip=unknown-ip-addr cmd=Shutting down the object store...
2018-05-21 14:18:23,109 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(824)) - 0: Metastore shutdown complete.
2018-05-21 14:18:23,109 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(393)) - ugi=hive ip=unknown-ip-addr cmd=Metastore shutdown complete.
2018-05-21 14:18:23,109 INFO [main]: service.AbstractService (AbstractService.java:stop(125)) - Service:HiveServer2 is stopped.
2018-05-21 14:18:23,117 WARN [main-EventThread]: server.HiveServer2 (HiveServer2.java:process(335)) - This HiveServer2 instance is now de-registered from ZooKeeper. The server will be shut down after the last client sesssion completes.
2018-05-21 14:18:23,117 WARN [main-EventThread]: server.HiveServer2 (HiveServer2.java:process(343)) - This instance of HiveServer2 has been removed from the list of server instances available for dynamic service discovery. The last client session has ended - will shutdown now.
2018-05-21 14:18:23,120 INFO [main]: zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x2620074a7ab1b48 closed
2018-05-21 14:18:23,120 INFO [main]: server.HiveServer2 (HiveServer2.java:removeServerInstanceFromZooKeeper(361)) - Server instance removed from ZooKeeper.
2018-05-21 14:18:23,120 INFO [main-EventThread]: server.HiveServer2 (HiveServer2.java:stop(405)) - Shutting down HiveServer2
2018-05-21 14:18:23,120 WARN [main]: server.HiveServer2 (HiveServer2.java:startHiveServer2(508)) - Error starting HiveServer2 on attempt 1, will retry in 60 seconds
org.apache.hadoop.security.AccessControlException: User hive does not have permission to submit application_1526926153151_0001 to queue AdHoc
... View more
Labels:
05-14-2018
09:04 PM
@Alpesh Virani you can set it up using Ranger policies: https://hortonworks.com/hadoop-tutorial/manage-security-policy-hive-hbase-knox-ranger/
... View more
05-10-2018
02:59 AM
@Deepesh thanks for the informaiton. How do you use it or run queries? just like hive? how does it work behind the scenes? it will be helpful if you can provide details about it. Thanks.
... View more
05-09-2018
03:16 AM
Hi, i just enabled hive interactive querying on hdp 2.5.3, i notice hive cli is automaticlaly disabled, i am not able to login regularly with hive cli. I can only do a jdbc login through beeline.... but i am not sure how to run queries on hive llap? how is running a query or writing scripts to run some jobs is done with hive llap? Please help.
... View more
Labels:
04-16-2018
09:56 PM
@Rahul Soni I am using same exact user for both clusters, this user has all permissions.
... View more
04-16-2018
06:44 PM
@Rahul Soni thanks for the reply but i get this error: hive> insert overwrite directory "hdfs://nn:8020/test/" select * from table; FAILED: SemanticException Error creating temporary folder on: hdfs://nn:8020/test
... View more
04-16-2018
04:04 PM
Hi, We have 2 different hadoop clusters(cloudera & hdp) , i have to run an import query on Impala twce a day to import that data into our hadoop cluster. What is the best way to do this? I see sqoop only imports from relational db, may be distcp but what if i want to do a query with where conditions? Is there any other way? if i decide to get all data from partition using distcp anyways, wghat ports should be open on both clusters? Thanks in advance.
... View more
Labels:
04-09-2018
09:49 PM
Hi all, i am using sqoop --direct mode now to copy nz tables into hive. If i run 4 parallel sqoop jobs, how does it affect hdp cluster and also how does it effect nz database? also, is there a better and fastest way to import data from nz?
... View more
Labels:
04-04-2018
04:11 PM
@dthakkar @Sindhu yes, i did mention the -Dmapreduce.job.queuename=<queue_name> already but 2 applications run if you look at the yarn jobs list, first one uses the mentioned queue in the above property and second job uses default queue. I have no idea why it lauches 2 separate jobs. i resolved this my configuring queue mappings and increasing the am resource percent. Thanks.
... View more
04-01-2018
03:13 PM
@Goutham Koneru @Artem Ervit @Ram Venkatesh Hi, i am trying to install python 3 too in my hdp 2.5.3 cluster? how does this affect the other copoennts other than spark? is this recommended to do in production? can i use anaconda instead?
... View more
04-01-2018
02:26 PM
@Aishwarya Sudhakar nn is namenode, you will find it in the core-site.xml properties under fs.defaultFS. but i think your issue as i mentioned earlier is you have saved your file without '/' in the beginning of "demo" directory, it got saved into the user home, look at the output of "hdfs dfs -ls demo/data.csv and it will display the user home it is in, use either that or mv it to the root like this: hdfs dfs -mv demo/dataset.csv /demo/dataset.csv make sure your /demo directory exists, if not create it: hdfs dfs -mkdir /demo Hope this helps.
... View more
03-30-2018
09:28 PM
@Andrew Watson @Dave Russell Have you tried to install python 3 on hdp? what did you use to install? virtualenv? conda env? which works better? do you have some instructions? is this advisable to do on production environment?
... View more
03-30-2018
03:11 PM
@Aishwarya Sudhakar
use the whole absolute path and try: sc.textFile("hdfs://nn:8020/demo/dataset.csv") you can find the absolute path core-site.xml and look for fs.defaultFS Also make sure your file in in root path because you mentioned "demo/dataset.csv" and not "/demo/dataset/csv", if it is not then it hsould be in the user home directory like "/user/yourusername/demo/dataset.csv".
... View more
03-29-2018
01:54 PM
Hi, can somebody explain me how conda packages and env work ? i am using hdp 2.5.3 with python 2.6 I am planning to use nltk which seem to work with python 3 and above. Is there any other suggestion other than conda env for a production environment? Thanks.
... View more