About Int17

Int17 · ‎11-08-2016

Hi Sagar, thank you for your hints but I can't test it because my Cluster is destroyed. 🙂 Klaus

Int17 · ‎10-28-2016

Hello; the client Installation process failed with this error on all nodes: stderr: /var/lib/ambari-agent/data/errors-3460.txt Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_client.py", line 75, in <module> OozieClient().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_client.py", line 36, in install self.install_packages(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 404, in install_packages Package(name) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 158, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 121, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 49, in action_install self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/apt.py", line 53, in wrapper return function_to_decorate(self, name, *args[2:]) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/apt.py", line 97, in install_package self.checked_call_until_not_locked(cmd, sudo=True, env=INSTALL_CMD_ENV, logoutput=self.get_logoutput()) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 72, in checked_call_until_not_locked return self.wait_until_not_locked(cmd, is_checked=True, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 80, in wait_until_not_locked code, out = func(cmd, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of '/usr/bin/apt-get -q -o Dpkg::Options::=--force-confdef --allow-unauthenticated --assume-yes install 'oozie-2-3-.*'' returned 100. Reading package lists... Building dependency tree... Reading state information... E: Unable to locate package oozie-2-3-.* E: Couldn't find any package by regex 'oozie-2-3-.*' stdout: /var/lib/ambari-agent/data/output-3460.txt 2016-10-28 08:03:58,249 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.4.0.0-169 2016-10-28 08:03:58,250 - Checking if need to create versioned conf dir /etc/hadoop/2.4.0.0-169/0 2016-10-28 08:03:58,250 - call['conf-select create-conf-dir --package hadoop --stack-version 2.4.0.0-169 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2016-10-28 08:03:58,271 - call returned (1, '/etc/hadoop/2.4.0.0-169/0 exist already', '') 2016-10-28 08:03:58,271 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.4.0.0-169 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False} 2016-10-28 08:03:58,295 - checked_call returned (0, '/usr/hdp/2.4.0.0-169/hadoop/conf -> /etc/hadoop/2.4.0.0-169/0') 2016-10-28 08:03:58,295 - Ensuring that hadoop has the correct symlink structure 2016-10-28 08:03:58,295 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf 2016-10-28 08:03:58,296 - Group['spark'] {} 2016-10-28 08:03:58,297 - Group['hadoop'] {} 2016-10-28 08:03:58,297 - Group['users'] {} 2016-10-28 08:03:58,297 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,298 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,298 - User['oozie'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users']} 2016-10-28 08:03:58,299 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,299 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users']} 2016-10-28 08:03:58,300 - User['accumulo'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,300 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,301 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users']} 2016-10-28 08:03:58,301 - User['flume'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,302 - User['kafka'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,303 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,303 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,304 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,304 - User['hbase'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,305 - User['hcat'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} 2016-10-28 08:03:58,305 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2016-10-28 08:03:58,306 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'} 2016-10-28 08:03:58,316 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if 2016-10-28 08:03:58,317 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'recursive': True, 'mode': 0775, 'cd_access': 'a'} 2016-10-28 08:03:58,317 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} 2016-10-28 08:03:58,318 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'} 2016-10-28 08:03:58,322 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due to not_if 2016-10-28 08:03:58,322 - Group['hdfs'] {} 2016-10-28 08:03:58,322 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': [u'hadoop', u'hdfs']} 2016-10-28 08:03:58,323 - Directory['/etc/hadoop'] {'mode': 0755} 2016-10-28 08:03:58,334 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} 2016-10-28 08:03:58,335 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0777} 2016-10-28 08:03:58,350 - Repository['HDP-2.4'] {'base_url': 'http://public-repo-1.hortonworks.com/HDP/ubuntu14/2.x/updates/2.4.0.0', 'action': ['create'], 'components': [u'HDP', 'main'], 'repo_template': '{{package_type}} {{base_url}} {{components}}', 'repo_file_name': 'HDP', 'mirror_list': None} 2016-10-28 08:03:58,354 - File['/tmp/tmpgzSjgM'] {'content': 'deb http://public-repo-1.hortonworks.com/HDP/ubuntu14/2.x/updates/2.4.0.0 HDP main'} 2016-10-28 08:03:58,355 - Writing File['/tmp/tmpgzSjgM'] because contents don't match 2016-10-28 08:03:58,355 - File['/tmp/tmpKdatW3'] {'content': StaticFile('/etc/apt/sources.list.d/HDP.list')} 2016-10-28 08:03:58,356 - Writing File['/tmp/tmpKdatW3'] because contents don't match 2016-10-28 08:03:58,358 - Repository['HDP-UTILS-1.1.0.20'] {'base_url': 'http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/ubuntu12', 'action': ['create'], 'components': [u'HDP-UTILS', 'main'], 'repo_template': '{{package_type}} {{base_url}} {{components}}', 'repo_file_name': 'HDP-UTILS', 'mirror_list': None} 2016-10-28 08:03:58,360 - File['/tmp/tmpgWh4hC'] {'content': 'deb http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/ubuntu12 HDP-UTILS main'} 2016-10-28 08:03:58,360 - Writing File['/tmp/tmpgWh4hC'] because contents don't match 2016-10-28 08:03:58,360 - File['/tmp/tmpy240ZL'] {'content': StaticFile('/etc/apt/sources.list.d/HDP-UTILS.list')} 2016-10-28 08:03:58,360 - Writing File['/tmp/tmpy240ZL'] because contents don't match 2016-10-28 08:03:58,362 - Package['unzip'] {} 2016-10-28 08:03:58,382 - Skipping installation of existing package unzip 2016-10-28 08:03:58,383 - Package['curl'] {} 2016-10-28 08:03:58,402 - Skipping installation of existing package curl 2016-10-28 08:03:58,402 - Package['hdp-select'] {} 2016-10-28 08:03:58,423 - Skipping installation of existing package hdp-select 2016-10-28 08:03:58,695 - Package['zip'] {} 2016-10-28 08:03:58,719 - Skipping installation of existing package zip 2016-10-28 08:03:58,720 - Package['mysql-connector-java'] {} 2016-10-28 08:03:58,739 - Skipping installation of existing package mysql-connector-java 2016-10-28 08:03:58,739 - Package['extjs'] {} 2016-10-28 08:03:58,759 - Skipping installation of existing package extjs 2016-10-28 08:03:58,759 - Package['oozie-2-3-.*'] {} 2016-10-28 08:03:58,779 - Installing package oozie-2-3-.* ('/usr/bin/apt-get -q -o Dpkg::Options::=--force-confdef --allow-unauthenticated --assume-yes install 'oozie-2-3-.*'') 2016-10-28 08:03:59,244 - Execution of '['/usr/bin/apt-get', '-q', '-o', 'Dpkg::Options::=--force-confdef', '--allow-unauthenticated', '--assume-yes', 'install', u'oozie-2-3-.*']' returned 100. Reading package lists... Building dependency tree... Reading state information... E: Unable to locate package oozie-2-3-.* E: Couldn't find any package by regex 'oozie-2-3-.*' 2016-10-28 08:03:59,245 - Failed to install package oozie-2-3-.*. Executing `/usr/bin/apt-get update -qq` 2016-10-28 08:04:32,864 - Retrying to install package oozie-2-3-.* Doing this manually the version 2.4.0.0.169 is available:: # apt-get install oozie\* Reading package lists... Done Building dependency tree Reading state information... Done Note, selecting 'oozie-2-4-0-0-169' for regex 'oozie*' Note, selecting 'oozie-server' for regex 'oozie*' Note, selecting 'oozie-2-4-0-0-169-server' for regex 'oozie*' Note, selecting 'oozie-client' for regex 'oozie*' Note, selecting 'oozie' for regex 'oozie*' Note, selecting 'oozie-2-4-0-0-169-client' for regex 'oozie*' The following extra packages will be installed: bigtop-tomcat The following NEW packages will be installed: bigtop-tomcat oozie oozie-2-4-0-0-169 oozie-2-4-0-0-169-client oozie-2-4-0-0-169-server oozie-client oozie-server 0 upgraded, 7 newly installed, 0 to remove and 0 not upgraded. Need to get 672 MB of archives. After this operation, 789 MB of additional disk space will be used. Do you want to continue? [Y/n] n Abort. How can I tell Ambari to use the actual version? 🙂 Klaus

Int17 · ‎10-27-2016

Hi Josh, I found in the Tracer log file: 2016-10-27 07:23:48,988 [start.Main] ERROR: Thread 'tracer' died. org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /tracers/trace- at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.accumulo.fate.zookeeper.ZooUtil.putEphemeralSequential(ZooUtil.java:463) at org.apache.accumulo.fate.zookeeper.ZooReaderWriter.putEphemeralSequential(ZooReaderWriter.java:99) at org.apache.accumulo.tracer.TraceServer.registerInZooKeeper(TraceServer.java:297) at org.apache.accumulo.tracer.TraceServer.<init>(TraceServer.java:235) at org.apache.accumulo.tracer.TraceServer.main(TraceServer.java:339) at org.apache.accumulo.tracer.TracerExecutable.execute(TracerExecutable.java:33) at org.apache.accumulo.start.Main$1.run(Main.java:93) at java.lang.Thread.run(Thread.java:745) After deleting the Tracer Zookeeper directory (rmr /tracers) the Tracer process had no problems to start. Many thanks for your support. 🙂 Klaus

Int17 · ‎10-26-2016

Hello, I have a fresh installation of Accumulo and my problem is that the Tracer process terminated with: 2016-10-26 14:56:50,314 [start.Main] ERROR: Thread 'tracer' died. org.apache.accumulo.core.client.AccumuloException: Internal error processing waitForFateOperation at org.apache.accumulo.core.client.impl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:303) at org.apache.accumulo.core.client.impl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:261) at org.apache.accumulo.core.client.impl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1427) at org.apache.accumulo.core.client.impl.TableOperationsImpl.create(TableOperationsImpl.java:188) at org.apache.accumulo.core.client.impl.TableOperationsImpl.create(TableOperationsImpl.java:155) at org.apache.accumulo.tracer.TraceServer.<init>(TraceServer.java:211) at org.apache.accumulo.tracer.TraceServer.main(TraceServer.java:339) at org.apache.accumulo.tracer.TracerExecutable.execute(TracerExecutable.java:33) at org.apache.accumulo.start.Main$1.run(Main.java:93) No idea why. Could someone help please? 🙂 Klaus

Int17 · ‎08-25-2016

Hi Robert, I have no logs from TaskManagers in the log dir. I played a bit with the heap.mb size of the Taskmangers and entering 4096 for it, the taskmangers started. Thanks for your interest to help. 🙂 Klaus

Int17 · ‎08-24-2016

Hello, I have a similar issue as discussed here.These are the settings: I see no TaskManagers. The overview shows: 0 Task Managers 0 Task Slots 0 Available Task Slots Running the example word count job I receive /usr/apache/flink-1.1.1/bin# /usr/apache/flink-1.1.1/bin/flink run /usr/apache/flink-1.1.1/examples/streaming/WordCount.jar Cluster configuration: Standalone cluster with JobManager at dedcm4229/10.79.210.78:6130 Using address dedcm4229:6130 to connect to JobManager. JobManager web interface address http://dedcm4229:8081 Starting execution of program Executing WordCount example with default input data set. Use --input to specify file input. Printing result to stdout. Use --output to specify output path. Submitting job with JobID: 47fee79c80eba58333eec5c3c3ee1cf0. Waiting for job completion. 08/24/2016 16:32:07 Job execution switched to status RUNNING. 08/24/2016 16:32:07 Source: Collection Source -> Flat Map(1/1) switched to SCHEDULED 08/24/2016 16:32:07 Job execution switched to status FAILING. org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not enough free slots available to run the job. You can decrease the operator parallelism or increase the number of slots per TaskManager in the configuration. Task to schedule: < Attempt #0 (Source: Collection Source -> Flat Map (1/1)) @ (unassigned) - [SCHEDULED] > with groupID < 963af48f2c5d35ff2fcaa1bc235543a7 > in sharing group < SlotSharingGroup [7168183d09cf33bacf5ac595e608bd87, 963af48f2c5d35ff2fcaa1bc235543a7] >. Resources available to scheduler: Number of instances=0, total number of slots=0, available slots=0 at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256) at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131) at org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:306) at org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:454) at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:326) at org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:741) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1332) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 08/24/2016 16:32:07 Source: Collection Source -> Flat Map(1/1) switched to CANCELED 08/24/2016 16:32:07 Keyed Aggregation -> Sink: Unnamed(1/1) switched to CANCELED 08/24/2016 16:32:07 Job execution switched to status FAILED. ------------------------------------------------------------ The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed. at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:413) at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:92) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:389) at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:68) at org.apache.flink.streaming.examples.wordcount.WordCount.main(WordCount.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:509) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:403) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:331) at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:777) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:253) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1005) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1048) Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$8.apply$mcV$sp(JobManager.scala:822) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$8.apply(JobManager.scala:768) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$8.apply(JobManager.scala:768) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not enough free slots available to run the job. You can decrease the operator parallelism or increase the number of slots per TaskManager in the configuration. Task to schedule: < Attempt #0 (Source: Collection Source -> Flat Map (1/1)) @ (unassigned) - [SCHEDULED] > with groupID < 963af48f2c5d35ff2fcaa1bc235543a7 > in sharing group < SlotSharingGroup [7168183d09cf33bacf5ac595e608bd87, 963af48f2c5d35ff2fcaa1bc235543a7] >. Resources available to scheduler: Number of instances=0, total number of slots=0, available slots=0 at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256) at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131) at org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:306) at org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:454) at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:326) at org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:741) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1332) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291) ... 9 more Could someone have a look into this log above and give advice to fix this issue please? 🙂 Klaus

Int17 · ‎08-24-2016

At my site this will work ACCUMULO_CONF_DIR=/etc/accumulo/conf/server accumulo init After init no further issues found. Many Thanks for your detailed help 🙂 Klaus

Int17 · ‎08-23-2016

Additional I've done: tables -l accumulo.metadata => !0 accumulo.replication => +rep accumulo.root => +r trace => 1 CheckTables. Scanning stucks. /usr/bin/accumulo admin checkTablets 2016-08-23 12:19:18,521 [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose set to false in hdfs-site.xml: data loss is possible on hard system reset or power loss *** Looking for offline tablets *** Scanning zookeeper +r<<@(null,de-hd-cluster.data-node3.com:9997[25669407cc8000b],de-hd-cluster.data-node3.com:9997[25669407cc8000b]) is ASSIGNED_TO_DEAD_SERVER #walogs:1 *** Looking for missing files *** Scanning : accumulo.root (-inf,~ : [] 9223372036854775807 false) Stats told me /usr/bin/accumulo org.apache.accumulo.test.GetMasterStats 2016-08-23 11:15:21,623 [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose set to false in hdfs-site.xml: data loss is possible on hard system reset or power loss State: NORMAL Goal State: NORMAL Unassigned tablets: 1 Dead tablet servers count: 0 Tablet Servers Name: de-hd-cluster.data-node3.com:9997 Ingest: 0.00 Last Contact: 1471943720583 OS Load Average: 0.12 Queries: 0.00 Time Difference: 1.3 Total Records: 0 Lookups: 0 Recoveries: 0 🙂 Klaus

Int17 · ‎08-23-2016

Hello Josh, thanks for your quick reply. I thought that the peaks in the memory usage has something to do with table issue. On the Accumulo monitor page I see now: In recent logs I see only this warning: [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose set to false in hdfs-site.xml: data loss is possible on hard system reset or power loss After a restart I see: 2016-08-23 09:19:30,318 [replication.WorkDriver] DEBUG: Sleeping 30000 ms before next work assignment 2016-08-23 09:19:36,776 [master.Master] DEBUG: Finished gathering information from 1 servers in 0.00 seconds 2016-08-23 09:19:36,776 [master.Master] DEBUG: not balancing because there are unhosted tablets: 1 2016-08-23 09:19:43,087 [recovery.RecoveryManager] DEBUG: Unable to initate log sort for hdfs://de-hd-cluster.name-node.com:8020/apps/accumulo/data/wal/de-hd-cluster.data-node3.com+9997/91ece971-7485-4acf-aa7f-dcde00fafce9: java.io.FileNotFoundException: File does not exist: /apps/accumulo/data/wal/de-hd-cluster.data-node3.com+9997/91ece971-7485-4acf-aa7f-dcde00fafce9 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2835) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:733) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:663) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145) 2016-08-23 09:19:43,611 [state.ZooTabletStateStore] DEBUG: root tablet logSet [hdfs://de-hd-cluster.name-node.com:8020/apps/accumulo/data/wal/de-hd-cluster.data-node3.com+9997/91ece971-7485-4acf-aa7f-dcde00fafce9] 2016-08-23 09:19:43,611 [state.ZooTabletStateStore] DEBUG: Returning root tablet state: +r<<@(null,de-hd-cluster.data-node3.com:9997[25669407cc8000b],de-hd-cluster.data-node3.com:9997[25669407cc8000b]) 2016-08-23 09:19:43,611 [recovery.RecoveryManager] DEBUG: Recovering hdfs://de-hd-cluster.name-node.com:8020/apps/accumulo/data/wal/de-hd-cluster.data-node3.com+9997/91ece971-7485-4acf-aa7f-dcde00fafce9 to hdfs://de-hd-cluster.name-node.com:8020/apps/accumulo/data/recovery/91ece971-7485-4acf-aa7f-dcde00fafce9 2016-08-23 09:19:43,614 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.master.recovery.HadoopLogCloser 2016-08-23 09:19:43,615 [recovery.RecoveryManager] INFO : Starting recovery of hdfs://de-hd-cluster.name-node.com:8020/apps/accumulo/data/wal/de-hd-cluster.data-node3.com+9997/91ece971-7485-4acf-aa7f-dcde00fafce9 (in : 300s), tablet +r<< holds a reference 2016-08-23 09:19:43,615 [master.Master] DEBUG: [Root Table]: scan time 0.00 seconds 2016-08-23 09:19:43,615 [master.Master] DEBUG: [Root Table] sleeping for 60.00 seconds 2016-08-23 09:19:46,779 [master.Master] DEBUG: Finished gathering information from 1 servers in 0.00 seconds 2016-08-23 09:19:46,779 [master.Master] DEBUG: not balancing because there are unhosted tablets: 1 2016-08-23 09:19:56,782 [master.Master] DEBUG: Finished gathering information from 1 servers in 0.00 seconds 2016-08-23 09:19:56,782 [master.Master] DEBUG: not balancing because there are unhosted tablets: 1 2016-08-23 09:20:00,318 [replication.WorkDriver] DEBUG: Sleeping 30000 ms before next work assignment 2016-08-23 09:20:06,785 [master.Master] DEBUG: Finished gathering information from 1 servers in 0.00 seconds 2016-08-23 09:20:06,785 [master.Master] DEBUG: not balancing because there are unhosted tablets: 1 2016-08-23 09:20:16,788 [master.Master] DEBUG: Finished gathering information from 1 servers in 0.00 seconds 2016-08-23 09:20:16,788 [master.Master] DEBUG: not balancing because there are unhosted tablets: 1 2016-08-23 09:24:44,144 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.master.recovery.HadoopLogCloser 2016-08-23 09:24:44,144 [recovery.RecoveryManager] INFO : Starting recovery of hdfs://de-hd-cluster.name-node.com:8020/apps/accumulo/data/wal/de-hd-cluster.data-node3.com+9997/91ece971-7485-4acf-aa7f-dcde00fafce9 (in : 300s), tablet +r<< holds a reference Here the tables in Hadoop: root@NameNode:~# hadoop fs -ls -R /apps/accumulo/data/tables/ drwxr-xr-x - accumulo hdfs 0 2016-04-19 14:16 /apps/accumulo/data/tables/!0 drwxr-xr-x - accumulo hdfs 0 2016-08-08 13:33 /apps/accumulo/data/tables/!0/default_tablet -rw-r--r-- 3 accumulo hdfs 871 2016-08-08 13:33 /apps/accumulo/data/tables/!0/default_tablet/F0002flt.rf drwxr-xr-x - accumulo hdfs 0 2016-08-10 10:57 /apps/accumulo/data/tables/!0/table_info -rw-r--r-- 3 accumulo hdfs 933 2016-08-08 10:14 /apps/accumulo/data/tables/!0/table_info/A0002bqu.rf -rw-r--r-- 3 accumulo hdfs 933 2016-08-08 10:19 /apps/accumulo/data/tables/!0/table_info/A0002bqx.rf -rw-r--r-- 3 accumulo hdfs 122 2016-08-10 10:57 /apps/accumulo/data/tables/!0/table_info/A004gpfm.rf_tmp -rw-r--r-- 3 accumulo hdfs 688 2016-08-08 13:33 /apps/accumulo/data/tables/!0/table_info/F0002fl0.rf drwxr-xr-x - accumulo hdfs 0 2016-04-19 14:16 /apps/accumulo/data/tables/+r drwxr-xr-x - accumulo hdfs 0 2016-08-10 10:57 /apps/accumulo/data/tables/+r/root_tablet -rw-r--r-- 3 accumulo hdfs 974 2016-08-08 10:19 /apps/accumulo/data/tables/+r/root_tablet/A0002bqz.rf -rw-r--r-- 3 accumulo hdfs 16 2016-08-10 10:57 /apps/accumulo/data/tables/+r/root_tablet/A004gpfl.rf_tmp -rw-r--r-- 3 accumulo hdfs 754 2016-08-10 10:13 /apps/accumulo/data/tables/+r/root_tablet/C004eodm.rf -rw-r--r-- 3 accumulo hdfs 364 2016-08-10 10:18 /apps/accumulo/data/tables/+r/root_tablet/F004ew4v.rf -rw-r--r-- 3 accumulo hdfs 364 2016-08-10 10:29 /apps/accumulo/data/tables/+r/root_tablet/F004fdch.rf -rw-r--r-- 3 accumulo hdfs 364 2016-08-10 10:34 /apps/accumulo/data/tables/+r/root_tablet/F004fn1f.rf -rw-r--r-- 3 accumulo hdfs 364 2016-08-10 10:39 /apps/accumulo/data/tables/+r/root_tablet/F004ftix.rf -rw-r--r-- 3 accumulo hdfs 364 2016-08-10 10:44 /apps/accumulo/data/tables/+r/root_tablet/F004g3af.rf -rw-r--r-- 3 accumulo hdfs 364 2016-08-10 10:54 /apps/accumulo/data/tables/+r/root_tablet/F004glat.rf drwxr-xr-x - accumulo hdfs 0 2016-04-19 14:16 /apps/accumulo/data/tables/+rep drwxr-xr-x - accumulo hdfs 0 2016-04-19 14:16 /apps/accumulo/data/tables/+rep/default_tablet drwxr-xr-x - accumulo hdfs 0 2016-04-19 14:18 /apps/accumulo/data/tables/1 drwxr-xr-x - accumulo hdfs 0 2016-08-10 10:57 /apps/accumulo/data/tables/1/default_tablet -rw-r--r-- 3 accumulo hdfs 2524936 2016-07-23 23:11 /apps/accumulo/data/tables/1/default_tablet/A0002041.rf -rw-r--r-- 3 accumulo hdfs 1502864 2016-07-29 11:17 /apps/accumulo/data/tables/1/default_tablet/C00024ci.rf -rw-r--r-- 3 accumulo hdfs 899175 2016-08-03 18:50 /apps/accumulo/data/tables/1/default_tablet/C00028be.rf -rw-r--r-- 3 accumulo hdfs 1428721 2016-08-07 13:21 /apps/accumulo/data/tables/1/default_tablet/C0002av5.rf -rw-r--r-- 3 accumulo hdfs 211245 2016-08-08 05:11 /apps/accumulo/data/tables/1/default_tablet/C0002bj6.rf -rw-r--r-- 3 accumulo hdfs 30474 2016-08-08 07:42 /apps/accumulo/data/tables/1/default_tablet/C0002bn1.rf -rw-r--r-- 3 accumulo hdfs 50286 2016-08-08 10:03 /apps/accumulo/data/tables/1/default_tablet/C0002bqh.rf -rw-r--r-- 3 accumulo hdfs 122 2016-08-10 10:57 /apps/accumulo/data/tables/1/default_tablet/C004gpfk.rf_tmp -rw-r--r-- 3 accumulo hdfs 905 2016-08-08 13:28 /apps/accumulo/data/tables/1/default_tablet/F0002byb.rf The command: root@hdp-accumulo-instance> scan -np -t accumulo.root hangs. Do you know how can I get rid of this table? 🙂 Klaus

Int17 · ‎08-22-2016

Hello, I receive the following messages from Accumulo every 10 seconds: monitor_de-hd-cluster.name-node.com.debug.log: 2016-08-22 07:43:14,841 [impl.ThriftScanner] DEBUG: Failed to locate tablet for table : !0 row : ~err_ 2016-08-22 07:43:23,167 [monitor.Monitor] INFO : Failed to obtain problem reports java.lang.RuntimeException: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:161) at org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:252) at org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:310) at org.apache.accumulo.monitor.Monitor.fetchData(Monitor.java:346) at org.apache.accumulo.monitor.Monitor$1.run(Monitor.java:486) at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:230) at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:80) at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151) ... 6 more 2016-08-22 07:43:23,510 [impl.ThriftScanner] DEBUG: Failed to locate tablet for table : !0 row : ~err_ 2016-08-22 07:43:26,533 [monitor.Monitor] INFO : Failed to obtain problem reports java.lang.RuntimeException: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:161) at org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:252) at org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:310) at org.apache.accumulo.monitor.Monitor.fetchData(Monitor.java:346) at org.apache.accumulo.monitor.Monitor$1.run(Monitor.java:486) at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:230) at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:80) at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151) ... 6 more After stopping Accumulo the alternating memory usage was gone. The cluster is not used by anyone and has nothing to do. Attached all debug log files after a restart of Accumulo. Could anyone assist? 🙂 Klaus

Online	Offline
Last Visited	‎03-09-2016 09:20 PM

Member Since	‎02-23-2016 09:00 PM
Last Visited	‎03-09-2016 09:20 PM
Posts	48
Kudos received	7

Cloudera Community

Re: Accumulo Tracer Service waitForFateOperation

Re: Flink cluster configuration issue - no slots a...

Re: Tutorial: Tag based policies with Apache Range...

Re: Tag based policies with Apache Ranger and Apac...

Re: Failed to install with cloudera-manager-instal...

Re: Oozie client installation failed in Ambari

Oozie client installation failed in Ambari

Re: Accumulo Tracer Service waitForFateOperation

Accumulo Tracer Service waitForFateOperation

Re: Flink cluster configuration issue - no slots a...

Flink cluster configuration issue - no slots avail...

Re: Failed to locate tablet for table : !0 row : ~...

Re: Failed to locate tablet for table : !0 row : ~...

Re: Failed to locate tablet for table : !0 row : ~...

Failed to locate tablet for table : !0 row : ~err_