Member since
06-28-2017
279
Posts
43
Kudos Received
24
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2016 | 12-24-2018 08:34 AM | |
5393 | 12-24-2018 08:21 AM | |
2250 | 08-23-2018 07:09 AM | |
9804 | 08-21-2018 05:50 PM | |
5185 | 08-20-2018 10:59 AM |
02-04-2018
01:27 PM
There is no tool hbck available on the nodes, so I couldn't try that
... View more
02-03-2018
02:35 PM
1 Kudo
I have an issue with the Ambari Metrics Collector and up to now I wasn't able to solve it. It started with the Metrics Collector being restarted frequently until it got stopped. So I followed the solution provided here: https://community.hortonworks.com/questions/121137/ambari-metrics-collector-restarting-again-and-agai.html (stopped the AMS completly, deleted the Hbase files, etc.) Now when I start the metric collector again, Ambari shows the alert Connection failed: [Errno 111] Connection refused to cgihdp4.localnet:6188 When I check this on the node, the alert is clear: [root@cgihdp4 ~]# netstat -tulpn | grep 6188
[root@cgihdp4 ~]# No process is listening on the port, so I stopped and restarted the AMS on that node again: [root@cgihdp4 ~]# ambari-metrics-collector status
AMS is not running.
[root@cgihdp4 ~]# ambari-metrics-collector start
Sa 3. Feb 11:18:07 CET 2018 Starting HBase.
starting master, logging to /var/log/ambari-metrics-collector/hbase-root-master-cgihdp4.out
Verifying ambari-metrics-collector process status...
Sa 3. Feb 11:18:10 CET 2018 Collector successfully started.
Sa 3. Feb 11:18:10 CET 2018 Initializing Ambari Metrics data model
Sa 3. Feb 11:18:27 CET 2018 Ambari Metrics data model initialization check 1
Sa 3. Feb 11:18:42 CET 2018 Ambari Metrics data model initialization check 2
Sa 3. Feb 11:18:58 CET 2018 Ambari Metrics data model initialization check 3
Sa 3. Feb 11:19:13 CET 2018 Ambari Metrics data model initialization check 4
Sa 3. Feb 11:19:30 CET 2018 Ambari Metrics data model initialization check 5
Sa 3. Feb 11:19:45 CET 2018 Ambari Metrics data model initialization check 6
Sa 3. Feb 11:20:01 CET 2018 Ambari Metrics data model initialization check 7
Sa 3. Feb 11:20:16 CET 2018 Ambari Metrics data model initialization check 8
Sa 3. Feb 11:20:34 CET 2018 Ambari Metrics data model initialization check 9
Sa 3. Feb 11:20:49 CET 2018 Ambari Metrics data model initialization check 10
[root@cgihdp4 ~]# ambari-metrics-collector status
AMS is running as process 32154.
[root@cgihdp4 ~]# netstat -tulpn | grep 6188
[root@cgihdp4 ~]# ps -ef | grep 32154
root 8187 31808 0 11:46 pts/0 00:00:00 grep 32154
root 32154 1 1 11:18 pts/0 00:00:24 /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xms640m -Xmx640m -Djava.library.path=/usr/lib/ams-hbase/lib/hadoop-native -Djava.security.auth.login.config=/etc/ams-hbase/conf/ams_collector_jaas.conf -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/ambari-metrics-collector/collector-gc.log-201802031118 -cp /usr/lib/ambari-metrics-collector/*:/etc/ambari-metrics-collector/conf -Djava.net.preferIPv4Stack=true -Dams.log.dir=/var/log/ambari-metrics-collector -Dproc_timelineserver org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
Looks like I miss an important point here? I checked the logs, where I see the following messages in /var/log/ambari-metrics-collector/hbase-ams-master-cgihdp4.log: 2018-02-03 11:43:59,838 WARN [ProcedureExecutorThread-2] procedure.CreateTableProcedure: The table SYSTEM.CATALOG does not exist in meta but has a znode. run hbck to fix inconsistencies.
...
2018-02-03 11:50:19,149 ERROR [cgihdp4.localnet,61300,1517601244862_ChoreService_1] master.BackupLogCleaner: Failed to get hbase:backup table, therefore will keep all files
[stacktrace removed...]
2018-02-03 11:51:14,716 INFO [timeline] timeline.HadoopTimelineMetricsSink: Unable to connect to collector, http://cgihdp4.localnet:6188/ws/v1/timeline/metrics
This exceptions will be ignored for next 100 times
2018-02-03 11:51:14,717 WARN [timeline] timeline.HadoopTimelineMetricsSink: Unable to send metrics to collector by address:http://cgihdp4.localnet:6188/ws/v1/timeline/metrics
Some odd things I also noticed when trying to follow above mentioned solution: the AMS user had no write permission on the hdfs trash, so all file deletion where failing until adding the parameter -skipTrash the directory 'hbase.tmp.dir'/zookeeper did not exist (and still doesn't exist) Any ideas on how to resolve it (I will try to run hbck as mentioned in the log)?
... View more
Labels:
- Labels:
-
Apache Ambari
01-31-2018
02:45 PM
@Matt Andruff May be you can verify the following point, in your intial log I can see this entry: 18/01/30 20:50:46 INFO RMProxy: Connecting to ResourceManager at this.server.fqdn/192.168.1.100:8032 So to me it looks like it tries to connect using the IP address. If that is true, and the reverse lookup by the IP doesn't return the name this.server.fqdn, the ticket that was granted for yarn/this.server.fqdn@MYREALM.INTERNAL can't be accepted. And what I can see from your output this is the case (if this is a result of the obfuscation just correct me): this.server.fqdn => 192.168.1.100 ok this => 192.168.1.100 ok 192.168.1.100 => ip-192.168.1.100.us-west-2.compute.internal <> this.server.fqdn nok, being the potential root cause of your authentication issue. What should help in that case: Make sure your client uses the server name instead of the IP, so that the reverse lookup will not be invoked Ensure that the reverse lookup results in the name this.server.fqdn (sometimes this is not possible due to network topology)
... View more
01-31-2018
12:42 PM
I think I just found the explanation on why your installation is not available: "You must do SSH on port 2222 when you want to connect to the actual docker container where HDP binaries are installed." I have just read that information here: https://community.hortonworks.com/questions/167327/hadoop-cmd-not-found-error-putty-hortonworks-sandb.html This applies to WinSCP as well, try connecting to port 2222, so change it from the default (22).
... View more
01-31-2018
12:13 PM
1 Kudo
Almost all the tools that are using authorization are based on usernames to authorize. I.e. in Ranger you configure username to allow access. And most of the tools could use an authorization different to Kerberos, so all of them need a mapping from the Kerberos principal to a username. If you have configured SSH to accept Kerberos authentication, the system still needs to know which user has been authenticated i.e. to determine the home dir and to start the user specific environment
... View more
01-31-2018
11:52 AM
it basically means that no hortonworks is installed (as there is no /usr/hdp directory). The sandbox is supposed to provide you with all the installation done. So either you connect to the wrong machine (unlikely as you are able to login with root), or something is wrong with your sandbox. Would you mind to try to download the sandbox again and import it into your tool (i guess you are using virtual box)?
... View more
01-31-2018
08:04 AM
if the file is not in the sandbox, it means you are either having a wrong sandbox, or that the sandbox doesn't fit to the tutorial. From the tutorial it says hdp-2.6.0, so I think you downloaded hdp-2.6.3., which has a link to the tutorial you are following. Please excuse me if I am explaining something obvious to you: but after logging in to the sandbox just try to click '..' (or the button with the up-arrow), or click on the button with the folder symbol labelled '/' to get to the so called root of the filesystem (which is something else than /root). Now check there if you can see a dir 'usr'. In your screenshot, you have the workdir '/root/start_scripts'.
... View more
01-31-2018
07:43 AM
@Matt Andruff your Kerberos log says that your application (spark_remote/this.server.fqdn@MYREALM.INTERNAL) was granted a service ticket for yarn/this.server.fqdn@MYREALM.INTERNAL. Looks like this ticket is not accepted by the resource manager. If your resource manager is otherwise working well with Kerberos, I really think @Geoffrey Shelton Okot is right that it is something with the names. can you check your name resolution with the below commands and verify that they all provide the FQDN name (this.server.fqdn) and the same IP (192.168.1.100)? nslookup this.server.fqdn
nslookup this nslookup 192.168.1.100
... View more
01-30-2018
01:07 PM
with WinSCP you should not have to enter the commands, you can copy the files by just dragging them from the right side (which is the remote machine, here the sandbox) to the left which is your local machine. So try to just move the .Main.py file to the correct dir. In the right side just open the dir where you want to copy the files to. The comand itself: scp -P 2222 root@sandbox.hortonworks.com:/usr/hdp/current/spark2-client/python/lib/pyspark.zip ~/HelloSpark/ Is supposed to copy the file /usr/hdp/current/spark2-client/python/lib/pyspark.zip from the server sandbox.hortonworks.com into the directory HelloSpark below your homedir on your local machine. To do so with WinSCP go on the right side to /usr/hdp/current/spark2-client/python/lib/ and there select the file pyspark.zip and drag it to the left side. What might be important, in the tutorial the SSH port is 2222 (and not the default 22 - by parameter -P 2222). The commands provided in the tutorial are to be entered in the shell, which means you open a shell (like bash) on your local machine and enter the commands. The commands actually fail, because they are using the name sandbox.hortonworks.com, but you are connecting via an IP address (192.168.47.128), so if you really want, you can try running the commands when you replaced sandbox.hortonworks.com by 192.168.47.128 I guess the target ~/HelloSpark could also be failing if you are running Windows on your local machine.
... View more
01-30-2018
12:25 PM
are you able to start an SSH session on your sandbox (i.e. with Putty)? The default port for SCP should be 22, which is the SSH port. So if you are able to start the shell, but not get WinSCP connected, it must be your WinSCP config.
... View more