Member since
05-12-2015
33
Posts
2
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6208 | 03-05-2016 06:34 PM | |
780 | 05-26-2015 01:56 PM |
08-04-2016
08:19 AM
Hello @nallen: Could you hep me with this question: https://community.hortonworks.com/questions/49823/need-instructions-to-start-data-ingest-on-metron.html
... View more
08-04-2016
07:54 AM
Hello, I managed to install single node metron via vagrant and I reached a point where Kibana status is Green & All hadoop services all in good state. But I am clueless of how to start ingesting pcap, netflow or bro. Looking at the github codebase of 0.2.BETA, it looks like you guys so far only implemented data ingestion sensors for pcap using fastcapa and bro using bro-plugin-kafka. Correct me if I am wrong about this. And please provide some instructions for the following: 1) How to start data ingestion: pcap or yaf netflow ? 2) Check status of ingestion 3) Check storm processing 4) Kibana UI: where to look at the alerts ?
... View more
Labels:
08-02-2016
07:15 PM
@Ryan Platt Thank you for the help. Your suggested fixes worked. Kibana dasboard is up with Green status But now I am clueless where look for data ingestions. pcap or netflow. Could you help me with next steps, please
... View more
06-29-2016
05:24 PM
2 Kudos
Hello guys @Former Member: when can we expect release of Cloudera Distribution of Apache Kafka with Kafka 0.10 version. Is there a official road map for kafka on CDH ?
... View more
- Tags:
- cdh
- Kafka
- kafka 0.10
06-20-2016
07:49 PM
@nallen and @David Lyle : Here are the versions: Metron 0.1BETA
--
* master
--
commit 739e2eb523dd6b4daeeccd3bab5a4a614ace8328
Author: dlyle65535 <dlyle65535@gmail.com>
Date: Tue Jun 14 15:35:15 2016 -0400
METRON-212: Allow additional Elasticsearch templates to be loaded to the index (dlyle65535 via cestella) closes apache/incubator-metron#145
--
metron-deployment/roles/metron_elasticsearch_templates/tasks/load_templates.yml | 1 +
1 file changed, 1 insertion(+)
--
ansible 2.0.0.2
config file =
configured module search path = Default w/o overrides
--
Vagrant 1.8.4
--
Python 2.7.10
--
Apache Maven 3.3.3 (7994120775791599e205a5524ec3e0dfe41d4a06; 2015-04-22T04:57:37-07:00)
Maven home: /usr/local/Cellar/maven/3.3.3/libexec
Java version: 1.8.0_91, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.11.4", arch: "x86_64", family: "mac"
--
Darwin Venkatas-MacBook-Pro.local 15.4.0 Darwin Kernel Version 15.4.0: Fri Feb 26 22:08:05 PST 2016; root:xnu-3248.40.184~3/RELEASE_X86_64 x86_64
... View more
06-15-2016
07:43 PM
My single node vagrant based installation is failing at elasticsearch start stage. Could you guys help me out with some troubleshooting TASK [elasticsearch : Install Elasticsearch.] **********************************
changed: [node1]
TASK [elasticsearch : Create Data Directories] *********************************
changed: [node1] => (item=/data1/elasticsearch)
changed: [node1] => (item=/data2/elasticsearch)
TASK [elasticsearch : Configure Elasticsearch.] ********************************
changed: [node1] => (item={u'regexp': u'#cluster\\.name', u'line': u'cluster.name: metron'})
changed: [node1] => (item={u'regexp': u'#network\\.host:', u'line': u'network.host: _eth1:ipv4_'})
changed: [node1] => (item={u'regexp': u'#discovery\\.zen\\.ping\\.unicast\\.hosts', u'line': u'discovery.zen.ping.unicast.hosts: [ node1 ]'})
changed: [node1] => (item={u'regexp': u'#path\\.data', u'line': u'path.data: /data1/elasticsearch,/data2/elasticsearch'})
TASK [elasticsearch : Start Elasticsearch.] ************************************
changed: [node1]
TASK [elasticsearch : include] *************************************************
included: /Users/<user>/BigData/metron/incubator-metron-Metron_0.1BETA_rc7/deployment/roles/elasticsearch/tasks/configure_index.yml for node1
TASK [elasticsearch : Wait for Elasticsearch Host to Start] ********************
ok: [node1]
TASK [elasticsearch : Wait for Green Index Status] *****************************
fatal: [node1]: FAILED! => {"failed": true, "msg": "ERROR! The conditional check 'result.content.find(\"green\") != -1' failed. The error was: ERROR! error while evaluating conditional (result.content.find(\"green\") != -1): ERROR! 'dict object' has no attribute 'content'"}
PLAY RECAP *********************************************************************
node1 : ok=94 changed=38 unreachable=0 failed=1
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again
... View more
Labels:
03-05-2016
06:34 PM
This is now fixed.
... View more
03-04-2016
11:25 AM
Hello guys, I am trying to setup livy_server for the spark notebooks to work with Hue 3.9.0 with CDH 5.6.0. I am facing the following error: livy server is failing to start with hue database password error: [root@xxxxxx hue]# ./build/env/bin/hue livy_server
Traceback (most recent call last):
File "./build/env/bin/hue", line 12, in <module>
load_entry_point('desktop==3.9.0', 'console_scripts', 'hue')()
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/desktop/core/src/desktop/manage_entry.py", line 57, in entry
execute_from_command_line(sys.argv)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/management/__init__.py", line 399, in e xecute_from_command_line
utility.execute()
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/management/__init__.py", line 392, in e xecute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/management/__init__.py", line 261, in f etch_command
commands = get_commands()
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/management/__init__.py", line 107, in g et_commands
apps = settings.INSTALLED_APPS
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/conf/__init__.py", line 54, in __getattr__
self._setup(name)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/conf/__init__.py", line 49, in _setup
self._wrapped = Settings(settings_module)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/conf/__init__.py", line 128, in __init__
mod = importlib.import_module(self.SETTINGS_MODULE)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/utils/importlib.py", line 40, in import_modu le
__import__(name)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/desktop/core/src/desktop/settings.py", line 309, in <module>
"PASSWORD" : desktop.conf.get_database_password(),
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/desktop/core/src/desktop/conf.py", line 1219, in get_database_password
password = DATABASE.PASSWORD_SCRIPT.get()
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/desktop/core/src/desktop/lib/conf.py", line 140, in get
return self.config.get_value(data, present=present, prefix=self.prefix, coerce_type=True)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/desktop/core/src/desktop/lib/conf.py", line 256, in get_value
return self._coerce_type(raw_val, prefix)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/desktop/core/src/desktop/lib/conf.py", line 276, in _coerce_type
return self.type(raw)
File "/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hue/desktop/core/src/desktop/conf.py", line 66, in coerce_password_from_script
raise subprocess.CalledProcessError(p.returncode, script)
subprocess.CalledProcessError: Command '/var/run/cloudera-scm-agent/process/4295-hue-HUE_SERVER/altscript.sh sec-8-password' returned non-zero exit status 1 I made sure current hue password set in CM is working fine. Please help here.
... View more
08-07-2015
11:44 AM
Harsh, thank you for comments. I tried changing permissions of /hadoopX to 700 and started datanode service on it, But cm-agent upon datanode service start, is creating /hadoopX/data & /hadoopX/local on root parition. Here is the log. [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Activating Process 608-hdfs-DATANODE [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Created /var/run/cloudera-scm-agent/process/608-hdfs-DATANODE [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Chowning /var/run/cloudera-scm-agent/process/608-hdfs-DATANODE to apps (513) apps (515) [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Chmod'ing /var/run/cloudera-scm-agent/process/608-hdfs-DATANODE to 0751 [07/Aug/2015 11:19:14 +0000] 14569 MainThread parcel INFO prepare_environment begin: {u'CDH': u'4.7.1-1.cdh4.7.1.p0.47'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin'] [07/Aug/2015 11:19:14 +0000] 14569 MainThread parcel INFO The following requested parcels are not available: {} [07/Aug/2015 11:19:14 +0000] 14569 MainThread parcel INFO Obtained tags ['cdh'] for parcel CDH [07/Aug/2015 11:19:14 +0000] 14569 MainThread parcel INFO prepare_environment end: {'CDH': '4.7.1-1.cdh4.7.1.p0.47'} [07/Aug/2015 11:19:14 +0000] 14569 MainThread util INFO Extracted 8 files and 0 dirs to /var/run/cloudera-scm-agent/process/608-hdfs-DATANODE. [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Created /var/run/cloudera-scm-agent/process/608-hdfs-DATANODE/logs [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Chowning /var/run/cloudera-scm-agent/process/608-hdfs-DATANODE/logs to apps (513) apps (515) [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Chmod'ing /var/run/cloudera-scm-agent/process/608-hdfs-DATANODE/logs to 0751 [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Created /hadoop6/data [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Chowning /hadoop6/data to apps (513) hadoop (493) [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Chmod'ing /hadoop6/data to 0700 [07/Aug/2015 11:19:14 +0000] 14569 MainThread agent INFO Triggering supervisord update. [07/Aug/2015 11:19:14 +0000] 14569 MainThread abstract_monitor INFO Refreshing DataNodeMonitor for None And we mount disk using fstab only. Our main goal is not decomission services on node with disk failures <= dfs.datanode.failed.volumes.tolerated (2). Could you help with some other workaournd of how to not allow cm-agent to create data dir's on root parition, may be adding a check in agent at $CMF_PATH/agent/src/cmf/agent.py ?
... View more
07-30-2015
11:32 AM
Hello, we have an issue with root parition getting filled whenever a datanode directory fails and umounted. Cloudera agent is creating the data dir's upon server power cycle or cloudera agent restart. Our environment: CDH 4.7.1 CM 4.8.5 Our current data dir setup: dfs.datanode.data.dir = /hadoopX/data mapred.local.dir = /hadoopX/local When ever a drive /hadoopX fails and drive gets unmount it for repair, Cloudera agent creates a /hadoopX/data and /hadoopX/local directories on root partition. Due to running jobs, root partition(200 gb) get filled pretty soon and results in service(datanode, tasktracker) failures. Is there a work around it ? How to stop cloudera agent to not create data dir's on root parition. I see that Ambari had a similar issue, and its fixed. Jira - https://issues.apache.org/jira/browse/AMBARI-7506 Please suggest any work around . Thank you. Appreciate your help. - Thanks Vganji
... View more
06-26-2015
02:08 PM
Gautam, what is recommended heapsize setting for CM services liek service monitor, host monitor, activity monitor ? For say: around 500 node cluster ? And slow we have mis configured alerts in CM for JT HA, we dont have HA setup i see health test result for MAPREDUCE_HA_JOB_TRACKER_HEALTH going bad very frequently. Our JT health is just fine.
... View more
06-25-2015
11:30 AM
Hello guys, we are running CDH 4.7.1 with CM 4.8.5 and we have pools set in fair schedular using CM's mapreduce configurations. One of the pools is set with following thresholds <pool name="pool-name"> <minMaps>144</minMaps> <minReduces>30</minReduces> <maxMaps>144</maxMaps> <maxReduces>30</maxReduces> </pool> But the schedular page at 'jt-host:50030/scheduler' show a completely different value: There are many other pools set but only this particular pool has the issue. Could you guys suggest any ways to debug this issue ? - Thanks vganji
... View more
06-19-2015
09:25 AM
Well, then do you suggest any workaround ? And also is there Apache JIRA for this issue ? I was not unable to find one. We restart our jobtracker every two weeks.
... View more
06-18-2015
11:39 PM
Thank for the reply, Wilfred. Our jobhistory is not on hdfs, its on local filesystem, inside jobtracker logs directory. jobtracker log location: /hadoop/logs jobtracker jobhistory: /hadoop/logs/history ---> current size is 71 gb. And I understand ' mapreduce.jobtracker.jobhistory.maxage' is the maximum age of log retention and we are not using this parameter in our cluster. Will moving jobhistory to hdfs, help ?
... View more
06-18-2015
01:21 PM
Hello, The jobhistory page at 'jobtracer_host:50030/jobhistory.jsp' is not picking up completed jobs. This issue started right after jobtracker service restart. I checked for following things: * Permissions are fine for the jobhistory and log location *job xml files are properply being generated on disk with latest jobs. *Only the 'jobhistory.jsp' is not picking them up. Our jobhistory location is at 71gb right now. Is their a chance that, jsp page is not able to pickup such a huge data ? we are currently running CDH 4.7.1 with CM 4.8.5. And also is this related to property: 'mapreduce.jobtracker.jobhistory.maxage' at http://mapredit.blogspot.com/2011/10/hadoop-log-retention.html This questions is related to " Jobtracker 's jobhistory page is empty " Please suggest. Any help is appreciated.
... View more
06-18-2015
12:11 PM
Hello Guys, this issue came back again. The jobhistory page at 'jobtracer_host:50030/jobhistory.jsp' is not picking up completed jobs. This issue started right after jobtracker service restart. I checked for following things: * Permissions are fine for the jobhistory and log location *job xml files are properply being generated on disk with latest jobs. *Only the 'jobhistory.jsp' is not picking them up. Our jobhistory location is at 71gb right now. Is their a chance that, jsp page is not able to pickup such a huge data ? And also is this related to property: 'mapreduce.jobtracker.jobhistory.maxage' at http://mapredit.blogspot.com/2011/10/hadoop-log-retention.html Please suggest. Any help is appreciated.
... View more
06-17-2015
01:36 PM
FYI: We were using custom location for logs not default - /var/log/. I was able to fix this issue, but it was a weird workaround of moving log and history location to default /var/log and then moved them to original location. jobhistory worked fine after this. I was unable to find the exact reason for this issue in first place.
... View more
06-17-2015
01:27 PM
Yes, Gautam. I lost it when I was trying to move the question to a appropriate topic group.
... View more
06-17-2015
01:21 PM
Hello, we have NN, JT without HA configuration. But our CM reports alerts for: CM 4.8.5/CDH 4.7.1 Active NameNode Role Health Check Standby NameNode Health Check Active JobTracker Health Mapreduce ->Configs -> Monitoring ->service_wide. HDFS ->Configs -> Monitoring ->service_wide. And also we get this following alerts even though Jobtracker is healthy and running. Please recommend how to change CM alerts configuration. Any help is appreciated. Thank you.
... View more
06-10-2015
11:00 AM
Hello,
Our jobtraker's jobhistory page at 'http://jobtracker_hosst:50030/jobhistory.jsp' is showing up empty with 'No files found' message. I know this page is populated by the local webserver run inside jobtracker service from the job history location. I have checked the job history log's location on the disk and it is just fine with lot of data.
Did anyone faced similar issue or know how to resolve this, please suggest. Any help is appreciated.
I am using: CM 4.8.5, CDH 4.7.1, mrv1
If it helps, I looked up the hadoop code and found how this jsp page is rendered.
Code block from: https://github.com/apache/hadoop-mapreduce/blob/HD FS-641/src/webapps/job/jobhistory.jsp
Path[] jobFiles = FileUtil.stat2Paths(fs.listStatus(new Path(historyLogDir), jobLogFileFilter));
out.println("<!-- user : " + user + ", jobid : " + jobid + "-->");
if (null == jobFiles || jobFiles.length == 0) {
out.println("No files found!");
return ;
}
I am trying to find where is the value 'historyLogDir' getting populated from
Find the jobtracker page here:
... View more
06-09-2015
10:29 AM
Gautam, Could you also help me out with how/where to enable GC logging for cloudera manager server ?
... View more
06-08-2015
10:30 PM
Thank you, Gautam. Appreciate it.
... View more
06-08-2015
06:26 PM
Hello , Where to change cloudera manager's heap size for GC ? I am running CM 4.8.5.
... View more
06-08-2015
12:00 PM
Hello, We have a cloudera manager 4.8.5 managed cluster and anyone in the same network of the CM_host machine can download the client configurations without any authentication via the bellow two static urls: http://cm_host:7180/api/v5/clusters/{cluster-name}/services/mapreduce1/clientConfig http://cm_host:7180/cmf/services/22/client-config Is there a way to restrict this access via CM API or CM settings ? Any suggestions are appreciated
... View more
05-27-2015
09:40 PM
Thanks Harsh, appreciate it.
... View more
05-27-2015
09:32 PM
I know that it could be set in jobtracker but, I am trying to set it at job level to avoid jobtracker restart. So, as I understand it is not at all possible to set it at job level ? Interesting. Thanks Harsh.
... View more
05-27-2015
01:57 PM
I have mapreduce job which requires more than current value of 10mb which is set in mapred-site.xml. I want to use temporarily increased value for my current job. Can I set it in the driver class, like 'job.setXXXXXX()' ? And also I dont want to use -D option at command line. Any suggestions ?
... View more
05-26-2015
01:56 PM
Its resolved! It looks like /metrics is moved to /jmx based on newer metrics2 implementation.
... View more
05-26-2015
01:37 PM
We recently upgraded our cluster from CDH3u3 to CDH 4.7.1. After the upgrade the 'http://namenode:50070/metrics' is not showing up. Does anything need to be enabled for this in CDH4 or am I missing something ? Please suggest.
... View more