Member since
04-23-2017
30
Posts
3
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2654 | 06-11-2016 04:19 AM | |
2367 | 05-22-2016 05:51 AM |
11-13-2017
01:55 PM
I am trying to use the Kafka Schema Registry with the HDFS sink. I am able to retrieve the schema using a curl command and to write data into my Kafka Topic using the Kafka-Avro-Console-Producer. Flume sees the event but when throws the error. I have tried different configurations and nothing is working. Please help. process failed
org.apache.flume.FlumeException: Could not find schema for event [Event headers = {timestamp=1510603952954, topic=customerdata2, partition=1}, body.length = 11 ] Below is my flume configuration where I list my Kafka Registry. # Sources, channels, and sinks are defined per # agent name, in this case tier1. tier1.sources = source1 tier1.channels = channel1 tier1.sinks = sink1 tier1.sources.source1.type = org.apache.flume.source.kafka.KafkaSource tier1.sources.source1.deserializer.schemaType = LITERAL tier1.sources.source1.zookeeperConnect = zk-nonprod:2181 tier1.sources.source1.topic =customerdata2 tier1.sources.source1.groupId = flume tier1.sources.source1.channels = channel1 tier1.sources.source1.kafka.consumer.timeout.ms = 100 tier1.channels.channel1.type = memory tier1.channels.channel1.capacity = 10000 tier1.channels.channel1.transactionCapacity = 1000 tier1.channels.channel1.parseAsFlumeEvent = false tier1.channels.channel1.kafka.consumer.auto.offset.reset = earliest tier1.sinks.sink1.type = hdfs tier1.sinks.sink1.serializer = org.apache.flume.sink.hdfs.AvroEventSerializer$Builder tier1.sinks.sink1.hdfs.kerberosPrincipal=$KERBEROS_PRINCIPAL tier1.sinks.sink1.hdfs.kerberosKeytab=$KERBEROS_KEYTAB tier1.sinks.sink1.hdfs.path = hdfs://cdh-namenode:8020/tmp/kafka/%{topic}/%y-%m-%d tier1.sinks.sink1.hdfs.rollInterval = 5 tier1.sinks.sink1.hdfs.rollSize = 0 tier1.sinks.sink1.hdfs.rollCount = 0 tier1.sinks.sink1.hdfs.fileType = DataStream tier1.sinks.sink1.hdfs.serializer = avro_event tier1.sinks.sink1.hdfs.fileSuffix = .avro tier1.sinks.sink1.hdfs.schema.registry.url = http://kafka-util:8081 tier1.sinks.sink1.channel = channel1
... View more
Labels:
- Labels:
-
Apache Flume
-
Apache Kafka
-
HDFS
04-26-2017
07:15 AM
Any help would be appreciated? I have researched this for the late 72 hours and nothing found has worked. I have checked security and the folders are owned by yarn and there appears(unless I am missing something) plenty of space on all the drives. All four nodemanagers are still marked as unhealthy and none of my jobs will complete. The below is still listed in the log. Unable to create directory /data/yarn/nm error mkdir of /data/yarn failed, removing from the list of valid directories. Unable to create directory /data/yarn/container-logs error mkdir of /data/yarn failed, removing from the list of valid directories. Most of the disks failed. 1/1 local-dirs are bad: /data/yarn/nm; 1/1 log-dirs are bad: /data/yarn/container-logs
... View more
04-24-2017
06:31 AM
Brand New installation of Cloudera 5.11.0 and no issues encountered install. NodeManager health for all 4 datanodes is marked as bad. The role instance log for all nodes shows the below lines. Unable to create directory /data/yarn/nm error mkdir of /data/yarn failed, removing from the list of valid directories. Unable to create directory /data/yarn/container-logs error mkdir of /data/yarn failed, removing from the list of valid directories. Most of the disks failed. 1/1 local-dirs are bad: /data/yarn/nm; 1/1 log-dirs are bad: /data/yarn/container-logs I checked the nodes and all of them have the directories stated in the error. I tested one datanode to try and isolate the problem by renaming the current directories and restarted the service and both recreated succesfully. I also checked space to see if that may be an issue but there is 500GB of space on the drive with 1% used. Any ideas??
... View more
Labels:
- Labels:
-
Apache YARN
-
Cloudera Manager
06-11-2016
04:19 AM
1 Kudo
Got it!! I added the below property to the mapred-site.xml file and the query ran successfully. <property>
<name>mapreduce.app-submission.cross-platform</name>
<value>True</value>
</property>
... View more
06-11-2016
03:30 AM
Here is the Property I added. <property>
<name>hdp.version</name>
<value>2.4.0.0-169</value>
</property>
... View more
06-11-2016
03:26 AM
I got past the substitution error by adding the hdp version value to the map-site.xml file. I am now getting the below error. Any suggestions? Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
2016-06-11 03:04:34,485 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(767)) - application_1465576751567_0007 State change from FINAL_SAVING to FAILED
2016-06-11 03:04:34,485 WARN resourcemanager.RMAuditLogger (RMAuditLogger.java:logFailure(323)) - USER=pdw_user OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1465576751567_0007 failed 2 times due to AM Container for appattempt_1465576751567_0007_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://sandbox.hortonworks.com:8088/cluster/app/application_1465576751567_0007Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e38_1465576751567_0007_02_000001
Exit code: 1
Exception message: /bin/bash: line 0: fg: no job control
Stack trace: ExitCodeException exitCode=1: /bin/bash: line 0: fg: no job control
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application. APPID=application_1465576751567_0007
... View more
06-10-2016
06:17 PM
Ok - I got past the parse error by substituting the version number and the job was submitted to my sandbox and a different error showed up this time that is below. Application application_1465576751567_0001 failed 2 times due to AM Container for appattempt_1465576751567_0001_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://sandbox.hortonworks.com:8088/cluster/app/application_1465576751567_0001Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e38_1465576751567_0001_02_000001
Exit code: 1
Exception message: /hadoop/yarn/local/usercache/pdw_user/appcache/application_1465576751567_0001/container_e38_1465576751567_0001_02_000001/launch_container.sh: line 11: %PWD%;$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure;job.jar/job.jar;job.jar/classes/;job.jar/lib/*;%PWD%/*: bad substitution
... View more
06-10-2016
05:45 PM
Artem - after adding the files into the config folder in the Polybase Path I got the below error. I checked the mapred-site.xml file and the path it is unable to parse is in the mapreduce.application.framework.path. Should I go though the file and hardcode current in place of where I see ${hdp.version}? Java exception raised on call to JobSubmitter_SubmitJob: Error [Unable to parse '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework' as a URI, check the setting for mapreduce.application.framework.path] occurred while accessing external file.'
... View more
06-10-2016
05:24 PM
Tim - I do not see it as a setup issue because I am making it to the cluster and it is failing as it is trying to run the job on the cluster. Plus, I can run a query from the SQL Server 216 install on my ORC table successfully. The SQL 2016 instructions for setting up the pushdown is only to copy the classpath value so not a lot there as to How-To".
... View more
06-10-2016
05:21 PM
I am working on this suggestion.
... View more
06-10-2016
02:45 AM
@Artem Ervits can you give me a hint as to what is wrong with my classpath?
... View more
06-09-2016
02:32 PM
2 Kudos
I have Polybase setup and configured to work with my HDP Sandbox. I am able to Query my ORC table successfully but when I try to use Pushdown the query fails. I setup the Pushdown by copying the yarn.application.classpath from my yarn-site.xml in the path /etc/hadoop/conf on my HDP Sandbox to the yarn-site.xml file located in the path C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf. <name>yarn.application.classpath</name>
<value>$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*</value>
</property> When I review the job log within Ambari I see the below error. Can anyone tell me what I am missing in my classpath? Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
... View more
Labels:
- Labels:
-
Apache YARN
05-22-2016
05:51 AM
After a lot of trial and error, reading documentation and beers. I was able to figure out that the issue was the name of my pig jar. I renamed it to pig.jar in the lib folder, backed out all changes to my pig-env,sh and my oozie job ran and succeeded. Thank you to everyone for your help and especially to Jitendra for taking the time to give suggestions.
... View more
05-21-2016
11:56 PM
The job still fails with the same error in the log oozie log. Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain]
... View more
05-21-2016
11:30 PM
I have copied both files from /etc/hadoop/conf to /usr/hdp/2.4.0.0-169/pig/conf and chmod to 755. Is this correct?
... View more
05-21-2016
11:03 PM
I did not get any pig jars when I ran grep -iR core-site.xml /usr/hdp/2.4.0.0-169/pig. The only output was the below. Should I copy the xml files to the pig/conf as you instructed? /usr/hdp/2.4.0.0-169/pig/doc/api/constant-values.html:<td><code>"core-site.xml"</code></td>
/usr/hdp/2.4.0.0-169/pig/conf/pig.properties:# By default, Pig expects hadoop configs (hadoop-site.xml and core-site.xml)
/usr/hdp/2.4.0.0-169/pig/CHANGES.txt:PIG-4247: S3 properties are not picked up from core-site.xml in local mode (cheolsoo)
/usr/hdp/2.4.0.0-169/pig/CHANGES.txt:PIG-3145: Parameters in core-site.xml and mapred-site.xml are not correctly substituted (cheolsoo)
... View more
05-21-2016
10:51 PM
Here is what my new pig-env.sh looks like. JAVA_HOME=/usr/lib/jvm/java
HADOOP_HOME=${HADOOP_HOME:-/usr/hdp/current/hadoop-client}
export HADOOP_CONF_DIR=$HADOOP_CONF_DIR:/etc/hadoop/conf
export PIG_CLASSPATH=$PIG_CLASSPATH:$HADOOP_CONF_DIR
if [ -d "/usr/lib/tez" ]; then
PIG_OPTS="$PIG_OPTS -Dmapreduce.framework.name=yarn"
fi
... View more
05-21-2016
10:46 PM
Here is the output Jitendra. Do you see anything wrong? /usr/hdp/2.4.0.0-169/pig/doc/api/constant-values.html:<td><code>"hadoop-site.xml"</code></td>
/usr/hdp/2.4.0.0-169/pig/conf/pig.properties:# By default, Pig expects hadoop configs (hadoop-site.xml and core-site.xml)
/usr/hdp/2.4.0.0-169/pig/CHANGES.txt:PIG-3200: MiniCluster should delete hadoop-site.xml on shutDown (prkommireddi via cheolsoo)
/usr/hdp/2.4.0.0-169/pig/CHANGES.txt:PIG-2491: Pig docs still mention hadoop-site.xml (daijy)
/usr/hdp/2.4.0.0-169/pig/CHANGES.txt:PIG-1791: System property mapred.output.compress, but pig-cluster-hadoop-site.xml doesn't (daijy)
/usr/hdp/2.4.0.0-169/pig/CHANGES.txt:PIG-1186: Pig do not take values in "pig-cluster-hadoop-site.xml" (daijy)
/usr/hdp/2.4.0.0-169/pig/CHANGES.txt: 'pig-cluster-hadoop-site.xml' in the non HOD case just like it does in the
/usr/hdp/2.4.0.0-169/pig/RELEASE_NOTES.txt:variable to point to the directory with your hadoop-site.xml file and then run
... View more
05-21-2016
10:28 PM
I am running the HDP Sandbox. My hadoop version command returns the below and I have not changed anything to my recollection about the environment. hadoop version
Hadoop 2.7.1.2.4.0.0-169
Subversion git@github.com:hortonworks/hadoop.git -r 26104d8ac833884c8776473823007f176854f2eb
Compiled by jenkins on 2016-02-10T06:18Z
Compiled with protoc 2.5.0
From source with checksum cf48a4c63aaec76a714c1897e2ba8be6
This command was run using /usr/hdp/2.4.0.0-169/hadoop/hadoop-common-2.7.1.2.4.0.0-169.jar
... View more
05-21-2016
09:48 PM
Actually that is not the issue. When I looked down in the log I found the below. I will investigate but appreciate any guidance as well. ERROR 4010: Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath).
... View more
05-21-2016
09:29 PM
I have took a look at the stderr logs and got this as an output. Does this make sense? SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/filecache/23/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/filecache/44/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Details at logfile: /hadoop/yarn/local/usercache/mbharrel/appcache/application_1463627394303_0038/container_e16_1463627394303_0038_01_000002/pig-job_1463627394303_0038.log
... View more
05-21-2016
04:33 AM
I am trying to run a Oozie job that runs a pig script. My job.properties file is local and my workflow.xml is in HDFS. Below is my properties file. nameNode=hdfs://192.168.56.104:8020
jobTracker=192.168.56.104:8050
queueName=default
flattenRoot=Flatten_Tweet2
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/oozie/${flattenRoot}
oozie.action.sharelib.for.pig=pig,hcatalog,hive and here is my workflow file. <workflow-app name='pig-wf' xmlns="uri:oozie:workflow:0.3">
<start to='pig-node'/>
<action name='pig-node'>
<pig>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<script>Flatten_Tweet2.pig</script>
<file>lib/hive-site.xml</file>
</pig>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Pig failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app> I keep getting the below error no matter what I try. Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2] Any help would be apprecated. Thanks
... View more
- Tags:
- Data Processing
- Oozie
Labels:
- Labels:
-
Apache Oozie
05-19-2016
03:20 AM
The changing of the ulimit appears to have fixed that issue. I am now presented with the below issue but at least it is something different. Thank you Xi for the help.
... View more
05-02-2016
05:24 PM
I was using IE Edge on my new Windows 10 OS. I just tried it with Chrome and it worked!! Thank you sir.
... View more
04-29-2016
01:35 PM
I am working on a Twitter Analysis POC and noticing that Hive Queries run in CLI work fine but when I run them from the Hive View in Ambari some of them do not return results. When this happens the Logs tab shows "INFO : Tez session hasn't been created yet. Opening session". I an see the query info within the Tez view and it is showing successful completion but the results are never returned to the Hive View. I am using HDP Sandbox 2.4.
... View more
Labels:
- Labels:
-
Apache Hive