Member since
02-06-2018
47
Posts
5
Kudos Received
0
Solutions
03-12-2019
05:02 PM
@Jordan Moore This is what we have done. However, I am not able to see any consumer matrix , I am only able to see broker matrix. is there something I am doing wrong.
... View more
03-07-2019
06:32 PM
I am using confluent HDFS sink connector and would like to know how to get consumer properties to expose through either JMS or rest api. I checked the following two properties, however, I don't know how to expose matrix to jmx port 1. connect-standalone.properties 2. consumer.properties
... View more
- Tags:
- Kafka
Labels:
- Labels:
-
Apache Kafka
01-24-2019
08:01 PM
Repo Description I have created a NiFi Utlities for doing following things. Find stopped processor groups Following script finds the processor group in which no processor is running. find_stopped_processor_groups.py Find invalid processor group
Following script finds the processor group in which has at least one invalid processor. As even if with single invalid processor you can't run that processor group.
find_invalid_processor_groups.py Find duplicate processor group
Following script finds the duplicate processor group based in it's name.
find_duplicate_processor_groups.py
Find controller services Following script finds all the database controller services used by NiFi
find_duplicate_processor_groups.py Instrunction about how to install and run is on repo page. Repo Info Github Repo URL https://github.com/Gaurang033/nifapi Github account name Gaurang033 Repo name nifapi
... View more
- Find more articles tagged with:
- nifi-api
- nifi-controller-service
- nifi-processor
- nifi-templates
- solutions
Labels:
01-18-2019
10:51 PM
not exactly, as I mentioned state could either be started or installed only. I was to see if service is facing any issue,
... View more
01-18-2019
07:34 PM
Hi Guys, I am trying to check the service status with Ambari rest API, However I am not able to find any documention which explain thing in great details. For example, If I hit following REST URL http://localhost:8080/api/v1/clusters/Sandbox/services/HDFS I get following output "maintenance_state": "OFF",
"repository_state": "CURRENT",
"service_name": "HDFS",
"state": "STARTED"
},
"alerts_summary": {
"CRITICAL": 0,
"MAINTENANCE": 8,
"OK": 293,
"UNKNOWN": 4,
"WARNING": 0
}, However, I am not sure how to interpret this things. Should I care about MAINTENANCE, UNKNOWN and WARNING and just checking that nothing CRITICAL is good enough. This is mainly for developers to understand and track the time how much any service is down.
... View more
Labels:
- Labels:
-
Apache Ambari
10-01-2018
07:30 PM
I am trying to understand the hive query plan for a simple distinct query and I have small confusion regrading output of one of the stage. I have simple table with just two column, id and value. and just 4 rows as mentioned below. Data: <code>hive> select * from temp.test_distinct;
OK
1 100
2 100
3 100
4 150
Plan <code>hive> explain select distinct value from temp.test_distinct;
OK
Plan not optimized by CBO.
Vertex dependency in root stage
Reducer 2 <- Map 1 (SIMPLE_EDGE)
Stage-0
Fetch Operator
limit:-1
Stage-1
Reducer 2
File Output Operator [FS_6]
compressed:false
Statistics:Num rows: 2 Data size: 10 Basic stats: COMPLETE Column stats: NONE
table:{"input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}
Group By Operator [GBY_4]
| keys:KEY._col0 (type: string)
| outputColumnNames:["_col0"]
| Statistics:Num rows: 2 Data size: 10 Basic stats: COMPLETE Column stats: NONE
|<-Map 1 [SIMPLE_EDGE]
Reduce Output Operator [RS_3]
key expressions:_col0 (type: string)
Map-reduce partition columns:_col0 (type: string)
sort order:+
Statistics:Num rows: 4 Data size: 20 Basic stats: COMPLETE Column stats: NONE
Group By Operator [GBY_2]
keys:value (type: string)
outputColumnNames:["_col0"]
Statistics:Num rows: 4 Data size: 20 Basic stats: COMPLETE Column stats: NONE
Select Operator [SEL_1]
outputColumnNames:["value"]
Statistics:Num rows: 4 Data size: 20 Basic stats: COMPLETE Column stats: NONE
TableScan [TS_0]
alias:test_distinct
Statistics:Num rows: 4 Data size: 20 Basic stats: COMPLETE Column stats: NONE
Time taken: 0.181 seconds, Fetched: 35 row(s)
Confusion: TableScan, Select Operator and Group By Operator shows that they processed 4 rows which make sense to me. But shouldn't the next stage after Group By Operator get only 2 rows the process. As group by will remove other rows. In my DAG I can see the output of mapper is just two rows and not four, however that doesn't see to match with Plan. May i looking it wrong?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
09-12-2018
09:27 PM
@Venkatesh Kancharla open up a new question with all the details. and logs.
... View more
09-07-2018
03:08 PM
Compaction works only on transactional table, and to make any table transactional it should meet following properties.
Should be ORC Table Should be bucketed Should be managed table. Due you see the last point, you can't run compaction on non transactional table, if you do it from hive you will definitely get error, not sure from spark.
... View more
09-05-2018
05:54 PM
you are not getting the desired result as your compaction has failed. please check the yarn log to understand what might have gone wrong.
... View more
08-13-2018
07:10 PM
I found the solution, so posting here. The problem I was having is I was just stopping the docker container after making change in the command I was using to start the HDP image, didn't realize need to remove container as well. Following command helped. Save docker Work docker commit <hdp_container_id> <hdp_container_id> Stop and Remove docker. docker stop <hdp_container_id>
docker rm <hdp_container_id> Open 9083 port (hive metastore) by modifying start-sandbox-hdp-standalone_2-6-4.sh #!/bin/bash
echo "Waiting for docker daemon to start up:"
until docker ps 2>&1| grep STATUS>/dev/null; do sleep 1; done; >/dev/null
docker ps -a | grep sandbox-hdp
if [ $? -eq 0 ]; then
docker start sandbox-hdp
else
docker pull hortonworks/sandbox-hdp-standalone:2.6.4
docker run --name sandbox-hdp --hostname "sandbox-hdp.hortonworks.com" --privileged -d \
-p 9083:9083 \
- Start Docker ./start-sandbox-hdp-standalone_2-6-4.sh
... View more
08-10-2018
02:15 PM
@Sandeep Nemuri how do i check if metastore is up and running from my local machine. I logged into docker container and I can see I am able to telnet into the 9083 port. However if I try to do that from my local machine, it doesn't work. The one thing I realized is the port is not exposed in docker image or mentioned in webpage anywhere. https://hortonworks.com/tutorial/hortonworks-sandbox-guide/section/3/ I exposed the port and restarted the docker container however, I am still not able to connect to that port using telnet from my local machine or from presto server ( which also on my local machine).
... View more
08-10-2018
12:59 AM
I am trying to connect to hive metastore to my HDP sandbox. However it's throwing following error. Hive Catalogonnector.name=hive-hadoop2
hive.metastore.uri=thrift://sandbox-hdp.hortonworks.com:9083
hive.metastore.authentication.type=NONE Error: Query 20180810_005352_00000_umgac failed: Failed connecting to Hive metastore: [sandbox-hdp.hortonworks.com:9083] I tried using following values for hive.metastore.uri however I am gettting the same error. thrift://localhost:9083 thrift://127.0.0.1:9083 thrift://<IP of docker container>:9083 thrift://<IP of local machine>:9083
... View more
Labels:
- Labels:
-
Apache Hive
07-24-2018
07:17 PM
am trying to connect hbase using JAVA HBase Rest Client. However it's giving following error. The Hbase Rest uses kerberos authentication, and so I have created a kerberos ticket and trying to authenticate using this ticket. code: public class RestExample {
public static void main(String[] args) throws IOException {
Configuration conf = HBaseConfiguration.create();
UserGroupInformation.setConfiguration(conf);
String projectDir = System.getProperty("user.dir");
System.out.println(projectDir);
UserGroupInformation.loginUserFromKeytab("gaurang.shah@mydomain.com", projectDir+"/gaurang.shah.keytab");
// vv RestExample
Cluster cluster = new Cluster();
cluster.add("hbase_host.mydomain.com", 17000);
Client client = new Client(cluster);
TableName tableName = TableName.valueOf("bda:aaa");
RemoteAdmin remoteAdmin = new RemoteAdmin(client, conf);
HTableDescriptor tableDesc = new HTableDescriptor(tableName);
remoteAdmin.createTable(tableDesc);
}
} StackTrace: Exception in thread "main" java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, URL: http://hbase_host.mydomain.com:17000/bda:aaa/schema?user.name=gaurang.shah, status: 403, message: Forbidden
at org.apache.hadoop.hbase.rest.client.Client.negotiate(Client.java:285)
at org.apache.hadoop.hbase.rest.client.Client.executeURI(Client.java:239)
at org.apache.hadoop.hbase.rest.client.Client.executePathOnly(Client.java:204)
at org.apache.hadoop.hbase.rest.client.Client.execute(Client.java:265)
at org.apache.hadoop.hbase.rest.client.Client.put(Client.java:557)
at org.apache.hadoop.hbase.rest.client.Client.put(Client.java:504)
at org.apache.hadoop.hbase.rest.client.Client.put(Client.java:474)
at org.apache.hadoop.hbase.rest.client.RemoteAdmin.createTable(RemoteAdmin.java:294)
at ca.cantire.RestExample.main(RestExample.java:42)
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, URL: http://hbase_host.mydomain.com:17000/bda:aaa/schema?user.name=gaurang.shah, status: 403, message: Forbidden
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:281)
at org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:77)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:212)
at org.apache.hadoop.hbase.rest.client.Client.negotiate(Client.java:280)
... 8 more
... View more
Labels:
- Labels:
-
Apache HBase
07-11-2018
02:26 PM
hi Guys.
I am trying to load data files into my hive table and facing issue. if files are located on local it doesn't work. if I move the file to hdfs then it works without any issue. Following command is not working in beeline, however it works perfectly in hive load data local inpath '/home/gaurang.shah/test.json' into table temp.test; Data is located on the node where one of the instance of hiverserver2 is running. I have given it all the permission as well. [gaurang.shah@aa ~] pwd
/home/gaurang.shah
[gaurang.shah@aa ~]$ ll test.json
-rwxrwxrwx 1 gaurang.shah domain users 56 Jul 11 13:54 test.json
... View more
Labels:
- Labels:
-
Apache Hive
06-08-2018
02:40 PM
hi Guys, I am thinking to use hbase reset api to interact with hbase reset server. would someone please let me know if there is a rest client available for hbase rest server. I found following two for python, if someone has used any of this would you please share the exeperince. https://github.com/barseghyanartur/starbase https://github.com/wbolster/happybase
... View more
- Tags:
- Data Processing
- HBase
Labels:
- Labels:
-
Apache HBase
06-05-2018
08:01 PM
it's throwing me following error if I have multiple column family in my Hbase table. Does this approach works only for the single column family ?
java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Multiple family directories found in hdfs://hadoopdev/apps/hive/warehouse/temp.db/employee_details/_temporary/0/_temporary/attempt_1527799542731_1180_r_000000_0
... View more
04-17-2018
08:21 PM
@Naresh P R I am using following repo and it resolve my issue http://repo.hortonworks.com/content/groups/public/
... View more
04-16-2018
02:20 PM
@Naresh P R Thanks, Now I can resolve the dependancy However, not able to compile the code. I am getting following error. I tried to add the dependancy mentioned, However it's not helping either. [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.2-beta-5:single (default-cli) on project hive-normalize: Failed to create assembly: Failed to resolve dependencies for project: ca.abc:hive-normalize:jar:1.0: Missing:
[ERROR] ----------
[ERROR] 1) org.mortbay.jetty:jetty-util:jar:6.1.ca26.hwx
[ERROR]
[ERROR] Try downloading the file manually from the project website.
[ERROR]
[ERROR] Then, install it using the command:
[ERROR] mvn install:install-file -DgroupId=org.mortbay.jetty -DartifactId=jetty-util -Dversion=6.1.26.hwx -Dpackaging=jar -Dfile=/path/to/file
[ERROR]
[ERROR] Alternatively, if you host your own repository you can deploy the file there:
[ERROR] mvn deploy:deploy-file -DgroupId=org.mortbay.jetty -DartifactId=jetty-util -Dversion=6.1.26.hwx -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
[ERROR]
[ERROR] Path to dependency:
[ERROR] 1) ca.abc:hive-normalize:jar:1.0
[ERROR] 2) org.apache.hive:hive-exec:jar:1.2.1000.2.6.4.0-91
[ERROR] 3) org.apache.hive:hive-shims:jar:1.2.1000.2.6.4.0-91
[ERROR] 4) org.apache.hive.shims:hive-shims-0.23:jar:1.2.1000.2.6.4.0-91
[ERROR] 5) org.apache.hadoop:hadoop-hdfs:jar:2.7.3.2.6.4.0-91
[ERROR] 6) org.mortbay.jetty:jetty-util:jar:6.1.26.hwx
[ERROR]
[ERROR] 2) org.mortbay.jetty:jetty:jar:6.1.26.hwx
[ERROR]
[ERROR] Try downloading the file manually from the project website.
[ERROR]
[ERROR] Then, install it using the command:
[ERROR] mvn install:install-file -DgroupId=org.mortbay.jetty -DartifactId=jetty -Dversion=6.1.26.hwx -Dpackaging=jar -Dfile=/path/to/file
[ERROR]
[ERROR] Alternatively, if you host your own repository you can deploy the file there:
[ERROR] mvn deploy:deploy-file -DgroupId=org.mortbay.jetty -DartifactId=jetty -Dversion=6.1.26.hwx -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
[ERROR]
[ERROR] Path to dependency:
[ERROR] 1) ca.abc:hive-normalize:jar:1.0
[ERROR] 2) org.apache.hive:hive-exec:jar:1.2.1000.2.6.4.0-91
[ERROR] 3) org.apache.hive:hive-shims:jar:1.2.1000.2.6.4.0-91
[ERROR] 4) org.apache.hive.shims:hive-shims-0.23:jar:1.2.1000.2.6.4.0-91
[ERROR] 5) org.apache.hadoop:hadoop-hdfs:jar:2.7.3.2.6.4.0-91
[ERROR] 6) org.mortbay.jetty:jetty:jar:6.1.26.hwx
[ERROR]
[ERROR] 3) org.mortbay.jetty:jetty-sslengine:jar:6.1.26.hwx
[ERROR]
[ERROR] Try downloading the file manually from the project website.
[ERROR]
[ERROR] Then, install it using the command:
[ERROR] mvn install:install-file -DgroupId=org.mortbay.jetty -DartifactId=jetty-sslengine -Dversion=6.1.26.hwx -Dpackaging=jar -Dfile=/path/to/file
[ERROR]
[ERROR] Alternatively, if you host your own repository you can deploy the file there:
[ERROR] mvn deploy:deploy-file -DgroupId=org.mortbay.jetty -DartifactId=jetty-sslengine -Dversion=6.1.26.hwx -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
[ERROR]
[ERROR] Path to dependency:
[ERROR] 1) ca.abc:hive-normalize:jar:1.0
[ERROR] 2) org.apache.hive:hive-exec:jar:1.2.1000.2.6.4.0-91
[ERROR] 3) org.apache.hive:hive-shims:jar:1.2.1000.2.6.4.0-91
[ERROR] 4) org.apache.hive.shims:hive-shims-0.23:jar:1.2.1000.2.6.4.0-91
[ERROR] 5) org.apache.hadoop:hadoop-yarn-server-resourcemanager:jar:2.7.3.2.6.4.0-91
[ERROR] 6) org.apache.hadoop:hadoop-yarn-server-common:jar:2.7.3.2.6.4.0-91
[ERROR] 7) org.apache.hadoop:hadoop-yarn-registry:jar:2.7.3.2.6.4.0-91
[ERROR] 8) org.apache.hadoop:hadoop-common:jar:2.7.3.2.6.4.0-91
[ERROR] 9) org.mortbay.jetty:jetty-sslengine:jar:6.1.26.hwx
[ERROR]
[ERROR] ----------
[ERROR] 3 required artifacts are missing.
[ERROR]
[ERROR] for artifact:
[ERROR] ca.abc:hive-normalize:jar:1.0
[ERROR]
[ERROR] from the specified remote repositories:
[ERROR] hortonworks.extrepo (http://repo.hortonworks.com/content/repositories/releases, releases=true, snapshots=true),
[ERROR] central (https://repo.maven.apache.org/maven2, releases=true, snapshots=false)
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
... View more
04-12-2018
07:41 PM
hi Guys,
Could someone let me know how they decide which hive-exec jar version would be compatible with their Environment.
Here are the two approches I have take, the first one is failing and second one passes. However second approach is really messy and I would like to have first approach work somehow.
First Approach - Not Working
use the maven project to compile and build single (fat) jar.
check which version of hive-exec.jar hadoop is using from following directory
/usr/hdp/current/hive-client/lib/hive-exec.jar -> hive-exec-1.2.1000.2.6.4.0-91.jar
use the matching version as maven dependancy <dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.1</version>
</dependency>
Second Approach - Working Create the simple JAVA project (not the maven) copy the JAR from hadoop and add in classpath compile and create the class files Create a thin Jar (without any dependancies), Provide class path in mainfest.mf file Class-path: /usr/hdp/current/hive-client/lib/hive-exec.jar
... View more
Labels:
- Labels:
-
Apache Hive
04-12-2018
05:20 AM
I am checking how to create a custom UDF in Hive. I have created a custom UDF which does nothing, just returns the given text as it is. I am able to load this JAR in hive without any issue I am also able to create function from this jar and I am able to execute this function. Problem if this Jar is added I can't execute load table/hdfs from another table. following simple query fails. insert into demo1 select * from demo; Stacktrace: Vertex failed, vertexName=Map 1, vertexId=vertex_1523501275422_0010_3_00, diagnostics=[Task failed, taskId=task_1523501275422_0010_3_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Container container_1523501275422_0010_01_000007 finished with diagnostics set to [Container completed. ]], TaskAttempt 1 failed, info=[Container container_1523501275422_0010_01_000008 finished with diagnostics set to [Container completed. ]], TaskAttempt 2 failed, info=[Container container_1523501275422_0010_01_000009 finished with diagnostics set to [Container completed. ]], TaskAttempt 3 failed, info=[Container container_1523501275422_0010_01_000010 finished with diagnostics set to [Container completed. ]]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1523501275422_0010_3_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
<br> Please note that both the table has same structure. If I remove the jar then I can execute above query without any issue. Code: ca.abc.demo public class demo extends UDF {
public String evaluate(String s) {
if (s == null) {
return null;
}else{
return s;
}
}
}<br> pom.xml <?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>ca.cantire</groupId> <artifactId>hive-normalize</artifactId> <version>1.0</version> <dependencies><!-- https://mvnrepository.com/artifact/org.apache.hive/hive-exec --><dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>2.3.3</version> </dependency> </dependencies> <build> <pluginManagement> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-surefire-plugin</artifactId> <version>2.8</version> </plugin> <plugin> <artifactId>maven-assembly-plugin</artifactId> <configuration> <archive> <manifest> <mainClass>ca.cantire.demo</mainClass> </manifest> </archive> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> </plugin> </plugins> </pluginManagement> </build></project>
... View more
Labels:
- Labels:
-
Apache Hive
04-12-2018
02:33 AM
hi Guys, Is there any way to Normalize the Hive UTF-8 data. I am talking about NFC format. Currently I have written custom UDF that does that, however wanted to know if there is any better easier way that does for whole table/hdfs file.
... View more
- Tags:
- Data Processing
- Hive
Labels:
- Labels:
-
Apache Hive
04-11-2018
03:51 PM
Hi Guys, I have written a custom UDF function which works fine if I run that in select query. However, if I try to export data to HDFS using select query as custom UDF is part of that query, it fails. Following query gets executed successfully. CREATE temporary FUNCTION normalize as 'ca.test.Normalize' USING JAR 'hdfs://hadoopdev/tmp/udf/normalizer-1.0-jar-with-dependencies.jar';
select normalize(test_desc, 'NFC') from temp.test_special_char; Following query Fails CREATE temporary FUNCTION normalize as 'ca.test.Normalize' USING JAR 'hdfs://hadoopdev/tmp/udf/normalizer-1.0-jar-with-dependencies.jar';
INSERT OVERWRITE DIRECTORY '/tmp/test_special_char/'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
NULL DEFINED AS ''
select normalize(test_desc, 'NFC') from temp.test_special_char; StackTrace INFO : Tez session hasn't been created yet. Opening session
ERROR : Failed to execute tez graph.
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1519224124029_102976 failed 2 times due to AM Container for appattempt_1519224124029_102976_000002 exited with exitCode: 255
For more detailed output, check the application tracking page: http://abc:8088/cluster/app/application_1519224124029_102976 Then click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e71_1519224124029_102976_02_000001
Exit code: 255
Stack trace: org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Launch container failed
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:109)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Shell output: main : command provided 1
main : run as user is gaurang.shah
main : requested yarn user is gaurang.shah
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /data/10/hadoop/yarn/local/nmPrivate/application_1519224124029_102976/container_e71_1519224124029_102976_02_000001/container_e71_1519224124029_102976_02_000001.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
Container exited with a non-zero exit code 255
Failing this attempt. Failing the application.
at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:779)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:217)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:272)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:152)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1745)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1491)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1151)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:253)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:264)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=1)
... View more
Labels:
- Labels:
-
Apache Hive
04-11-2018
01:06 PM
@bkosaraju HDFS file has UTF-8 encoding and Netezza table also has UTF-8 encoding. the problem is with NFC (normalization)
... View more
04-10-2018
12:45 AM
I am trying to export the data from HDFS to Netezza and few french characters are giving me trouble. The only related post I found on internet is following. http://grokbase.com/t/sqoop/user/137gtanzx8/sqoop-utf-8-data-load-issue However the problem is I am not sure which configuration file he is talking about, would someone please let me know in which configuration file I need to provide, Connection Encoding?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Sqoop
03-19-2018
06:16 PM
The issue was with the /n character at the end of the file. It works perfectly without issue on Netezza however it creates an issue on SQL Server following command to create new password file resolved the issue. tr -d '\n' < sqlserver_password.pass > sqlserver.pass
... View more
03-19-2018
03:05 PM
hi Guys, I am trying to export data from HDFS to SQL server, it works fine if I provide the password as argument. However, if I provide password file then it fails. Password file has only single line, password only. No New line or special character at the end of the line. The same thing works for Netezza, it fails only for SQL server Sqoop version: 1.4.6.2.5.3.0-37 Driver Jar: sqljdbc4-2.0.jar sqoop export --connect "jdbc:sqlserver://abc.com:58850;databaseName=IKB_PROD;schema=dbo;" --table "SQOOP_TEST_SMALL" --export-dir /tmp/SQOOP_TEST_SMALL_20180101_010101 --username HADOOP_USR --password-file /user/gaurang.shah/sqlserver_password.pass --verbose
Warning: /usr/hdp/2.5.3.0-37/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/03/19 14:30:17 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.3.0-37
18/03/19 14:30:17 DEBUG tool.BaseSqoopTool: Enabled debug logging.
18/03/19 14:30:17 DEBUG password.FilePasswordLoader: Fetching password from specified path: /user/gaurang.shah/sqlserver_password.pass
18/03/19 14:30:18 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
18/03/19 14:30:18 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
18/03/19 14:30:18 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
18/03/19 14:30:18 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop!
18/03/19 14:30:18 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
18/03/19 14:30:18 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:sqlserver:
18/03/19 14:30:18 INFO manager.SqlManager: Using default fetchSize of 1000
18/03/19 14:30:18 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.SQLServerManager@6f0628de
18/03/19 14:30:18 INFO tool.CodeGenTool: Beginning code generation
18/03/19 14:30:18 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM [SQOOP_TEST_SMALL] AS t WHERE 1=0
18/03/19 14:30:18 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
18/03/19 14:30:18 ERROR manager.SqlManager: Error executing statement: com.microsoft.sqlserver.jdbc.SQLServerException: Login failed for user 'HADOOP_USR'.
com.microsoft.sqlserver.jdbc.SQLServerException: Login failed for user 'HADOOP_USR'.
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:196)
at com.microsoft.sqlserver.jdbc.TDSTokenHandler.onEOF(tdsparser.java:246)
... View more
Labels:
- Labels:
-
Apache Sqoop
03-14-2018
11:15 PM
1 Kudo
Hi Guys, I am considering sqoop to import/export data from RDBMS to/from HDFS. I found following issues with sqoop
it's still using Map Reduce as execution engine which is slowly dying creating a number of mappers to speed up the execution is a tiresome process. Find a column which could be evenly distributed is not easy when you don't have a primary key (Netezza) or it's a combination of two columns Hcatlog is not supported in Sqoop version 2. Sqoop version is depreciated by Cloudera, is similar gonna happen with Hortonworks as well? https://www.cloudera.com/documentation/enterprise/5-6-x/topics/cdh_ig_sqoop_vs_sqoop2.html Could someone also tell me the roadmap of sqoop. Should I consider writing a Spark Script which does the import/export? would it be faster?
... View more
Labels:
- Labels:
-
Apache Sqoop
03-07-2018
05:32 AM
@Rahul Soni Thanks for the great explanation. just a quick question, is there a way I can modify that value. Just in case if I need to restart the flow from some specific point
... View more
03-06-2018
03:58 AM
1 Kudo
@Constantin Stanca could you please explain the approach in detail.
... View more
03-05-2018
05:17 PM
1 Kudo
@Constantin Stanca yes, no activity for me is 0 records. could you please explain this approach in details.
... View more