Member since
04-01-2019
20
Posts
0
Kudos Received
0
Solutions
02-24-2020
07:21 AM
I am trying to load data into hbase table but it is throwing error :
Caused by: <line 2, column 57> pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.pig.backend.hadoop.hbase.HbaseStorage using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1339) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1324) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5184) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7782) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
commands executed:
register /usr/hdp/2.6.4.63-2/hbase/lib/hbase-*.jar
rawd = LOAD '/user/bdauser/hbasedata.txt' USING PigStorage(',') AS (product:charArray,type:charArray);
: grunt> STORE rawd into 'hbase://odsvivonext:pighbasetest' USING apache.pig.backend.hadoop.hbase.HbaseStorage('hbasedata:product,hbasedata:type');
... View more
Labels:
02-18-2020
01:59 AM
Can you share the full exception? Please share the job.properties and workflow.xml
... View more
02-14-2020
12:29 AM
I have copied hbase-commom.jar in oozie share lib path under /user/oozie/share/lib/java and /user/oozie/sahre/lib/pig but this jar is not getting picked up by oozie and throwing an error hbase class not found but when i copy jars to /usr/hdp/current/oozie/libext/ and /usr/hdp/current/oozie/lib and /usr/hdp/current/oozie/oozie-server/webapps/oozie/WEB-INF/lib/ . oozie is able to pick jar and job is getting completed successfully .
I am confused why oozie is not able to pick up jar though jars are present in hdfs oozie share lib but when i copy in local oozie server in web inf and libext job is getting completed successfully .
... View more
Labels:
01-30-2020
02:08 AM
I have streaming application which is ingesting data into kafka in HDP cluster but I am not understanding how to manage schema , I do not want to send schema with every record . How can I manage with schema registry in hdp cluster without installing HDF .
... View more
01-07-2020
05:13 AM
I want to read habse tables and perform transformation over data and store final result into kafka topic with the help of spark streaming job . I am following below procedure to achieve the above requirement .
I am using newAPIHadoopRDD to read hbase table from spark stream and run transformation over data , In this step I am loading data into RDD but to I want to register schema also with Kafka as my final destination is insertion of records into hive .
Basically i am following below steps :
1. read hbase tables and load data into spark rdd.
2. perform transformation .
3. load transformed data into kafka topic .
all above steps I am running via spark stream job . for step first newAPIHadoopRDD will help to read data , for transformation I have functions in spark also custom functions and this final data load into kafka acting spark job as producer for kafka and utilizing kafka apis to achieve the same .But I am not sure how to register hbase schema with kafka ?
... View more
01-07-2020
02:56 AM
Hello, 1. ORC files can be created from Avro but not directly. This can be done in two steps. a. Convert the Avro into json format using avro-tools jar on command line. b. Convert the json file into ORC using orc-tools jar. (introduced from ORC v1.4) [See: https://orc.apache.org/news/2017/05/08/ORC-1.4.0/] 2. Through Hive tables - Yes, we can accomplish this by creating a new table with ORC storage format and inserting data from the table which has the data in Avro format. [table2 in the below example stores the data in ORC format and table1 in Avro] CREATE TABLE test2
(col1 string,
col2 string)
STORED AS ORC;
INSERT INTO test2
select * from test1; Thanks!
... View more
12-13-2019
02:20 AM
@warrior Can you elaborate on your setup? Are you running HDP or CDH? How many distinct REALMS do you have please share after tokenizing your real REALM names? @KuldeepK has a well documented it how to setup cross-realm trust between two MIT KDC this involved configuring the realm mapping editing your kr5.conf and capaths etc. That's how the entry should be on the edge node and all hosts in the other cluster including the data nodes, first ready and try to understand the logic in the mapping then you can use the AD and the DN REALMS. Let me know if you still need help after perusing that document HTH
... View more
10-11-2019
02:57 AM
I am confused with hbase snapshot cloning mechanism . snapshot cloning in hbase allows creation of new table from existing snapshot taken for some other table , but any modification to newly created table does not have any impact on either snapshot or original table for what snaphsot was created . below is example What I am trying to justify on above statement : suppose we have hbase table tab , and it has two records cf1:fname ='Anurag' cf2:Iname ='Mishra' . .Now with this structure I have created snapshot of this table with name tabsnapshot . As snapshot clone operation allows creating new table let's assume I have created new table with name newtab . Hence now newtab table with same structure and data as tab table has been created , when I update record in the newtab table it does not make any difference in the actual table neither on snapshot . How is it possible when there is no data movement still any update on the new table does not update data for original table it does update only for itself but no disturbance in the original table . As per the functionality wise this should be the only case new table created from the snapshot should not disturb other table , but snapshot operation does not involve data copy or movement and does not affect region server how come update on data does not make any difference on original hbase table >
... View more
Labels:
09-03-2019
11:03 AM
Hi, The properties mentioned in the question are correct. Also as mentioned below you can set some interval and Max age for how long the logs should be there. spark.history.fs.cleaner.enabled=true
spark.history.fs.cleaner.interval=1d
spark.history.fs.cleaner.maxAge=5d Thanks AKR
... View more
04-22-2019
03:37 PM
Hi @warrior , Could you please paste the documentation link which you were following to install Cloudera Manager? Here is the latest doc for your information: https://www.cloudera.com/documentation/enterprise/latest/topics/install_cm_cdh.html Also, did you install the JDBC driver on the CM host and all the other hosts which requires database access like this doc mentions? https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_mysql.html#cmig_topic_5_5_3 Thanks, Li
... View more
04-01-2019
07:03 AM
Hi @warrior, Can you show us your configuration file? Regards, Manu.
... View more
09-06-2018
12:27 PM
I am getting error when i try to access oozie workflow manager view : java.lang.RuntimeException: java.lang.NullPointerException at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1506) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:3036) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:489) at org.apache.ambari.server.controller.internal.URLStreamProvider.processURL(URLStreamProvider.java:218) at org.apache.ambari.server.view.ViewURLStreamProvider.getHttpURLConnection(ViewURLStreamProvider.java:239) at org.apache.ambari.server.view.ViewURLStreamProvider.getInputStream(ViewURLStreamProvider.java:216) at org.apache.ambari.server.view.ViewURLStreamProvider.readFrom(ViewURLStreamProvider.java:103) at org.apache.ambari.server.view.ViewURLStreamProvider.readAs(ViewURLStreamProvider.java:117) at org.apache.ambari.server.view.ViewURLStreamProvider.readAsCurrent(ViewURLStreamProvider.java:131) at org.apache.oozie.ambari.view.AmbariIOUtil.readFromUrl(AmbariIOUtil.java:45) at org.apache.oozie.ambari.view.OozieDelegate.readFromOozie(OozieDelegate.java:152) at org.apache.oozie.ambari.view.OozieDelegate.consumeService(OozieDelegate.java:118) at org.apache.oozie.ambari.view.OozieDelegate.consumeService(OozieDelegate.java:111) at org.apache.oozie.ambari.view.OozieProxyImpersonator.handleGet(OozieProxyImpersonator.java:453) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) at I have set below property in custum-oozie : oozie.service.ProxyUserService.proxyuser.ambari-server-AADSPRD.hosts=*
oozie.service.ProxyUserService.proxyuser.ambari-server-AADSPRD.groups=* oozie.authentication.type=kerberos I followed below link for the resolving the same but still I am facing the issue : https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-views/content/wfm-kerberos-setup.html
... View more
Labels:
03-17-2018
02:48 AM
@Shyam Mishra, You can also do it from ambari GUI, but it would be a tedious task if you have many nodes. Heres how you can do it from ambari. Click on Hosts in the top toolbar in ambari. Select the host where you want to install the client. Click on the Add button in the components tab. Select Spark client /Spark2 client. Please refer the screenshot. If the client is already installed on the node, you will not see the Spark client in the add menu. If this helped resolve the issue, please accept the answer by clicking the Accept button. This will be helpful for other community users. . -Aditya
... View more
01-05-2018
05:47 AM
@Shyam Mishra Are you sure that all your cluster nodes and Ambari servers are on the same Network? Or their IPAddress and Hostname mappings are correct? - I see that your NameNode address is 10.128.0.2 but your Ambari server address is completely in a different range "104.197.146.171" is that intentional or by mistake it is mapped to an incorrect address? - Also please check the FQDN of your every host including ambari-server is correct and is being resolved properly from other Agent machines. # hostname -f
# cat /etc/hosts . Please make sure that the "/etc/hosts" file mapping on all hosts are correct (means has correct IP Address and Hostname combination). And the Hostname (FQDN) is correct (hostname -f) as we expect. .
... View more
11-21-2017
04:58 PM
@Shyam Mishra Thanks for the further details. Can you please check if your web server is up and running in your GCP compute engine? By default, it doesn't come with installed packages, so you have to install it separately using yum in Red Hat/Centos flavours and apt-get in Debian flavors. Steps are below for Centos/Redhat machines: #To check the installed package
yum list installed httpd
#To install http web server
yum install httpd -y
#To start the service of httpd webserver
systemctl status httpd.service
#Another way to start httpd service
service start httpd
#Once service is up and running restart your ambari server
ambari-server restart
#Now check the service status
ambari-server status Once these steps are executed you can check your Ambari portal link and let us know the status of the problem. Please Note: You will not find IPtables services installed on your machine unless you have exclusively installed it on your GCP machine. Regards, SKS
... View more