Member since
06-22-2017
10
Posts
0
Kudos Received
0
Solutions
02-27-2018
10:03 AM
val sc=new Scan() add two filters in filterlist (keyonlyfilter,firstkeyonlyfilter) add that filterlist to scan val conf =HBaseConfiguration.create() conf.addResource(newPath("/usr/hdp/current/hbase-client/conf/hbase-site.xml")) conf.set("hbase.zookeeper.quorum","xxx") conf.set(TableInputFormat.INPUT_TABLE,"TEST_TABLE") conf.set(TableInputFormat.SCAN,convertscantoString(sc)) val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[org.apache.hadoop.hbase.mapreduce.TableInputFormat], classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], classOf[org.apache.hadoop.hbase.client.Result]) hBaseRDD.count()
... View more
02-23-2018
01:42 PM
i am using phoenix spark connector to load data in a data frame but it does not return any value. i checked by using saveasTextFile but it creates folder and part file with no rows
... View more
02-16-2018
04:41 AM
We have to get large dataset of hbase table and then deduplicate it with a small csv file. It seems as if we are not using optimum methord to read. can anyone help?
... View more
Labels:
02-06-2018
06:31 AM
I am using spark shell in yarn client mode i wanted to stop my current sc and deploy new one . i created a new rdd and the ran a reduce job. then i got this error Caused by: java.lang.ClassNotFoundException: $iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC$anonfun$1
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$anon$1.resolveClass(JavaSerializer.scala:68)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1858)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1744)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2032)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1566)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2277)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2201)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2059)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1566)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2277)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2201)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2059)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1566)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2277)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2201)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2059)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1566)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:426)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
... View more
Labels:
10-13-2017
12:03 PM
ok so, suppose if i put a spark submit command in jar and that jar contains code to connect to hbase then how to pass this
... View more
10-13-2017
11:48 AM
it is saying permission denied klist -kte /etc/security/keytabs/hbase.headless.keytab
Keytab name: FILE:/etc/security/keytabs/hbase.headless.keytab
klist: Permission denied while starting keytab scan
... View more
10-13-2017
11:36 AM
What is that <princ> sorry i am new to this
... View more
10-13-2017
11:36 AM
what is this file and what is that tag <princ>
... View more
10-13-2017
11:23 AM
I am tring to schedule a script in which a folder will be uploaded to hdfs and then it will be downloaded to machine on which oozie schedules the job. From there that script will run eg: sh thatdownloadedscript.sh in that script i have connected to hbase shell and just listed tables but it gives me following error: HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.2.2.5.0.0-1245, r53538b8ab6749cbb6fdc0fe448b89aa82495fb3f, Fri Aug 26 01:32:27 UTC 2016
count 'DRM_Email1'
ERROR: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
Here is some help for this command:
Count the number of rows in a table. Return value is the number of rows.
This operation may take a LONG time (Run '$HADOOP_HOME/bin/hadoop jar
hbase.jar rowcount' to run a counting mapreduce job). Current count is shown
every 1000 rows by default. Count interval may be optionally specified. Scan
caching is enabled on count scans by default. Default cache size is 10 rows.
If your rows are small in size, you may want to increase this
parameter. Examples:
... View more
Labels:
06-22-2017
12:17 PM
hi all,
i want to bulkload data in hbase and i am doing this by using mapreduce.
i use HFileOutputFormat2.configureIncrementalLoad for this
and instead of using normal inputformat i am creating my own custominputformat by extending combine input format.
when i am making a fat jar and trying to run this jar then i am getting classnot found exception for this custom class. may be hbase api is not able to find it.
how can i fix it.please help
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.rupesh.practice.main.WholeFileInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2208)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:779)
at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:452)
at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:410)
at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:391)
at com.abnamro.drm.hortonworks.metadataloader.BulkLoader.runforMapperExtended(BulkLoader.java:181)
at com.abnamro.drm.hortonworks.metadataloader.BulkLoader.runForMapper(BulkLoader.java:152)
at com.abnamro.drm.hortonworks.metadataloader.BulkLoader.run(BulkLoader.java:94)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at com.abnamro.drm.hortonworks.metadataloader.BulkLoader.loadMetadata(BulkLoader.java:72)
at com.abnamro.drm.hortonworks.metadataloader.BulkLoadMain.main(BulkLoadMain.java:26)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.ClassNotFoundException: Class com.rupesh.practice.main.WholeFileInputFormat not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2114)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2206)
... 17 more
... View more
Labels: