Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1980 | 07-09-2019 12:53 AM | |
| 11924 | 06-23-2019 08:37 PM | |
| 9183 | 06-18-2019 11:28 PM | |
| 10177 | 05-23-2019 08:46 PM | |
| 4601 | 05-20-2019 01:14 AM |
07-27-2016
07:49 AM
2 Kudos
Thanks I'm certain you're hitting the same error as HADOOP-12559, given the AuthenticationException is coming at write-time, and from the client package that's used for HTTP work - indicating that the NN is unable to contact the KMS. You'll also likely observe this error only much after a NameNode restart period (but that it works immediately after NN restart), and that it may go away after one day or so, only to return again, which is inline with HADOOP-12559's behaviour within the NameNode. The bug-fix update of 5.5.x or any minor upgrade to the newer releases should solve this up.
... View more
07-27-2016
06:37 AM
1 Kudo
Here's one example that uses the native hbase-spark module via DataFrames in PySpark: http://community.cloudera.com/t5/Storage-Random-Access-HDFS/Include-latest-hbase-spark-in-CDH/m-p/43236/highlight/true#M2280
... View more
07-27-2016
05:58 AM
Retry your command this way: ~> HADOOP_ROOT_LOGGER=DEBUG,console hadoop fs -put testkb.txt /data/fi Among other outputs, it should produce the fuller exception trace before it aborts with the same message.
... View more
07-27-2016
12:12 AM
4 Kudos
You should be able to read HBase Spark connector data via DataFrames in Pyspark, via the sqlContext already today: ~> hbase shell
> create 't', 'c'
> put 't', '1', 'c:a', 'a column data'
> put 't', '1', 'c:b', 'b column data'
> exit
~> export SPARK_CLASSPATH=$(hbase classpath)
~> pyspark
> hTbl = sqlContext.read.format('org.apache.hadoop.hbase.spark')
.option('hbase.table','t')
.option('hbase.columns.mapping', 'KEY_FIELD STRING :key, A STRING c:a, B STRING c:b')
.option('hbase.use.hbase.context', False)
.option('hbase.config.resources', 'file:///etc/hbase/conf/hbase-site.xml')
.load()
> hTbl.show()
+---------+---------+---------+
|KEY_FIELD| A| B|
+---------+---------+---------+
|1|a column data|b column data|
+---------+---------+---------+ There are some limitations as the JIRA notes of course. Which specific missing feature are you looking for, just so we know the scope of request?
... View more
07-26-2016
05:48 PM
What version of CDH do you use? Can you share the full stack trace around the exception? Depending on your version and the stack trace you're most likely hitting the https://issues.apache.org/jira/browse/HADOOP-12559 described failure. This has been addressed in CDH 5.5.4 onwards for the 5.5.x line, and is also in all 5.6.x and 5.7.x and any future releases since then. An ACL setting failure would give you a different error, such as a 403 from a KMS.
... View more
07-24-2016
09:57 PM
Using a single quote around the value will help it get evaluated properly in shell. The & is otherwise taken as a token to fork the process. An ex. of quoting in shell: … --connect 'jdbc:mysql://IP/DB?zeroDateTimeBehavior= convertToull&useTimezone=true&serverTimezone=GMT' \ …
... View more
07-22-2016
06:43 AM
1 Kudo
The POST command on the API you're using [1] requires passing the username and password as query parameters, not as a JSON object in the request body. Try something like this instead: ~> curl -X POST -u "admin:admin" -i 'http://localhost:7180/api/v11/cm/commands/importAdminCredentials?username=user/admin@REALM&password=your-password' The type of parameter expected is noted in the column on the table in the link above. You need to use JSON request bodies if and only if the POST description requires such a structure, for example for this request [2] (notice the two parameters, plus an additional request body data structure).
... View more
07-07-2016
11:42 PM
1 Kudo
Copy is done to a temporary file, and then moved to the actual destination upon completion. There's no "merge", only move. This procedure is done to ensure partial file copies don't get leftover if the job fails or gets killed.
... View more
07-07-2016
07:31 PM
LazyOutputFormat is available for both APIs. Here's the one for the older API: http://archive.cloudera.com/cdh5/cdh/5/hadoop/api/org/apache/hadoop/mapred/lib/LazyOutputFormat.html
... View more
07-06-2016
11:59 PM
Block-level copies (with file merges) is not supported as a DistCp feature yet. However, you can use the -update options to do progressive copies - resuming upon the last failure.
... View more