Member since
08-08-2018
49
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3635 | 08-11-2018 12:05 AM |
03-21-2019
06:41 PM
I have the same issue. Any ideas? It is saying that certain params are not set, but they are set in Ambari. Have restarted Ambari and the Hive node, but no results.
... View more
10-08-2018
04:32 PM
@wxu What should I be using to write my Hive scrips in Ambari today?
... View more
09-17-2018
08:16 PM
Hello, I wanted to crosspost my SO post on here regarding Hive issues. I wrote it up on that site because it has better formatting, but I will watch the comments here as well. Find all the details here
... View more
09-08-2018
08:16 PM
Hello, One normally disables Tez with Hive using: SET hive.execution.engine=mr;
But when I use this option in the Hive shell I get: 0: jdbc:hive2://my_server:2181,> SET hive.execution.engine = mr;
Error: Error while processing statement: hive execution engine mr is not supported. (state=42000,code=1)
What's going on? Tez is not working for me and I want to try with MR. I'm using HDP 3.0
... View more
09-07-2018
11:14 PM
Did you solve this?
... View more
09-06-2018
06:22 AM
@Eugene Mogilevsky did you figure this out?
... View more
09-05-2018
04:23 AM
I have a 4-node cluster and this did not work for me. same error: /bucket_00003 could only be written to 0 of the 1 minReplication nodes. There are 4 datanode(s) running and no node(s) are excluded in this operation.
... View more
08-22-2018
05:15 PM
Can please let me know why my Hive query is taking such a long time? The query: SELECT DISTINCT res_falg as n FROM my_table took ~75 mins to complete The query: SELECT COUNT(*) FROM my_table WHERE res_flag = 1; took 73 minutes to complete The table is stored on HDFS (replicated on 4 nodes) as a CSV and is only 100MB in size. It has 6 columns, the types are VARCHAR or TINYINT. The column I'm querying has NAs. It is an external table. My hive query is running as a Tez job on YARN using 26 cores and ~120 GB of memory. I am not using LLAP. Any idea what's going on? I'm on HDP 3.0. EDIT: I imported the csv into hdfs using the following command: hdfs dfs -Ddfs.replication=4 -put '/mounted/path/to/file.csv' /dir/file.csv
I used these commands to create the table in Hive: CREATE EXTERNAL TABLE my_table (
svcpt_id VARCHAR(50),
start_date VARCHAR(8),
end_date VARCHAR(8),
prem_id VARCHAR(20),
res_flag TINYINT,
num_prem_id TINYINT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
stored as textfile
LOCATION 'hdfs://ncienspk01/APS';
<br>LOAD DATA INPATH '/dir/file.csv' into table my_table;
... View more
Labels:
08-20-2018
01:01 AM
@Sandeep Nemuri I think this may be because of the way sqoop writes null values from the RDBMS into hdfs. I was seeing some "null" strings in the FLOAT column. I set --null-non-string to "" in sqoop and am experimenting to see if Phoenix still gets held up. I feel the "/n" issue is probably related. Sqoop may be writing these values in instead of the float values that the parser is looking for. But it's confusing because there is no reason why it would write "/n" in the float column. If I use the -g,–ignore-errors option, what happens?
... View more
08-18-2018
08:11 PM
@Sandeep Nemuri Thanks for your response, here is the head of one of the blocks on HDFS: $ head C://Users/dzafar/Downloads/000000_0
75000001,2220901556,1.2,20180602,2100,n,A,-700,n
75000002,4631103346,2.13,20180602,2100,n,A,-700,n
75000003,810079025,8.26,20180602,2100,n,A,-700,n
75000004,4991209767,3.95,20180602,2100,n,A,-700,n
75000005,4161180101,0.48,20180602,2100,n,A,-700,n
75000006,8450216737,1.14,20180602,2100,n,A,-700,n
75000007,4170587823,1.86,20180602,2100,n,A,-700,n
75000008,4920666845,10.39,20180602,2100,n,A,-700,n
75000009,840684899,0.29,20180602,2100,n,A,-700,n
75000010,3180799190,2.66,20180602,2100,n,A,-700,n
... View more
08-17-2018
09:38 PM
@Sandeep Nemuri I recently moved data into HDFS using this command: sqoop import --connect "jdbc:sqlserver://my_server:1433;database=APS_AMI" --username boop -P --table "my_table" --target-dir "hdfs://my_server/test/300M.csv" --split-by "pk" --m 12; Now I'm trying to move into Phoenix and getting this error: Error: java.lang.RuntimeException: org.apache.phoenix.schema.IllegalDataException: java.sql.SQLException: ERROR 201 (22000): Illegal data.
...
Caused by: org.apache.phoenix.schema.IllegalDataException: java.sql.SQLException: ERROR 201 (22000): Illegal data.
...
Caused by: java.lang.NumberFormatException: For input string: "\N"
It looks like Phoenix is having some issues parsing the HDFS csv. Do you know how I can fix it? FOR REFERENCE The Phoenix schema I'm using is: CREATE IMMUTABLE TABLE my_table (
pk VARCHAR(50),
id CHAR(10),
height_value FLOAT,
read_date INTEGER,
read_time SMALLINT,
units CHAR(1),
is_estimate CHAR(1),
UTC_offset SMALLINT,
Match_flag CHAR(1)
CONSTRAINT pk PRIMARY KEY (id))
IMMUTABLE_STORAGE_SCHEME = SINGLE_CELL_ARRAY_WITH_OFFSETS,
COLUMN_ENCODED_BYTES = 1; The columns, already loaded in Hive are: | table.pk | table.id | table.height_value | table.read_date | atable.read_time | table.units | table.is_estimate | table.utc_offset | table.match_flag |
This is the code I used to move the data into Phoenix HADOOP_CLASSPATH=/usr/hdp/current/hbase-master/lib/hbase-protocol.jar:/usr/hdp/current/hbase-master/conf hadoop jar /usr/hdp/current/phoenix-client/phoenix-5.0.0.3.0.0.0-1634-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --table my_table --input /test/300M.cs
... View more
Labels:
08-16-2018
08:32 PM
@Sandeep Nemuri Thanks for the pointers. I finally got it working. For others running Spark-Phoenix in Zeppelin you need to:
On the spark client node, create a symbolic link of 'hbase-site.xml' into the spark /conf file: ln -s /usr/hdp/current/hbase-master/conf/hbase-site.xml /usr/hdp/current/spark2-client/conf/hbase-site.xml Add the following to both spark.driver.extraClassPath and spark.executor.extraClassPath in spark-defaults.conf: /usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar:/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar:/usr/hdp/current/spark-client/lib/spark-assembly-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar
Add the following jars in Zeppelin under the Spark2 interpreter's dependencies /usr/hdp/current/phoenix-server/lib/phoenix-spark-5.0.0.3.0.0.0-1634.jar /usr/hdp/current/hbase-master/lib/hbase-common-2.0.0.3.0.0.0-1634.jar /usr/hdp/current/hbase-client/lib/hbase-client-2.0.0.3.0.0.0-1634.jar /usr/hdp/current/hbase-client/lib/htrace-core-3.2.0-incubating.jar /usr/hdp/current/phoenix-client/phoenix-client.jar /usr/hdp/current/phoenix-client/phoenix-server.jar
... View more
08-16-2018
08:24 PM
you're the best 🙂
... View more
08-16-2018
05:03 PM
@Sandeep Nemuri I edited my above question, do you mind taking a look at it? I'm seeing a CsvBulkLoadTool and a JsonBulkLoadTool. How will I bulk load my sqoop-loaded data?
... View more
08-16-2018
04:42 PM
@Sandeep Nemuri I'm not sure if I follow what you are talking about. The page you pointed to shows a bulk load from CSV>Phoenix or HDFS JSON>Phoenix. Can you provide a link or command on how one would go from Sqoop's HDFS output to Phoenix directly?
... View more
08-16-2018
03:41 PM
How about this one? com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory
... View more
08-16-2018
02:55 PM
@Sandeep Nemuri Thanks so much for the advice! My table occupies 1.5 terabytes in MS SQL server. This will be a one-time migration. I feel that exporting to a csv would take days of processing time. Your last approach sounds the best but I've heard that it is not possible based on this post and this post (my table does have a float column). That being said, what is my best option? I'm thinking that I may try to split up my table into n csv files and load them sequentially into Phoenix. Would that be the best option for data at this size?
... View more
08-16-2018
04:40 AM
@Vinicius Higa Murakami I don't mind providing the jars but is there a way to add all of them in once to Zeppelin? I'm also have trouble figuring out which Phoenix jar this class is from. Do you know?
... View more
08-16-2018
04:24 AM
Bump! I am also looking for a solution
... View more
08-15-2018
11:38 PM
1 Kudo
Hello- I have HDP 3.0 with Sqoop v1.4.7. What is the best way to migrate my data from an external RDBMS into something query-able from Phoenix? I want to make sure I import it in a way that it was have very fast queries. Do I need to Sqoop it into HDFS first or can I go directly into HBase? It looks like the Sqoop - Phoenix is not yet completed so I believe I will need to sqoop the data into HDFS or Hbase and then connect to Phoenix. Can someone show (or point) me to how to do that? This post makes me think that I will need to go RBDMS>HDFS>CSV>Phoenix, please tell me that is not true... Thanks!
... View more
Labels:
08-15-2018
11:18 PM
I've heard that Sqoop 1.4.7 enables import through Phoenix directly. Can you please add a section on how to do that with HDP 3.0 or a new post?
... View more
08-15-2018
06:03 AM
I'm trying to use the Phoenix-Spark2 connector in Zeppelin as described here and having some confusion about dependencies Here is the code I'm running: import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
val sc = new SparkContext("local", "phoenix-test")
val sqlContext = new SQLContext(sc)
val df = sqlContext.load(
"org.apache.phoenix.spark",
Map("table" -> "dz_places",
"zkUrl" -> "driver_node:2181"))
I keep getting ClassNotFound Exceptions so I go and find the related jar and add it to the Zeppelin Spark2 interpreter dependencies. So far I've added these jars: /usr/hdp/current/phoenix-server/lib/phoenix-spark-5.0.0.3.0.0.0-1634.jar
/usr/hdp/current/phoenix-server/lib/phoenix-core-5.0.0.3.0.0.0-1634.jar
/usr/hdp/current/hbase-master/lib/hbase-common-2.0.0.3.0.0.0-1634.jar
/usr/hdp/current/hbase-client/lib/hbase-client-2.0.0.3.0.0.0-1634.jar
/usr/hdp/current/hbase-client/lib/htrace-core-3.2.0-incubating.jar Now I'm seeing this error: java.lang.NoClassDefFoundError: org/apache/tephra/TransactionSystemClient I'm not seeing this jar in any of the Hbase or Phoenix lib folders. What's going on? Why do I need to add all these? Where is this particular class housed? Is there a better way to specify these? Using /usr/hdp/current/hbase-client/lib/*.jar threw an error.
... View more
08-15-2018
05:03 AM
I had to do a lot of fanagaling but this command ultimately solved my issues. Thanks! I did inadvertantly install hbase-master on one of the datanodes and it is not registered in Ambari. Should delete it? if so, `yum remove hbase-master` did not work. How would I do so?
... View more
08-14-2018
11:45 PM
@Jay Kumar SenSharma I think the issue is with a bad hbase-client install. What is the yum install arg to reinstall that?
... View more
08-14-2018
11:34 PM
@Jay Kumar SenSharma I didn't see an entry about the RegionServer. I tried deleting and installing again in Ambari and this did not result in anything different. Only the conf dir is present.
... View more
08-14-2018
11:33 PM
I tried another time and am still only showing the conf file present
... View more
08-14-2018
11:28 PM
I have already removed and installed again once and it did not solve the issue.
... View more
08-14-2018
11:24 PM
@amarnath reddy pappu Yep I don't doubt that the message is correct, what do I do next? Only conf file is present: [dzafar@MYSERVER03 ~]$ ls /usr/hdp/current/hbase-regionserver
conf
... View more
08-14-2018
11:17 PM
Hello, I just installed Hbase RegionServers to 2 of my datanodes. I'm getting the following error when I start them on HDP 3.0: esource_management.core.exceptions.ExecutionFailed: Execution of '/usr/hdp/current/hbase-regionserver/bin/hbase-daemon.sh --config /usr/hdp/current/hbase-regionserver/conf start regionserver' returned 127. -bash: /usr/hdp/current/hbase-regionserver/bin/hbase-daemon.sh: No such file or directory What am I missing? I installed them through Ambari.
... View more
08-14-2018
04:42 PM
Hello, I'm looking to read tables from Phoenix using Zeppelin. I don't want to use the JDBC connection in favor of the plugin described here: https://phoenix.apache.org/phoenix_spark.html I believe I am not specifying the dependencies correctly because even though I added this to Zeppelin Interpreter (see zeppelin-phoenix-config.png). I'm still getting the following error: import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
error: object phoenix is not a member of package org.apache
How do I fix this? I've also added /usr/hdp/current/phoenix-server/lib/:/usr/hdp/current/phoenix-client/lib/:/usr/hdp/current/phoenix-server/ to both spark.driver.extraClassPath and spark.executor.extraClassPath in Spark2's spark.defaults.conf. Thanks in advance! P.S. I'm trying to figure all this out so that I can write a sparklyr extension to interact with Hbase data through Phoenix. Any pointers toward that goal are appreciated as well.
... View more