Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

phoenix upsert boolean values

avatar

I am inserting a csv file using the apache phoenix insert python utility and i have field in the phoenix table which holds a boolean value. In the csv file that value is represented as 1/0 and when it goes into phoenix it should go as true/false. but it is going as false/true. Any thoughts on this. The CSV file used as an import is generated through an ETL from mysql source.

1 ACCEPTED SOLUTION

avatar

It is related to PHOENIX-2434, which has been fixed in HDP 2.5/ apache v4.7.

View solution in original post

9 REPLIES 9

avatar
Super Guru

"In the csv file that value is represented as 1/0 and when it goes into phoenix it should go as true/false. but it is going as false/true"

> create table booltest(pk varchar not null primary key, truth boolean);
> upsert into booltest values('true', 1);
Error: ERROR 203 (22005): Type mismatch. INTEGER cannot be coerced to BOOLEAN (state=22005,code=203)

It seems like UPSERTS will not coerce integers into a boolean. I'm curious how the CSV tool is doing this. psql.py seems to do this fine:

$ echo "true,1" > ~/booleans.csv
$ echo "false,0" >> ~/booleans.csv
$ /usr/local/lib/phoenix/bin/psql.py -t BOOLTEST localhost:2181:/hbase-1.2 ~/booleans.csv
$ sqlline.py ...
> 0: jdbc:phoenix:localhost:2181:/hbase-1.2> select * from booltest;
+--------+--------+
|   PK   | TRUTH  |
+--------+--------+
| false  | false  |
| true   | true   |
+--------+--------+

Similarly, using the CsvBulkLoadTool shows the same for me:

yarn jar /usr/local/lib/phoenix/phoenix-4.9.0-HBase-1.2-SNAPSHOT-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -Dmapred.map.child.java.opts="-Xmx1G" --table BOOLTEST --input booleans.csv -z localhost:2181:/hbase-1.2

Maybe do a sanity check on your processing?

avatar

HI @Josh Elser, i tried to do the same thing from my environment, but the results are different. I am using the phoenix utility provided by HDP 2.4. any thoughts on this

        
$ cat booleans.csv
true,1
false,0
$ /usr/hdp/2.4.0.0-169/phoenix/bin/psql.py -t BOOLTEST -h PK,TRUTH localhost:2181:/hbase-unsecure booleans.csv
0: jdbc:phoenix:localhost:2181:/hbase-unse> select * from BOOLTEST;
+------------------------------------------+------------------------------------------+
|                    PK                    |                  TRUTH                   |
+------------------------------------------+------------------------------------------+
| false                                    | false                                    |
| true                                     | false                                    |
+------------------------------------------+------------------------------------------+

avatar
Super Guru

Oh, that's different than what you said earlier: Both values have "false" (not the values being inverted). Maybe it is a bug with 2.4.0.0. I'll have to see if I can use that exact version and see if I can reproduce the issue.

avatar

Thanks @Josh Elser and @Ankit Singhal. Is there a work around for this, without upgrading it to HDP 2.5 ? As upgrade involves involving lot of teams using the cluster.

avatar
Super Guru

This fix may eventually make it into a maintenance release for HDP. If you have a Hortonworks support contract, you can escalate this through them to get a fix made with urgency.

The workaround is to provide "true" and "false" when referring to booleans instead of "1" and "0".

avatar

Hi @Josh Elser, as a work arouns what i did was copied the phoenix 4.8 jars to the hbase classpath on all the nodes and made it work. IS it a good approach to do 😉 Also as phoenix stores all the version information on the system tables when it gets connected for the first time. Since i have been using phoenix 4.4 earlier and now 4.8 , will it cause any issues for accessign the existing data. Do i need to drop all the tables and create it back?

avatar
Super Guru

That is not a good idea. It is not well tested as to how the version of Phoenix provided in HDP2.3/2.4 works with Apache Phoenix 4.8. You are likely on your own there 🙂

avatar

It is related to PHOENIX-2434, which has been fixed in HDP 2.5/ apache v4.7.

avatar
Super Guru

Thanks for sharing, Ankit!