Created 10-13-2016 07:12 AM
I am inserting a csv file using the apache phoenix insert python utility and i have field in the phoenix table which holds a boolean value. In the csv file that value is represented as 1/0 and when it goes into phoenix it should go as true/false. but it is going as false/true. Any thoughts on this. The CSV file used as an import is generated through an ETL from mysql source.
Created 10-15-2016 07:09 PM
It is related to PHOENIX-2434, which has been fixed in HDP 2.5/ apache v4.7.
Created 10-13-2016 05:30 PM
"In the csv file that value is represented as 1/0 and when it goes into phoenix it should go as true/false. but it is going as false/true"
> create table booltest(pk varchar not null primary key, truth boolean); > upsert into booltest values('true', 1); Error: ERROR 203 (22005): Type mismatch. INTEGER cannot be coerced to BOOLEAN (state=22005,code=203)
It seems like UPSERTS will not coerce integers into a boolean. I'm curious how the CSV tool is doing this. psql.py seems to do this fine:
$ echo "true,1" > ~/booleans.csv $ echo "false,0" >> ~/booleans.csv $ /usr/local/lib/phoenix/bin/psql.py -t BOOLTEST localhost:2181:/hbase-1.2 ~/booleans.csv $ sqlline.py ... > 0: jdbc:phoenix:localhost:2181:/hbase-1.2> select * from booltest; +--------+--------+ | PK | TRUTH | +--------+--------+ | false | false | | true | true | +--------+--------+
Similarly, using the CsvBulkLoadTool shows the same for me:
yarn jar /usr/local/lib/phoenix/phoenix-4.9.0-HBase-1.2-SNAPSHOT-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -Dmapred.map.child.java.opts="-Xmx1G" --table BOOLTEST --input booleans.csv -z localhost:2181:/hbase-1.2
Maybe do a sanity check on your processing?
Created 10-14-2016 04:54 AM
HI @Josh Elser, i tried to do the same thing from my environment, but the results are different. I am using the phoenix utility provided by HDP 2.4. any thoughts on this
$ cat booleans.csv true,1 false,0 $ /usr/hdp/2.4.0.0-169/phoenix/bin/psql.py -t BOOLTEST -h PK,TRUTH localhost:2181:/hbase-unsecure booleans.csv 0: jdbc:phoenix:localhost:2181:/hbase-unse> select * from BOOLTEST; +------------------------------------------+------------------------------------------+ | PK | TRUTH | +------------------------------------------+------------------------------------------+ | false | false | | true | false | +------------------------------------------+------------------------------------------+
Created 10-14-2016 03:54 PM
Oh, that's different than what you said earlier: Both values have "false" (not the values being inverted). Maybe it is a bug with 2.4.0.0. I'll have to see if I can use that exact version and see if I can reproduce the issue.
Created 10-17-2016 01:05 PM
Thanks @Josh Elser and @Ankit Singhal. Is there a work around for this, without upgrading it to HDP 2.5 ? As upgrade involves involving lot of teams using the cluster.
Created 10-17-2016 03:11 PM
This fix may eventually make it into a maintenance release for HDP. If you have a Hortonworks support contract, you can escalate this through them to get a fix made with urgency.
The workaround is to provide "true" and "false" when referring to booleans instead of "1" and "0".
Created 10-18-2016 07:39 AM
Hi @Josh Elser, as a work arouns what i did was copied the phoenix 4.8 jars to the hbase classpath on all the nodes and made it work. IS it a good approach to do 😉 Also as phoenix stores all the version information on the system tables when it gets connected for the first time. Since i have been using phoenix 4.4 earlier and now 4.8 , will it cause any issues for accessign the existing data. Do i need to drop all the tables and create it back?
Created 10-18-2016 02:45 PM
That is not a good idea. It is not well tested as to how the version of Phoenix provided in HDP2.3/2.4 works with Apache Phoenix 4.8. You are likely on your own there 🙂
Created 10-15-2016 07:09 PM
It is related to PHOENIX-2434, which has been fixed in HDP 2.5/ apache v4.7.
Created 10-16-2016 09:26 PM
Thanks for sharing, Ankit!