Member since
03-09-2016
10
Posts
5
Kudos Received
0
Solutions
10-22-2016
06:32 PM
@Michael Young... It says "Failed to enable and lock VT-x features".... but you have mentioned "You must have a CPU with the VT-x features enabled".... error ,sg says "failed to enable".... can you pls clarify what is required here ? I am also getitng below error : Failed to open a session for the virtual machine Hortonworks Docker Sandbox.
VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED). Result Code: E_FAIL (0x80004005) Component: ConsoleWrap Interface: IConsole {872da645-4a9b-1727-bee2-5585105b9eed}
... View more
03-14-2016
02:42 AM
1 Kudo
@Manikandan Kannan redefining the table with hcat option and then sqooping using hcatalog feature didnt work as well ? How about switching back to '--export-dir <directory>.*' option ?
... View more
03-13-2016
10:52 AM
1 Kudo
@Manikandan Kannan appears that to use -hcatalog option in sqoop, the table needs to be first created with hcat option... eg: hcat -e "create table.... in addition: below needs to provided as well as per docs: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_dataintegration/content/sqoop-hcatalog-integration.html hcatalog-database Specifies the database name for the HCatalog table. If not specified, the default database name ‘default’ is used. Providing the --hcatalog-database option without --hcatalog-table is an error. This is not a required option. kindly let me know if this works.... else you can switch to "--export-dir <directory>.*" option.... ( " * " for exporting partitions recursively )
... View more
03-13-2016
09:00 AM
@Rushikesh Deshmukh On the other hand.. HIVE's regexp_replace can help in cleaning the data.. eg below.. this removes nested '\','\t' and '\r' combination of unformatted data within a single JSON string.. --populate clean src table
insert overwrite table src_clean PARTITION (ddate='${hiveconf:DATE_VALUE}')
select regexp_replace(regexp_replace(regexp_replace(regexp_replace(full_json_line, "\\\\\\\\\\\\\\\\t|\\\\\\\\\\\\\\\\n|\\\\\\\\\\\\\\\\r", "\\\\\\\\\\\\\\\\<t or n or r>"), "\\\\\\\\\\\\t|\\\\\\\\\\\\n|\\\\\\\\\\\\r", "\\\\\\\\ "), "\\\\\\\\t|\\\\\\\\n|\\\\\\\\r", "\\\\\\\\<t or n or r>"),"\\\\t|\\\\n|\\\\r", "") as full_json_line
from src_unclean where ddate='${hiveconf:DATE_VALUE}';
... View more
03-13-2016
06:29 AM
@Rushikesh Deshmukh- I see Geometry data type (point) included in your data set. For insights on geo-spacial calculations, you can start at- https://cwiki.apache.org/confluence/display/Hive/Spatial+queries and https://github.com/Esri/spatial-framework-for-hadoop/wiki/ST_Geometry-in-Hive-versus-SQL.
... View more
03-13-2016
03:59 AM
@mike pal.. there shouldnt be any need of specifying INPUTFORMAT and OUTFORMAT, you can simply avoid this extra work and just use STORED AS TEXTFILE to expose the text file in HIVE. in most cases TEXTFILE is the default file format, unless the configuration parameterhive.default.fileformat has a different setting...
... View more
03-12-2016
05:29 AM
@Jitendra Yadav Similar to Partitioned tables we cannot directly load bucketed tables rather we need to use INSERT OVERWRITE TABLE... SELECT.. FROM clause from another table to populate the bucketed table. Also hive.enforce.bucketing should be set to TRUE, so that number of reducers need not be specified exclusively. Kindly let me know if you find this useful.
... View more
03-10-2016
04:06 AM
2 Kudos
@Roberto Sancho.. Similar to Partitioned tables we cannot directly load bucketed tables rather we need to use INSERT OVERWRITE TABLE... SELECT.. FROM clause from another table to populate the bucketed table. Also hive.enforce.bucketing should be set to TRUE, so that number of reducers need not be specified exclusively. Kindly let me know if you find this useful.
... View more