i'm trying to find a good way to submit a hive query and capture the output results with the least amount of lag between steps. i'm maily a windows so as of right now, i would write a query, save it to a text file, scp it into my linux box, then run hive -f "myfile.txt" to execute the query. i'm using basic sql code like so:
Select count(*) COUNT
from ( select
from `mydb`.`mytable` ds
where ((ds.`field` ='myvalue1') or (ds.`field` ='myvalue2')
group by `ds`.`keyfield`)f;
i'm trying to get the COUNT result back to my sql server, where i log it in a table.
i would NOT like to use polybase, and i'd like to keep using hql query language, since thats very simalar to t-sql.
i would image there's a way to write my query, and just post it to my hdp 3 hortonworks cluster, and then get back a jobid, where i can then find the result once completed, but i cant see to find a writeup on it.
... View more
I Have a large file i have to process every month, when i import it, it only uses one mapper to import the data causing it to be unnecessarily slow. the file is comma delimited with double-quote text qualifiers... any advice to allow the cluster to parrallize the prcessing would be helpful. i basically create the external table, managed table, then i query the data from the external table into the managed table. see below. (using HDP 2.6.5) Thanks! CREATE EXTERNAL TABLE `ext_incomingdata`( `record_note` string COMMENT 'from deserializer', `firstdate` string COMMENT 'from deserializer', `lastdate` string COMMENT 'from deserializer', `fn` string COMMENT 'from deserializer', `ln` string COMMENT 'from deserializer' ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( 'quoteChar'='\"', 'separatorChar'=',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'hdfs://mynode.mydomain.com:8020/etl/mydir/201905/import.csv' TBLPROPERTIES ( 'last_modified_by'='hdfs', 'last_modified_time'='1556747118', 'numFiles'='1', 'numRows'='0', 'rawDataSize'='0', 'skip.header.line.count'='1', 'totalSize'='37567255367', 'transient_lastDdlTime'='1556747118') CREATE TABLE `incomingdata`( `record_note` string, `firstdate` string, `lastdate` string, `fn` string, `ln` string ) stored as orc; insert into incomingdata select * from ext_incomingdata;
... View more
I'm trying to connect Visual Studio HDInsights Emulator to an Hortonworks 2.6 HDP Cluster. (3 master servers, and 3 data nodes) but every time i try to connect, i have issues wiht HiveServer2 and WebHDFS. any help would be appreciated... i'm trying to find a good GUI for writting HIVE quries outside of the hive view since it was discontinued in future distros. here's the writeup i'm using: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-emulator-visual-studio i'm using visual studio 2017, hdp 2.6, and i've got my endpoints setup as: WebHCat: http://hdpmn03.mydomain.com:50111 (this is the node my hive / webhcat resides on) HiveServer2: http://hdpmn03.mydomain.com:10000 (this is the node my hive / webhcat resides on, also did a portscan and some googling and found that usually it's port 10000, not the default v.s. trys to set it to 10001) WebHDFS: http://hdpmn01.mydomain.com:50070 (this is my namenode, i post files via curl all the time so i know this is correct) SSH: http://hdpmn01.mydomain.com:22 (centos7 box, i assume this is right, i get creen when i hit next) Yarn Timeline: http://hdpmn01.mydomain.com:8188 (yarn timeline is on this box) User: root Pw: ###### when i hit next, WebHcat service is connected successfully. Failed to connect to HiveServer2 - Error Detail :Failed to open connection. Please check your connection string. See inner exception for failure details. but i have no way of seeing what the error is (unless you guys have any ideas). i have a pause sign WebHDFS. could be connected but needs config change (if i hit update, it changes to error 403 forbidden) SSH Service is connected scucessfully. any help in getting V.S. connected so i can play with some quries and what-not would be super helpful! thanks!
... View more