Member since
01-12-2016
123
Posts
12
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1501 | 12-12-2016 08:59 AM |
11-04-2016
01:34 PM
where to get this jar file com.ibm.spss.hive.serde2.xml.XmlSerDe?
... View more
11-03-2016
11:38 AM
Thanks for quick response. 0: jdbc:hive2://hdp224.local:10000/default> !sh hdfs dfs -ls / is to run the HDFS commands How to run Linux shell commands from beeline i mean how we will run below command in beeline hive>!pwd;
... View more
11-03-2016
11:16 AM
1)what is the difference between these two commands? To get table names in a database which one i need to use? 0: jdbc:hive2://> show tables; or 0: jdbc:hive2://> !tables 2)How to run HDFS ,Unix commands in beeline?
... View more
Labels:
- Labels:
-
Apache Hive
11-02-2016
09:00 AM
1 Kudo
New to Hive and I followed link mentioned in below .Input will be appreciater URL: http://hadooptutorial.info/java-vs-hive/ Input:-[docs] The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part called MapReduce. Hadoop splits files into large blocks and distributes them across nodes in a cluster. To process data, Hadoop transfers packaged code for nodes to process in parallel based on the data that needs to be processed. This approach takes advantage of data locality– nodes manipulating the data they have access to – to allow the dataset to be processed faster and more efficiently than it would be in a more conventional supercomputer architecture that relies on a parallel file system where computation and data are distributed via high-speed networking Script for Hive in link:-
CREATE TABLE docs (line STRING);
LOAD DATA INPATH 'docs' OVERWRITE INTO TABLE docs;
CREATE TABLE word_counts AS
SELECT word, count(1) AS count FROM
(SELECT explode(split(line, '\\s')) AS word FROM docs) w
GROUP BY word
ORDER BY word;
I have following clarifications on script 1)Script was aborted due to Invalid postscript error so i changed the create table statement as mentioned below. Please let me know what i am missing since in the orginal link the author did not face any error? CREATE TABLE docs (line STRING) STORED AS TEXTFILE; 2)when i got Invalid postscript error file[docs] placed in HDFS home directory got deleted not sure why? Do I need to place the file each and everytime whenever i got Invalid postscript error? 3)The below create statement is creating table with below format.suppose if i want to change the settings to TEXTINPUTFORMAT how to change it?
SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde
InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat"
CREATE TABLE word_counts AS
SELECT word, count(1) AS count FROM
(SELECT explode(split(line, '\\s')) AS word FROM docs) w
GROUP BY word
ORDER BY word;
... View more
Labels:
- Labels:
-
Apache Hive
10-27-2016
08:21 AM
Hi I have below clarification on incr command. 1)what is the meaning of COUNTER VALUE = 23 i can see cf1:no3 is incremented to 3 but i did not what is meant by COUNTER VALUE = 23? 2)How to identify no of rows[3 row(s)]?Is it no of distinct keys present in HBASE table? 3)The below command is showing following error.can not we use incr on existing column in HBASE? incr 'emp_vamsi','123','cf1:no',3 Error:- ERROR: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Attempted to increment field that isn't 64 bits wide incr 'emp_vamsi','123','cf1:no3',3 Below is my source:
hbase(main):028:0> scan 'emp_vamsi'
ROW COLUMN+CELL
123 column=cf1:name, timestamp=1477459091210, value=vamsi1
123 column=cf1:no, timestamp=1477546999021, value=1
123 column=cf1:no1, timestamp=1477547356118, value=\x00\x00\x00\x00\x00\x00\x00\x19
123 column=cf1:no3, timestamp=1477547272487, value=\x00\x00\x00\x00\x00\x00\x00\x14
123 column=cf1:no34, timestamp=1477547405178, value=\x00\x00\x00\x00\x00\x00\x00\x15
123 column=cf1:role, timestamp=1477459091210, value=TL
345 column=cf1:name, timestamp=1477459091210, value=vishnu
345 column=cf1:role, timestamp=1477459091210, value=HR
567 column=cf1:name, timestamp=1477459091210, value=ramya
567 column=cf1:role, timestamp=1477459091210, value=wife
3 row(s) in 0.0350 seconds
Incr command:
hbase(main):029:0> incr 'emp_vamsi','123','cf1:no3',3
COUNTER VALUE = 23
hbase(main):030:0> scan 'emp_vamsi'
ROW COLUMN+CELL
123 column=cf1:name, timestamp=1477459091210, value=vamsi1
123 column=cf1:no, timestamp=1477546999021, value=1
123 column=cf1:no1, timestamp=1477547356118, value=\x00\x00\x00\x00\x00\x00\x00\x19
123 column=cf1:no3, timestamp=1477547447862, value=\x00\x00\x00\x00\x00\x00\x00\x17
123 column=cf1:no34, timestamp=1477547405178, value=\x00\x00\x00\x00\x00\x00\x00\x15
123 column=cf1:role, timestamp=1477459091210, value=TL
345 column=cf1:name, timestamp=1477459091210, value=vishnu
345 column=cf1:role, timestamp=1477459091210, value=HR
567 column=cf1:name, timestamp=1477459091210, value=ramya
567 column=cf1:role, timestamp=1477459091210, value=wife
3 row(s) in 0.0440 seconds
... View more
Labels:
- Labels:
-
Apache HBase
10-21-2016
09:11 AM
could anybody tell what is the purpouse of code highlighted in bold letters in create table statement
CREATE EXTERNAL TABLE intermediate_access_logs ( ip STRING, date STRING, method STRING, url STRING, http_version STRING, code1 STRING, code2 STRING, dash STRING, user_agent STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( 'input.regex' = '([^ ]*) - - \\[([^\\]]*)\\] "([^\ ]*) ([^\ ]*) ([^\ ]*)" (\\d*) (\\d*) "([^"]*)" "([^"]*)"', 'output.format.string' = "%1$$s %2$$s %3$$s %4$$s %5$$s %6$$s %7$$s %8$$s %9$$s") LOCATION '/user/hive/warehouse/original_access_logs';
... View more
Labels:
- Labels:
-
Apache Hive
05-18-2016
10:31 AM
Labels:
- Labels:
-
Apache Hadoop
05-16-2016
02:56 PM
what is purpose of sample.pig script in above link set hcat.bin /usr/bin/hcat; sql show tables;
... View more
- « Previous
- Next »