Member since
04-22-2016
931
Posts
46
Kudos Received
26
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1852 | 10-11-2018 01:38 AM | |
| 2217 | 09-26-2018 02:24 AM | |
| 2245 | 06-29-2018 02:35 PM | |
| 2914 | 06-29-2018 02:34 PM | |
| 6094 | 06-20-2018 04:30 PM |
04-20-2018
02:35 PM
Hbase is not saving all the versions of imported data despite versions set to 40 ? 1- create table with cf versions 40 create 'PUR_ACCT_PHX','cf1'
alter 'PUR_ACCT_PHX',{NAME => 'cf1', VERSIONS => 40}
hbase(main):004:0> describe 'PUR_ACCT_PHX'
Table PUR_ACCT_PHX is ENABLED
PUR_ACCT_PHX
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '40', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPR
ESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2- sqoop import table data consisting of 38 records SQL> desc tab4
Name Null? Type
----------------------------------------- -------- ----------------------------
ACCT_NUM NOT NULL NUMBER(8)
PUR_ID NUMBER(10)
PUR_DET_ID NOT NULL NUMBER(10)
PRODUCT_PUR_PRODUCT_CODE VARCHAR2(2)
PROD_AMT NUMBER(10,2)
PUR_TRANS_DATE DATE
ACCTTYPE_ACCT_TYPE_CODE VARCHAR2(2)
ACCTSTAT_ACCT_STATUS_CODE VARCHAR2(2)
EMP_EMP_CODE CHAR(4)
PLAZA_PLAZA_ID CHAR(6)
PURSTAT_PUR_STATUS_CODE CHAR(2) SQL> select count(*) from tab4;
COUNT(*)
----------
38 sqoop import --connect "jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=patronQA)(port=1526))(connect_data=(service_name=patron)))" --username PATRON --table PATRON.TAB4 --hbase-table PUR_ACCT_PHX --column-family cf1 --hbase-row-key "ACCT_NUM" -m 4
3- scan hbase for all 38 versions but shows only 3 values/column
... View more
Labels:
04-17-2018
03:53 PM
and this is even more bizzare .. how did the rows increase in the table after truncate command from 1304 to 19962 ? hbase(main):010:0> count 'PUR_ACCT_PHX'
Current count: 1000, row: 5807472
1304 row(s) in 0.4150 seconds
=> 1304
hbase(main):011:0> truncate 'PUR_ACCT_PHX'
Truncating 'PUR_ACCT_PHX' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 5.5750 seconds
hbase(main):012:0> count 'PUR_ACCT_PHX'
Current count: 1000, row: 12857756
Current count: 2000, row: 13740556
Current count: 3000, row: 14526924
Current count: 4000, row: 15327898
Current count: 5000, row: 16045786
Current count: 6000, row: 16678996
Current count: 7000, row: 17351808
Current count: 8000, row: 17998624
Current count: 9000, row: 18645468
Current count: 10000, row: 19311016
Current count: 11000, row: 19935412
Current count: 12000, row: 20487390
Current count: 13000, row: 214002
Current count: 14000, row: 2799176
Current count: 15000, row: 3449430
Current count: 16000, row: 4322882
Current count: 17000, row: 5288986
Current count: 18000, row: 6087966
Current count: 19000, row: 833559
19962 row(s) in 0.9620 seconds
=> 19962
... View more
04-17-2018
03:51 PM
I am truncating an hbase table but its not emptying it and everytime I truncate it gives me a different count on the table ? four of the columns on this table have versions =10 set. I want to empty this table in one command.
hbase(main):001:0>
hbase(main):002:0*
hbase(main):003:0* truncate 'PUR_ACCT_PHX'
Truncating 'PUR_ACCT_PHX' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 9.7210 seconds
hbase(main):004:0> count 'PUR_ACCT_PHX'
Current count: 1000, row: 18220040
1133 row(s) in 0.1700 seconds
=> 1133
hbase(main):005:0> truncate 'PUR_ACCT_PHX'
Truncating 'PUR_ACCT_PHX' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 5.3810 seconds
hbase(main):006:0> count 'PUR_ACCT_PHX'
Current count: 1000, row: 13107118
Current count: 2000, row: 14153506
Current count: 3000, row: 15192260
Current count: 4000, row: 16091274
Current count: 5000, row: 1695352
Current count: 6000, row: 17850480
Current count: 7000, row: 18678016
Current count: 8000, row: 19573032
Current count: 9000, row: 2108730
Current count: 10000, row: 2905364
Current count: 11000, row: 3763048
Current count: 12000, row: 4989356
Current count: 13000, row: 6119274
Current count: 14000, row: 8709736
14447 row(s) in 1.1760 seconds
=> 14447
hbase(main):007:0> truncate 'PUR_ACCT_PHX'
Truncating 'PUR_ACCT_PHX' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 5.3440 seconds
hbase(main):008:0> count 'PUR_ACCT_PHX'
Current count: 1000, row: 1861961
Current count: 2000, row: 80757
2119 row(s) in 0.4240 seconds
=> 2119
hbase(main):009:0>
... View more
Labels:
04-16-2018
07:39 PM
ok thanks Josh I figured out my mistake . I didn't realize that phoenix automatically find the primary key and I don't have to specify the primary key column name explicitly . thanks for the guidance .
... View more
04-16-2018
06:09 PM
hi Josh I did follow that link but its showing one column table example and I have multiple columns , I tried various options to make multiple columns to work but I am not able to , kindly advise another post I found on this forum is suggesting exactly the same way I am trying to create the table so what I am doing wrong?
... View more
04-16-2018
03:47 PM
i have read all the posts regarding this but still I am not able to get it right 😞 see blow the first query is showing a new column with name "PK" and ACCT_NUM empty . the second query showing ACCT_NUM empty and PUR_ID populated. how can I map all the columns of a column family to the phoenix table columns where the hbase row key becomes the primary key column of the phoenix table ? hbase(main):001:0> scan 'PUR_ACCT_PHX2',{FILTER=>"(PrefixFilter('1001181'))"}
ROW COLUMN+CELL
1001181 column=cf1:ACCTSTAT_ACCT_STATUS_CODE, timestamp=1523892576289, value=01
1001181 column=cf1:ACCTTYPE_ACCT_TYPE_CODE, timestamp=1523892576289, value=01
1001181 column=cf1:EMP_EMP_CODE, timestamp=1523892576289, value=9999
1001181 column=cf1:PLAZA_PLAZA_ID, timestamp=1523892576289, value=009500
1001181 column=cf1:PRODUCT_PUR_PRODUCT_CODE, timestamp=1523892576289, value=31
1001181 column=cf1:PROD_AMT, timestamp=1523892576289, value=10
1001181 column=cf1:PURSTAT_PUR_STATUS_CODE, timestamp=1523892576289, value=10
1001181 column=cf1:PUR_DET_ID, timestamp=1523892576289, value=661592624
1001181 column=cf1:PUR_ID, timestamp=1523892576289, value=752545412
1001181 column=cf1:PUR_TRANS_DATE, timestamp=1523892576289, value=2015-01-20 06:36:10.0
1001181 column=cf1:_0, timestamp=1523892576289, value=
1 row(s) in 0.2230 seconds
0: jdbc:phoenix:hadoop1:2181:/hbase-unsecure> CREATE table "PUR_ACCT_PHX2" (pk VARCHAR PRIMARY KEY, "cf1"."ACCT_NUM" varchar);
0: jdbc:phoenix:hadoop1:2181:/hbase-unsecure> select * from PUR_ACCT_PHX2 LIMIT 10;
+----------+-----------+
| PK | ACCT_NUM |
+----------+-----------+
| 1001181 | |
| 1002226 | |
| 1002491 | |
CREATE table "PUR_ACCT_PHX2" (pk VARCHAR PRIMARY KEY, "cf1"."ACCT_NUM" varchar, "cf1"."PUR_ID" varchar);
0: jdbc:phoenix:hadoop1:2181:/hbase-unsecure> select acct_num,pur_id from PUR_ACCT_PHX2 LIMIT 10;
+-----------+------------+
| ACCT_NUM | PUR_ID |
+-----------+------------+
| | 752545412 |
| | 752541376 |
| | 752539214 |
... View more
Labels:
03-27-2018
08:15 PM
ok I found the issue with my statement , I was using "-" in table name which it didn't like . thanks for your help
... View more
03-27-2018
07:54 PM
I tried this but failed hive>
>
> CREATE EXTERNAL TABLE sqoop-avro
> STORED AS AVRO
> LOCATION 'hdfs:///sqoop-avro'
> TBLPROPERTIES ('avro.schema.url'='/tmp/sqoop/PATRON_TAB4.avsc');
NoViableAltException(306@[202:1: tableName : (db= identifier DOT tab= identifier -> ^( TOK_TABNAME $db $tab) |tab= identifier -> ^( TOK_TABNAME $tab) );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:116)
at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableName(HiveParser_FromClauseParser.java:4942)
at org.apache.hadoop.hive.ql.parse.HiveParser.tableName(HiveParser.java:49834)
at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:6431)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:4012)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1786)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1152)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:211)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:438)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1151)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
FAILED: ParseException line 1:27 cannot recognize input near 'sqoop' '-' 'avro' in table name
hive> quit
> ;
[hdfs@hadoop1 ~]$ hdfs dfs -ls 'hdfs:///sqoop-avro'
Found 4 items
-rw-r--r-- 3 hdfs hdfs 28808 2018-03-27 15:40 hdfs:///sqoop-avro/part-m-00000.avro
-rw-r--r-- 3 hdfs hdfs 3127 2018-03-27 15:42 hdfs:///sqoop-avro/part-m-00001.avro
-rw-r--r-- 3 hdfs hdfs 3474 2018-03-27 15:42 hdfs:///sqoop-avro/part-m-00002.avro
-rw-r--r-- 3 hdfs hdfs 682403 2018-03-27 15:43 hdfs:///sqoop-avro/part-m-00003.avro
[hdfs@hadoop1 ~]$
... View more