Member since
11-14-2015
268
Posts
122
Kudos Received
29
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1018 | 08-07-2017 08:39 AM | |
1868 | 07-26-2017 06:06 AM | |
5727 | 12-30-2016 08:29 AM | |
4368 | 11-28-2016 08:08 AM | |
3108 | 11-21-2016 02:16 PM |
09-07-2017
09:58 AM
1 Kudo
It's better you declare every field as VARCHAR and then use functions to convert them to numbers[1] for mathematical operations. [1] https://phoenix.apache.org/language/functions.html#to_number
... View more
09-07-2017
08:35 AM
grep for WARN or ERROR log lines in the region server logs. And also check your system logs for resources availability errors.
... View more
08-21-2017
10:59 AM
you need to set the same for Master opts as well as described in my last comment.
... View more
08-21-2017
10:17 AM
1 Kudo
@arun, you need to set -XX:MaxDirectMemorySize=<more_than_bucket_cache_size> in HBASE_MASTER_OPTS also. as master internaly starts a regionserver for some tasks. export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:MaxDirectMemorySize=<more_than_bucket_cache_size>"
... View more
08-10-2017
11:21 AM
Do you have any errors in regionserver logs (it seems hbase:namespace table is not getting assigned some how).
... View more
08-09-2017
03:04 PM
Before running above command please take regionserver and master down if not already.(and keep the zookeeper running)
... View more
08-09-2017
02:39 PM
If it is not a production cluster(or not doing replication or other stuff dependent on zookeeper), can you try cleaning your zookeeper as it may be possible that there is znode for non-existent tables . bin/hbase clean --cleanZk
... View more
08-07-2017
04:21 PM
can you check application and tasks log from yarn UI for any errors. Is the destination cluster where you are doing distcp is reachable from source cluster? Did the job eventually failed with some errors?
... View more
08-07-2017
10:06 AM
Map intermediate data will be written and sorted on local disk before sending to the reducer machines. You can reduce Map output Use Combiner in between Compressing it with Gzip to save network IO but there will be a tradeoff for CPU (mapred.compress.map.output=true, mapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec) Decrease split size(this will distribute map across server) and increase the number of reducers so that they have fewer amount of data to sort and process stop speculative execution( mapred.map.tasks.speculative.execution=false) If you can optimize on sorting, update algorithm for map.sort.class bq. Will i get any performance improvement if i increase io.sort.mb paramter when Map() task generates huge amount of data? Yes (but impact may not be huge), you can use with io.sort.factor
... View more
08-07-2017
09:02 AM
which version of kafka you are using (does it use zookeeper)?, Can you enable DEBUG level and try to find the real port which it is trying to connect.. is your kafka secure(using SSL or SASL)?
... View more
08-07-2017
08:54 AM
Can't you use CROSS operator with FILTER operator for your use-case(or JOIN operator)? grunt> cross_data = CROSS <detailed_table>, <lookup_table>;
grunt> joined_data = JOIN <detailed_table> BY email1, <lookup_table> BY email;
... View more
08-07-2017
08:41 AM
can you paste the complete stack trace.
... View more
08-07-2017
08:39 AM
Currently, It seems there is no option in distcp to do so. File can be replaced by a File and directory by a directory by passing "-update" option to command.
... View more
08-07-2017
06:36 AM
1 Kudo
It seems that your HDFS is not healthy. Can you check your datanode and namenode logs for any exception trace or error.
... View more
08-07-2017
06:22 AM
you can use "major_compact" command to run a major compaction on the table. In HBase shell:- hbase(main):013:0> major_compact 'tablename'
... View more
08-04-2017
01:27 PM
1 Kudo
can you please check the causes mentioned by @Josh Elser in a comment below and try debugging? https://community.hortonworks.com/questions/11779/hbase-master-shutting-down-with-zookeeper-delete-f.html
... View more
08-04-2017
10:15 AM
1 Kudo
You can't execute "get" on a non-string row key from a shell. You need to do it through HBase Java API. Get get = new Get(toBytes("row1")); In Phoenix case, it would be difficult you to form an exact row key from the primary key values as it involves the usage of low level APIs( PTable.newKey(),PDataType.toBytes(value, column.getSortOrder) etc). However , if you are really looking for look ups, you can still do it from sql. SELECT * FROM MyTab WHERE feature='Temp' and TS=to_time('<ts_value>')
... View more
08-03-2017
02:38 PM
You can have active and passive HBase master on two nodes cluster independently of your HDFS configuration. But if you don't have HA namenode, then you will have single point of failure which can bring your whole HBase cluster down in case of failure as HBase cluster relies on HDFS for storage.
... View more
08-03-2017
10:49 AM
have you confirmed that region observer is there in table descriptor of your table. You can do this from hbase shell by using "describe '<tablename>'" command.
... View more
08-02-2017
05:43 AM
You might be getting affected by these bugs:- https://issues.apache.org/jira/browse/PHOENIX-4041 https://issues.apache.org/jira/browse/PHOENIX-3611
... View more
08-02-2017
05:37 AM
It depends on what 50K users is doing. (if your cluster capacity and configuration is right, you can scale horizontally without any problem) If it is just the point lookup(key value access) then depending upon the disk(SSD/HDD) you are using, you should be able to scale without any problem, some basic configuration tweak is required, like increasing no. of handlers for datanode and regionserver, block cache/bucket cache etc If you are doing heavy scans then you may need a large cluster which can bear this load. Network, CPU and disk will play an important role.
... View more
08-02-2017
05:24 AM
You can create a Schema(which is similar to databases) by using following grammar. https://phoenix.apache.org/language/index.html#create_schema Schema will be mapped to a namespace in HBase so your tables can be segregated logically as well as physically. https://phoenix.apache.org/namspace_mapping.html
... View more
08-01-2017
09:01 AM
It seems you are using column mapping feature(Though it is default from 4.11), which encodes column name in some encoded form of qualifier while storing in hbase , this is to save space and for better performance. Currently, we don't provide any API(as there is no standard JDBC API which exposes the details of the storage) to give you such mapping. if your application requires using the HBase qualifier directly, then I would suggest creating a table having column encoding disabled so that the column name will be used for HBase qualifier as well. CREATE TABLE test
( tag varchar(10) NOT NULL,
ts DATE NOT NULL,
val INTEGER CONSTRAINT pk PRIMARY KEY (tag, ts)
)COLUMN_ENCODED_BYTES=0;
... View more
08-01-2017
08:45 AM
Your OR condition include a filter on non-primary key, so Phoenix has to read the full table anyways, There are two ways (you can try):- * you may try adding an index hint in the query( if you want your particular index to be used) * If you have another index on Branch (leading in primary key) as well, then you can use UNION of two queries to get result fast(select AccountId,Name,Transactiondate from Accounts where Name = 'Rajesh' UNION select AccountId,Name,Transactiondate from Accounts where Branch = 'abc')
... View more
08-01-2017
08:36 AM
for the second question:- bq. Also, I want to add a table with defined column families and few qualifiers. My requirement is to store the data with qualifiers at runtime. How can I make sure to fetch the valid qualifiers while fetching particular record? You can use dynamic columns while upserting data and doing SELECT. https://phoenix.apache.org/dynamic_columns.html
... View more
07-28-2017
05:44 AM
Why SquirrelSQL is not able to make a connection to HBase cluster is not clear from the stack trace. would you mind enabling debugging and paste the logs here. have you tried connecting using sqlline with the same url from the same machine.
... View more
07-27-2017
10:11 AM
I'll not recommend updating a single component in a stack, as there will be some incompatibility and upgrade issues. I only suspect you might be hitting PHOENIX-2169 so it's better to reproduce the same issue somewhere in pre-prod/dev and see if the fix works and then replicate on production.
Or, you can talk to your vendor to provide the hotfix for the same version.
... View more
07-26-2017
10:49 AM
bq. One thing I wanted to highlight was my EVENT column datatype is VARCHAR....is that making EVENT column not to sort? Varchar field is also expected to be sorted in desc as like for other data types. Your "desc" declaration for Event column will be visible only when there are records having the same value for col1,col2,col3. you can order your keys as per your query pattern or create an additional index with a new order.
... View more
07-26-2017
10:43 AM
You need to move your snapshot from exported directory to the directory where hbase looks for snapshots(.hbase-snapshot). so that list_snapshots and other commands(like clone_snapshot ) can work easily hadoop dfs -mv hdfs://NAME_NODE:8020/hbase/.hbase-snapshot/<snapshot_name> hdfs://NAME_NODE:8020/apps/hbase/data/.hbase-snapshot/
hdfs dfs -mv hdfs://NAME_NODE:8020/hbase/archive/data/* hdfs://NAME_NODE:8020/apps/hbase/data/archive/data/ FYI:- to list snapshot directly from remote directory hbase org.apache.hadoop.hbase.snapshot.SnapshotInfo -remote-dir hdfs://NAME_NODE:8020/hbase/ -list-snapshots
... View more