Member since
09-17-2014
88
Posts
3
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2633 | 07-15-2015 08:57 PM | |
9311 | 07-15-2015 06:32 PM |
10-08-2018
09:28 AM
hi experts! in HDFS serivices there is tool called balancer, which purposed to ensure even distribution of blocks across cluster. my question is how frequently it kicks in to check is cluster imbalanced or not? is there any way to change this frequency? thanks!
... View more
Labels:
- Labels:
-
HDFS
10-05-2018
07:47 AM
awesome! thank you!
... View more
10-05-2018
06:08 AM
@Fawze thanks for script... unfortunitely it reflects some error for me: ./users_resource_cons.sh: line 14: syntax error: unexpected end of file
... View more
10-05-2018
06:02 AM
Thank you Thomas! do you mean some concreate charts? 🙂 I checked Cloudera Manager -> YARN -> Resource Pools there indeed lots of useful charts, but it shows pool consumption. for example it could be pool root.marketing, but within thi pool it could be multiple users. so, i want to have a understanding which users consume which resources.
... View more
10-04-2018
06:50 AM
hi dear experts! i do have a challenge. i do have a dynamuc service pool, let's say root.marketing. Many users, who belong to this pool is submitting jobs on it (Bob, Alice, Tom). i want to know resource consumption for each of the users. like for the last day Bob used in average 33 cores, Alice 12, Tom 118... or something like this. in other words want to know who consume what within the same pool thanks!
... View more
Labels:
- Labels:
-
Apache YARN
04-19-2016
03:29 PM
Hi dear experts! i'm trying to load data with ImportTSV tool , like this: hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dmapreduce.job.reduces=1000 -Dimporttsv.columns="data:SS_SOLD_DATE_SK, HBASE_ROW_KEY" -Dimporttsv.separator="|" -Dimporttsv.bulk.output=/tmp/store_sales_hbase store_sales /user/root/benchmarks/bigbench/data/store_sales/*, but have only one reducer (despite on -Dmapreduce.job.reduces=1000 setting). i even set mapreduce.job.reduces=1000 on the Cluster wide, but still have only one reducer. Could anybody hint how to resolve this? thank you in advance for any input!
... View more
Labels:
- Labels:
-
Apache HBase
04-19-2016
11:17 AM
Hi dear experts! i'm trying to load data from CSV format on HDFS to HBase with ImportTSV (importtsv). it works perfectly fine in case when HBASE_ROW_KEY is the single CSV column. but i don't know how to create composite HBASE_ROW_KEY (from two columns). for example, i have CSV with 3 columns: row1, 1, abc
row1, 2, dd
row2, 1, iop
row3, 1, kk and row could be uniqly identified by first two columns. any inputs will be highly appreciated!
... View more
Labels:
- Labels:
-
Apache HBase
01-22-2016
05:01 PM
Hi dear expert! i'm trying to export data with sqoop.export.records.per.statement parameter. But for some reasons sqoop doesn't recognize it: sqoop export --direct --connect jdbc:oracle:thin:@scaj43bda01:1521:orcl --username bds --password bds --table orcl_dpi --export-dir /tmp/dpi --input-fields-terminated-by ',' --lines-terminated-by '\n' -m 70 --batch -Dsqoop.export.records.per.statement=10000 -Dsqoop.export.statements.per.transaction=100
Warning: /opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p1168.923/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/01/22 19:59:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.1
16/01/22 19:59:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/01/22 19:59:38 ERROR tool.BaseSqoopTool: Error parsing arguments for export:
16/01/22 19:59:38 ERROR tool.BaseSqoopTool: Unrecognized argument: -Dsqoop.export.records.per.statement=10000
16/01/22 19:59:38 ERROR tool.BaseSqoopTool: Unrecognized argument: -Dsqoop.export.statements.per.transaction=100 i've tried to remove --direct key (target DB is Oracle), but it also doesn't help: sqoop export --connect jdbc:oracle:thin:@host:1521:orcl --username user --password pass --table orcl_dpi --export-dir /tmp/dpi --input-fields-terminated-by ',' --lines-terminated-by '\n' -m 70 --batch -Dsqoop.export.records.per.statement=10000 -Dsqoop.export.statements.per.transaction=100
Warning: /opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p1168.923/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/01/22 20:00:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.1
16/01/22 20:00:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/01/22 20:00:29 ERROR tool.BaseSqoopTool: Error parsing arguments for export:
16/01/22 20:00:29 ERROR tool.BaseSqoopTool: Unrecognized argument: -Dsqoop.export.records.per.statement=10000
16/01/22 20:00:29 ERROR tool.BaseSqoopTool: Unrecognized argument: -Dsqoop.export.statements.per.transaction=100 thank you!
... View more
Labels:
01-13-2016
02:10 PM
so, i've started to play with this and met interesting thing. When I try to proceed data with lzma i read in two times more data then i'm actually have on the HDFS. For example, hadoop client (hadoop fs -du) shows some numbers like 100GB. then i run MR (like select count(1) ) over this data and check MR counters and find "HDFS bytes read" two times more (like 200GB). In case of gzip and bzip2 codecs hadoop client file size and MR counters are the similar
... View more