About hrongali

hrongali · ‎07-14-2016

@Krishna Srinivas Have you tried the Falcon mirroring feature ? Instead of cluster to cluster replication, you can try replicating to different directories in the same cluster. http://hortonworks.com/hadoop-tutorial/mirroring-datasets-between-hadoop-clusters-with-apache-falcon/ https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_data_governance/content/section_mirroring_data_falcon.html https://falcon.apache.org/HDFSDR.html

hrongali · ‎07-14-2016

@ANSARI FAHEEM AHMED Are you referring to Hive/Tez job container sizes ? If yes, you can go to hive CLI and try set hive.tez.container.size; or if it a mapreduce job , you can try the same set command for mapreduce mapper or reducer mb.memory properties. if it is a generic YARN container size for any particular YARN application, then the containers are JVM processes, you can use yarn application commands to get the application attempt id and using the application attempt id, you can list the containers running for that APP. Doing a ps aux | grep <container pid> should give you enough details about the container size.

hrongali · ‎07-14-2016

@ANSARI FAHEEM AHMED Are you referring to the running jobs on the cluster, as they acquire resources on YARN, the namenode heap is increasing ? If yes, your processes might be under the hood making lot of namenode requests which might be the reason for increase in namenode heap size as well. The namenode heap size will eventually get down after it hits Garbage Collection.

hrongali · ‎05-23-2016

@Manoj Dhake Look at the below link for Atlas 0.7 http://atlas.incubator.apache.org/Bridge-Falcon.html Hope this helps!

hrongali · ‎04-05-2016

Is there any data encryption option for Spark Thrift Server ?

hrongali · ‎01-13-2016

We need to set up a HDP cluster based on Isilon storage and customer is asking how much impact would it have on the CPU usage on Isilon nodes ? Currently the Isilon cluster is shared with other work loads as well. What are our experiences around this ? Would the namenode operations in Isilon cause lot of CPU spikes which results in degradation in performance with other workloads on Isilon ?

hrongali · ‎11-17-2015

Trying to import table data from Sybase table to Hive using the below command: sqoop import --verbose --driver com.sybase.jdbc4.jdbc.SybDriver --connect jdbc:sybase:Tds:dbgbl-tst:8032/DATABASE=trim_bw --username hrongali -P --table trim_bw..account --hive-database trim_bw --hive-table account --hive-import -m 1 Sqoop is generating the below alias(AS trim_bw..account) which is failing to execute in Sybase and the below exception is thrown: 2015-11-17 14:29:48,511 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: SELECT col_1, col_2, col_3, col_4 FROM trim_bw..account AS trim_bw..account WHERE ( 1=1 ) AND ( 1=1 ) 2015-11-17 14:29:48,514 ERROR [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: com.sybase.jdbc4.jdbc.SybSQLException: Incorrect syntax near '.'. at com.sybase.jdbc4.tds.Tds.processEed(Tds.java:4084) at com.sybase.jdbc4.tds.Tds.nextResult(Tds.java:3174) at com.sybase.jdbc4.tds.Tds.getResultSetResult(Tds.java:3940) at com.sybase.jdbc4.tds.TdsCursor.open(TdsCursor.java:328) at com.sybase.jdbc4.jdbc.SybStatement.executeQuery(SybStatement.java:2370) at com.sybase.jdbc4.jdbc.SybPreparedStatement.executeQuery(SybPreparedStatement.java:264) at org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111) at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235) Note: Tried with out giving the database name in the --table parameter, but the table object is not being recognized with that convention

hrongali · ‎11-12-2015

Sort Bucket the Hive table and Read the bucketed Hive table in Mapreduce program and hit Hbase when the Key changes. Requires programming effor, but very effective. Bucketing the Hive table will make sure that a particular key goes to only one bucket, so you hit Hbase Once for a particular key.

hrongali · ‎11-09-2015

The below Blog provides very good guideline too: http://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2/

hrongali · ‎11-07-2015

Thanks Pardeep !

Online	Offline
Last Visited	‎09-22-2024 03:44 PM

Member Since	‎09-26-2015 05:48 PM
Last Visited	‎09-22-2024 03:44 PM
Posts	48
Kudos received	29

Cloudera Community

Re: Hive unknown queue: Default error

Re: HIVE MR VS TEZ difference in output, ,Hi,

Re: how to export xml data saved in a hive table's...

Re: Permissions for using Ranger

Re: Why Hive supports ACID properties in ORC File ...

Re: copy files within hdfs based on the modified t...

Re: Yarn container size

Re: Name node heap memory and yarn memory?

Re: Will I get the Lineage for Apache Falcon in At...

Is there a data encryption option for Spark Thrift...

HDP on Isilon - CPU Usage

Sqoop import from Sybase to Hive failing when the ...

Re: What is the best, most performant, method to j...

Re: Ranger implementation - Hive impersonation fal...

Re: HDP 2.3/Ambari integration with AD managed by ...