Member since
05-30-2018
1322
Posts
715
Kudos Received
148
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4044 | 08-20-2018 08:26 PM | |
| 1943 | 08-15-2018 01:59 PM | |
| 2372 | 08-13-2018 02:20 PM | |
| 4104 | 07-23-2018 04:37 PM | |
| 5010 | 07-19-2018 12:52 PM |
07-12-2016
04:01 AM
1 Kudo
@pankaj chaturvedi
inside your pig script do this: set exectype=tez;
... View more
07-11-2016
05:56 PM
@mark doutre difficult to provide recommendations without knowing the use case. One thing that stands out to me with ASR is the source data generating app will have to adhere to a schema. This ends up provide just another maintance issue on the app side. For me I rather have the data flow into NiFi and have it forks based on the type of data feed in.
... View more
07-11-2016
05:18 PM
with hive cli I am able to execute a init script when I launch hive. For example in my .hiverc file I have add jar statement.s When I launch hive it automatically executes all statements in the script. how to do this with beeline?
... View more
Labels:
- Labels:
-
Apache Hive
07-11-2016
04:54 AM
@Benjamin Leonhardi did this work as expected? if we simply select one of the RMs in falcon, will it fail over automatically to secondary? I am trying to understand the impact if Falcon pointing to RM and that RM goes down.
... View more
07-11-2016
04:44 AM
Client, ApplicationMaster and NodeManager on RM failover When there are multiple RMs, the configuration (yarn-site.xml) used by clients and nodes is expected to list all the RMs. Clients, ApplicationMasters (AMs) and NodeManagers (NMs) try connecting to the RMs in a round-robin fashion until they hit the Active RM. If the Active goes down, they resume the round-robin polling until they hit the “new” Active. This default retry logic is implemented as org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider. You can override the logic by implementing org.apache.hadoop.yarn.client.RMFailoverProxyProvider and setting the value of yarn.client.failover-proxy-provider to the class name.
... View more
07-11-2016
04:17 AM
1 Kudo
@Ahmad Debbas I have done this using storm to parse emails/pdfs using tika as documents land onto hdfs. You can use storm hdfs spout (info here). Once data is parsed, using another bolt to sink into solr. Pretty straight forward solution. NiFi is definitely a consideration. You will need a build a NiFi tiki processor. As each event is then run through processor --> parsed text--> into solr. this could work as well
... View more
07-11-2016
03:50 AM
1 Kudo
@SANTOSH DASH You can process data in hadoop using many difference services. If your data has a schema then you can start with processing the data with hive. Full tutorial here. My preference is to do ELT logic with pig. Full tutorial here. there are many ways to skin a cat here. Full list of tutorials are here.
... View more
07-11-2016
03:42 AM
@Kiran Jilla Are there any difference in pre-prod vs prod? db versions, services, kerberos, etc?
... View more
07-09-2016
04:26 AM
Duplicate question answered here https://community.hortonworks.com/questions/44208/hdfs-heterogeneous-storage-using-aws-s3-as-storage.html
... View more