Member since
09-24-2015
47
Posts
21
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13599 | 06-07-2017 09:09 PM | |
648 | 03-28-2017 04:46 PM | |
676 | 12-08-2016 10:33 PM | |
681 | 11-15-2016 05:41 PM | |
2431 | 09-23-2016 04:26 PM |
07-28-2017
05:17 PM
The sandbox is intended to be run on a desktop with a NAT network interface. It's really not designed to be on a server with multiple people accessing it, and using it like this will likely result in errors, warnings, difficulty accessing services, etc. For a "shared sandbox", the best option is probably to run the sandbox in a cloud environment such as AWS. This is described at https://community.hortonworks.com/articles/103754/hdp-sandbox-on-aws-1.html. If you'd still like to give it a try in your environment, just be aware that there are several ports that have to be forwarded in order to access the services / components of the Sandbox. Here are a couple of links that should help. Default Sandbox port forwards - https://hortonworks.com/tutorial/hortonworks-sandbox-guide/section/3/ Port forwarding guide - https://hortonworks.com/tutorial/sandbox-port-forwarding-guide/
... View more
07-27-2017
09:40 PM
1 Kudo
@Shubham Saxena Cloudbreak does support the "configurations" section, below is an example of one I have running in AWS right now. You might verify that you don't have incorrect or extraneous characters in there somewhere, and that the blueprint is otherwise formatted correctly. For proper formatting and structure, reviewing https://cwiki.apache.org/confluence/display/AMBARI/Blueprints#Blueprints-BlueprintStructure might be helpful. {
"host_groups": [
{
"name": "host_group_master_1",
"configurations": [],
"components": [
{
"name": "ZOOKEEPER_SERVER"
},
{
"name": "HISTORYSERVER"
},
{
"name": "OOZIE_CLIENT"
},
{
"name": "NAMENODE"
},
{
"name": "OOZIE_SERVER"
},
{
"name": "HDFS_CLIENT"
},
{
"name": "YARN_CLIENT"
},
{
"name": "FALCON_SERVER"
},
{
"name": "METRICS_MONITOR"
},
{
"name": "MAPREDUCE2_CLIENT"
}
],
"cardinality": "1"
},
{
"name": "host_group_master_2",
"configurations": [],
"components": [
{
"name": "ZOOKEEPER_SERVER"
},
{
"name": "PIG"
},
{
"name": "ZOOKEEPER_CLIENT"
},
{
"name": "HIVE_SERVER"
},
{
"name": "METRICS_MONITOR"
},
{
"name": "TEZ_CLIENT"
},
{
"name": "HIVE_METASTORE"
},
{
"name": "HDFS_CLIENT"
},
{
"name": "YARN_CLIENT"
},
{
"name": "MYSQL_SERVER"
},
{
"name": "MAPREDUCE2_CLIENT"
},
{
"name": "RESOURCEMANAGER"
},
{
"name": "WEBHCAT_SERVER"
}
],
"cardinality": "1"
},
{
"name": "host_group_master_3",
"configurations": [],
"components": [
{
"name": "ZOOKEEPER_SERVER"
},
{
"name": "APP_TIMELINE_SERVER"
},
{
"name": "TEZ_CLIENT"
},
{
"name": "HBASE_MASTER"
},
{
"name": "HBASE_CLIENT"
},
{
"name": "HDFS_CLIENT"
},
{
"name": "METRICS_MONITOR"
},
{
"name": "SECONDARY_NAMENODE"
}
],
"cardinality": "1"
},
{
"name": "host_group_client_1",
"configurations": [],
"components": [
{
"name": "ZOOKEEPER_CLIENT"
},
{
"name": "PIG"
},
{
"name": "OOZIE_CLIENT"
},
{
"name": "HBASE_CLIENT"
},
{
"name": "HCAT"
},
{
"name": "KNOX_GATEWAY"
},
{
"name": "METRICS_MONITOR"
},
{
"name": "FALCON_CLIENT"
},
{
"name": "TEZ_CLIENT"
},
{
"name": "SLIDER"
},
{
"name": "SQOOP"
},
{
"name": "HDFS_CLIENT"
},
{
"name": "HIVE_CLIENT"
},
{
"name": "YARN_CLIENT"
},
{
"name": "METRICS_COLLECTOR"
},
{
"name": "MAPREDUCE2_CLIENT"
}
],
"cardinality": "1"
},
{
"name": "host_group_slave_1",
"configurations": [],
"components": [
{
"name": "HBASE_REGIONSERVER"
},
{
"name": "NODEMANAGER"
},
{
"name": "METRICS_MONITOR"
},
{
"name": "DATANODE"
}
]
}
],
"Blueprints": {
"blueprint_name": "hdp-small-default",
"stack_name": "HDP",
"stack_version": "2.6"
}
}
... View more
07-27-2017
09:09 PM
@Wendy Lam Can you detail what sort of setup you have? In general, the Sandbox will run as a completely standalone environment within either Virtualbox or VMware and there should not be a need to configure ports or IP addresses. For example, if you import the Sandbox into VMware using the directions at https://hortonworks.com/tutorial/sandbox-deployment-and-install-guide/section/2/, once the Sandbox starts up you will see a screen that says something along the lines of "To initiate your Hortonworks Sandbox session, please open a browser and enter this address in the browser's address field: http://192.168.10.150:8888/". At that point you should be able to put the address in a browser and connect to the Sandbox. If you are running a firewall of some kind on your local PC, try temporarily disabling it to see if that resolves the problem.
... View more
07-20-2017
07:21 PM
@umair ahmed The hostname would be the actual host name of the Exchange server. According to the documentation: "Network address of Email server (e.g., pop.gmail.com, imap.gmail.com . . .)". Hope this helps, and please accept the answer if it was useful.
... View more
06-08-2017
07:42 PM
@Ir Mar Starting with HDP 2.6, you can use Workflow Designer to design and schedule work flows, including Spark jobs. Documentation is at https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_workflow-management/content/ch_wfm_basics.html. Alternatively, you can use Oozie to schedule Spark workflows, and details around that can be found at https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/ch_oozie-spark-action.html. Hope this helps, and please remember to upvote / accept the answer if you found this useful.
... View more
06-07-2017
09:09 PM
1 Kudo
@bigdata.neophyte Here are a few answers for you: NiFi can be interacted with via the UI as well as its REST API. The API is documented at https://nifi.apache.org/docs/nifi-docs/rest-api/index.html. NiFi is primarily a data flow tool whereas Kafka is a broker for a pub/sub type of use pattern. Kafka is frequently used as the backing mechanism for NiFi flows in a pub/sub architecture, so while they work well together they provide two different functions in a given solution. NiFi does have a visual command and control mechanism, while Kafka does not have a native command and control GUI Apache Atlas, Kafka, and NiFi all can work together to provide a comprehensive lineage / governance solution. There is a high level architecture slide at https://hortonworks.com/apache/atlas/#section_2 as well as a tutorial that might help this make more sense at https://hortonworks.com/hadoop-tutorial/cross-component-lineage-apache-atlas/. Data prioritization, back pressure, and balancing latency and throughput are all within NiFi's many strong points and these can be leveraged easily. Kafka does really not provide data prioritization. Security aspects of both Kafka and NiFi are tightly integrated with Apache Ranger, take a look at https://hortonworks.com/apache/ranger/ for additional details. Hope this helps, and please accept the answer if this was helpful.
... View more
06-02-2017
04:16 PM
@Badshah Rehman There is a great article around NiFi performance that covers several tuning aspects including disk partitioning at https://community.hortonworks.com/articles/7882/hdfnifi-best-practices-for-setting-up-a-high-perfo.html, I think it should provide you with all of the info you need.
... View more
05-31-2017
08:38 PM
1 Kudo
@Saikrishna Tarapareddy Yes, you should be able to. Take a look at this HCC article and see if it helps: https://community.hortonworks.com/articles/98394/accessing-data-from-osi-softs-pi-system.html.
... View more
05-31-2017
05:55 PM
@Naveen Keshava It is possible to use S3 as the storage for Hive, for example uses refer to the documentation at https://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.14.1/bk_hdcloud-aws/content/s3-hive/index.html.
... View more
05-30-2017
06:16 PM
@Sunil Neurgaonkar NiFi does not support this right now, but you might look at something like putting a proxy or a load balancer in front of NiFi that can remap the URL as needed.
... View more