Created on 11-03-2015 04:19 AM
Exploring Apache Flink with HDP
Apache Flink is an open source platform for distributed stream and batch data processing. More details on Flink and how it is being used in the industry today available here: http://flink-forward.org/?post_type=session. There are a few ways you can explore Flink on HDP 2.3:
1. Compilation on HDP 2.3.2
To compile Flink from source on HDP 2.3 you can use these commands:
curl -o /etc/yum.repos.d/epel-apache-maven.repo https://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo yum -y install apache-maven-3.2* git clone https://github.com/apache/flink.git cd flink mvn clean install -DskipTests -Dhadoop.version=2.7.1.2.3.2.0-2950 -Pvendor-repos
Note that with this option I ran into a classpath bug and raised it here: https://issues.apache.org/jira/browse/FLINK-3032
2. Run using precompiledtarball
wget http://www.gtlib.gatech.edu/pub/apache/flink/flink-0.9.1/flink-0.9.1-bin-hadoop27.tgz tar xvzf flink-0.9.1-bin-hadoop27.tgzcd flink-0.9.1 export HADOOP_CONF_DIR=/etc/hadoop/conf./bin/yarn-session.sh -n 1 -jm 768 -tm 1024
3. Using Ambari service (demo purposes only for now)
The Ambari service lets you easily install/compile Flink on HDP 2.3
192.168.191.241 sandbox.hortonworks.com sandbox
ssh root@sandbox.hortonworks.com
VERSION=`hdp-select status hadoop-client | sed 's/hadoop-client - \([0-9]\.[0-9]\).*/\1/'` sudo git clone https://github.com/abajwa-hw/ambari-flink-service.git /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/FLINK
#sandbox service ambari restart #non sandbox sudo service ambari-server restart
On bottom left -> Actions -> Add service -> check Flink server -> Next -> Next -> Change any config you like (e.g. install dir, memory sizes, num containers or values in flink-conf.yaml) -> Next -> Deploy
export SERVICE=FLINK export PASSWORD=admin export AMBARI_HOST=localhost #detect name of cluster output=`curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' http://$AMBARI_HOST:8080/api/v1/clusters` CLUSTER=`echo $output | sed -n 's/.*"cluster_name" : "\([^\"]*\)".*/\1/p'` #get service status curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X GET http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE #start service curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE #stop service curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE
su flink export HADOOP_CONF_DIR=/etc/hadoop/conf cd /opt/flink ./bin/flink run ./examples/flink-java-examples-0.9.1-WordCount.jar
More details on Flink and how it is being used in the industry today available here: http://flink-forward.org/?post_type=session
export SERVICE=FLINK export PASSWORD=admin export AMBARI_HOST=localhost #detect name of cluster output=`curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' http://$AMBARI_HOST:8080/api/v1/clusters` CLUSTER=`echo $output | sed -n 's/.*"cluster_name" : "\([^\"]*\)".*/\1/p'` curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X DELETE http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE #if above errors out, run below first to fully stop the service #curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE
rm -rf /opt/flink* rm /tmp/flink.tgz
Created on 10-18-2016 05:32 PM
Hello,
After downloading the ambari-flink-service I placed it in /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/FLINK and restarted ambari. But when I go to Actions > Add Service , Flink doesn't appear on the list. What could be the problem?
Thank you and regards,
Pedro Chaves
Created on 11-04-2016 07:12 PM
Had an issue with https://community.hortonworks.com/questions/54894/problem-when-i-install-flink-in-hortonworks.html on HDP 2.4. Have a fix - can give pull request if you want - Hananiel
Created on 10-24-2017 09:35 AM
Hi! When could we expect a stable, non-demo ambari service of Flink, that could be installed not only on your sandbox, but on real hadoop infrastructure? Have't found in your road map.
Thanks in advance!
Andrey
Created on 09-06-2018 09:42 AM
Thanks,It is ok.