Member since
06-05-2019
128
Posts
133
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1792 | 12-17-2016 08:30 PM | |
1334 | 08-08-2016 07:20 PM | |
2375 | 08-08-2016 03:13 PM | |
2475 | 08-04-2016 02:49 PM | |
2280 | 08-03-2016 06:29 PM |
12-17-2016
08:14 PM
Hi @Nick Pileggi Can you verify the user trying to read the file has the Decrypt EEK permission in Ranger KMS? You can use my article here as a reference Is your cluster Kerberized?
... View more
12-17-2016
07:50 PM
2 Kudos
Hi @rudra prasad biswas,
Great questions - don't think about column families / column qualifiers, because Phoenix will interact with HBase to automatically do all this for you. Instead, simply create your table: CREATE TABLE mytable (id integer, first_name varchar, last_name varchar CONSTRAINT my_pk PRIMARY KEY (id)); Phoenix will create this table structure on-top of HBase (automatically creating column families and qualifiers). Phoenix has the concept of dynamic columns, where you are able to upsert columns at runtime - take a look at this documentation If you'd like to see how Phoenix is using HBase to create column families and column qualifiers, I'd recommend taking a look at the audit log in Ranger, to take a look how column families and column qualifiers are being created.
... View more
11-30-2016
01:18 AM
5 Kudos
Prerequisites 1) Service Ambari Infra installed -> Ranger will use Ambari Infra's SolrCloud for Ranger Audit 2) MySQL installed and running (I'll use Hive's Metastore MySQL instance * MySQL is one of the many DB options) Installing Apache Ranger using Ambari Infra (SolrCloud) for Ranger Audit 1) Find the location of mysql-connector-java.jar (assume /usr/share/java/mysql-connector-java.jar) run the following command on the Ambari server sudo ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar 2) In Ambari, click Add Service 3) Choose Ranger and click Next 4) Choose "I have met all the requirements above." and click Proceed (this was done in #1 above) 5) Assign master(s) for "Ranger Usersync" and "Ranger Admin" and click Next 6) Assign Slaves and Clients - since we did not install Apache Atlas, Ranger TagSync is not required and click Next 7) Customize Services -> Ranger Audit, click on "OFF" to enable SolrCloud Before clicking: After clicking: 😎 Customize Services -> Ranger Admin, enter "Ranger DB host" the DB you chose (in my case, I chose MySQL) and a password "Ranger DB password" for the user rangeradmin *Ranger will automatically add the user "rangeradmin" Add the proper credentials for a DB user that has administrator credentials (this administrator will create the user rangeradmin and Ranger tables) MySQL create an administrator user *Note: rcicak2.field.hortonworks.com is the server where Ranger is being installed CREATE USER 'ryan'@'rcicak2.field.hortonworks.com' IDENTIFIED BY 'lebronjamesisawesome';
GRANT ALL PRIVILEGES ON *.* TO 'ryan'@'rcicak2.field.hortonworks.com' WITH GRANT OPTION; Click Next 9) Review -> Click Deploy * Install, Start and Test will show you the progess of Ranger installing 10) Choose Ranger in Ambari 11) Choose "Configs" and "Ranger Plugin" and select the services you'd like Ranger to authorize (You'll need to restart the service after saving changes)
... View more
Labels:
11-22-2016
12:57 PM
11 Kudos
The goal of this article is to ingest log data from multiple
servers (running MiNiFi) that push log data to a NiFi cluster. The NiFi cluster will listen for the log data
on an input port and route to an HDFS directory (determined by the host name). This article will assume you are using Ambari
for NiFi installation/administration. 1) Set the nifi.remote.input.host and nifi.remote.input.socket.port
of NiFi cluster Reason: Allows NiFi cluster to use
an Input Port (MiNiFi will push to a Remote Processing Group) * NiFi will
listen using the Input Port a) In Ambari, go to NiFi b) Choose the Configs tab c) Choose the Advanced nifi-properties d) Set the nifi.remote.input.host to your NiFi hostname and nifi.remote.input.socket.port to 10000 e) Restart NiFi using Ambari 2) In NiFi – Create a flow for incoming log data
(listening [input port] for MiNiFi data) Reason: Listen for incoming log
data and route the log data to an HDFS directory a) On the NiFi Flow Canvas, drag-and-drop an Input
port (Name your Input Port – I named mine “listen_for_minifi_logs”) b) Drag-and-drop processor RouteOnAttribute c) Create a connection between Input Port and
RouteOnAttribute d) Configure RouteOnAttribute - Properties (removing
the red caution), adding two properties (one property per server you’re
installing MiNiFi on) I have two servers – rcicak0.field.hortonworks.com
and rcicak1.field.hortonworks.com, the incoming log data (flowfile) will
contain an attribute called “host_name” where we’ll properly route the flowfile
depending on the host_name property e) Drag-and-drop three putHDFS processors &
create a connection using hostname_rcicak0, hostname_rcicak1 and unmatched f) Each putHDFS processor will have a different
HDFS directory – configure properties for each putHDFS processor /tmp/rcicak0/,
/tmp/rcicak1/ and /tmp/unmatched g) Configure the HDFS directory – properties (adding a core-site.xml and the
directory – depending on the connection) h) Play the processors – at this point, the NiFi
flow is ready to receive Log data from MiNiFi 3) Setup MiNiFi on at least one server Reason: MiNiFi needs to push the
log data to a remote processing group and delete the log file a) Download MiNiFi (from http://hortonworks.com/downloads/
) on each of the servers (that contain log data) b) Unzip minifi-0.0.1-bin.zip to a directory c) Complete step 4 below before continuing to d d) Running with an account that has the read/write
permission to the log data directory (to read the file and delete the file) run
*location/minifi-0.0.1/bin/minifi.sh start 4) Using a processor group in NiFi, create a MiNiFi Flow
(pushing log data to a remote processing group) Reason: Push the log data to a
remote processing group and delete the log file a) Create a process group (call the group “minifi_flow”) b) Go into the process group “minifi_flow" c) Drag-and-drop the processor GetFile d) Configure the processor GetFile – Properties
(IMPORTANT: Any file matching the file filter’s regular expression under input directory
[and the recursive subdirectories when set to true], the file will be deleted
once the file is stored in MiNiFi’s content repository In
the example above, the file filter looks for hdfs-audit.log.Archive -> in
this case is a date e) Drag-and-drop the UpdateAttribute processor and
create a successful connection between GetFile and UpdateAttribute f) Configure the UpdateAttribute processor –
Properties, setting the host_name attribute adding the nifi expression language
getting the hostname g) Drag-and-drop a remote process group – use the nifi.remote.input.host
from above for the URL Wait for the connection to
establish before continuing to h h) Add a connection between UpdateAttribute and the
Remote Process Group – under To Input choose listen_for_minifi_logs i) Select all processors and relationships to
create a template (Download the template’s xml file) j) Use the minifi-toolkit (https://www.apache.org/dyn/closer.lua?path=/nifi/minifi/0.0.1/minifi-toolkit-0.0.1-bin.zip
) and run the config.sh tool “config.sh transform theminifi_flow_template.xml
config.yml” -> which will convert the XML to a YML file that will be read by
MiNiFi k) Copy the config.yml file into the minifi-0.0.1/conf
directory on each of the MiNiFi servers (if you already have your MiNiFi agent
started, restart the agent)
... View more
Labels:
11-15-2016
12:41 AM
Please use the XSLT code located here - it is more complete than the XSLT code provided
... View more
10-03-2016
05:17 PM
2 Kudos
If you've received the error exitCode=7 after enabling Kerberos, you are hitting this Jira bug. Notice the bug outlines the issue but does not outline a solution. The good news is the solution is simple, as I'll document below. Problem: If you've enabled Kerberos through Ambari, you'll get through around 90-95% of the last step "Start and Test Services" and then receive the error: 16/09/26 23:42:49 INFO mapreduce.Job: Running job: job_1474928865338_0022
16/09/26 23:42:55 INFO mapreduce.Job: Job job_1474928865338_0022 running in uber mode : false
16/09/26 23:42:55 INFO mapreduce.Job: map 0% reduce 0%
16/09/26 23:42:55 INFO mapreduce.Job: Job job_1474928865338_0022 failed with state FAILED due to: Application application_1474928865338_0022 failed 2 times due to AM Container for appattempt_1474928865338_0022_000002 exited with
exitCode: 7
For more detailed output, check application tracking page:
http://master2.fqdn.com:8088/cluster/app/application_1474928865338_0022
Then, click on links to logs of each attempt.Diagnostics: Exception from container-launch.
Container id: container_e05_1474928865338_0022_02_000001
Exit code: 7
Stack trace: ExitCodeException exitCode=7:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:371)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Shell output: main : command provided 1
main : run as user is ambari-qa
main : requested yarn user is ambari-qa
Container exited with a non-zero exit code 7
Failing this attempt. Failing the application. You'll notice running "Service Checks" for Tez, MapReduce2, YARN, Pig (any service that involves creating a YARN container) will fail with the exitCode=7. This is because in YARN, the local-dirs likely has the "noexec" flag specified meaning the binaries that are added to these directories cannot be executed. Solution: Open /etc/fstab (with the proper permissions) and remove the noexec flag under all mounted drives specified under "local-dirs" in YARN. Then either remount or reboot your machine - problem solved.
... View more
09-24-2016
12:24 AM
1 Kudo
You may be in a bind if you need to install HDP on Azure with CentOS 6 or RHEL 6 and certain services (not everything). By following these steps below, you will be able to use ambari-server to install HDP on any of the supported Hortonworks/Azure VMs. 1) Configure your VMs - use the same VNet for all VMs Run the next steps as root or sudo the commands: 2) Update /etc/hosts on all your machines: vi /etc/hosts
172.1.1.0 master1.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.1 master2.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.2 master3.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.3 worker1.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.4 worker2.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.5 worker3.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net * use the FQDN (find the fqdn by typing hostname -f). The ip address are internal and can be found on eth0 by typing ifconfig 3) Edit /etc/sudoers.d/waagent so that you don't need to type a password when sudoing a) change permissions on /etc/sudoers.d/waagent: chmod 600 /etc/sudoers.d/waagent
b) update the file "username ALL = (ALL) ALL" to "username ALL = (ALL) NOPASSWD: ALL": vi /etc/sudoers.d/waagent c) change permissions on /etc/sudoers.d/waagent: chmod 440 /etc/sudoers.d/waagent * change username to the user that you sudo with (the user that will install Ambari) 3) Disable iptables a) service iptables stop
b) chkconfig iptables off * If you need iptables enabled, please make the necessary port configuration changes found here 4) Disable transparent huge pages a) Run the following in your shell: cat > /usr/local/sbin/ambari-thp-disable.sh <<-'EOF'
#!/usr/bin/env bash
# disable transparent huge pages: for Hadoop
thp_disable=true
if [ "${thp_disable}" = true ]; then
for path in redhat_transparent_hugepage transparent_hugepage; do
for file in enabled defrag; do
if test -f /sys/kernel/mm/${path}/${file}; then
echo never > /sys/kernel/mm/${path}/${file}
fi
done
done
fi
exit 0
EOF
b) chmod 755 /usr/local/sbin/ambari-thp-disable.sh
c) sh /usr/local/sbin/ambari-thp-disable.sh * Perform a-c on all hosts to disable transparent huge pages 5) If you don't have a private key generated (where the host running ambari-server can use a privat key to login to all the hosts - please perform this step) a) ssh-keygen -t rsa -b 2048 -C "username@master1.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net"
b) ssh-copy-id -i /locationofgeneratedinaabove/id_rsa.pub username@master1 * Run b above on all hosts, this way you can ssh using the username into all hosts from the ambari-server host without a password 6) Install the ambari repo on the server where you'll install Ambari (documentation😞 wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.2.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo 7) Install ambari-server: yum install ambari-server 😎 Setup ambari-server: ambari-server setup * You can use the defaults by pressing ENTER 9) Start ambari-server: ambari-server start
* This could take a few minutes to startup depending on the speed of your machine 10) Open your browser and go to the ip address where ambari-server is running http://ambariipaddress:8080 * Continue with your HDP 2.4.3 installation
... View more
09-14-2016
03:12 AM
1 Kudo
When utilizing a processor that calls an external API call, is there a way to set a timeout (for example if the API never responds) wait for 1 minute (max). Would the processor just continue to wait for a response?
... View more
Labels:
- Labels:
-
Apache NiFi
09-14-2016
03:06 AM
1 Kudo
Once creating a custom processor (described here) - what is the best way to deploy the nar file in a NiFi cluster? Do I need to deploy to every node in the cluster and then restart one at a time (rolling restart)? Did anything change with deploying a custom processor to the cluster with NiFi v1.0?
... View more
Labels:
- Labels:
-
Apache NiFi
08-29-2016
02:58 PM
Hi @ScipioTheYounger Yes - you are correct (StorageBasedAuthorizationProvider and DefaultHiveMetastoreAuthorizationProvider) are the two provided https://cwiki.apache.org/confluence/display/Hive/Storage+Based+Authorization+in+the+Metastore+Server DefaultHiveMetastoreAuthorizationProvider = Hive's grant/revoke model StorageBasedAuthorizationProvider = HDFS permission based model (which is recommended on the apache website) More info here on configuring for the storage-based model https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_Sys_Admin_Guides/content/ref-5422cb60-d1d5-425a-b719-ec7bd03ee5d3.1.html
... View more