Member since
02-02-2017
18
Posts
1
Kudos Received
0
Solutions
03-26-2018
02:00 PM
Is Kerberized server supported by LivySessionController ?
... View more
03-26-2018
01:55 PM
Is Kerberized server supported by LivySessionController ? I tried with the same approach on Kerberized Hadoop cluster but not able to get expected results.
... View more
06-14-2017
07:42 PM
I am using a ExecuteScript processor to run a python script at a remote machine.
This processor doesn't take input from flowfile from an upstream processor, i want to pass the parameter in ExecuteScript processor dynamically. I saw that ExecuteStreamCommand processor can take parameters from upstream processor. But i am unable to find an option to run the python script on remote machine through this processor.
Is there a way to run remote python script through ExecuteStreamCommand processor ?
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
05-17-2017
07:16 PM
We have a marketing report in form of tabular data set which schema looks like this: event timestamp distinct_id initial_referring_domain ColumnX ColumnY Page Viewed 1472688038 489687 www.abc.com Sample Data Sample Data Page Viewed 1472688052 118805 www.abc.com Sample Data Sample Data Request Information Click 1472688056 192674 www.abc.com Sample Data Sample Data Page Viewed 1472688087 204231 ww.123.com Sample Data Sample Data Page Viewed 1472688161 76081 www.abc.com Sample Data Sample Data Page Viewed 1472688219 186081 www.abc.com Sample Data Sample Data Page Viewed 1472688236 83259 www.google.co.in Sample Data Sample Data Page Viewed 1472688310 61410 www.tuv.in Sample Data Sample Data We need to write a map reduce program in order to find out the highest frequency of Initial_referring source site in order to find out which website most effective ad platform. Approach
Remove rows having duplicate entities in distinct_id column. Count frequency of each entity in initial_referring column. Publish the result of frequency of each identity. I am able to do this problem in Hive and pig but was not able to get the correct result in MapReduce program. Any reference or piece of similar code can help.
... View more
Labels:
- Labels:
-
Apache Hadoop
03-30-2017
09:39 AM
Hey @Dan Chaffelson Is there a way we can connect to the remote machine with password instead of key. i was able to run to do ssh (with password) from backend. Can i do it from "ExecuteProcess" processor or any other processor ?
thanks.
... View more
03-30-2017
09:36 AM
Hey @Matt Clarke,
Is there a way we can connect to the remote machine with password instead of key.
i was able to run to do ssh (with password) from backend. Can i do it from "ExecuteProcess" processor.
thanks.
... View more
03-24-2017
08:51 AM
@Dan Chaffelson The python script (is actually a big python code ~ 500 Lines) pulls data from a MS SQL Server and dumps data in CSV format to a specified location in that machine. Actually, the whole use case is extracting Tabular Data from various SQL sources and putting them to HDFS (HDP), we are using NiFi to orchestrate the process. Hence want to instantiate and schedule the whole flow with NiFi only.
... View more
03-24-2017
08:03 AM
Hey @Dan Chaffelson thanks for sharing, Can you help me with a Processor (with configuration) that can be used to trigger a python script on remote machine (by doing ssh) and pulling back the result to HDF cluster.
I have used Execute process processor with below configs to trigger the python script on remote machine. ( As suggest by @Matt Clarke) and was able to run the python script.
Property Value Command ssh Command Arguments -i "<path to private key>" <user>@<remotehost> python <script> &
Now i want to pull back the output of python command (which is at /home/abc with 777 permission) to HDF cluster.
Is there a processor to do that ?
... View more
03-24-2017
07:25 AM
Thanks @Matt Clarke It worked fine, i was able to execute the python script.
Is there a way we can get back the output data which is being generated by the python code, which is being saved to a directory in the remote machine ?
... View more
03-23-2017
01:12 PM
@Matt I have tried exploring Execute Script processor and Execute Process Processor but was not able to do ssh to remote machine. Could you provide more detail on this or help me with the processor and configurations. Thanks
... View more
03-23-2017
11:35 AM
I have a 3 nodes HDF cluster(in AWS cloud) where NiFi is running across the cluster, i want to use a NiFi processor to trigger shell/python script on a remote machine (OnPremise) to perform certain action written in the shell/python script.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
02-20-2017
12:23 PM
I have around 400 tables in a SQL server database, out of which i want to list 10 tables from ListDatabaseTables processor in Apache NiFi in HDF. These 10 tables doesn't have a fix naming pattern. When i am trying to add multiple tables with semicolon as separation in value of TableNamePattern property, the processor is not able to detect the tables.
While the same works absolutely fine with single TableNamePattern value. Refer to screen shots: 1st processors works fine whereas 2nd processor is not able to search the tables. Any other suggestion to import data from 10 tables(not having a fix naming pattern) out of 400-500 tables through NiFi will be highly appreciated.
... View more
Labels:
02-09-2017
11:01 AM
1 Kudo
sqoop import --table <TableName> --connect "jdbc:jtds:sqlserver://<HostName>:<PortNo>;useNTLMv2=true;domain=<DomainName>;databaseName=<DB_Name>" --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username <WindowsUserName> --password <'********'> --verbose --target-dir <TargetDirectory> -m 1
I think useNTLMv2=true may do the trick. Can you try the above query.
... View more
02-09-2017
09:47 AM
@RajendraM
Can you check if your for which you are generating Kinit ticket has permission to write in the directory which you are specifying in the sqoop import command.
Also if you can share the Sqoop command which you are firing now.
... View more
02-08-2017
06:35 PM
Rajendra, you need to first generate a kinit ticket for the user you have logged in from. Once it is kerberos authenticated, then you can fire sqoop query.
... View more
02-06-2017
06:01 AM
Thanks @Sindhu, it worked. I missed defining some of the parameters.
... View more
02-02-2017
03:18 PM
I am trying to import a table from Microsoft SQL Server 11.0.5058 through Sqoop (which is a service on Hortonwork Data Platform) into HDFS. Given the user i have, has only windows authentication (LDAP) on SQL Server.
Tried few approaches
1. Kept the sqljdbc4.jar in sqoop shared library and used import command.
2. Downloaded sqljdbc_auth.dll and kept it in java library and tried running import command.
But no luck.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Sqoop