Member since
11-21-2017
70
Posts
5
Kudos Received
0
Solutions
01-14-2021
11:54 PM
Hi ravikirandasar1, I also have the same query.Could you please let me know how did you automate this job using crontab for everyday download of the files to hdfs?
... View more
01-10-2019
01:15 PM
I have currently set up a HDF 3.3.1 with NiFi on a standalone machine. I want to go ahead and install HDFS for storage purposes. Can I work with the latest version of HDP ?? Please advice !!! @Matt Burgess
... View more
04-16-2018
11:03 AM
Hi Salvator, Im facing same problem, do u find any solution for this?
... View more
01-23-2018
12:35 PM
1 Kudo
@Ravikiran Dasari Please accept the answer if it addresses your query 🙂 or let me know if you need any further information.
... View more
01-15-2018
12:00 PM
@Jay Kumar SenSharma Thank u.. do u have any idea abt installation of NiFi on HDP cluster?
... View more
01-10-2018
03:10 PM
@Ravikiran Dasari Create a sqoop job for your import as sqoop job --create <job-name> -- import --connect "jdbc:sqlserver://10.21.29.15:1433;database=db;username=ReportingServices;password=ReportingServices" --check-column batchid --incremental append -m 1 --hive-table mmidwpresentation.journeypositions_archive --table JourneyPositions --hive-import --schema safedrive So once you create sqoop job sqoop will store the last value for the batchid(it's check column argument), when ever you run the job again sqoop will pull only new records after the last state value. Sqoop Job Arguments:- $ sqoop job
--create <job-name>Define a new saved job with the specified job-id (name). A second Sqoop comm and-lin e, separated by a -- should be specified; this defines the saved job.
--delete <job-name>Delete a saved job.
--exec <job-name>Given a job defined with --create, run the saved job.
--show <job-name>Show the parameters for a saved job.
--list these are all the arguments you can use with sqoop job command to execute, list,delete jobs ..etc. Use --password-file option to Set path for a file containing the authentication password while creating sqoop jobs
... View more
01-09-2018
08:58 AM
@Ravikiran Dasari: You can see all parameters from hive with "hive -H". hive -H
usage: hive
-d,--define <key=value> Variable substitution to apply to Hive
commands. e.g. -d A=B or --define A=B
-e <quoted-query-string> SQL from command line
-f <filename> SQL from files
-H,--help Print help information
-h <hostname> Connecting to Hive Server on remote host
--hiveconf <property=value> Use value for given property
--hivevar <key=value> Variable substitution to apply to hive
commands. e.g. --hivevar A=B
-i <filename> Initialization SQL file
-p <port> Connecting to Hive Server on port number
-S,--silent Silent mode in interactive shell
-v,--verbose Verbose mode (echo executed SQL to the
console) You can add two or more tables into the same schema if they have different names (which will be the case if you use the timestamp). If you are running your create script in parallel, you could always just get a new timestamp in case the tablename with the timestamp you have already exists. If needed you can add the date stamp as well by curr_timestamp=`date +%Y%m%d_%s`
... View more
12-12-2017
01:08 AM
No. 840GB, that means a single node has almost 120GB RAM, and it's not ideal way to maintain system. Because each nodes need some free memory for other services such os applications or agents which are using by ambari and etc. Just start 90GB to 100GB, then you can slightly change for that.
... View more
12-06-2017
05:16 PM
Yes continuously, automatically. By default it polls for new files every 60 seconds, you can shrink that. You can also convert those files to Apache ORC and auto build new Hive tables on them if the files are CSV, TSV, Avro, Excel, JSON, XML, EDI, HL7 or C-CDA. Install Apache NiFi on an edge node, there are ways to combine them with HDP 2.6 and HDF 3 with the new Ambari. But it's easiest to have a separate node for Apache NiFi to start. You can also just download nifi unzip and run on a laptop that has JDK 8 installed https://www.apache.org/dyn/closer.lua?path=/nifi/1.4.0/nifi-1.4.0-bin.zip
... View more
12-14-2018
01:05 PM
Hi @Jordan Moore, what option would you suggest if you have 100 different sftp sources and 10-15 files in each of them. Configuring individual NiFi processes is not an option here. I've played around with NiFi processors and they are not very good at working with parameters. Would Spark be a good solution for my case? Thanks, Farid
... View more