Member since
06-28-2017
279
Posts
43
Kudos Received
24
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2044 | 12-24-2018 08:34 AM | |
5444 | 12-24-2018 08:21 AM | |
2278 | 08-23-2018 07:09 AM | |
9934 | 08-21-2018 05:50 PM | |
5243 | 08-20-2018 10:59 AM |
04-20-2018
09:14 AM
The usage of the script will be displayed when running it with the parameter --help or -h. From a quick look, it seems that the script expects parameter by name like -a ACTION, --action=ACTION<br> while you try to provide the parameter by position. I.e. instead of "configs.py get ..." it should be "configs.py -a get ..." Ambari has some mandatory configs, that you can't delete. It should be easy to check if the parameter you want to delete are mandatory or not: locate them in the Ambari UI, and check if there is the 'delete' action icon right of the config, it looks like the traffic sign entrance forbidden. If the icon is there, you can delete them using Ambari UI.
... View more
04-20-2018
08:56 AM
i guess your run from the shell just the command 'hive'? And your are doing this being logged in to the shell as the user 'hive'?
... View more
04-20-2018
08:46 AM
Ok, if it works for seven days, i agree that all other possible issues can be ignored for now. Are you able fix the issue by running a kinit and then restart solr? If that enables your solr to run for another 7 days, you might want to change solr from using the ticket to using a keytab. Keytabs do not expire.
... View more
04-19-2018
10:21 PM
solr/solr1.mycluster.com@MYCLUSTER.COM is your Kerberos principal, and it is trying to connect to solr2.mycluster.com/172.31.16.23:8020;, which is servername/IP:Port. I really think this is pointing to your issue; local host is: "java.net.UnknownHostException: solr1.mycluster.com: solr1.mycluster.com: System error" Here should be a server name and an IP. not an unknown host exception. Can you check your hostname setup (DNS or /etc/hosts) on the machine solr1? I.e. Open a shell and try a "ping solr1.mycluster.com"? The ticket lifetime is 24h in your setup, and the renewal needs to take place within that period. The renew_lifetime is the maximum lifetime for the renewed ticket. Do you get that error after aaround 24 hours or is it happening when you start the service?
... View more
04-19-2018
01:43 PM
might be stupid, but did you check the file permissions? Has the user running the hive command permissions on the file?
... View more
04-17-2018
10:44 AM
maybe I get your question wrong, but you want to convert 1 line of the file into 1 line of the hive table right? Your target table has 8 columns, while the text file only has 4 columns/words? To me it looks as if you don't do a text replace at all, please correct me if am wrong? Col1: Word1 of file complete Col2: Word2 of file complete Col3: Word3 of file complete Col4: Word4 of file complete Col5: second part of Word1 (not sure if digits part, or just the last 3 chars or just half of the chars?) Col6: Word2 of file complete Col7: middlepart of Word1 (just the middle two char, or the two char around the split of Col5?) Col8: Word3 of file complete So it looks like you are populating three columns with the word1 of the file, word 2 and word 3 are populated into 2 columns each? The result needs to be a Hive table or a file? And is the input a file stored on hdfs or a stream where you receive line by line? If it is a file, you may try the serde feature of hive.
... View more
04-17-2018
10:25 AM
1 Kudo
Let me try to explain, though not all the details will be included: Cassandra is a BigTable database, that can use multiple nodes for storage and processing, and the files can be stored in HDFS or other filesystems on the nodes. Cassandra comes with a query interfaces that has a SQL style (but not for the full SQL standard). Cassandra is not part of Hadoop, but can be used together with Hadoop and HDFS. Hive itself is a layer that can provide you sql style access to flat files in HDFS, to other DB being connected to Hive via JDBC or internal Hive tables (also stored in HDFS). Hive is part of Hadoop and typically described as the DWH layer of Hadoop, as it brings SQL query capabilities and a structured view on the files. Spark is a parallel processing framework, that is now also part of Hadoop, but was originally developed independent. It can access HDFS files, or streaming data or even external SQL databases and much more. It is typically considered an alternative to the Map&Reduce framework, which was the basis for the first Hadoop versions (together with HDFS) and is still there in Hadoop. For processing the data it brings several options to include code i.e. written in Python or R, or even SQL. This depends of course on the data you are processing. I.e image data is not well processed with Spark SQL.
... View more
04-16-2018
01:47 PM
can you try running "systemctl status mysqld.service"? It seems that your mysql daemon is not up, and this will also let mysql_upgrade fail. I think this entry from the log could point to the issue: /usr/sbin/mysqld: Unknown storage engine 'InnoDB'
... View more
04-16-2018
06:59 AM
your last (third) query should be fine, do you get any error message? not sure if you need to put brackets around the row filters like { FILTER => (filterA AND filterB) }
... View more
02-27-2018
12:52 PM
1 Kudo
ok, the error handling can be implemented in this way: ...
__MY_FTP_COMMANDS__
ret_ftp = $?
if [ ${ret_ftp} == 0]
then #if you have a logging facility you properly want to use it to log the status
echo "Files successfully transfered"
else
echo "Error in file transfer"
return ${ret_ftp}
fi
#at this point the files should already be locally copied
hadoop fs -put -f ${dir} /destination
ret_hdfs = $?
#Put a similar handling here
For the password, SFTP is like ssh a little tricky, so to get rid of the password prompt, I would recommend to exchange SSH keys If this is working you can add the scheduled execution of the script in your crontab.
... View more