About Gayathridevi

BDAS · ‎11-21-2019

Perfect its worked

Shu_ashu · ‎02-14-2019

@ujvala reddy Reason is The first Week of Year is the first week with 4 or more days in the new year. First day of week is Monday and last day of week is Sunday Refer to this thread for more details regards to this week of year.

nramanaiah · ‎09-28-2018

If First day of week should start from Monday, change the subtraction/addition date to 1900-01-08 --First day of the week as Monday select date_sub('2018-09-12',pmod(datediff('2018-09-12','1900-01-08'),7)); +-------------+--+ | _c0 | +-------------+--+ | 2018-09-10 | +-------------+--+ --Last day of the week as Sunday select date_add('2018-09-12',6 - pmod(datediff('2018-09-12','1900-01-08'),7)); +-------------+--+ | _c0 | +-------------+--+ | 2018-09-16 | +-------------+--+

JonathanSneep · ‎07-24-2018

@Gayathri Devi I've created a database in MariaDB and exported a hive table using sqoop on my lab setup. This worked well for me; [sqoop@jsneep-lab ~]$ sqoop export --connect jdbc:mysql://172.3.2.1/export --username mariadb --password mariadb --table exported --direct --export-dir /apps/hive/warehouse/drivers Make sure you have /usr/share/java/mysql-connector-java.jar present on your system, this gave me trouble initially.

therson · ‎04-02-2018

Do you have a target variable that you can predict? Or do you have logic that will allow you to convert a "low" CPU value into a target variable? Spark has a wide variety of models that are available for classification modeling: https://spark.apache.org/docs/latest/mllib-classification-regression.html If you are interested in seeing which factor is contributing to a specific instance, I would recommend starting with a logistic regression model as that will provide more explanatory power -- providing more insight into which factor is contributing to a particular CPU failure

asirna · ‎03-28-2018

@Gayathri Devi, This prediction depends on the date you have. You may have labelled or unlabelled data based on which you have different algorithms. Assuming your data is labelled, then you have to find if you are trying to solve a regression problem or a classification problem. Based on that you can choose the algorithms. Since you have written that you want to find outliers , I'm assuming that it is a regression problem. Then you can use algorithms like Linear Regression, Support Vector Regression, Decision tree regression, Random forest regression etc. If your data is unlabelled, you have to use a unsupervised learning method. You will have algorithms like K-Means clustering, Hierarchical clustering etc. The main part of any solving machine learning problem is learning what your data is and choosing the right algorithm for your problem. So you may need to spend more time in analysing data and choosing the right algorithm. Here are few links for the concepts mentioned above. You can find these algorithms in spark. https://spark.apache.org/docs/latest/ml-guide.html https://machinelearningmastery.com/classification-versus-regression-in-machine-learning/ https://www.quora.com/What-is-the-main-difference-between-classification-problems-and-regression-problems-in-machine-learning https://machinelearningmastery.com/supervised-and-unsupervised-machine-learning-algorithms/ https://stackoverflow.com/questions/19170603/what-is-the-difference-between-labeled-and-unlabeled-data Happy machine learning 🙂 . -Aditya

Shu_ashu · ‎02-28-2018

@Gayathri Devi Could you try with below query as we are reading 2018-02-27T02:00 value and converting as timestamp value. Query:- hive> select from_unixtime(unix_timestamp('2018-02-27T02:00',"yyyy-MM-dd'T'hh:mm"),'yyyy-MM-dd hh:mm:ss'); +----------------------+--+ | _c0 | +----------------------+--+ | 2018-02-27 02:00:00 | +----------------------+--+ (or) By using regexp_replace function we can replace T in your timestamp value hive> select regexp_replace('2018-02-27T02:00','T',' '); +-------------------+--+ | _c0 | +-------------------+--+ | 2018-02-27 02:00 | +-------------------+--+ And use concat function to add missing :00 value to make above value as hive timestamp. hive> select concat(regexp_replace('2018-02-27T02:00','T',' '),":00"); +----------------------+--+ | _c0 | +----------------------+--+ | 2018-02-27 02:00:00 | +----------------------+--+

Shu_ashu · ‎02-28-2018

@Gayathri Devi You can use INPUT__FILE__NAME(gives all input filenames of the table) virtual column and construct your query then store the results of your query to final table. You need to create a temp table and keep your akolp9app1a_170905_0000.txt file in that table location. Then use hive> select INPUT__FILE__NAME from table; //this statement results your akolp9app1a_170905_0000.txt filename +---------------------------------------------------------------------------------+--+ | input__file__name | +---------------------------------------------------------------------------------+--+ | /apps/hive/warehouse/sales/akolp9app1a_170905_0000.txt | +---------------------------------------------------------------------------------+--+ So then you can use all your string functions like substring on the input_file_name filed and keep your hostname,date fileds extracted from the input__file__name field. hive> select substring(INPUT__FILE__NAME,20,30) hostname,substring(INPUT__FILE__NAME,40,50) `date` from table; Then you can have final table that you can insert the above select statement hostname,date values. hive> insert into finaltable select substring(INPUT__FILE__NAME,20,30) hostname,substring(INPUT__FILE__NAME,40,50) `date` from table; For more references:- https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VirtualColumns

asirna · ‎01-08-2018

@Gayathri Devi, You can use the below script . beeline -u "{connection-string}" -e "show tables" | grep $1 if [ $? -eq 0 ] then echo "table found" else echo "table not found" fi But the content in a file say checktable.sh and run the below steps chmod +x checktable.sh ./checktable.sh {tablename to check} Thanks, Aditya

asirna · ‎12-07-2017

@Gayathri Devi, Can you try this query insert into table tblename select * from (select from_unixtime(unix_timestamp('161223000001', 'yyMMddHHmmss')))b; #2) If you have timestamp as '1506614501' hive> select from_unixtime(unix_timestamp('1506614501', 'yyMMddHHmm')); OK 2015-08-01 21:01:00 Time taken: 0.257 seconds, Fetched: 1 row(s) Thanks, Aditya

Online	Offline
Last Visited	‎01-24-2019 01:40 PM

Member Since	‎01-24-2019 10:18 AM
Last Visited	‎01-24-2019 01:40 PM
Posts	49
Kudos received	4

Cloudera Community

Re: Hive -org.apache.thrift.TException: Error in c...

Re: Week Aggregation in Hive

Re: How to get the Date of the last day of a week ...

Re: Sqoop export from hive to maria db

Re: Root Cause Analysis using Machine Learning in ...

Re: Data Analysis using Mahout

Re: 2018- to convert into hive

Re: Filename extract and insert into column values...

Re: How to check the existence of the table in Bee...

Re: SemanticException [Error 10293]: