<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: what kind of data type i am going to use as a timestamp while inserting date as a value in hive and elasticsearch in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-kind-of-data-type-i-am-going-to-use-as-a-timestamp/m-p/169839#M49986</link>
    <description>&lt;P&gt;The most direct way is to transform the date to correct format in NiFi.  Alternatively, you could land it in a hive table and CTAS to a new table while transforming to correct format.  See this for Hive timestamp format to be used in either case: &lt;A href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-timestamp" target="_blank"&gt;https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-timestamp&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;NiFi:  &lt;/STRONG&gt;Before  putting to hdfs or hive, use a ReplaceText processor.  You will use regex to find the timestamp pattern from original twitter json and replace it with the timestamp pattern needed in Hive/Kibana.  This article should help you out: &lt;A href="https://community.hortonworks.com/articles/57803/using-nifi-gettwitter-updateattributes-and-replace.html" target="_blank"&gt;https://community.hortonworks.com/articles/57803/using-nifi-gettwitter-updateattributes-and-replace.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Hive alternative:  &lt;/STRONG&gt;Here you either use a SerDe to transform the timestamp or you use regex.  In both cases, you land the data in a Hive table, then CTAS (Create Table as Select) to a final table.  This should help you out for this approach: &lt;A href="https://community.hortonworks.com/questions/19192/how-to-transform-hive-table-using-serde.html" target="_blank"&gt;https://community.hortonworks.com/questions/19192/how-to-transform-hive-table-using-serde.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;To me, the NiFi approach is superior (unless you must store the original with untransformed date into Hadoop).&lt;/P&gt;</description>
    <pubDate>Fri, 23 Dec 2016 20:50:50 GMT</pubDate>
    <dc:creator>gkeys</dc:creator>
    <dc:date>2016-12-23T20:50:50Z</dc:date>
    <item>
      <title>what kind of data type i am going to use as a timestamp while inserting date as a value in hive and elasticsearch</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-kind-of-data-type-i-am-going-to-use-as-a-timestamp/m-p/169838#M49985</link>
      <description>&lt;P&gt;i get the twitter data and put those data into hdfs using nifi,it gives a data,but my date formate is in Tue Dec 20 10:04:31 +0000 2016 and i create a hive table with those twitter data.when i create that table i have created_time field and i need to set datatype as date,it shows an null value then i change the datatype as string it shows an value.then i need to transfer a that table into elasticsearch using es-hadoop but i confused what datatype i am going to use as time field.&lt;/P&gt;&lt;P&gt;i tried string to that time filed,it shows a value but when i visualize that field in kibana,in show an error a date type is required,&lt;/P&gt;&lt;P&gt;my query is given below:&lt;/P&gt;&lt;P&gt;i also tried -&amp;gt;change my created_time field as date but i shows a null value &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="10736-screenshot-from-2016-12-23-173539.png" style="width: 1920px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/20130i2635C8DA7552A52A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="10736-screenshot-from-2016-12-23-173539.png" alt="10736-screenshot-from-2016-12-23-173539.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="10734-screenshot-from-2016-12-23-172821.png" style="width: 1920px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/20131iF273720656374CCB/image-size/medium?v=v2&amp;amp;px=400" role="button" title="10734-screenshot-from-2016-12-23-172821.png" alt="10734-screenshot-from-2016-12-23-172821.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;PRE&gt; create table if not exists tweets_text_partition(
  tweet_id bigint, 
  created_unixtime bigint, 
  created_time string, 
  displayname string, 
  msg string,
  fulltext string
)
row format delimited fields terminated by "|"
location "/tmp/tweets_staging";
*******************************************************************
curl -XPUT &lt;A href="http://sandbox.hortonworks.com:9200/twitter_text1/?pretty" target="_blank" rel="nofollow noopener noreferrer"&gt;http://sandbox.hortonworks.com:9200/twitter_text1/?pretty&lt;/A&gt; -d'{"rels":{"properties":{"tweet_id":{"type":"bigint"},"created_unixtime":{"type":"bigint"},"created_time":{"type":"string"},"displayname":{"type":"string"},"msg":{"type":"string"},"fulltext":{"type":"string"}}}}'

***********************************************************************
ADD JAR /root/ elasticsearch-hadoop-5.1.1.jar;

ADD JAR /usr/hdp/current/hive-client/lib/commons-httpclient-3.0.1.jar;
**********************************************************************************
CREATE EXTERNAL TABLE test.twitter_es1
(
  tweet_id bigint, 
  created_unixtime bigint, 
  created_time string, 
  displayname string, 
  msg string,
  fulltext string
)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES (
'es.resource' = 'twitterdata/data',
'es.nodes'='sandbox.hortonworks.com:9200',
'es.index.auto.create' = 'false',
'es.mapping.id'='tweet_id');
*************************************************************
INSERT OVERWRITE TABLE test.twitter_es1
SELECT
tweet_id, 
created_unixtime, 
created_time, 
displayname, 
msg,
fulltext
FROM
test.tweets_hive1;&lt;/PRE&gt;&lt;BR /&gt;&lt;IMG src="https://community.cloudera.com/t5/image/serverpage/image-id/7540iAD2A38E6422A91F5/image-size/large?v=1.0&amp;amp;px=999" border="0" alt="screenshot-from-2016-12-23-173530.png" title="screenshot-from-2016-12-23-173530.png" /&gt;</description>
      <pubDate>Sun, 18 Aug 2019 10:48:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-kind-of-data-type-i-am-going-to-use-as-a-timestamp/m-p/169838#M49985</guid>
      <dc:creator>a_rajeshinfotec</dc:creator>
      <dc:date>2019-08-18T10:48:49Z</dc:date>
    </item>
    <item>
      <title>Re: what kind of data type i am going to use as a timestamp while inserting date as a value in hive and elasticsearch</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-kind-of-data-type-i-am-going-to-use-as-a-timestamp/m-p/169839#M49986</link>
      <description>&lt;P&gt;The most direct way is to transform the date to correct format in NiFi.  Alternatively, you could land it in a hive table and CTAS to a new table while transforming to correct format.  See this for Hive timestamp format to be used in either case: &lt;A href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-timestamp" target="_blank"&gt;https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-timestamp&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;NiFi:  &lt;/STRONG&gt;Before  putting to hdfs or hive, use a ReplaceText processor.  You will use regex to find the timestamp pattern from original twitter json and replace it with the timestamp pattern needed in Hive/Kibana.  This article should help you out: &lt;A href="https://community.hortonworks.com/articles/57803/using-nifi-gettwitter-updateattributes-and-replace.html" target="_blank"&gt;https://community.hortonworks.com/articles/57803/using-nifi-gettwitter-updateattributes-and-replace.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Hive alternative:  &lt;/STRONG&gt;Here you either use a SerDe to transform the timestamp or you use regex.  In both cases, you land the data in a Hive table, then CTAS (Create Table as Select) to a final table.  This should help you out for this approach: &lt;A href="https://community.hortonworks.com/questions/19192/how-to-transform-hive-table-using-serde.html" target="_blank"&gt;https://community.hortonworks.com/questions/19192/how-to-transform-hive-table-using-serde.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;To me, the NiFi approach is superior (unless you must store the original with untransformed date into Hadoop).&lt;/P&gt;</description>
      <pubDate>Fri, 23 Dec 2016 20:50:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/what-kind-of-data-type-i-am-going-to-use-as-a-timestamp/m-p/169839#M49986</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2016-12-23T20:50:50Z</dc:date>
    </item>
  </channel>
</rss>

