Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

Solved Go to solution
Highlighted

How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

New Contributor

In SQL , for example, we can use the following:

SELECT (to_date('01-12-2016','DD-MM-YYYY') - ROWNUM) AS DT FROM DUAL CONNECT BY ROWNUM <= 366

I tried uing the following in HiveQL which didnt work:

SELECT (to_date('01-12-2016','DD-MM-YYYY') - ROW_NUMBER() OVER () AS row_num) AS DT FROM DUAL CONNECT BY row_num <= 366;

Is there a way around this?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

@Nandini Bhattacharjee

You must use a custom UDF for your usecase.

I had some time to modify one of my UDF to suit your usecase.

To use this udf, you must pass 3 arguments:

1. Base Date

2. Count [this represents the number of dates you want to generate. If the count you input is positive then the dates generated will be starting from date that you passed above. If the count is negative, then the dates generated will be before the date you passed above.

This will be demonstrated in the demo below.

3. Date format as per Joda specs

https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html

Here is the usage demo, highlighted in yellow:

13816-demo.jpg

Here is the source code for the UDF:

package com.dc.hadoop;

import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.Converter;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector;
import org.apache.hadoop.io.Text;
import org.joda.time.DateTime;
import org.joda.time.format.DateTimeFormat;
import org.joda.time.format.DateTimeFormatter;

/**
 * UDF to generate all dates given [baseDate, count, dateformat]
 */
public final class DateGen extends GenericUDF
{


 Converter[] converters = null;
 StringObjectInspector inputOI = null;
 List<Object> dates = new ArrayList<>();


 @Override
 public ObjectInspector initialize(ObjectInspector[] objectInspectors) throws UDFArgumentException {
  if(objectInspectors.length != 3) {
   throw new UDFArgumentException("All params are required: baseDate, count and dateFormat");
  }


  if (!(objectInspectors[0] instanceof StringObjectInspector)
    || !(objectInspectors[1] instanceof StringObjectInspector)
    || !(objectInspectors[2] instanceof StringObjectInspector)) {
   throw new UDFArgumentException("All input should be of type STRING");
  }


  this.inputOI = (StringObjectInspector) objectInspectors[0];
  this.converters = new Converter[objectInspectors.length];
  converters[0] = ObjectInspectorConverters.getConverter(inputOI, PrimitiveObjectInspectorFactory.writableStringObjectInspector);


  return ObjectInspectorFactory.getStandardListObjectInspector(PrimitiveObjectInspectorFactory.writableStringObjectInspector);
 }


 @Override
 public Object evaluate(DeferredObject[] deferredObjects) throws HiveException {
  String start = inputOI.getPrimitiveJavaObject(deferredObjects[0].get());
  String counter = inputOI.getPrimitiveJavaObject(deferredObjects[1].get());
  String dateFormat = inputOI.getPrimitiveJavaObject(deferredObjects[2].get());


  if (start == null || counter == null || dateFormat == null) {
   return null;
  }


  DateTimeFormatter formatter = DateTimeFormat.forPattern(dateFormat);
  DateTime baseDate = null;
  Integer count = null;
  DateTime endDate = null;
  try {
   baseDate = formatter.parseDateTime(start);
   count = Integer.valueOf(counter);
  } catch (IllegalArgumentException e) {
   System.err.println("ERROR: DateRangeToDateArray - can't parse baseDate or count: " + baseDate + " " + count);
   baseDate = null;
  }


  if ( baseDate == null || count == null ) {
   return null;
  }
  if(count<0){
   endDate = baseDate;
   baseDate = baseDate.minusDays(-1*count);
  }else{
   endDate = baseDate.plusDays(count);
  }
  return getDatesArrayForRange(baseDate, endDate, formatter);
 }


 private List<Object> getDatesArrayForRange(DateTime start,
   DateTime end, DateTimeFormatter formatter) {
  
  dates.clear();
  while (start.isBefore(end) || start.isEqual(end)) {
   Text dateAsText = new Text(start.toString(formatter));
   dates.add(dateAsText);
   start = start.plusDays(1);
  }
  return dates;
 }
 
 @Override
 public String getDisplayString(String[] strings) {
  return "Generate all dates given [baseDate, count, dateformat]";
 }
}

Also, attached zip file containing sample jar file that I used for demo dtgen.zip

Hope this helps.

9 REPLIES 9

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

@Nandini Bhattacharjee

In lower versions of Hive, the following worked for me:

select TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP()));

In higher versions, you can use :

select CURRENT_DATE;

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

New Contributor

Hi @Dinesh Chitlangia . Yes I do use this. But i'm looking for a way to generate a series of dates using HiveQL

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

@Nandini Bhattacharjee do you mean you want to generate dates given a range. say x days from the past till today, or x days in future from today ?

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

New Contributor
@Dinesh Chitlangia

Exactly what i'm looking for....any ideas....

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

Expert Contributor
@Nandini Bhattacharjee

Can you try writing Hive UDF?

https://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFR... describes creating UDF for generating sequence of numbers.

You can follow the same approach to generate series of dates.

Also refer : https://community.hortonworks.com/questions/20168/sequence-number-generation-in-hive.html

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

@Nandini Bhattacharjee

1)Best way is to create UDF to generate sequence of date.

2)If you are SQL guy then create a stage table which loads row_number()over() as row_num. Then use this table to generate date_add(current_date,row_num) which will give you the date in sequence. Make sure you create rows as per your need in the stage table.

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

@Nandini Bhattacharjee

You must use a custom UDF for your usecase.

I had some time to modify one of my UDF to suit your usecase.

To use this udf, you must pass 3 arguments:

1. Base Date

2. Count [this represents the number of dates you want to generate. If the count you input is positive then the dates generated will be starting from date that you passed above. If the count is negative, then the dates generated will be before the date you passed above.

This will be demonstrated in the demo below.

3. Date format as per Joda specs

https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html

Here is the usage demo, highlighted in yellow:

13816-demo.jpg

Here is the source code for the UDF:

package com.dc.hadoop;

import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.Converter;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector;
import org.apache.hadoop.io.Text;
import org.joda.time.DateTime;
import org.joda.time.format.DateTimeFormat;
import org.joda.time.format.DateTimeFormatter;

/**
 * UDF to generate all dates given [baseDate, count, dateformat]
 */
public final class DateGen extends GenericUDF
{


 Converter[] converters = null;
 StringObjectInspector inputOI = null;
 List<Object> dates = new ArrayList<>();


 @Override
 public ObjectInspector initialize(ObjectInspector[] objectInspectors) throws UDFArgumentException {
  if(objectInspectors.length != 3) {
   throw new UDFArgumentException("All params are required: baseDate, count and dateFormat");
  }


  if (!(objectInspectors[0] instanceof StringObjectInspector)
    || !(objectInspectors[1] instanceof StringObjectInspector)
    || !(objectInspectors[2] instanceof StringObjectInspector)) {
   throw new UDFArgumentException("All input should be of type STRING");
  }


  this.inputOI = (StringObjectInspector) objectInspectors[0];
  this.converters = new Converter[objectInspectors.length];
  converters[0] = ObjectInspectorConverters.getConverter(inputOI, PrimitiveObjectInspectorFactory.writableStringObjectInspector);


  return ObjectInspectorFactory.getStandardListObjectInspector(PrimitiveObjectInspectorFactory.writableStringObjectInspector);
 }


 @Override
 public Object evaluate(DeferredObject[] deferredObjects) throws HiveException {
  String start = inputOI.getPrimitiveJavaObject(deferredObjects[0].get());
  String counter = inputOI.getPrimitiveJavaObject(deferredObjects[1].get());
  String dateFormat = inputOI.getPrimitiveJavaObject(deferredObjects[2].get());


  if (start == null || counter == null || dateFormat == null) {
   return null;
  }


  DateTimeFormatter formatter = DateTimeFormat.forPattern(dateFormat);
  DateTime baseDate = null;
  Integer count = null;
  DateTime endDate = null;
  try {
   baseDate = formatter.parseDateTime(start);
   count = Integer.valueOf(counter);
  } catch (IllegalArgumentException e) {
   System.err.println("ERROR: DateRangeToDateArray - can't parse baseDate or count: " + baseDate + " " + count);
   baseDate = null;
  }


  if ( baseDate == null || count == null ) {
   return null;
  }
  if(count<0){
   endDate = baseDate;
   baseDate = baseDate.minusDays(-1*count);
  }else{
   endDate = baseDate.plusDays(count);
  }
  return getDatesArrayForRange(baseDate, endDate, formatter);
 }


 private List<Object> getDatesArrayForRange(DateTime start,
   DateTime end, DateTimeFormatter formatter) {
  
  dates.clear();
  while (start.isBefore(end) || start.isEqual(end)) {
   Text dateAsText = new Text(start.toString(formatter));
   dates.add(dateAsText);
   start = start.plusDays(1);
  }
  return dates;
 }
 
 @Override
 public String getDisplayString(String[] strings) {
  return "Generate all dates given [baseDate, count, dateformat]";
 }
}

Also, attached zip file containing sample jar file that I used for demo dtgen.zip

Hope this helps.

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

New Contributor

@Dinesh Chitlangia brilliant! i will try this out. Thanks a lot!!

Re: How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

New Contributor

Amazing answer!

Don't have an account?
Coming from Hortonworks? Activate your account here