Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How do you generate dates using HiveQL? Is there a DUAL table in HIVE?

avatar
Contributor

In SQL , for example, we can use the following:

SELECT (to_date('01-12-2016','DD-MM-YYYY') - ROWNUM) AS DT FROM DUAL CONNECT BY ROWNUM <= 366

I tried uing the following in HiveQL which didnt work:

SELECT (to_date('01-12-2016','DD-MM-YYYY') - ROW_NUMBER() OVER () AS row_num) AS DT FROM DUAL CONNECT BY row_num <= 366;

Is there a way around this?

1 ACCEPTED SOLUTION

avatar

@Nandini Bhattacharjee

You must use a custom UDF for your usecase.

I had some time to modify one of my UDF to suit your usecase.

To use this udf, you must pass 3 arguments:

1. Base Date

2. Count [this represents the number of dates you want to generate. If the count you input is positive then the dates generated will be starting from date that you passed above. If the count is negative, then the dates generated will be before the date you passed above.

This will be demonstrated in the demo below.

3. Date format as per Joda specs

https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html

Here is the usage demo, highlighted in yellow:

13816-demo.jpg

Here is the source code for the UDF:

package com.dc.hadoop;

import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.Converter;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector;
import org.apache.hadoop.io.Text;
import org.joda.time.DateTime;
import org.joda.time.format.DateTimeFormat;
import org.joda.time.format.DateTimeFormatter;

/**
 * UDF to generate all dates given [baseDate, count, dateformat]
 */
public final class DateGen extends GenericUDF
{


 Converter[] converters = null;
 StringObjectInspector inputOI = null;
 List<Object> dates = new ArrayList<>();


 @Override
 public ObjectInspector initialize(ObjectInspector[] objectInspectors) throws UDFArgumentException {
  if(objectInspectors.length != 3) {
   throw new UDFArgumentException("All params are required: baseDate, count and dateFormat");
  }


  if (!(objectInspectors[0] instanceof StringObjectInspector)
    || !(objectInspectors[1] instanceof StringObjectInspector)
    || !(objectInspectors[2] instanceof StringObjectInspector)) {
   throw new UDFArgumentException("All input should be of type STRING");
  }


  this.inputOI = (StringObjectInspector) objectInspectors[0];
  this.converters = new Converter[objectInspectors.length];
  converters[0] = ObjectInspectorConverters.getConverter(inputOI, PrimitiveObjectInspectorFactory.writableStringObjectInspector);


  return ObjectInspectorFactory.getStandardListObjectInspector(PrimitiveObjectInspectorFactory.writableStringObjectInspector);
 }


 @Override
 public Object evaluate(DeferredObject[] deferredObjects) throws HiveException {
  String start = inputOI.getPrimitiveJavaObject(deferredObjects[0].get());
  String counter = inputOI.getPrimitiveJavaObject(deferredObjects[1].get());
  String dateFormat = inputOI.getPrimitiveJavaObject(deferredObjects[2].get());


  if (start == null || counter == null || dateFormat == null) {
   return null;
  }


  DateTimeFormatter formatter = DateTimeFormat.forPattern(dateFormat);
  DateTime baseDate = null;
  Integer count = null;
  DateTime endDate = null;
  try {
   baseDate = formatter.parseDateTime(start);
   count = Integer.valueOf(counter);
  } catch (IllegalArgumentException e) {
   System.err.println("ERROR: DateRangeToDateArray - can't parse baseDate or count: " + baseDate + " " + count);
   baseDate = null;
  }


  if ( baseDate == null || count == null ) {
   return null;
  }
  if(count<0){
   endDate = baseDate;
   baseDate = baseDate.minusDays(-1*count);
  }else{
   endDate = baseDate.plusDays(count);
  }
  return getDatesArrayForRange(baseDate, endDate, formatter);
 }


 private List<Object> getDatesArrayForRange(DateTime start,
   DateTime end, DateTimeFormatter formatter) {
  
  dates.clear();
  while (start.isBefore(end) || start.isEqual(end)) {
   Text dateAsText = new Text(start.toString(formatter));
   dates.add(dateAsText);
   start = start.plusDays(1);
  }
  return dates;
 }
 
 @Override
 public String getDisplayString(String[] strings) {
  return "Generate all dates given [baseDate, count, dateformat]";
 }
}

Also, attached zip file containing sample jar file that I used for demo dtgen.zip

Hope this helps.

View solution in original post

9 REPLIES 9

avatar

@Nandini Bhattacharjee

In lower versions of Hive, the following worked for me:

select TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP()));

In higher versions, you can use :

select CURRENT_DATE;

avatar
Contributor

Hi @Dinesh Chitlangia . Yes I do use this. But i'm looking for a way to generate a series of dates using HiveQL

avatar

@Nandini Bhattacharjee do you mean you want to generate dates given a range. say x days from the past till today, or x days in future from today ?

avatar
Contributor
@Dinesh Chitlangia

Exactly what i'm looking for....any ideas....

avatar
Super Collaborator
@Nandini Bhattacharjee

Can you try writing Hive UDF?

https://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFR... describes creating UDF for generating sequence of numbers.

You can follow the same approach to generate series of dates.

Also refer : https://community.hortonworks.com/questions/20168/sequence-number-generation-in-hive.html

avatar

@Nandini Bhattacharjee

1)Best way is to create UDF to generate sequence of date.

2)If you are SQL guy then create a stage table which loads row_number()over() as row_num. Then use this table to generate date_add(current_date,row_num) which will give you the date in sequence. Make sure you create rows as per your need in the stage table.

avatar

@Nandini Bhattacharjee

You must use a custom UDF for your usecase.

I had some time to modify one of my UDF to suit your usecase.

To use this udf, you must pass 3 arguments:

1. Base Date

2. Count [this represents the number of dates you want to generate. If the count you input is positive then the dates generated will be starting from date that you passed above. If the count is negative, then the dates generated will be before the date you passed above.

This will be demonstrated in the demo below.

3. Date format as per Joda specs

https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html

Here is the usage demo, highlighted in yellow:

13816-demo.jpg

Here is the source code for the UDF:

package com.dc.hadoop;

import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.Converter;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector;
import org.apache.hadoop.io.Text;
import org.joda.time.DateTime;
import org.joda.time.format.DateTimeFormat;
import org.joda.time.format.DateTimeFormatter;

/**
 * UDF to generate all dates given [baseDate, count, dateformat]
 */
public final class DateGen extends GenericUDF
{


 Converter[] converters = null;
 StringObjectInspector inputOI = null;
 List<Object> dates = new ArrayList<>();


 @Override
 public ObjectInspector initialize(ObjectInspector[] objectInspectors) throws UDFArgumentException {
  if(objectInspectors.length != 3) {
   throw new UDFArgumentException("All params are required: baseDate, count and dateFormat");
  }


  if (!(objectInspectors[0] instanceof StringObjectInspector)
    || !(objectInspectors[1] instanceof StringObjectInspector)
    || !(objectInspectors[2] instanceof StringObjectInspector)) {
   throw new UDFArgumentException("All input should be of type STRING");
  }


  this.inputOI = (StringObjectInspector) objectInspectors[0];
  this.converters = new Converter[objectInspectors.length];
  converters[0] = ObjectInspectorConverters.getConverter(inputOI, PrimitiveObjectInspectorFactory.writableStringObjectInspector);


  return ObjectInspectorFactory.getStandardListObjectInspector(PrimitiveObjectInspectorFactory.writableStringObjectInspector);
 }


 @Override
 public Object evaluate(DeferredObject[] deferredObjects) throws HiveException {
  String start = inputOI.getPrimitiveJavaObject(deferredObjects[0].get());
  String counter = inputOI.getPrimitiveJavaObject(deferredObjects[1].get());
  String dateFormat = inputOI.getPrimitiveJavaObject(deferredObjects[2].get());


  if (start == null || counter == null || dateFormat == null) {
   return null;
  }


  DateTimeFormatter formatter = DateTimeFormat.forPattern(dateFormat);
  DateTime baseDate = null;
  Integer count = null;
  DateTime endDate = null;
  try {
   baseDate = formatter.parseDateTime(start);
   count = Integer.valueOf(counter);
  } catch (IllegalArgumentException e) {
   System.err.println("ERROR: DateRangeToDateArray - can't parse baseDate or count: " + baseDate + " " + count);
   baseDate = null;
  }


  if ( baseDate == null || count == null ) {
   return null;
  }
  if(count<0){
   endDate = baseDate;
   baseDate = baseDate.minusDays(-1*count);
  }else{
   endDate = baseDate.plusDays(count);
  }
  return getDatesArrayForRange(baseDate, endDate, formatter);
 }


 private List<Object> getDatesArrayForRange(DateTime start,
   DateTime end, DateTimeFormatter formatter) {
  
  dates.clear();
  while (start.isBefore(end) || start.isEqual(end)) {
   Text dateAsText = new Text(start.toString(formatter));
   dates.add(dateAsText);
   start = start.plusDays(1);
  }
  return dates;
 }
 
 @Override
 public String getDisplayString(String[] strings) {
  return "Generate all dates given [baseDate, count, dateformat]";
 }
}

Also, attached zip file containing sample jar file that I used for demo dtgen.zip

Hope this helps.

avatar
Contributor

@Dinesh Chitlangia brilliant! i will try this out. Thanks a lot!!

avatar
Contributor

Amazing answer!