Support Questions

Find answers, ask questions, and share your expertise

Loading data into druid failing in hortonworks sandbox

Rising Star

Steps: https://github.com/cartershanklin/hive-druid-ssb

Error:

INFO  : Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode
INFO  : Starting task [Stage-0:MOVE] in serial mode
INFO  : Moving data to directory hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/ssb_druid from hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/.hive-staging_hive_2017-10-19_09-48-11_929_1281256378284617965-1/-ext-10002
INFO  : Starting task [Stage-4:DDL] in serial mode
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10/segmentsDescriptorDir does not exist.
INFO  : Resetting the caller context to HIVE_SSN_ID:ae1bcabc-646b-4e05-96cb-40fe4990a916
INFO  : Completed executing command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10); Time taken: 6.296 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10/segmentsDescriptorDir does not exist. (state=08S01,code=1)




Closing: 0: jdbc:hive2://sandbox.hortonworks.com:10500/default

[root@sandbox hive-druid-ssb]# 


Failing load step:

[root@sandbox hive-druid-ssb]# sh 00load.sh 1 sandbox.hortonworks.com:10500 sandbox.hortonworks.com root hadoop

Connecting to jdbc:hive2://sandbox.hortonworks.com:10500/default

Connected to: Apache Hive (version 2.1.0.2.6.1.0-129)

Driver: Hive JDBC (version 1.2.1000.2.6.1.0-129)

Transaction isolation: TRANSACTION_REPEATABLE_READ

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.metadata.username=${DRUID_USERNAME};

No rows affected (0.14 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.metadata.password=${DRUID_PASSWORD};

No rows affected (0.007 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.metadata.uri=jdbc:mysql://${DRUID_HOST}/druid;

No rows affected (0.006 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.indexer.partition.size.max=1000000;

No rows affected (0.004 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.indexer.memory.rownum.max=100000;

No rows affected (0.007 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.broker.address.default=${DRUID_HOST}:8082;

No rows affected (0.006 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.coordinator.address.default=${DRUID_HOST}:8081;

No rows affected (0.007 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.storage.storageDirectory=/apps/hive/warehouse;

No rows affected (0.004 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.tez.container.size=1024;

No rows affected (0.007 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.passiveWaitTimeMs=180000;

No rows affected (0.005 seconds)

0: jdbc:hive2://sandbox.hortonworks.com:10500> 

0: jdbc:hive2://sandbox.hortonworks.com:10500> CREATE TABLE ssb_druid

0: jdbc:hive2://sandbox.hortonworks.com:10500> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'

0: jdbc:hive2://sandbox.hortonworks.com:10500> TBLPROPERTIES (

0: jdbc:hive2://sandbox.hortonworks.com:10500>   "druid.datasource" = "ssb_druid",

0: jdbc:hive2://sandbox.hortonworks.com:10500>   "druid.segment.granularity" = "MONTH",

0: jdbc:hive2://sandbox.hortonworks.com:10500>   "druid.query.granularity" = "DAY")

0: jdbc:hive2://sandbox.hortonworks.com:10500> AS

0: jdbc:hive2://sandbox.hortonworks.com:10500> SELECT

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(d_year || '-' || d_monthnuminyear || '-' || d_daynuminmonth as timestamp) as `__time`,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(c_city as string) c_city,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(c_nation as string) c_nation,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(c_region as string) c_region,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(d_weeknuminyear as string) d_weeknuminyear,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(d_year as string) d_year,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(d_yearmonth as string) d_yearmonth,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(d_yearmonthnum as string) d_yearmonthnum,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(lo_discount as string) lo_discount,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(lo_quantity as string) lo_quantity,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(p_brand1 as string) p_brand1,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(p_category as string) p_category,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(p_mfgr as string) p_mfgr,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(s_city as string) s_city,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(s_nation as string) s_nation,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   cast(s_region as string) s_region,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   lo_revenue,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   lo_extendedprice * lo_discount discounted_price,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   lo_revenue - lo_supplycost net_revenue

0: jdbc:hive2://sandbox.hortonworks.com:10500> FROM

0: jdbc:hive2://sandbox.hortonworks.com:10500>   ssb_${SCALE}_flat_orc.customer, ssb_${SCALE}_flat_orc.dates, ssb_${SCALE}_flat_orc.lineorder,

0: jdbc:hive2://sandbox.hortonworks.com:10500>   ssb_${SCALE}_flat_orc.part, ssb_${SCALE}_flat_orc.supplier

0: jdbc:hive2://sandbox.hortonworks.com:10500> where

0: jdbc:hive2://sandbox.hortonworks.com:10500>   lo_orderdate = d_datekey and lo_partkey = p_partkey

0: jdbc:hive2://sandbox.hortonworks.com:10500>   and lo_suppkey = s_suppkey and lo_custkey = c_custkey;

INFO  : Compiling command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10): CREATE TABLE ssb_druid

STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'

TBLPROPERTIES (

  "druid.datasource" = "ssb_druid",

  "druid.segment.granularity" = "MONTH",

  "druid.query.granularity" = "DAY")

AS

SELECT

  cast(d_year || '-' || d_monthnuminyear || '-' || d_daynuminmonth as timestamp) as `__time`,

  cast(c_city as string) c_city,

  cast(c_nation as string) c_nation,

  cast(c_region as string) c_region,

  cast(d_weeknuminyear as string) d_weeknuminyear,

  cast(d_year as string) d_year,

  cast(d_yearmonth as string) d_yearmonth,

  cast(d_yearmonthnum as string) d_yearmonthnum,

  cast(lo_discount as string) lo_discount,

  cast(lo_quantity as string) lo_quantity,

  cast(p_brand1 as string) p_brand1,

  cast(p_category as string) p_category,

  cast(p_mfgr as string) p_mfgr,

  cast(s_city as string) s_city,

  cast(s_nation as string) s_nation,

  cast(s_region as string) s_region,

  lo_revenue,

  lo_extendedprice * lo_discount discounted_price,

  lo_revenue - lo_supplycost net_revenue

FROM

  ssb_1_flat_orc.customer, ssb_1_flat_orc.dates, ssb_1_flat_orc.lineorder,

  ssb_1_flat_orc.part, ssb_1_flat_orc.supplier

where

  lo_orderdate = d_datekey and lo_partkey = p_partkey

  and lo_suppkey = s_suppkey and lo_custkey = c_custkey

INFO  : We are setting the hadoop caller context from HIVE_SSN_ID:ae1bcabc-646b-4e05-96cb-40fe4990a916 to hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10

INFO  : Semantic Analysis Completed

INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:__time, type:timestamp, comment:null), FieldSchema(name:c_city, type:string, comment:null), FieldSchema(name:c_nation, type:string, comment:null), FieldSchema(name:c_region, type:string, comment:null), FieldSchema(name:d_weeknuminyear, type:string, comment:null), FieldSchema(name:d_year, type:string, comment:null), FieldSchema(name:d_yearmonth, type:string, comment:null), FieldSchema(name:d_yearmonthnum, type:string, comment:null), FieldSchema(name:lo_discount, type:string, comment:null), FieldSchema(name:lo_quantity, type:string, comment:null), FieldSchema(name:p_brand1, type:string, comment:null), FieldSchema(name:p_category, type:string, comment:null), FieldSchema(name:p_mfgr, type:string, comment:null), FieldSchema(name:s_city, type:string, comment:null), FieldSchema(name:s_nation, type:string, comment:null), FieldSchema(name:s_region, type:string, comment:null), FieldSchema(name:lo_revenue, type:double, comment:null), FieldSchema(name:discounted_price, type:double, comment:null), FieldSchema(name:net_revenue, type:double, comment:null)], properties:null)

INFO  : Completed compiling command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10); Time taken: 4.972 seconds

INFO  : We are resetting the hadoop caller context to HIVE_SSN_ID:ae1bcabc-646b-4e05-96cb-40fe4990a916

INFO  : Setting caller context to query id hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10

INFO  : Executing command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10): CREATE TABLE ssb_druid

STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'

TBLPROPERTIES (

  "druid.datasource" = "ssb_druid",

  "druid.segment.granularity" = "MONTH",

  "druid.query.granularity" = "DAY")

AS

SELECT

  cast(d_year || '-' || d_monthnuminyear || '-' || d_daynuminmonth as timestamp) as `__time`,

  cast(c_city as string) c_city,

  cast(c_nation as string) c_nation,

  cast(c_region as string) c_region,

  cast(d_weeknuminyear as string) d_weeknuminyear,

  cast(d_year as string) d_year,

  cast(d_yearmonth as string) d_yearmonth,

  cast(d_yearmonthnum as string) d_yearmonthnum,

  cast(lo_discount as string) lo_discount,

  cast(lo_quantity as string) lo_quantity,

  cast(p_brand1 as string) p_brand1,

  cast(p_category as string) p_category,

  cast(p_mfgr as string) p_mfgr,

  cast(s_city as string) s_city,

  cast(s_nation as string) s_nation,

  cast(s_region as string) s_region,

  lo_revenue,

  lo_extendedprice * lo_discount discounted_price,

  lo_revenue - lo_supplycost net_revenue

FROM

  ssb_1_flat_orc.customer, ssb_1_flat_orc.dates, ssb_1_flat_orc.lineorder,

  ssb_1_flat_orc.part, ssb_1_flat_orc.supplier

where

  lo_orderdate = d_datekey and lo_partkey = p_partkey

  and lo_suppkey = s_suppkey and lo_custkey = c_custkey

INFO  : Query ID = hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10

INFO  : Total jobs = 1

INFO  : Launching Job 1 out of 1

INFO  : Starting task [Stage-1:MAPRED] in serial mode

INFO  : Session is already open

INFO  : Tez session missing resources, adding additional necessary resources

INFO  : Dag name: CREATE TABLE ssb_druid

STORED BY...c_custkey(Stage-1)

INFO  : Setting tez.task.scale.memory.reserve-fraction to 0.30000001192092896

INFO  : Status: Running (Executing on YARN cluster with App id application_1508387771410_0016)




--------------------------------------------------------------------------------

        VERTICES        MODE  STATUS      TOTAL  COMPLETED  RUNNING  PENDING  FAILED

--------------------------------------------------------------------------------

Map 1                   llap  SUCCEEDED          0        0        0       0       0

Map 2                   llap  SUCCEEDED          0        0        0       0       0

Map 4                   llap  SUCCEEDED          0        0        0       0       0

Map 5                   llap  SUCCEEDED          0        0        0       0       0

Map 6                   llap  SUCCEEDED          0        0        0       0       0

Reducer 3 ......        llap  SUCCEEDED          1        1        0       0       0

--------------------------------------------------------------------------------

VERTICES: 01/06  [==========================>>] 100%  ELAPSED TIME: 3.16 s     

--------------------------------------------------------------------------------

INFO  : Status: DAG finished successfully in 2.49 seconds

INFO  : 

INFO  : Query Execution Summary

INFO  : ----------------------------------------------------------------------------------------------

INFO  : OPERATION                            DURATION

INFO  : ----------------------------------------------------------------------------------------------

INFO  : Compile Query                           4.97s

INFO  : Prepare Plan                            2.05s

INFO  : Submit Plan                             0.72s

INFO  : Start DAG                               0.71s

INFO  : Run DAG                                 2.49s

INFO  : ----------------------------------------------------------------------------------------------

INFO  : 

INFO  : Task Execution Summary

INFO  : ----------------------------------------------------------------------------------------------

INFO  :   VERTICES      DURATION(ms)   CPU_TIME(ms)    GC_TIME(ms)   INPUT_RECORDS   OUTPUT_RECORDS

INFO  : ----------------------------------------------------------------------------------------------

INFO  :      Map 1              0.00              0              0               0                0

INFO  :      Map 2              0.00              0              0               0                0

INFO  :      Map 4              0.00              0              0               0                0

INFO  :      Map 5              0.00              0              0               0                0

INFO  :      Map 6              0.00              0              0               0                0

INFO  :  Reducer 3           1411.00              0              0               0                0

INFO  : ----------------------------------------------------------------------------------------------

INFO  : 

INFO  : LLAP IO Summary

INFO  : ----------------------------------------------------------------------------------------------

INFO  :   VERTICES ROWGROUPS  META_HIT  META_MISS  DATA_HIT  DATA_MISS  ALLOCATION     USED  TOTAL_IO

INFO  : ----------------------------------------------------------------------------------------------

INFO  :      Map 1         0         0          0        0B         0B          0B       0B     0.00s

INFO  :      Map 2         0         0          0        0B         0B          0B       0B     0.00s

INFO  :      Map 4         0         0          0        0B         0B          0B       0B     0.00s

INFO  :      Map 5         0         0          0        0B         0B          0B       0B     0.00s

INFO  :      Map 6         0         0          0        0B         0B          0B       0B     0.00s

INFO  : ----------------------------------------------------------------------------------------------

INFO  : 

INFO  : FileSystem Counters Summary

INFO  : 

INFO  : Scheme: HDFS

INFO  : ----------------------------------------------------------------------------------------------

INFO  :   VERTICES      BYTES_READ      READ_OPS     LARGE_READ_OPS      BYTES_WRITTEN     WRITE_OPS

INFO  : ----------------------------------------------------------------------------------------------

INFO  :      Map 1              0B             0                  0                 0B             0

INFO  :      Map 2              0B             0                  0                 0B             0

INFO  :      Map 4              0B             0                  0                 0B             0

INFO  :      Map 5              0B             0                  0                 0B             0

INFO  :      Map 6              0B             0                  0                 0B             0

INFO  :  Reducer 3              0B             1                  0                43B             1

INFO  : ----------------------------------------------------------------------------------------------

INFO  : 

INFO  : Scheme: FILE

INFO  : ----------------------------------------------------------------------------------------------

INFO  :   VERTICES      BYTES_READ      READ_OPS     LARGE_READ_OPS      BYTES_WRITTEN     WRITE_OPS

INFO  : ----------------------------------------------------------------------------------------------

INFO  :      Map 1              0B             0                  0                 0B             0

INFO  :      Map 2              0B             0                  0                 0B             0

INFO  :      Map 4              0B             0                  0                 0B             0

INFO  :      Map 5              0B             0                  0                 0B             0

INFO  :      Map 6              0B             0                  0                 0B             0

INFO  :  Reducer 3              0B             0                  0                 0B             0

INFO  : ----------------------------------------------------------------------------------------------

INFO  : 

INFO  : org.apache.tez.common.counters.DAGCounter:

INFO  :    NUM_SUCCEEDED_TASKS: 1

INFO  :    TOTAL_LAUNCHED_TASKS: 1

INFO  :    AM_CPU_MILLISECONDS: 2810

INFO  :    AM_GC_TIME_MILLIS: 13

INFO  : File System Counters:

INFO  :    FILE_BYTES_READ: 0

INFO  :    FILE_BYTES_WRITTEN: 0

INFO  :    FILE_READ_OPS: 0

INFO  :    FILE_LARGE_READ_OPS: 0

INFO  :    FILE_WRITE_OPS: 0

INFO  :    HDFS_BYTES_READ: 0

INFO  :    HDFS_BYTES_WRITTEN: 43

INFO  :    HDFS_READ_OPS: 1

INFO  :    HDFS_LARGE_READ_OPS: 0

INFO  :    HDFS_WRITE_OPS: 1

INFO  : org.apache.tez.common.counters.TaskCounter:

INFO  :    REDUCE_INPUT_RECORDS: 0

INFO  :    OUTPUT_RECORDS: 0

INFO  :    SHUFFLE_BYTES_DECOMPRESSED: 0

INFO  : HIVE:

INFO  :    RECORDS_OUT_1_default.ssb_druid: 0

INFO  : TaskCounter_Reducer_3_INPUT_Map_2:

INFO  :    REDUCE_INPUT_RECORDS: 0

INFO  :    SHUFFLE_BYTES_DECOMPRESSED: 0

INFO  : TaskCounter_Reducer_3_OUTPUT_out_Reducer_3:

INFO  :    OUTPUT_RECORDS: 0

INFO  : Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode

INFO  : Starting task [Stage-0:MOVE] in serial mode

INFO  : Moving data to directory hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/ssb_druid from hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/.hive-staging_hive_2017-10-19_09-48-11_929_1281256378284617965-1/-ext-10002

INFO  : Starting task [Stage-4:DDL] in serial mode

ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10/segmentsDescriptorDir does not exist.

INFO  : Resetting the caller context to HIVE_SSN_ID:ae1bcabc-646b-4e05-96cb-40fe4990a916

INFO  : Completed executing command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10); Time taken: 6.296 seconds

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10/segmentsDescriptorDir does not exist. (state=08S01,code=1)




Closing: 0: jdbc:hive2://sandbox.hortonworks.com:10500/default

[root@sandbox hive-druid-ssb]# 

3 REPLIES 3

Rising Star

Got it working. The scripts doesn't generate any data when the scale is set to 1. The error is one liner in the CONSOLE output, so it was difficult for me to find out. Even though the data did not get generated, the script continues to create the table rather than exiting.

Explorer

Hi, Can you please share more details about how you got it to work? I'm facing the same problem while trying to create a new table with Druid storage handler.

Explorer
@nbalaji-elangovan

@Megh Vidani

Can you please guide on what you did to make it work? Creating a table on Hive LLAP using Druid Stroage Handler gives me a similar error.