Created on 10-22-2017 01:12 PM - edited 09-16-2022 05:25 AM
Steps: https://github.com/cartershanklin/hive-druid-ssb
Error:
INFO : Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode INFO : Starting task [Stage-0:MOVE] in serial mode INFO : Moving data to directory hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/ssb_druid from hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/.hive-staging_hive_2017-10-19_09-48-11_929_1281256378284617965-1/-ext-10002 INFO : Starting task [Stage-4:DDL] in serial mode ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10/segmentsDescriptorDir does not exist. INFO : Resetting the caller context to HIVE_SSN_ID:ae1bcabc-646b-4e05-96cb-40fe4990a916 INFO : Completed executing command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10); Time taken: 6.296 seconds Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10/segmentsDescriptorDir does not exist. (state=08S01,code=1) Closing: 0: jdbc:hive2://sandbox.hortonworks.com:10500/default [root@sandbox hive-druid-ssb]#
Failing load step:
[root@sandbox hive-druid-ssb]# sh 00load.sh 1 sandbox.hortonworks.com:10500 sandbox.hortonworks.com root hadoop Connecting to jdbc:hive2://sandbox.hortonworks.com:10500/default Connected to: Apache Hive (version 2.1.0.2.6.1.0-129) Driver: Hive JDBC (version 1.2.1000.2.6.1.0-129) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.metadata.username=${DRUID_USERNAME}; No rows affected (0.14 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.metadata.password=${DRUID_PASSWORD}; No rows affected (0.007 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.metadata.uri=jdbc:mysql://${DRUID_HOST}/druid; No rows affected (0.006 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.indexer.partition.size.max=1000000; No rows affected (0.004 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.indexer.memory.rownum.max=100000; No rows affected (0.007 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.broker.address.default=${DRUID_HOST}:8082; No rows affected (0.006 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.coordinator.address.default=${DRUID_HOST}:8081; No rows affected (0.007 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.storage.storageDirectory=/apps/hive/warehouse; No rows affected (0.004 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.tez.container.size=1024; No rows affected (0.007 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> set hive.druid.passiveWaitTimeMs=180000; No rows affected (0.005 seconds) 0: jdbc:hive2://sandbox.hortonworks.com:10500> 0: jdbc:hive2://sandbox.hortonworks.com:10500> CREATE TABLE ssb_druid 0: jdbc:hive2://sandbox.hortonworks.com:10500> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' 0: jdbc:hive2://sandbox.hortonworks.com:10500> TBLPROPERTIES ( 0: jdbc:hive2://sandbox.hortonworks.com:10500> "druid.datasource" = "ssb_druid", 0: jdbc:hive2://sandbox.hortonworks.com:10500> "druid.segment.granularity" = "MONTH", 0: jdbc:hive2://sandbox.hortonworks.com:10500> "druid.query.granularity" = "DAY") 0: jdbc:hive2://sandbox.hortonworks.com:10500> AS 0: jdbc:hive2://sandbox.hortonworks.com:10500> SELECT 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(d_year || '-' || d_monthnuminyear || '-' || d_daynuminmonth as timestamp) as `__time`, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(c_city as string) c_city, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(c_nation as string) c_nation, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(c_region as string) c_region, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(d_weeknuminyear as string) d_weeknuminyear, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(d_year as string) d_year, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(d_yearmonth as string) d_yearmonth, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(d_yearmonthnum as string) d_yearmonthnum, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(lo_discount as string) lo_discount, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(lo_quantity as string) lo_quantity, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(p_brand1 as string) p_brand1, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(p_category as string) p_category, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(p_mfgr as string) p_mfgr, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(s_city as string) s_city, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(s_nation as string) s_nation, 0: jdbc:hive2://sandbox.hortonworks.com:10500> cast(s_region as string) s_region, 0: jdbc:hive2://sandbox.hortonworks.com:10500> lo_revenue, 0: jdbc:hive2://sandbox.hortonworks.com:10500> lo_extendedprice * lo_discount discounted_price, 0: jdbc:hive2://sandbox.hortonworks.com:10500> lo_revenue - lo_supplycost net_revenue 0: jdbc:hive2://sandbox.hortonworks.com:10500> FROM 0: jdbc:hive2://sandbox.hortonworks.com:10500> ssb_${SCALE}_flat_orc.customer, ssb_${SCALE}_flat_orc.dates, ssb_${SCALE}_flat_orc.lineorder, 0: jdbc:hive2://sandbox.hortonworks.com:10500> ssb_${SCALE}_flat_orc.part, ssb_${SCALE}_flat_orc.supplier 0: jdbc:hive2://sandbox.hortonworks.com:10500> where 0: jdbc:hive2://sandbox.hortonworks.com:10500> lo_orderdate = d_datekey and lo_partkey = p_partkey 0: jdbc:hive2://sandbox.hortonworks.com:10500> and lo_suppkey = s_suppkey and lo_custkey = c_custkey; INFO : Compiling command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10): CREATE TABLE ssb_druid STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' TBLPROPERTIES ( "druid.datasource" = "ssb_druid", "druid.segment.granularity" = "MONTH", "druid.query.granularity" = "DAY") AS SELECT cast(d_year || '-' || d_monthnuminyear || '-' || d_daynuminmonth as timestamp) as `__time`, cast(c_city as string) c_city, cast(c_nation as string) c_nation, cast(c_region as string) c_region, cast(d_weeknuminyear as string) d_weeknuminyear, cast(d_year as string) d_year, cast(d_yearmonth as string) d_yearmonth, cast(d_yearmonthnum as string) d_yearmonthnum, cast(lo_discount as string) lo_discount, cast(lo_quantity as string) lo_quantity, cast(p_brand1 as string) p_brand1, cast(p_category as string) p_category, cast(p_mfgr as string) p_mfgr, cast(s_city as string) s_city, cast(s_nation as string) s_nation, cast(s_region as string) s_region, lo_revenue, lo_extendedprice * lo_discount discounted_price, lo_revenue - lo_supplycost net_revenue FROM ssb_1_flat_orc.customer, ssb_1_flat_orc.dates, ssb_1_flat_orc.lineorder, ssb_1_flat_orc.part, ssb_1_flat_orc.supplier where lo_orderdate = d_datekey and lo_partkey = p_partkey and lo_suppkey = s_suppkey and lo_custkey = c_custkey INFO : We are setting the hadoop caller context from HIVE_SSN_ID:ae1bcabc-646b-4e05-96cb-40fe4990a916 to hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10 INFO : Semantic Analysis Completed INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:__time, type:timestamp, comment:null), FieldSchema(name:c_city, type:string, comment:null), FieldSchema(name:c_nation, type:string, comment:null), FieldSchema(name:c_region, type:string, comment:null), FieldSchema(name:d_weeknuminyear, type:string, comment:null), FieldSchema(name:d_year, type:string, comment:null), FieldSchema(name:d_yearmonth, type:string, comment:null), FieldSchema(name:d_yearmonthnum, type:string, comment:null), FieldSchema(name:lo_discount, type:string, comment:null), FieldSchema(name:lo_quantity, type:string, comment:null), FieldSchema(name:p_brand1, type:string, comment:null), FieldSchema(name:p_category, type:string, comment:null), FieldSchema(name:p_mfgr, type:string, comment:null), FieldSchema(name:s_city, type:string, comment:null), FieldSchema(name:s_nation, type:string, comment:null), FieldSchema(name:s_region, type:string, comment:null), FieldSchema(name:lo_revenue, type:double, comment:null), FieldSchema(name:discounted_price, type:double, comment:null), FieldSchema(name:net_revenue, type:double, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10); Time taken: 4.972 seconds INFO : We are resetting the hadoop caller context to HIVE_SSN_ID:ae1bcabc-646b-4e05-96cb-40fe4990a916 INFO : Setting caller context to query id hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10 INFO : Executing command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10): CREATE TABLE ssb_druid STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' TBLPROPERTIES ( "druid.datasource" = "ssb_druid", "druid.segment.granularity" = "MONTH", "druid.query.granularity" = "DAY") AS SELECT cast(d_year || '-' || d_monthnuminyear || '-' || d_daynuminmonth as timestamp) as `__time`, cast(c_city as string) c_city, cast(c_nation as string) c_nation, cast(c_region as string) c_region, cast(d_weeknuminyear as string) d_weeknuminyear, cast(d_year as string) d_year, cast(d_yearmonth as string) d_yearmonth, cast(d_yearmonthnum as string) d_yearmonthnum, cast(lo_discount as string) lo_discount, cast(lo_quantity as string) lo_quantity, cast(p_brand1 as string) p_brand1, cast(p_category as string) p_category, cast(p_mfgr as string) p_mfgr, cast(s_city as string) s_city, cast(s_nation as string) s_nation, cast(s_region as string) s_region, lo_revenue, lo_extendedprice * lo_discount discounted_price, lo_revenue - lo_supplycost net_revenue FROM ssb_1_flat_orc.customer, ssb_1_flat_orc.dates, ssb_1_flat_orc.lineorder, ssb_1_flat_orc.part, ssb_1_flat_orc.supplier where lo_orderdate = d_datekey and lo_partkey = p_partkey and lo_suppkey = s_suppkey and lo_custkey = c_custkey INFO : Query ID = hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10 INFO : Total jobs = 1 INFO : Launching Job 1 out of 1 INFO : Starting task [Stage-1:MAPRED] in serial mode INFO : Session is already open INFO : Tez session missing resources, adding additional necessary resources INFO : Dag name: CREATE TABLE ssb_druid STORED BY...c_custkey(Stage-1) INFO : Setting tez.task.scale.memory.reserve-fraction to 0.30000001192092896 INFO : Status: Running (Executing on YARN cluster with App id application_1508387771410_0016) -------------------------------------------------------------------------------- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED -------------------------------------------------------------------------------- Map 1 llap SUCCEEDED 0 0 0 0 0 Map 2 llap SUCCEEDED 0 0 0 0 0 Map 4 llap SUCCEEDED 0 0 0 0 0 Map 5 llap SUCCEEDED 0 0 0 0 0 Map 6 llap SUCCEEDED 0 0 0 0 0 Reducer 3 ...... llap SUCCEEDED 1 1 0 0 0 -------------------------------------------------------------------------------- VERTICES: 01/06 [==========================>>] 100% ELAPSED TIME: 3.16 s -------------------------------------------------------------------------------- INFO : Status: DAG finished successfully in 2.49 seconds INFO : INFO : Query Execution Summary INFO : ---------------------------------------------------------------------------------------------- INFO : OPERATION DURATION INFO : ---------------------------------------------------------------------------------------------- INFO : Compile Query 4.97s INFO : Prepare Plan 2.05s INFO : Submit Plan 0.72s INFO : Start DAG 0.71s INFO : Run DAG 2.49s INFO : ---------------------------------------------------------------------------------------------- INFO : INFO : Task Execution Summary INFO : ---------------------------------------------------------------------------------------------- INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS INFO : ---------------------------------------------------------------------------------------------- INFO : Map 1 0.00 0 0 0 0 INFO : Map 2 0.00 0 0 0 0 INFO : Map 4 0.00 0 0 0 0 INFO : Map 5 0.00 0 0 0 0 INFO : Map 6 0.00 0 0 0 0 INFO : Reducer 3 1411.00 0 0 0 0 INFO : ---------------------------------------------------------------------------------------------- INFO : INFO : LLAP IO Summary INFO : ---------------------------------------------------------------------------------------------- INFO : VERTICES ROWGROUPS META_HIT META_MISS DATA_HIT DATA_MISS ALLOCATION USED TOTAL_IO INFO : ---------------------------------------------------------------------------------------------- INFO : Map 1 0 0 0 0B 0B 0B 0B 0.00s INFO : Map 2 0 0 0 0B 0B 0B 0B 0.00s INFO : Map 4 0 0 0 0B 0B 0B 0B 0.00s INFO : Map 5 0 0 0 0B 0B 0B 0B 0.00s INFO : Map 6 0 0 0 0B 0B 0B 0B 0.00s INFO : ---------------------------------------------------------------------------------------------- INFO : INFO : FileSystem Counters Summary INFO : INFO : Scheme: HDFS INFO : ---------------------------------------------------------------------------------------------- INFO : VERTICES BYTES_READ READ_OPS LARGE_READ_OPS BYTES_WRITTEN WRITE_OPS INFO : ---------------------------------------------------------------------------------------------- INFO : Map 1 0B 0 0 0B 0 INFO : Map 2 0B 0 0 0B 0 INFO : Map 4 0B 0 0 0B 0 INFO : Map 5 0B 0 0 0B 0 INFO : Map 6 0B 0 0 0B 0 INFO : Reducer 3 0B 1 0 43B 1 INFO : ---------------------------------------------------------------------------------------------- INFO : INFO : Scheme: FILE INFO : ---------------------------------------------------------------------------------------------- INFO : VERTICES BYTES_READ READ_OPS LARGE_READ_OPS BYTES_WRITTEN WRITE_OPS INFO : ---------------------------------------------------------------------------------------------- INFO : Map 1 0B 0 0 0B 0 INFO : Map 2 0B 0 0 0B 0 INFO : Map 4 0B 0 0 0B 0 INFO : Map 5 0B 0 0 0B 0 INFO : Map 6 0B 0 0 0B 0 INFO : Reducer 3 0B 0 0 0B 0 INFO : ---------------------------------------------------------------------------------------------- INFO : INFO : org.apache.tez.common.counters.DAGCounter: INFO : NUM_SUCCEEDED_TASKS: 1 INFO : TOTAL_LAUNCHED_TASKS: 1 INFO : AM_CPU_MILLISECONDS: 2810 INFO : AM_GC_TIME_MILLIS: 13 INFO : File System Counters: INFO : FILE_BYTES_READ: 0 INFO : FILE_BYTES_WRITTEN: 0 INFO : FILE_READ_OPS: 0 INFO : FILE_LARGE_READ_OPS: 0 INFO : FILE_WRITE_OPS: 0 INFO : HDFS_BYTES_READ: 0 INFO : HDFS_BYTES_WRITTEN: 43 INFO : HDFS_READ_OPS: 1 INFO : HDFS_LARGE_READ_OPS: 0 INFO : HDFS_WRITE_OPS: 1 INFO : org.apache.tez.common.counters.TaskCounter: INFO : REDUCE_INPUT_RECORDS: 0 INFO : OUTPUT_RECORDS: 0 INFO : SHUFFLE_BYTES_DECOMPRESSED: 0 INFO : HIVE: INFO : RECORDS_OUT_1_default.ssb_druid: 0 INFO : TaskCounter_Reducer_3_INPUT_Map_2: INFO : REDUCE_INPUT_RECORDS: 0 INFO : SHUFFLE_BYTES_DECOMPRESSED: 0 INFO : TaskCounter_Reducer_3_OUTPUT_out_Reducer_3: INFO : OUTPUT_RECORDS: 0 INFO : Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode INFO : Starting task [Stage-0:MOVE] in serial mode INFO : Moving data to directory hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/ssb_druid from hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/.hive-staging_hive_2017-10-19_09-48-11_929_1281256378284617965-1/-ext-10002 INFO : Starting task [Stage-4:DDL] in serial mode ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10/segmentsDescriptorDir does not exist. INFO : Resetting the caller context to HIVE_SSN_ID:ae1bcabc-646b-4e05-96cb-40fe4990a916 INFO : Completed executing command(queryId=hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10); Time taken: 6.296 seconds Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20171019094811_def7c784-7ab4-4803-943e-65c5b36f8d10/segmentsDescriptorDir does not exist. (state=08S01,code=1) Closing: 0: jdbc:hive2://sandbox.hortonworks.com:10500/default [root@sandbox hive-druid-ssb]#
Created 10-23-2017 02:04 PM
Got it working. The scripts doesn't generate any data when the scale is set to 1. The error is one liner in the CONSOLE output, so it was difficult for me to find out. Even though the data did not get generated, the script continues to create the table rather than exiting.
Created 05-07-2018 03:22 PM
Hi, Can you please share more details about how you got it to work? I'm facing the same problem while trying to create a new table with Druid storage handler.
Created 12-18-2018 09:47 AM
Can you please guide on what you did to make it work? Creating a table on Hive LLAP using Druid Stroage Handler gives me a similar error.