Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sandbox v3.1.0 performance issue; group by query over 5400 rows - 6 minutes to run.

Highlighted

Sandbox v3.1.0 performance issue; group by query over 5400 rows - 6 minutes to run.

New Contributor

Following the tutorial;

https://hortonworks.com/tutorial/hadoop-tutorial-getting-started-with-hdp/section/3/#analyze-the-tru...

Created truckmileage table ok;

Below query takes 367 secs to run in Hive, hangs in DAS (Data Analytics Studio)

SELECT truckid, avg(mpg) avgmpg FROM truckmileage GROUP BY truckid;

Why is the performance so bad?

Sandbox is running on VMware with 16Gb of memory.

Host has 32Gb memory, SDD, i7 7th Gen.

Is there a better development tool than DAS, which will

1) keep metadata up to date ? i.e. tables in 'default' schema after drop of table ?

2) report errors (although this might be a cache issue), similar to above?

Hive Output

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

login as: root
root@sandbox-hdp.hortonworks.com's password:
Last login: Mon Jun 24 11:42:11 2019
[root@sandbox-hdp ~]# wget sandbox-hdp.hortonworks.com:1080
--2019-07-22 11:54:12-- http://sandbox-hdp.hortonworks.com:1080/
Resolving sandbox-hdp.hortonworks.com (sandbox-hdp.hortonworks.com)... 172.18.0.2
Connecting to sandbox-hdp.hortonworks.com (sandbox-hdp.hortonworks.com)|172.18.0.2|:1080... connected.
HTTP request sent, awaiting response... 302 Found
Location: /splash.html [following]
--2019-07-22 11:54:12-- http://sandbox-hdp.hortonworks.com:1080/splash.html
Connecting to sandbox-hdp.hortonworks.com (sandbox-hdp.hortonworks.com)|172.18.0.2|:1080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2446 (2.4K) [text/html]
Saving to: ‘index.html.6’

100%[==========================================================================================================>] 2,446 --.-K/s in 0.01s

2019-07-22 11:54:12 (225 KB/s) - ‘index.html.6’ saved [2446/2446]

[root@sandbox-hdp ~]# wget sandbox-hdp.hortonworks.com:8080
--2019-07-22 11:54:22-- http://sandbox-hdp.hortonworks.com:8080/
Resolving sandbox-hdp.hortonworks.com (sandbox-hdp.hortonworks.com)... 172.18.0.2
Connecting to sandbox-hdp.hortonworks.com (sandbox-hdp.hortonworks.com)|172.18.0.2|:8080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2075 (2.0K) [text/html]
Saving to: ‘index.html.7’

100%[==========================================================================================================>] 2,075 --.-K/s in 0s

2019-07-22 11:54:22 (258 MB/s) - ‘index.html.7’ saved [2075/2075]

[root@sandbox-hdp ~]# wget sandbox-hdp.hortonworks.com:30800
--2019-07-22 11:54:52-- http://sandbox-hdp.hortonworks.com:30800/
Resolving sandbox-hdp.hortonworks.com (sandbox-hdp.hortonworks.com)... 172.18.0.2
Connecting to sandbox-hdp.hortonworks.com (sandbox-hdp.hortonworks.com)|172.18.0.2|:30800... failed: Connection refused.
[root@sandbox-hdp ~]# spark
-bash: spark: command not found
[root@sandbox-hdp ~]# spark-sql
SPARK_MAJOR_VERSION is set to 2, using Spark2
19/07/22 12:34:38 INFO metastore: Trying to connect to metastore with URI thrift://sandbox-hdp.hortonworks.com:9083
19/07/22 12:34:39 INFO metastore: Connected to metastore.
19/07/22 12:34:41 INFO SessionState: Created HDFS directory: /tmp/spark/root
19/07/22 12:34:41 INFO SessionState: Created local directory: /tmp/root
19/07/22 12:34:41 INFO SessionState: Created local directory: /tmp/de7aaa34-604c-45b5-9670-a3bb2f98766b_resources
19/07/22 12:34:41 INFO SessionState: Created HDFS directory: /tmp/spark/root/de7aaa34-604c-45b5-9670-a3bb2f98766b
19/07/22 12:34:41 INFO SessionState: Created local directory: /tmp/root/de7aaa34-604c-45b5-9670-a3bb2f98766b
19/07/22 12:34:41 INFO SessionState: Created HDFS directory: /tmp/spark/root/de7aaa34-604c-45b5-9670-a3bb2f98766b/_tmp_space.db
19/07/22 12:34:41 INFO SparkContext: Running Spark version 2.3.1.3.0.1.0-187
19/07/22 12:34:41 INFO SparkContext: Submitted application: SparkSQL::172.18.0.2
19/07/22 12:34:41 INFO SecurityManager: Changing view acls to: root
19/07/22 12:34:41 INFO SecurityManager: Changing modify acls to: root
19/07/22 12:34:41 INFO SecurityManager: Changing view acls groups to:
19/07/22 12:34:41 INFO SecurityManager: Changing modify acls groups to:
19/07/22 12:34:41 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/07/22 12:34:42 INFO Utils: Successfully started service 'sparkDriver' on port 40389.
19/07/22 12:34:42 INFO SparkEnv: Registering MapOutputTracker
19/07/22 12:34:42 INFO SparkEnv: Registering BlockManagerMaster
19/07/22 12:34:42 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
19/07/22 12:34:42 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
19/07/22 12:34:42 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-ce9a4897-dab9-492e-a312-c1d776e5ecb7
19/07/22 12:34:42 INFO MemoryStore: MemoryStore started with capacity 93.3 MB
19/07/22 12:34:42 INFO SparkEnv: Registering OutputCommitCoordinator
19/07/22 12:34:42 INFO log: Logging initialized @8627ms
19/07/22 12:34:43 INFO Server: jetty-9.3.z-SNAPSHOT, build timestamp: 2018-06-05T17:11:56Z, git hash: 84205aa28f11a4f31f2a3b86d1bba2cc8ab69827
19/07/22 12:34:43 INFO Server: Started @8993ms
19/07/22 12:34:43 INFO AbstractConnector: Started ServerConnector@474c9131{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
19/07/22 12:34:43 INFO Utils: Successfully started service 'SparkUI' on port 4040.
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@859ea42{/jobs,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@396ef8b2{/jobs/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@72825400{/jobs/job,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5f117b3d{/jobs/job/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1174a305{/stages,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@71b6d77f{/stages/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1866da85{/stages/stage,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@3f685162{/stages/stage/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11f406f8{/stages/pool,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@987455b{/stages/pool/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@622fdb81{/storage,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1f3165e7{/storage/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@2ec3633f{/storage/rdd,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1d5d5621{/storage/rdd/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@13275d8{/environment,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@15b82644{/environment/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@20576557{/executors,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@574cd322{/executors/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@45c2e0a6{/executors/threadDump,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@119c745c{/executors/threadDump/json,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@a7ad6e5{/static,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@26221bad{/,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@158f4cfe{/api,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@3e47a03{/jobs/job/kill,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@7d9ba6c{/stages/stage/kill,null,AVAILABLE,@Spark}
19/07/22 12:34:43 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://sandbox-hdp.hortonworks.com:4040
19/07/22 12:34:43 INFO RMProxy: Connecting to ResourceManager at sandbox-hdp.hortonworks.com/172.18.0.2:8050
19/07/22 12:34:44 INFO Client: Requesting a new application from cluster with 1 NodeManagers
19/07/22 12:34:44 INFO Configuration: found resource resource-types.xml at file:/etc/hadoop/3.0.1.0-187/0/resource-types.xml
19/07/22 12:34:44 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (4096 MB per container)
19/07/22 12:34:44 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
19/07/22 12:34:44 INFO Client: Setting up container launch context for our AM
19/07/22 12:34:44 INFO Client: Setting up the launch environment for our AM container
19/07/22 12:34:44 INFO Client: Preparing resources for our AM container
19/07/22 12:34:46 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs://sandbox-hdp.hortonworks.com:8020/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-yarn-archive.tar.gz
19/07/22 12:34:46 INFO Client: Source and destination file systems are the same. Not copying hdfs://sandbox-hdp.hortonworks.com:8020/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-yarn-archive.tar.gz
19/07/22 12:34:46 INFO Client: Distribute hdfs cache file as spark.sql.hive.metastore.jars for HDP, hdfsCacheFile:hdfs://sandbox-hdp.hortonworks.com:8020/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-hive-archive.tar.gz
19/07/22 12:34:46 INFO Client: Source and destination file systems are the same. Not copying hdfs://sandbox-hdp.hortonworks.com:8020/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-hive-archive.tar.gz
19/07/22 12:34:47 INFO Client: Uploading resource file:/tmp/spark-b5dc651c-8276-48f7-b524-a85c9beadded/__spark_conf__6301860785565142139.zip -> hdfs://sandbox-hdp.hortonworks.com:8020/user/root/.sparkStaging/application_1563796668538_0003/__spark_conf__.zip
19/07/22 12:34:47 INFO SecurityManager: Changing view acls to: root
19/07/22 12:34:47 INFO SecurityManager: Changing modify acls to: root
19/07/22 12:34:47 INFO SecurityManager: Changing view acls groups to:
19/07/22 12:34:47 INFO SecurityManager: Changing modify acls groups to:
19/07/22 12:34:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/07/22 12:34:47 INFO Client: Submitting application application_1563796668538_0003 to ResourceManager
19/07/22 12:34:48 INFO YarnClientImpl: Submitted application application_1563796668538_0003
19/07/22 12:34:48 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1563796668538_0003 and attemptId None
19/07/22 12:34:49 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:49 INFO Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1563798887829
final status: UNDEFINED
tracking URL: http://sandbox-hdp.hortonworks.com:8088/proxy/application_1563796668538_0003/
user: root
19/07/22 12:34:50 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:51 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:52 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:53 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:54 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:55 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:56 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:57 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:58 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:59 INFO Client: Application report for application_1563796668538_0003 (state: ACCEPTED)
19/07/22 12:34:59 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> sandbox-hdp.hortonworks.com, PROXY_URI_BASES -> http://sandbox-hdp.hortonworks.com:8088/proxy/application_1563796668538_0003), /proxy/application_1563796668538_0003
19/07/22 12:34:59 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
19/07/22 12:35:00 INFO Client: Application report for application_1563796668538_0003 (state: RUNNING)
19/07/22 12:35:00 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 172.18.0.2
ApplicationMaster RPC port: 0
queue: default
start time: 1563798887829
final status: UNDEFINED
tracking URL: http://sandbox-hdp.hortonworks.com:8088/proxy/application_1563796668538_0003/
user: root
19/07/22 12:35:00 INFO YarnClientSchedulerBackend: Application application_1563796668538_0003 has started running.
19/07/22 12:35:00 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41283.
19/07/22 12:35:00 INFO NettyBlockTransferService: Server created on sandbox-hdp.hortonworks.com:41283
19/07/22 12:35:00 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19/07/22 12:35:00 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
19/07/22 12:35:00 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, sandbox-hdp.hortonworks.com, 41283, None)
19/07/22 12:35:00 INFO BlockManagerMasterEndpoint: Registering block manager sandbox-hdp.hortonworks.com:41283 with 93.3 MB RAM, BlockManagerId(driver, sandbox-hdp.hortonworks.com, 41283, None)
19/07/22 12:35:00 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, sandbox-hdp.hortonworks.com, 41283, None)
19/07/22 12:35:00 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, sandbox-hdp.hortonworks.com, 41283, None)
19/07/22 12:35:01 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
19/07/22 12:35:01 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4e4395c{/metrics/json,null,AVAILABLE,@Spark}
19/07/22 12:35:02 INFO EventLoggingListener: Logging events to hdfs:/spark2-history/application_1563796668538_0003
19/07/22 12:35:09 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (172.18.0.2:41560) with ID 2
19/07/22 12:35:09 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (172.18.0.2:41564) with ID 1
19/07/22 12:35:09 INFO BlockManagerMasterEndpoint: Registering block manager sandbox-hdp.hortonworks.com:36001 with 93.3 MB RAM, BlockManagerId(2, sandbox-hdp.hortonworks.com, 36001, None)
19/07/22 12:35:09 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
19/07/22 12:35:09 INFO SharedState: loading hive config file: file:/etc/spark2/3.0.1.0-187/0/hive-site.xml
19/07/22 12:35:09 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse').
19/07/22 12:35:09 INFO SharedState: Warehouse path is '/apps/spark/warehouse'.
19/07/22 12:35:09 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL.
19/07/22 12:35:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@fe42a09{/SQL,null,AVAILABLE,@Spark}
19/07/22 12:35:09 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json.
19/07/22 12:35:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1ffd0114{/SQL/json,null,AVAILABLE,@Spark}
19/07/22 12:35:09 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution.
19/07/22 12:35:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@99c4993{/SQL/execution,null,AVAILABLE,@Spark}
19/07/22 12:35:09 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json.
19/07/22 12:35:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@9729a97{/SQL/execution/json,null,AVAILABLE,@Spark}
19/07/22 12:35:09 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql.
19/07/22 12:35:09 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@64bba0eb{/static/sql,null,AVAILABLE,@Spark}
19/07/22 12:35:09 INFO BlockManagerMasterEndpoint: Registering block manager sandbox-hdp.hortonworks.com:36081 with 93.3 MB RAM, BlockManagerId(1, sandbox-hdp.hortonworks.com, 36081, None)
19/07/22 12:35:09 INFO HiveUtils: Initializing HiveMetastoreConnection version 3.0 using file:/usr/hdp/current/spark2-client/standalone-metastore/standalone-metastore-1.21.2.3.0.1.0-187-hive3.jar:file:/usr/hdp/current/spark2-client/standalone-metastore/standalone-metastore-1.21.2.3.0.1.0-187-hive3.jar
19/07/22 12:35:10 INFO HiveConf: Found configuration file file:/usr/hdp/current/spark2-client/conf/hive-site.xml
Hive Session ID = c260d73e-bac3-4604-9ef0-e73948f1d8f8
19/07/22 12:35:10 INFO SessionState: Hive Session ID = c260d73e-bac3-4604-9ef0-e73948f1d8f8
19/07/22 12:35:11 INFO SessionState: Created HDFS directory: /tmp/spark/root/c260d73e-bac3-4604-9ef0-e73948f1d8f8
19/07/22 12:35:11 INFO SessionState: Created local directory: /tmp/root/c260d73e-bac3-4604-9ef0-e73948f1d8f8
19/07/22 12:35:11 INFO SessionState: Created HDFS directory: /tmp/spark/root/c260d73e-bac3-4604-9ef0-e73948f1d8f8/_tmp_space.db
19/07/22 12:35:11 INFO HiveClientImpl: Warehouse location for Hive client (version 3.0.0) is /apps/spark/warehouse
19/07/22 12:35:12 INFO HiveMetaStoreClient: Trying to connect to metastore with URI thrift://sandbox-hdp.hortonworks.com:9083
19/07/22 12:35:12 INFO HiveMetaStoreClient: Opened a connection to metastore, current connections: 1
19/07/22 12:35:12 INFO HiveMetaStoreClient: Connected to metastore.
19/07/22 12:35:12 INFO RetryingMetaStoreClient: RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=root (auth:SIMPLE) retries=1 delay=5 lifetime=0
19/07/22 12:35:14 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
spark-sql> show tables;
Time taken: 1.543 seconds
19/07/22 12:35:43 INFO SparkSQLCLIDriver: Time taken: 1.543 seconds
spark-sql> show database;
Error in query:
missing 'FUNCTIONS' at '<EOF>'(line 1, pos 13)

== SQL ==
show database
-------------^^^

spark-sql> exit;
19/07/22 12:44:59 INFO AbstractConnector: Stopped Spark@474c9131{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
19/07/22 12:44:59 INFO SparkUI: Stopped Spark web UI at http://sandbox-hdp.hortonworks.com:4040
19/07/22 12:44:59 INFO YarnClientSchedulerBackend: Interrupting monitor thread
19/07/22 12:44:59 INFO YarnClientSchedulerBackend: Shutting down all executors
19/07/22 12:44:59 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
19/07/22 12:44:59 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
19/07/22 12:44:59 INFO YarnClientSchedulerBackend: Stopped
19/07/22 12:44:59 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
19/07/22 12:44:59 INFO MemoryStore: MemoryStore cleared
19/07/22 12:44:59 INFO BlockManager: BlockManager stopped
19/07/22 12:44:59 INFO BlockManagerMaster: BlockManagerMaster stopped
19/07/22 12:45:00 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/07/22 12:45:00 INFO SparkContext: Successfully stopped SparkContext
19/07/22 12:45:00 INFO ShutdownHookManager: Shutdown hook called
19/07/22 12:45:00 INFO ShutdownHookManager: Deleting directory /tmp/spark-4c39aa1b-cf45-4e62-a31f-b0d456f26e3f
19/07/22 12:45:00 INFO ShutdownHookManager: Deleting directory /tmp/spark-b5dc651c-8276-48f7-b524-a85c9beadded
[root@sandbox-hdp ~]# hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://sandbox-hdp.hortonworks.com:2181/default;password=hive;serviceDiscoveryMode=zooKeeper;user=hive;zooKeeperNamespace=hiveserver2
19/07/22 12:45:13 [main]: INFO jdbc.HiveConnection: Connected to sandbox-hdp.hortonworks.com:10000
Connected to: Apache Hive (version 3.1.0.3.0.1.0-187)
Driver: Hive JDBC (version 3.1.0.3.0.1.0-187)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.0.3.0.1.0-187 by Apache Hive
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2>
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> show tables;
INFO : Compiling command(queryId=hive_20190722124529_307d7bf1-25a5-408a-8a9f-a3f8a897594b): show tables
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=hive_20190722124529_307d7bf1-25a5-408a-8a9f-a3f8a897594b); Time taken: 0.055 seconds
INFO : Executing command(queryId=hive_20190722124529_307d7bf1-25a5-408a-8a9f-a3f8a897594b): show tables
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20190722124529_307d7bf1-25a5-408a-8a9f-a3f8a897594b); Time taken: 0.027 seconds
INFO : OK
+---------------+
| tab_name |
+---------------+
| geolocation |
| truckmileage |
| trucks |
+---------------+
3 rows selected (0.34 seconds)
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> select count(*) from trucks;
INFO : Compiling command(queryId=hive_20190722124551_523c67eb-6abc-4669-8dce-a0ca6fa2772b): select count(*) from trucks
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20190722124551_523c67eb-6abc-4669-8dce-a0ca6fa2772b); Time taken: 1.575 seconds
INFO : Executing command(queryId=hive_20190722124551_523c67eb-6abc-4669-8dce-a0ca6fa2772b): select count(*) from trucks
INFO : Completed executing command(queryId=hive_20190722124551_523c67eb-6abc-4669-8dce-a0ca6fa2772b); Time taken: 0.033 seconds
INFO : OK
+------+
| _c0 |
+------+
| 100 |
+------+
1 row selected (1.764 seconds)
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> select count(*) from truckmileage;
INFO : Compiling command(queryId=hive_20190722124615_f37b1297-c900-41d7-b5dd-83ab3516141d): select count(*) from truckmileage
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20190722124615_f37b1297-c900-41d7-b5dd-83ab3516141d); Time taken: 0.286 seconds
INFO : Executing command(queryId=hive_20190722124615_f37b1297-c900-41d7-b5dd-83ab3516141d): select count(*) from truckmileage
INFO : Completed executing command(queryId=hive_20190722124615_f37b1297-c900-41d7-b5dd-83ab3516141d); Time taken: 0.01 seconds
INFO : OK
+-------+
| _c0 |
+-------+
| 5400 |
+-------+
1 row selected (0.359 seconds)
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> select * from truckmileage limit 10;
INFO : Compiling command(queryId=hive_20190722124828_b0d3d206-3951-4949-babb-8fb5c9a45364): select * from truckmileage limit 10
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:truckmileage.truckid, type:string, comment:null), FieldSchema(name:truckmileage.driverid, type:string, comment:null), FieldSchema(name:truckmileage.rdate, type:string, comment:null), FieldSchema(name:truckmileage.miles, type:int, comment:null), FieldSchema(name:truckmileage.gas, type:int, comment:null), FieldSchema(name:truckmileage.mpg, type:double, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20190722124828_b0d3d206-3951-4949-babb-8fb5c9a45364); Time taken: 0.289 seconds
INFO : Executing command(queryId=hive_20190722124828_b0d3d206-3951-4949-babb-8fb5c9a45364): select * from truckmileage limit 10
INFO : Completed executing command(queryId=hive_20190722124828_b0d3d206-3951-4949-babb-8fb5c9a45364); Time taken: 0.012 seconds
INFO : OK
+-----------------------+------------------------+---------------------+---------------------+-------------------+--------------------+
| truckmileage.truckid | truckmileage.driverid | truckmileage.rdate | truckmileage.miles | truckmileage.gas | truckmileage.mpg |
+-----------------------+------------------------+---------------------+---------------------+-------------------+--------------------+
| A1 | A1 | jun13 | 9217 | 1914 | 4.815569487983281 |
| A1 | A1 | may13 | 8769 | 1892 | 4.63477801268499 |
| A1 | A1 | apr13 | 14234 | 3008 | 4.732047872340425 |
| A1 | A1 | mar13 | 11519 | 2262 | 5.092396109637489 |
| A1 | A1 | feb13 | 8676 | 1596 | 5.43609022556391 |
| A1 | A1 | jan13 | 10025 | 1878 | 5.338125665601704 |
| A1 | A1 | dec12 | 12647 | 2331 | 5.425568425568426 |
| A1 | A1 | nov12 | 10214 | 2054 | 4.972736124634859 |
| A1 | A1 | oct12 | 10807 | 2134 | 5.064198687910028 |
| A1 | A1 | sep12 | 11127 | 2191 | 5.078502966681881 |
+-----------------------+------------------------+---------------------+---------------------+-------------------+--------------------+
10 rows selected (0.426 seconds)
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> drop table truckmileage;
INFO : Compiling command(queryId=hive_20190722125349_d02175fb-629c-493f-ba13-7b3e92f9a2e6): drop table truckmileage
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20190722125349_d02175fb-629c-493f-ba13-7b3e92f9a2e6); Time taken: 0.073 seconds
INFO : Executing command(queryId=hive_20190722125349_d02175fb-629c-493f-ba13-7b3e92f9a2e6): drop table truckmileage
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20190722125349_d02175fb-629c-493f-ba13-7b3e92f9a2e6); Time taken: 0.449 seconds
INFO : OK
No rows affected (0.578 seconds)
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> show tables;
INFO : Compiling command(queryId=hive_20190722125420_7078d451-9c33-4a1e-b2b7-45d460ad2782): show tables
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=hive_20190722125420_7078d451-9c33-4a1e-b2b7-45d460ad2782); Time taken: 0.052 seconds
INFO : Executing command(queryId=hive_20190722125420_7078d451-9c33-4a1e-b2b7-45d460ad2782): show tables
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20190722125420_7078d451-9c33-4a1e-b2b7-45d460ad2782); Time taken: 0.072 seconds
INFO : OK
+--------------+
| tab_name |
+--------------+
| geolocation |
| trucks |
+--------------+
2 rows selected (0.253 seconds)
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> show tables;
INFO : Compiling command(queryId=hive_20190722131852_1be196f2-bf54-4b0d-9173-59668fe20819): show tables
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=hive_20190722131852_1be196f2-bf54-4b0d-9173-59668fe20819); Time taken: 0.044 seconds
INFO : Executing command(queryId=hive_20190722131852_1be196f2-bf54-4b0d-9173-59668fe20819): show tables
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20190722131852_1be196f2-bf54-4b0d-9173-59668fe20819); Time taken: 0.027 seconds
INFO : OK
+--------------+
| tab_name |
+--------------+
| geolocation |
| trucks |
+--------------+
2 rows selected (0.099 seconds)
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> SELECT truckid, avg(mpg) avgmpg FROM truckmileage GROUP BY truckid;
INFO : Compiling command(queryId=hive_20190722134150_70ee823f-8dbe-457e-ba92-ef523f334fa5): SELECT truckid, avg(mpg) avgmpg FROM truckmileage GROUP BY truckid
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:truckid, type:string, comment:null), FieldSchema(name:avgmpg, type:double, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20190722134150_70ee823f-8dbe-457e-ba92-ef523f334fa5); Time taken: 0.496 seconds
INFO : Executing command(queryId=hive_20190722134150_70ee823f-8dbe-457e-ba92-ef523f334fa5): SELECT truckid, avg(mpg) avgmpg FROM truckmileage GROUP BY truckid
INFO : Query ID = hive_20190722134150_70ee823f-8dbe-457e-ba92-ef523f334fa5
INFO : Total jobs = 1
INFO : Launching Job 1 out of 1
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Subscribed to counters: [] for queryId: hive_20190722134150_70ee823f-8dbe-457e-ba92-ef523f334fa5
INFO : Tez session hasn't been created yet. Opening session
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... container SUCCEEDED 2 2 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 11.82 s
----------------------------------------------------------------------------------------------
INFO : Status: DAG finished successfully in 11.76 seconds
INFO :
INFO : Query Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : OPERATION DURATION
INFO : ----------------------------------------------------------------------------------------------
INFO : Compile Query 0.50s
INFO : Prepare Plan 352.98s
INFO : Get Query Coordinator (AM) 0.00s
INFO : Submit Plan 0.53s
INFO : Start DAG 1.66s
INFO : Run DAG 11.76s
INFO : ----------------------------------------------------------------------------------------------
INFO :
INFO : Task Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS
INFO : ----------------------------------------------------------------------------------------------
INFO : Map 1 3130.00 6,700 155 5,400 100
INFO : Reducer 2 5409.00 7,460 125 100 0
INFO : ----------------------------------------------------------------------------------------------
INFO :
INFO : org.apache.tez.common.counters.DAGCounter:
INFO : NUM_SUCCEEDED_TASKS: 3
INFO : TOTAL_LAUNCHED_TASKS: 3
INFO : AM_CPU_MILLISECONDS: 4780
INFO : AM_GC_TIME_MILLIS: 0
INFO : File System Counters:
INFO : FILE_BYTES_READ: 2176
INFO : FILE_BYTES_WRITTEN: 1398
INFO : HDFS_BYTES_READ: 56428
INFO : HDFS_BYTES_WRITTEN: 3563
INFO : HDFS_READ_OPS: 8
INFO : HDFS_WRITE_OPS: 4
INFO : HDFS_OP_CREATE: 2
INFO : HDFS_OP_GET_FILE_STATUS: 6
INFO : HDFS_OP_OPEN: 2
INFO : HDFS_OP_RENAME: 2
INFO : org.apache.tez.common.counters.TaskCounter:
INFO : REDUCE_INPUT_GROUPS: 100
INFO : REDUCE_INPUT_RECORDS: 100
INFO : COMBINE_INPUT_RECORDS: 0
INFO : SPILLED_RECORDS: 200
INFO : NUM_SHUFFLED_INPUTS: 2
INFO : NUM_SKIPPED_INPUTS: 0
INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
INFO : MERGED_MAP_OUTPUTS: 2
INFO : GC_TIME_MILLIS: 280
INFO : TASK_DURATION_MILLIS: 7301
INFO : CPU_MILLISECONDS: 14160
INFO : PHYSICAL_MEMORY_BYTES: 1946157056
INFO : VIRTUAL_MEMORY_BYTES: 8015695872
INFO : COMMITTED_HEAP_BYTES: 1946157056
INFO : INPUT_RECORDS_PROCESSED: 6
INFO : INPUT_SPLIT_LENGTH_BYTES: 55074
INFO : OUTPUT_RECORDS: 100
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_BYTES: 1592
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 1804
INFO : OUTPUT_BYTES_PHYSICAL: 1342
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILLS_BYTES_READ: 1342
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : SHUFFLE_CHUNK_COUNT: 1
INFO : SHUFFLE_BYTES: 1342
INFO : SHUFFLE_BYTES_DECOMPRESSED: 1804
INFO : SHUFFLE_BYTES_TO_MEM: 0
INFO : SHUFFLE_BYTES_TO_DISK: 0
INFO : SHUFFLE_BYTES_DISK_DIRECT: 1342
INFO : NUM_MEM_TO_DISK_MERGES: 0
INFO : NUM_DISK_TO_DISK_MERGES: 0
INFO : SHUFFLE_PHASE_TIME: 256
INFO : MERGE_PHASE_TIME: 418
INFO : FIRST_EVENT_RECEIVED: 150
INFO : LAST_EVENT_RECEIVED: 150
INFO : HIVE:
INFO : CREATED_FILES: 2
INFO : DESERIALIZE_ERRORS: 0
INFO : RECORDS_IN_Map_1: 5400
INFO : RECORDS_OUT_0: 100
INFO : RECORDS_OUT_INTERMEDIATE_Map_1: 100
INFO : RECORDS_OUT_INTERMEDIATE_Reducer_2: 0
INFO : RECORDS_OUT_OPERATOR_FS_12: 100
INFO : RECORDS_OUT_OPERATOR_GBY_10: 100
INFO : RECORDS_OUT_OPERATOR_GBY_8: 100
INFO : RECORDS_OUT_OPERATOR_MAP_0: 0
INFO : RECORDS_OUT_OPERATOR_RS_9: 100
INFO : RECORDS_OUT_OPERATOR_SEL_11: 100
INFO : RECORDS_OUT_OPERATOR_SEL_7: 5400
INFO : RECORDS_OUT_OPERATOR_TS_0: 5400
INFO : Shuffle Errors:
INFO : BAD_ID: 0
INFO : CONNECTION: 0
INFO : IO_ERROR: 0
INFO : WRONG_LENGTH: 0
INFO : WRONG_MAP: 0
INFO : WRONG_REDUCE: 0
INFO : Shuffle Errors_Reducer_2_INPUT_Map_1:
INFO : BAD_ID: 0
INFO : CONNECTION: 0
INFO : IO_ERROR: 0
INFO : WRONG_LENGTH: 0
INFO : WRONG_MAP: 0
INFO : WRONG_REDUCE: 0
INFO : TaskCounter_Map_1_INPUT_truckmileage:
INFO : INPUT_RECORDS_PROCESSED: 6
INFO : INPUT_SPLIT_LENGTH_BYTES: 55074
INFO : TaskCounter_Map_1_OUTPUT_Reducer_2:
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : OUTPUT_BYTES: 1592
INFO : OUTPUT_BYTES_PHYSICAL: 1342
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 1804
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_RECORDS: 100
INFO : SHUFFLE_CHUNK_COUNT: 1
INFO : SPILLED_RECORDS: 100
INFO : TaskCounter_Reducer_2_INPUT_Map_1:
INFO : ADDITIONAL_SPILLS_BYTES_READ: 1342
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : COMBINE_INPUT_RECORDS: 0
INFO : FIRST_EVENT_RECEIVED: 150
INFO : LAST_EVENT_RECEIVED: 150
INFO : MERGED_MAP_OUTPUTS: 2
INFO : MERGE_PHASE_TIME: 418
INFO : NUM_DISK_TO_DISK_MERGES: 0
INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
INFO : NUM_MEM_TO_DISK_MERGES: 0
INFO : NUM_SHUFFLED_INPUTS: 2
INFO : NUM_SKIPPED_INPUTS: 0
INFO : REDUCE_INPUT_GROUPS: 100
INFO : REDUCE_INPUT_RECORDS: 100
INFO : SHUFFLE_BYTES: 1342
INFO : SHUFFLE_BYTES_DECOMPRESSED: 1804
INFO : SHUFFLE_BYTES_DISK_DIRECT: 1342
INFO : SHUFFLE_BYTES_TO_DISK: 0
INFO : SHUFFLE_BYTES_TO_MEM: 0
INFO : SHUFFLE_PHASE_TIME: 256
INFO : SPILLED_RECORDS: 100
INFO : TaskCounter_Reducer_2_OUTPUT_out_Reducer_2:
INFO : OUTPUT_RECORDS: 0
INFO : org.apache.hadoop.hive.ql.exec.tez.HiveInputCounters:
INFO : GROUPED_INPUT_SPLITS_Map_1: 1
INFO : INPUT_DIRECTORIES_Map_1: 1
INFO : INPUT_FILES_Map_1: 1
INFO : RAW_INPUT_SPLITS_Map_1: 1
INFO : Completed executing command(queryId=hive_20190722134150_70ee823f-8dbe-457e-ba92-ef523f334fa5); Time taken: 366.94 seconds
INFO : OK
+----------+---------------------+
| truckid | avgmpg |
+----------+---------------------+
| A100 | 4.939038953107008 |
| A13 | 5.8205454797114795 |
| A15 | 5.009636162465418 |
| A18 | 6.295206724472384 |
| A19 | 5.04935270034368 |
| A20 | 4.434637900251125 |
| A25 | 4.756763642452679 |
| A27 | 4.940763169593083 |
| A29 | 4.652011108298253 |
| A30 | 6.202007499297939 |
| A31 | 5.186630121374391 |
| A32 | 5.17353087986957 |
| A33 | 5.65188891838994 |
| A34 | 4.739824311056264 |
| A35 | 5.738289483023642 |
| A38 | 5.917100754501856 |
| A4 | 4.504676811811306 |
| A41 | 4.855358136222561 |
| A42 | 4.559024143168264 |
| A43 | 4.43516544002827 |
| A44 | 5.018996103708838 |
| A45 | 5.208540799607152 |
| A46 | 5.113125679089717 |
| A47 | 4.722808315173039 |
| A48 | 5.592252419550423 |
| A49 | 5.62379167691581 |
| A5 | 5.223425534662345 |
| A53 | 4.669730376107297 |
| A54 | 5.122082710218873 |
| A56 | 4.765154211116589 |
| A57 | 4.745868653350815 |
| A58 | 4.337846525372068 |
| A59 | 5.623936629997015 |
| A6 | 5.709810313341764 |
| A61 | 6.238571478738324 |
| A65 | 4.582795667934662 |
| A66 | 5.422210415994836 |
| A67 | 5.712187104402484 |
| A68 | 5.232702632427136 |
| A72 | 4.86017814628895 |
| A73 | 4.861641727794131 |
| A74 | 5.507235804693363 |
| A75 | 5.023487463406983 |
| A78 | 4.9469631491473125 |
| A79 | 6.230205355202399 |
| A80 | 5.051228410947556 |
| A82 | 5.174718804682691 |
| A83 | 5.175163152000204 |
| A84 | 4.511143943462746 |
| A85 | 4.871965540887994 |
| A86 | 5.160003390095063 |
| A89 | 4.793870887656507 |
| A91 | 5.681998816091858 |
| A96 | 5.08203982229108 |
| A98 | 4.809896132716975 |
| A1 | 4.785822711239916 |
| A10 | 5.401717663765759 |
| A11 | 5.502368692859457 |
| A12 | 4.686163839064591 |
| A14 | 4.93990730262057 |
| A16 | 4.933367404839269 |
| A17 | 4.902493924800388 |
| A2 | 5.795686564164179 |
| A21 | 4.248467131526379 |
| A22 | 4.4532219767104335 |
| A23 | 4.6041832903470885 |
| A24 | 5.044884395723807 |
| A26 | 5.03754065351013 |
| A28 | 5.66550861501657 |
| A3 | 4.341879025699085 |
| A36 | 4.7409481499360595 |
| A37 | 5.053048701633835 |
| A39 | 4.970242119598287 |
| A40 | 5.8199316380329105 |
| A50 | 4.515401782721258 |
| A51 | 5.710871580417459 |
| A52 | 4.70956059568735 |
| A55 | 4.8977503054229725 |
| A60 | 5.96129692348809 |
| A62 | 5.770136723132981 |
| A63 | 4.923412925924469 |
| A64 | 4.8431621534829015 |
| A69 | 5.481884538504378 |
| A7 | 5.090388922443801 |
| A70 | 5.002603211188059 |
| A71 | 4.617594869783823 |
| A76 | 4.635618187782364 |
| A77 | 5.708873823384679 |
| A8 | 4.997700888590846 |
| A81 | 5.343851087145179 |
| A87 | 4.89421612894924 |
| A88 | 5.457134924282174 |
| A9 | 5.412078635821513 |
| A90 | 5.070842112896156 |
| A92 | 5.787692965941835 |
| A93 | 4.389087862719203 |
| A94 | 4.994076713661026 |
| A95 | 4.361883485283725 |
| A97 | 4.2505120106866645 |
| A99 | 5.5660509833665595 |
+----------+---------------------+
100 rows selected (367.595 seconds)
0: jdbc:hive2://sandbox-hdp.hortonworks.com:2>

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------




3 REPLIES 3

Re: Sandbox v3.1.0 performance issue; group by query over 5400 rows - 6 minutes to run.

Explorer

@chris5 Did you ever find out why the performance is so bad, or how to improve it?

 

I'm running HDP Sandbox in Docker on a machine with 8 cores and 32Gb of Ram and this exact same query is taking over 10 minutes.

Re: Sandbox v3.1.0 performance issue; group by query over 5400 rows - 6 minutes to run.

New Contributor

No sorry! The query was from the first tutorial around the sample data, cant see how things are going to scale to millions of rows!

Chris

Re: Sandbox v3.1.0 performance issue; group by query over 5400 rows - 6 minutes to run.

New Contributor

Can anybody answer this fundamental question?

or point me in the right direction as to the best use of hadoop?

really concerned that if everything has to be stored in key value pairs that means

1) a lot of data and servers needed

2) schema on read becomes a hindrance

3) that the simplest join will break the system

any help much appreciated.

Chris

Don't have an account?
Coming from Hortonworks? Activate your account here