Created 06-22-2023 10:09 PM
Hi everyone,
We are unable to get the record count into the variable after Kerberos Authentication implementation and Ranger Installation. we are using following command but it's showing count by printing. Following is we used command
I need to get the count into the variable. anyone please help me!
count=$(beeline -u "jdbc:hive2://dev-lisa.realm.com:10000/default;principal=hive/dev-lisa.intranet.slt.com.lk@REALM.COM;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks" -n hadoopuser02 -e 'SELECT count(*) from table_name; ')
Created 07-13-2023 11:59 PM
use showHeader=false and outputformat as csv2 to extract only count value to bash variable.
[root@nightly-cm-ap-2 ~]# count=$(beeline -u "<jdbcURL>" --showHeader=false --outputformat=csv2 -e "select count(*) from sample_07")
SLF4J: Class path contains multiple SLF4J bindings.
.
.
.
INFO : Compiling command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f): select count(*) from sample_07
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f); Time taken: 0.213 seconds
INFO : Executing command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f): select count(*) from sample_07
INFO : Completed executing command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f); Time taken: 0.009 seconds
INFO : OK
1 row selected (0.675 seconds)
[root@nightly-cm-ap-2 ~]# echo $count
822
[root@nightly-cm-ap-2 ~]#
For additional information use beeline --help.
[root@nightly-cm-ap-2 ~]# beeline --help
SLF4J: Class path contains multiple SLF4J bindings.
.
.
.
Usage: java org.apache.hive.cli.beeline.BeeLine
-u <database url> the JDBC URL to connect to
-c <named url> the named JDBC URL to connect to,
which should be present in beeline-site.xml
as the value of beeline.hs2.jdbc.url.<namedUrl>
-r reconnect to last saved connect url (in conjunction with !save)
-n <username> the username to connect as
-p <password> the password to connect as
-d <driver class> the driver class to use
-i <init file> script file for initialization
-e <query> query that should be executed
-f <exec file> script file that should be executed
-w (or) --password-file <password file> the password file to read password from
--hiveconf property=value Use value for given property
--hivevar name=value hive variable name and value
This is Hive specific settings in which variables
can be set at session level and referenced in Hive
commands or queries.
--property-file=<property-file> the file to read connection properties (url, driver, user, password) from
--color=[true/false] control whether color is used for display
--showHeader=[true/false] show column names in query results
--escapeCRLF=[true/false] show carriage return and line feeds in query results as escaped \r and \n
--headerInterval=ROWS; the interval between which heades are displayed
--fastConnect=[true/false] skip building table/column list for tab-completion
--autoCommit=[true/false] enable/disable automatic transaction commit
--verbose=[true/false] show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showDbInPrompt=[true/false] display the current database name in the prompt
--showNestedErrs=[true/false] display nested errors
--numberFormat=[pattern] format numbers using DecimalFormat pattern
--force=[true/false] continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTH the maximum width to use when displaying columns
--silent=[true/false] be more silent
--report=[true/false] show number of rows and execution time after query execution
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv] format mode for result display
Note that csv, and tsv are deprecated - use csv2, tsv2 instead
--incremental=[true/false] Defaults to false. When set to false, the entire result set
is fetched and buffered before being displayed, yielding optimal
display column sizing. When set to true, result rows are displayed
immediately as they are fetched, yielding lower latency and
memory usage at the price of extra display column padding.
Setting --incremental=true is recommended if you encounter an OutOfMemory
on the client side (due to the fetched result set size being large).
Only applicable if --outputformat=table.
--incrementalBufferRows=NUMROWS the number of rows to buffer when printing rows on stdout,
defaults to 1000; only applicable if --incremental=true
and --outputformat=table
--truncateTable=[true/false] truncate table column when it exceeds length
--delimiterForDSV=DELIMITER specify the delimiter for delimiter-separated values output format (default: |)
--isolation=LEVEL set the transaction isolation level
--nullemptystring=[true/false] set to true to get historic behavior of printing null as empty string
--maxHistoryRows=MAXHISTORYROWS The maximum number of rows to store beeline history.
--delimiter=DELIMITER set the query delimiter; multi-char delimiters are allowed, but quotation
marks, slashes, and -- are not allowed; defaults to ;
--convertBinaryArrayToString=[true/false] display binary column data as string or as byte array
--getUrlsFromBeelineSite Print all urls from beeline-site.xml, if it is present in the classpath
--help display this message
Example:
1. Connect using simple authentication to HiveServer2 on localhost:10000
$ beeline -u jdbc:hive2://localhost:10000 username password
2. Connect using simple authentication to HiveServer2 on hs.local:10000 using -n for username and -p for password
$ beeline -n username -p password -u jdbc:hive2://hs2.local:10012
3. Connect using Kerberos authentication with hive/localhost@mydomain.com as HiveServer2 principal
$ beeline -u "jdbc:hive2://hs2.local:10013/default;principal=hive/localhost@mydomain.com"
4. Connect using SSL connection to HiveServer2 on localhost at 10000
$ beeline "jdbc:hive2://localhost:10000/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword"
5. Connect using LDAP authentication
$ beeline -u jdbc:hive2://hs2.local:10013/default <ldap-username> <ldap-password>
Created 06-23-2023 08:22 AM
Hi @Choolake , sorry, maybe it is not clear only for me, have you executed the beeline command alone? What do you exactly get when you do so?
beeline -u "jdbc:hive2://<lb_or_hs2_hostname>:10000/default;principal=hive/<lbhostname_if_lb_is_enabled_otherwise_hs2_hostname>@REALM.COM;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks" -e 'SELECT count(*) from table_name; '
just to understand if you get any errors. Then repeat with redirecting the stderr to /dev/null
beeline -u "jdbc:hive2://<lb_or_hs2_hostname>:10000/default;principal=hive/<lbhostname_if_lb_is_enabled_otherwise_hs2_hostname>@REALM.COM;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks" -e 'SELECT count(*) from table_name; ' 2>/dev/null
Thanks
Miklos
Created 06-25-2023 09:55 PM
Hi,@mszurap
Thanks for the reply.!
I not got any errors after end the commands.it was just printing the result as the following.
+-----------+
| _c0 |
+-----------+
| 84300000 |
+-----------+
I need to get the record count into the variable of "count" as following
count=84300000
Created 07-10-2023 09:17 AM
Just alias the column name. With aggregations any output automatically gives a value of c + Index number. In this case it is c0 as it's only one aggregate column. An example of a query with the above output as count with the number under it is SELECT count)(*) as count from <TABLE_NAME> should give you the c0 column listed as count. If you are looking to store the output as a variable as a local variable in you could do something like:
count=`hive -S -e "select count(*) from <TABLE_NAME>;"` echo $count
Created 07-13-2023 05:58 AM
Hi @Choolake
You can use it as below.
[root@c3910-node2 1546341386-hive-HIVEMETASTORE]# count=`beeline -u "jdbc:hive2://c3910-node3.coelab.cloudera.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=lpsZVjarkhv7WPONRXNCVscCkntpfjTtilIcrZH8NeP" -e "select count(*) from default.test"`
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p35.41426564/jars/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p35.41426564/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p35.41426564/jars/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p35.41426564/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://c3910-node3.coelab.cloudera.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=lpsZVjarkhv7WPONRXNCVscCkntpfjTtilIcrZH8NeP
23/07/13 12:56:03 [main]: INFO jdbc.HiveConnection: Connected to c3910-node3.coelab.cloudera.com:10000
Connected to: Apache Hive (version 3.1.3000.7.1.8.35-1)
Driver: Hive JDBC (version 3.1.3000.7.1.8.35-1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
INFO : Compiling command(queryId=hive_20230713125603_1478c0aa-6c44-42b2-a78f-b323ac872df8): select count(*) from default.test
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20230713125603_1478c0aa-6c44-42b2-a78f-b323ac872df8); Time taken: 0.103 seconds
INFO : Executing command(queryId=hive_20230713125603_1478c0aa-6c44-42b2-a78f-b323ac872df8): select count(*) from default.test
INFO : Completed executing command(queryId=hive_20230713125603_1478c0aa-6c44-42b2-a78f-b323ac872df8); Time taken: 0.01 seconds
INFO : OK
1 row selected (0.245 seconds)
Beeline version 3.1.3000.7.1.8.35-1 by Apache Hive
Closing: 0: jdbc:hive2://c3910-node3.coelab.cloudera.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=lpsZVjarkhv7WPONRXNCVscCkntpfjTtilIcrZH8NeP
[root@c3910-node2 1546341386-hive-HIVEMETASTORE]# echo $count
+------+ | _c0 | +------+ | 1 | +------+
[root@c3910-node2 1546341386-hive-HIVEMETASTORE]#
You might need to further put some formatting on the "count" variable, as that contains the entire output of the select query.
Let me know if this helps!
Cheers!
Let me know if this helps.
Created 07-18-2023 01:16 AM
Thank you!
Created 07-13-2023 11:59 PM
use showHeader=false and outputformat as csv2 to extract only count value to bash variable.
[root@nightly-cm-ap-2 ~]# count=$(beeline -u "<jdbcURL>" --showHeader=false --outputformat=csv2 -e "select count(*) from sample_07")
SLF4J: Class path contains multiple SLF4J bindings.
.
.
.
INFO : Compiling command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f): select count(*) from sample_07
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f); Time taken: 0.213 seconds
INFO : Executing command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f): select count(*) from sample_07
INFO : Completed executing command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f); Time taken: 0.009 seconds
INFO : OK
1 row selected (0.675 seconds)
[root@nightly-cm-ap-2 ~]# echo $count
822
[root@nightly-cm-ap-2 ~]#
For additional information use beeline --help.
[root@nightly-cm-ap-2 ~]# beeline --help
SLF4J: Class path contains multiple SLF4J bindings.
.
.
.
Usage: java org.apache.hive.cli.beeline.BeeLine
-u <database url> the JDBC URL to connect to
-c <named url> the named JDBC URL to connect to,
which should be present in beeline-site.xml
as the value of beeline.hs2.jdbc.url.<namedUrl>
-r reconnect to last saved connect url (in conjunction with !save)
-n <username> the username to connect as
-p <password> the password to connect as
-d <driver class> the driver class to use
-i <init file> script file for initialization
-e <query> query that should be executed
-f <exec file> script file that should be executed
-w (or) --password-file <password file> the password file to read password from
--hiveconf property=value Use value for given property
--hivevar name=value hive variable name and value
This is Hive specific settings in which variables
can be set at session level and referenced in Hive
commands or queries.
--property-file=<property-file> the file to read connection properties (url, driver, user, password) from
--color=[true/false] control whether color is used for display
--showHeader=[true/false] show column names in query results
--escapeCRLF=[true/false] show carriage return and line feeds in query results as escaped \r and \n
--headerInterval=ROWS; the interval between which heades are displayed
--fastConnect=[true/false] skip building table/column list for tab-completion
--autoCommit=[true/false] enable/disable automatic transaction commit
--verbose=[true/false] show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showDbInPrompt=[true/false] display the current database name in the prompt
--showNestedErrs=[true/false] display nested errors
--numberFormat=[pattern] format numbers using DecimalFormat pattern
--force=[true/false] continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTH the maximum width to use when displaying columns
--silent=[true/false] be more silent
--report=[true/false] show number of rows and execution time after query execution
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv] format mode for result display
Note that csv, and tsv are deprecated - use csv2, tsv2 instead
--incremental=[true/false] Defaults to false. When set to false, the entire result set
is fetched and buffered before being displayed, yielding optimal
display column sizing. When set to true, result rows are displayed
immediately as they are fetched, yielding lower latency and
memory usage at the price of extra display column padding.
Setting --incremental=true is recommended if you encounter an OutOfMemory
on the client side (due to the fetched result set size being large).
Only applicable if --outputformat=table.
--incrementalBufferRows=NUMROWS the number of rows to buffer when printing rows on stdout,
defaults to 1000; only applicable if --incremental=true
and --outputformat=table
--truncateTable=[true/false] truncate table column when it exceeds length
--delimiterForDSV=DELIMITER specify the delimiter for delimiter-separated values output format (default: |)
--isolation=LEVEL set the transaction isolation level
--nullemptystring=[true/false] set to true to get historic behavior of printing null as empty string
--maxHistoryRows=MAXHISTORYROWS The maximum number of rows to store beeline history.
--delimiter=DELIMITER set the query delimiter; multi-char delimiters are allowed, but quotation
marks, slashes, and -- are not allowed; defaults to ;
--convertBinaryArrayToString=[true/false] display binary column data as string or as byte array
--getUrlsFromBeelineSite Print all urls from beeline-site.xml, if it is present in the classpath
--help display this message
Example:
1. Connect using simple authentication to HiveServer2 on localhost:10000
$ beeline -u jdbc:hive2://localhost:10000 username password
2. Connect using simple authentication to HiveServer2 on hs.local:10000 using -n for username and -p for password
$ beeline -n username -p password -u jdbc:hive2://hs2.local:10012
3. Connect using Kerberos authentication with hive/localhost@mydomain.com as HiveServer2 principal
$ beeline -u "jdbc:hive2://hs2.local:10013/default;principal=hive/localhost@mydomain.com"
4. Connect using SSL connection to HiveServer2 on localhost at 10000
$ beeline "jdbc:hive2://localhost:10000/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword"
5. Connect using LDAP authentication
$ beeline -u jdbc:hive2://hs2.local:10013/default <ldap-username> <ldap-password>
Created 07-18-2023 07:27 AM
@Choolake, Thank you for your participation in Cloudera Community. I'm happy to see you resolved your issue. Please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Vidya Sargur,