Support Questions

Find answers, ask questions, and share your expertise

Unable to get the record count into the variable after Kerberos Authentication implementation and Ranger Installation

avatar
Contributor

Hi everyone,

 

We are unable to get the record count into the variable  after Kerberos Authentication implementation and Ranger Installation. we are using following command but it's showing count by printing. Following is we used command

I need to get the count into the variable. anyone please help me!

 

count=$(beeline -u "jdbc:hive2://dev-lisa.realm.com:10000/default;principal=hive/dev-lisa.intranet.slt.com.lk@REALM.COM;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks" -n hadoopuser02 -e 'SELECT count(*) from table_name; ')

1 ACCEPTED SOLUTION

avatar
Super Collaborator

use  showHeader=false and outputformat as csv2 to extract only count value to bash variable. 

[root@nightly-cm-ap-2 ~]# count=$(beeline -u "<jdbcURL>"  --showHeader=false --outputformat=csv2 -e "select count(*) from sample_07")
SLF4J: Class path contains multiple SLF4J bindings.
.
.
.
INFO  : Compiling command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f): select count(*) from sample_07
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f); Time taken: 0.213 seconds
INFO  : Executing command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f): select count(*) from sample_07
INFO  : Completed executing command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f); Time taken: 0.009 seconds
INFO  : OK
1 row selected (0.675 seconds)
[root@nightly-cm-ap-2 ~]# echo $count
822
[root@nightly-cm-ap-2 ~]#

 

For additional information use beeline --help.

[root@nightly-cm-ap-2 ~]# beeline --help
SLF4J: Class path contains multiple SLF4J bindings.
.
.
.
Usage: java org.apache.hive.cli.beeline.BeeLine
   -u <database url>               the JDBC URL to connect to
   -c <named url>                  the named JDBC URL to connect to,
                                   which should be present in beeline-site.xml
                                   as the value of beeline.hs2.jdbc.url.<namedUrl>
   -r                              reconnect to last saved connect url (in conjunction with !save)
   -n <username>                   the username to connect as
   -p <password>                   the password to connect as
   -d <driver class>               the driver class to use
   -i <init file>                  script file for initialization
   -e <query>                      query that should be executed
   -f <exec file>                  script file that should be executed
   -w (or) --password-file <password file>  the password file to read password from
   --hiveconf property=value       Use value for given property
   --hivevar name=value            hive variable name and value
                                   This is Hive specific settings in which variables
                                   can be set at session level and referenced in Hive
                                   commands or queries.
   --property-file=<property-file> the file to read connection properties (url, driver, user, password) from
   --color=[true/false]            control whether color is used for display
   --showHeader=[true/false]       show column names in query results
   --escapeCRLF=[true/false]       show carriage return and line feeds in query results as escaped \r and \n
   --headerInterval=ROWS;          the interval between which heades are displayed
   --fastConnect=[true/false]      skip building table/column list for tab-completion
   --autoCommit=[true/false]       enable/disable automatic transaction commit
   --verbose=[true/false]          show verbose error messages and debug info
   --showWarnings=[true/false]     display connection warnings
   --showDbInPrompt=[true/false]   display the current database name in the prompt
   --showNestedErrs=[true/false]   display nested errors
   --numberFormat=[pattern]        format numbers using DecimalFormat pattern
   --force=[true/false]            continue running script even after errors
   --maxWidth=MAXWIDTH             the maximum width of the terminal
   --maxColumnWidth=MAXCOLWIDTH    the maximum width to use when displaying columns
   --silent=[true/false]           be more silent
   --report=[true/false]           show number of rows and execution time after query execution
   --autosave=[true/false]         automatically save preferences
   --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for result display
                                   Note that csv, and tsv are deprecated - use csv2, tsv2 instead
   --incremental=[true/false]      Defaults to false. When set to false, the entire result set
                                   is fetched and buffered before being displayed, yielding optimal
                                   display column sizing. When set to true, result rows are displayed
                                   immediately as they are fetched, yielding lower latency and
                                   memory usage at the price of extra display column padding.
                                   Setting --incremental=true is recommended if you encounter an OutOfMemory
                                   on the client side (due to the fetched result set size being large).
                                   Only applicable if --outputformat=table.
   --incrementalBufferRows=NUMROWS the number of rows to buffer when printing rows on stdout,
                                   defaults to 1000; only applicable if --incremental=true
                                   and --outputformat=table
   --truncateTable=[true/false]    truncate table column when it exceeds length
   --delimiterForDSV=DELIMITER     specify the delimiter for delimiter-separated values output format (default: |)
   --isolation=LEVEL               set the transaction isolation level
   --nullemptystring=[true/false]  set to true to get historic behavior of printing null as empty string
   --maxHistoryRows=MAXHISTORYROWS The maximum number of rows to store beeline history.
   --delimiter=DELIMITER           set the query delimiter; multi-char delimiters are allowed, but quotation
                                   marks, slashes, and -- are not allowed; defaults to ;
   --convertBinaryArrayToString=[true/false]    display binary column data as string or as byte array
   --getUrlsFromBeelineSite        Print all urls from beeline-site.xml, if it is present in the classpath
   --help                          display this message

   Example:
    1. Connect using simple authentication to HiveServer2 on localhost:10000
    $ beeline -u jdbc:hive2://localhost:10000 username password

    2. Connect using simple authentication to HiveServer2 on hs.local:10000 using -n for username and -p for password
    $ beeline -n username -p password -u jdbc:hive2://hs2.local:10012

    3. Connect using Kerberos authentication with hive/localhost@mydomain.com as HiveServer2 principal
    $ beeline -u "jdbc:hive2://hs2.local:10013/default;principal=hive/localhost@mydomain.com"

    4. Connect using SSL connection to HiveServer2 on localhost at 10000
    $ beeline "jdbc:hive2://localhost:10000/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword"

    5. Connect using LDAP authentication
    $ beeline -u jdbc:hive2://hs2.local:10013/default <ldap-username> <ldap-password>

View solution in original post

7 REPLIES 7

avatar

Hi @Choolake , sorry, maybe it is not clear only for me, have you executed the beeline command alone? What do you exactly get when you do so? 


beeline -u "jdbc:hive2://<lb_or_hs2_hostname>:10000/default;principal=hive/<lbhostname_if_lb_is_enabled_otherwise_hs2_hostname>@REALM.COM;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks" -e 'SELECT count(*) from table_name; '

 

just to understand if you get any errors. Then repeat with redirecting the stderr to /dev/null

 

beeline -u "jdbc:hive2://<lb_or_hs2_hostname>:10000/default;principal=hive/<lbhostname_if_lb_is_enabled_otherwise_hs2_hostname>@REALM.COM;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks" -e 'SELECT count(*) from table_name; ' 2>/dev/null

 

Thanks

 Miklos

avatar
Contributor

Hi,@mszurap

Thanks for the reply.!

I not got any errors after end the commands.it was just printing the result as the following.

+-----------+
| _c0 |
+-----------+
| 84300000 |
+-----------+

I need to get the record count into the variable of "count" as following

count=84300000

 

avatar
Cloudera Employee

Just alias the column name.  With aggregations any output automatically gives a value of c + Index number.  In this case it is c0 as it's only one aggregate column.  An example of a query with the above output as count with the number under it is SELECT count)(*) as count from <TABLE_NAME> should give you the c0 column listed as count.  If you are looking to store the output as a variable as a local variable in you could do something like:

count=`hive -S -e "select count(*) from <TABLE_NAME>;"`
echo $count

 

avatar
Super Collaborator

Hi @Choolake 

You can use it as below. 

[root@c3910-node2 1546341386-hive-HIVEMETASTORE]# count=`beeline -u "jdbc:hive2://c3910-node3.coelab.cloudera.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=lpsZVjarkhv7WPONRXNCVscCkntpfjTtilIcrZH8NeP" -e "select count(*) from default.test"`
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p35.41426564/jars/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p35.41426564/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p35.41426564/jars/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p35.41426564/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://c3910-node3.coelab.cloudera.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=lpsZVjarkhv7WPONRXNCVscCkntpfjTtilIcrZH8NeP
23/07/13 12:56:03 [main]: INFO jdbc.HiveConnection: Connected to c3910-node3.coelab.cloudera.com:10000
Connected to: Apache Hive (version 3.1.3000.7.1.8.35-1)
Driver: Hive JDBC (version 3.1.3000.7.1.8.35-1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
INFO : Compiling command(queryId=hive_20230713125603_1478c0aa-6c44-42b2-a78f-b323ac872df8): select count(*) from default.test
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20230713125603_1478c0aa-6c44-42b2-a78f-b323ac872df8); Time taken: 0.103 seconds
INFO : Executing command(queryId=hive_20230713125603_1478c0aa-6c44-42b2-a78f-b323ac872df8): select count(*) from default.test
INFO : Completed executing command(queryId=hive_20230713125603_1478c0aa-6c44-42b2-a78f-b323ac872df8); Time taken: 0.01 seconds
INFO : OK
1 row selected (0.245 seconds)
Beeline version 3.1.3000.7.1.8.35-1 by Apache Hive
Closing: 0: jdbc:hive2://c3910-node3.coelab.cloudera.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=lpsZVjarkhv7WPONRXNCVscCkntpfjTtilIcrZH8NeP
[root@c3910-node2 1546341386-hive-HIVEMETASTORE]# echo $count
+------+ | _c0 | +------+ | 1 | +------+
[root@c3910-node2 1546341386-hive-HIVEMETASTORE]#

You might need to further put some formatting on the "count" variable, as that contains the entire output of the select query.

 

Let me know if this helps!

 

Cheers!

Let me know if this helps.

avatar
Contributor

Thank you!

avatar
Super Collaborator

use  showHeader=false and outputformat as csv2 to extract only count value to bash variable. 

[root@nightly-cm-ap-2 ~]# count=$(beeline -u "<jdbcURL>"  --showHeader=false --outputformat=csv2 -e "select count(*) from sample_07")
SLF4J: Class path contains multiple SLF4J bindings.
.
.
.
INFO  : Compiling command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f): select count(*) from sample_07
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f); Time taken: 0.213 seconds
INFO  : Executing command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f): select count(*) from sample_07
INFO  : Completed executing command(queryId=hive_20230714065303_399f60fb-d947-4fcf-b017-c64a4a39251f); Time taken: 0.009 seconds
INFO  : OK
1 row selected (0.675 seconds)
[root@nightly-cm-ap-2 ~]# echo $count
822
[root@nightly-cm-ap-2 ~]#

 

For additional information use beeline --help.

[root@nightly-cm-ap-2 ~]# beeline --help
SLF4J: Class path contains multiple SLF4J bindings.
.
.
.
Usage: java org.apache.hive.cli.beeline.BeeLine
   -u <database url>               the JDBC URL to connect to
   -c <named url>                  the named JDBC URL to connect to,
                                   which should be present in beeline-site.xml
                                   as the value of beeline.hs2.jdbc.url.<namedUrl>
   -r                              reconnect to last saved connect url (in conjunction with !save)
   -n <username>                   the username to connect as
   -p <password>                   the password to connect as
   -d <driver class>               the driver class to use
   -i <init file>                  script file for initialization
   -e <query>                      query that should be executed
   -f <exec file>                  script file that should be executed
   -w (or) --password-file <password file>  the password file to read password from
   --hiveconf property=value       Use value for given property
   --hivevar name=value            hive variable name and value
                                   This is Hive specific settings in which variables
                                   can be set at session level and referenced in Hive
                                   commands or queries.
   --property-file=<property-file> the file to read connection properties (url, driver, user, password) from
   --color=[true/false]            control whether color is used for display
   --showHeader=[true/false]       show column names in query results
   --escapeCRLF=[true/false]       show carriage return and line feeds in query results as escaped \r and \n
   --headerInterval=ROWS;          the interval between which heades are displayed
   --fastConnect=[true/false]      skip building table/column list for tab-completion
   --autoCommit=[true/false]       enable/disable automatic transaction commit
   --verbose=[true/false]          show verbose error messages and debug info
   --showWarnings=[true/false]     display connection warnings
   --showDbInPrompt=[true/false]   display the current database name in the prompt
   --showNestedErrs=[true/false]   display nested errors
   --numberFormat=[pattern]        format numbers using DecimalFormat pattern
   --force=[true/false]            continue running script even after errors
   --maxWidth=MAXWIDTH             the maximum width of the terminal
   --maxColumnWidth=MAXCOLWIDTH    the maximum width to use when displaying columns
   --silent=[true/false]           be more silent
   --report=[true/false]           show number of rows and execution time after query execution
   --autosave=[true/false]         automatically save preferences
   --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for result display
                                   Note that csv, and tsv are deprecated - use csv2, tsv2 instead
   --incremental=[true/false]      Defaults to false. When set to false, the entire result set
                                   is fetched and buffered before being displayed, yielding optimal
                                   display column sizing. When set to true, result rows are displayed
                                   immediately as they are fetched, yielding lower latency and
                                   memory usage at the price of extra display column padding.
                                   Setting --incremental=true is recommended if you encounter an OutOfMemory
                                   on the client side (due to the fetched result set size being large).
                                   Only applicable if --outputformat=table.
   --incrementalBufferRows=NUMROWS the number of rows to buffer when printing rows on stdout,
                                   defaults to 1000; only applicable if --incremental=true
                                   and --outputformat=table
   --truncateTable=[true/false]    truncate table column when it exceeds length
   --delimiterForDSV=DELIMITER     specify the delimiter for delimiter-separated values output format (default: |)
   --isolation=LEVEL               set the transaction isolation level
   --nullemptystring=[true/false]  set to true to get historic behavior of printing null as empty string
   --maxHistoryRows=MAXHISTORYROWS The maximum number of rows to store beeline history.
   --delimiter=DELIMITER           set the query delimiter; multi-char delimiters are allowed, but quotation
                                   marks, slashes, and -- are not allowed; defaults to ;
   --convertBinaryArrayToString=[true/false]    display binary column data as string or as byte array
   --getUrlsFromBeelineSite        Print all urls from beeline-site.xml, if it is present in the classpath
   --help                          display this message

   Example:
    1. Connect using simple authentication to HiveServer2 on localhost:10000
    $ beeline -u jdbc:hive2://localhost:10000 username password

    2. Connect using simple authentication to HiveServer2 on hs.local:10000 using -n for username and -p for password
    $ beeline -n username -p password -u jdbc:hive2://hs2.local:10012

    3. Connect using Kerberos authentication with hive/localhost@mydomain.com as HiveServer2 principal
    $ beeline -u "jdbc:hive2://hs2.local:10013/default;principal=hive/localhost@mydomain.com"

    4. Connect using SSL connection to HiveServer2 on localhost at 10000
    $ beeline "jdbc:hive2://localhost:10000/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword"

    5. Connect using LDAP authentication
    $ beeline -u jdbc:hive2://hs2.local:10013/default <ldap-username> <ldap-password>

avatar
Community Manager

@Choolake, Thank you for your participation in Cloudera Community. I'm happy to see you resolved your issue. Please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

 



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: