Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to get the total number of rows from multiple tables in a database?

Solved Go to solution
Highlighted

How to get the total number of rows from multiple tables in a database?

Explorer

In my db i have many tables that start with 'bza' in table name.

 

I wrote all those table names into a text file name as tables.txt

 

Now using a shell script (count_row.sh) I am reading each line from tables.txt and execuing the hive commnad:

 

 

while IFS='' read -r line || [[ -n "$line" ]]; do
echo ""
echo ""
echo ""
echo "Counting the table : $line"
eval "hive -e 'select count(*) from $line'"
done < "$1"

This results in total rows from each table, which I am writting it into a text file rows.txt as

 

$ ./count_row.sh tables.txt > row.txt

Now my task is I want to count the total number of rows from all the tables by summing the individual result. 

 

Is there any way to achieve that ? Or any alternate way to count the rows from mutliple tables.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to get the total number of rows from multiple tables in a database?

Master Guru
Should be a simple extension in bash, if that is what you're looking for:

./count_row.sh tables.txt | paste -s -d+ - | bc

Ref: http://stackoverflow.com/questions/450799/shell-command-to-sum-integers-one-per-line#comment12469220...

P.s. It may be more efficient to generate a list of queries and run it via a single hive command, cause each command runs a whole new JVM.
9 REPLIES 9

Re: How to get the total number of rows from multiple tables in a database?

Master Guru
Should be a simple extension in bash, if that is what you're looking for:

./count_row.sh tables.txt | paste -s -d+ - | bc

Ref: http://stackoverflow.com/questions/450799/shell-command-to-sum-integers-one-per-line#comment12469220...

P.s. It may be more efficient to generate a list of queries and run it via a single hive command, cause each command runs a whole new JVM.

Re: How to get the total number of rows from multiple tables in a database?

Explorer

@Harsh J
By saying "Generate a list of queries" you mean, creating a list of queries and separate by ;(semicolon) in hive command like
hive -e 'query1;query2;...queryN' ?

Re: How to get the total number of rows from multiple tables in a database

Master Guru
Yes that could work too (or a file with them, passed via -f or such).

Re: How to get the total number of rows from multiple tables in a database

Explorer
@Harsh J
Using hive -f a file can be passed, but the main purpose of writing the loop in script is to automate the process. I have to apply the same command for 250 tables.

Re: How to get the total number of rows from multiple tables in a database

Master Guru
Right, it was suggested as an optimisation aside of the summing question,
given the described example. Does the bc command not solve your original
question?

Re: How to get the total number of rows from multiple tables in a database

Explorer
I was just asking it. yes it solved using bc

Re: How to get the total number of rows from multiple tables in a database

Master Guru
Glad to know! Please consider marking the thread resolved, so others with a
similar question can find a solution quicker.

Feel free to post a new thread with any further questions.

Re: How to get the total number of rows from multiple tables in a database?

Explorer

It doesn't work for me, here is the screenshot:

 

Screenshot - 5_5_2017 , 9_23_33 PM.jpg

 

What am I missing here?

 

Thank you very much.

Re: How to get the total number of rows from multiple tables in a database?

Explorer

Done keyword is missing in your script. In your loop body use done at last.

Don't have an account?
Coming from Hortonworks? Activate your account here