Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to get the total number of rows from multiple tables in a database?

avatar
Contributor

In my db i have many tables that start with 'bza' in table name.

 

I wrote all those table names into a text file name as tables.txt

 

Now using a shell script (count_row.sh) I am reading each line from tables.txt and execuing the hive commnad:

 

 

while IFS='' read -r line || [[ -n "$line" ]]; do
echo ""
echo ""
echo ""
echo "Counting the table : $line"
eval "hive -e 'select count(*) from $line'"
done < "$1"

This results in total rows from each table, which I am writting it into a text file rows.txt as

 

$ ./count_row.sh tables.txt > row.txt

Now my task is I want to count the total number of rows from all the tables by summing the individual result. 

 

Is there any way to achieve that ? Or any alternate way to count the rows from mutliple tables.

 

 

1 ACCEPTED SOLUTION

avatar
Mentor
Should be a simple extension in bash, if that is what you're looking for:

./count_row.sh tables.txt | paste -s -d+ - | bc

Ref: http://stackoverflow.com/questions/450799/shell-command-to-sum-integers-one-per-line#comment12469220...

P.s. It may be more efficient to generate a list of queries and run it via a single hive command, cause each command runs a whole new JVM.

View solution in original post

9 REPLIES 9

avatar
Mentor
Should be a simple extension in bash, if that is what you're looking for:

./count_row.sh tables.txt | paste -s -d+ - | bc

Ref: http://stackoverflow.com/questions/450799/shell-command-to-sum-integers-one-per-line#comment12469220...

P.s. It may be more efficient to generate a list of queries and run it via a single hive command, cause each command runs a whole new JVM.

avatar
Contributor

@Harsh J
By saying "Generate a list of queries" you mean, creating a list of queries and separate by ;(semicolon) in hive command like
hive -e 'query1;query2;...queryN' ?

avatar
Mentor
Yes that could work too (or a file with them, passed via -f or such).

avatar
Contributor
@Harsh J
Using hive -f a file can be passed, but the main purpose of writing the loop in script is to automate the process. I have to apply the same command for 250 tables.

avatar
Mentor
Right, it was suggested as an optimisation aside of the summing question,
given the described example. Does the bc command not solve your original
question?

avatar
Contributor
I was just asking it. yes it solved using bc

avatar
Mentor
Glad to know! Please consider marking the thread resolved, so others with a
similar question can find a solution quicker.

Feel free to post a new thread with any further questions.

avatar
Contributor

It doesn't work for me, here is the screenshot:

 

Screenshot - 5_5_2017 , 9_23_33 PM.jpg

 

What am I missing here?

 

Thank you very much.

avatar
Explorer

Done keyword is missing in your script. In your loop body use done at last.