@senthil kumar spark with push down the predicates to the datasource , hbase in this case, it will keep the resultant data frame after the filter in memory. For most databases as well spark will do push down. It does not do this blindly though. Spark will assess all the operations that will happen on data frame and based on it build a execution plan and decide it should do a push down or do it in memory. For small tables, it might make sense to do in memory.