Until now, I have been using import command to import data and have been saving --last-value in the default metastore.
1. What to specify for --last-value initially in the import command for incremental import based on last modified value. I see it's 0 for append mode but what should that be in lastmodified mode?
2. I have a table I need to import data incrementally from but the modified_at field of this table is stored in another table referenced via foreign key. Now, since there are multiple tables involved here, I understand that I will need to use free query form but how exactly does it fetch --last-value from metastore? I of course am not going to hard code the last value in free query.
Also, how do I provide modified_at column from this referenced table?
What should be the exact syntax in this case?
I don't believe that Incremental Imports, http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports, has any problem working alongside of Free-form Query Imports, http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_free_form_query_imports. That said, here are my attempts for answers to your two questions.
1) I believe you'd use the same initial value for "lastmodified" as you would for "append".
2) So, I'm thinking you want to do something like the following.
sqoop import \ --query 'SELECT a.*, b.* FROM a JOIN b on (a.id == b.id) WHERE $CONDITIONS' \ --split-by a.id --target-dir /user/foo/joinresults \ --check-column b.THE_LAST_UPDATED_TIMESTAMP_FIELD \ --incremental lastmodified \ --last-value HANDLE_THIS_THE_WAY_U_R_ALREADY_DOING_IT