Speed, Batch, and Serving layers which tools are better to implement these layers. We did some research on ApacheFlink, Spark, Storm, and Samza. We are completely in confusion mode to choose which is best one. I greatly appreciate your suggestions.
It is difficult to provide you a silver bullet architecture without knowing the use case.
30k foot view
-PS i would do Kappa instead of Lamba
If you are set on using lamba you need to start with sinks and push to resilient layer (ie Kafka). Your batch layer could be served by LLAP or Spark (i prefer this). For Speed you can you Spark (2.3 with continuous processing mode) or Storm/Streaming analytics manager). Serving is super difficult to answer without knowing your use case. For pre defined access models, Phoenix may be your way to go. Other than that use LLAP/Druid/RDBMS. You need a NoSQL DB to serve your real time layer and when batch needs to feed updated analytic, that would be phoenix/hbase (again if predefined access model is available). Hope that gets you started.