About MarlinGomez

MarlinGomez · ‎12-08-2025

Hey everyone, I’m currently preparing for the CCA175 (Cloudera Data Engineer) exam and focusing heavily on real, hands-on scenario challenges to strengthen my understanding. So far, I’ve practiced with various ingestion and transformation pipelines, but I’m stuck on one scenario that feels very close to what the actual exam might present. Midway through my study plan, I started using Certs Matrix, which has helped me evaluate different approaches to solving Spark and Hadoop workflow problems under pressure. The scenario I’m trying to clarify is this: If you receive streaming data in inconsistent formats and need to cleanse, transform, and store it efficiently in HDFS, which approach would be most exam-accurate using Spark Structured Streaming with schema evolution, designing separate ETL pipelines for each input format, or relying on a unified schema-on-read strategy? I’d really appreciate insights from anyone who has taken CCA175 or handled similar real-world pipelines. Your guidance would help me refine my preparation.

Online	Offline
Last Visited	‎12-08-2025 01:49 PM

Member Since	‎12-08-2025 01:49 PM
Last Visited	‎12-08-2025 01:49 PM
Posts	1

Cloudera Community

Need Help Clarifying a Real CCA175 Scenario