@johnwook You don't have to install any external database for CDSW to interact with your Hadoop cluster. As CDSW will interact with Hadoop using Gateway nodes and those will take care of this.
NOTE: The Cloudera Data Science Workbench uses a PostgreSQL database that runs within a container on the master host at /var/lib/cdsw/current/postgres-data. So you can not use any custom database with CDSW. You have to use this shipped with CDSW.
You want want to see the CDSW architecture to understand how CDSW works in Hadoop.