Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Issue writing to HDFS using Python - Package used HDFS, Subprocess

Highlighted

Issue writing to HDFS using Python - Package used HDFS, Subprocess

New Contributor

Facing issue when writing to HDFS using HDFS, Subprocess packages available in Python. We are running the code from our desktop and connecting to the HDFS cluster hosted in Linux.

We tried placing the file locally on the linux server and running the code from Desktop by connecting to the server

Also we tried placing the file on the desktop and tried running the code from desktop by connecting to the server.

When using InsecureClient and read function, we see the file being read even though we are connecting from Desktop. But write is failing with write function of HDFS python package. So we tried using subprocess. When running python code with subprocess functions or POPEN command we get the file not found issue.


Python Vers: 3.6

HDP: 2.6

Error : FileNotFoundError: [WinError 2] The system cannot find the file specified

filename: test.csv

hdfs_path :hdfs://xxxxx.xxx.xxx.xxx:8020/test/data/

put = Popen(["hadoop", "fs", "-put", filename, hdfs_path], stdin=PIPE, bufsize=-1)