Cluster information:
Kubernetes version: v1.15.3
Installation method: using minikube
Host OS: macOS
Hi , I have a minikube cluster running in my local and I am trying to use apache-spark in client mode with kubernetes and read a file in order to work. I expect read a file but when a I’m trying to read with textfile = sc.textFile("README.md")
and then execute textfile.count()
the output is java.io.FileNotFoundException: File file:/Users/jaoks/Desktop/spark-2.4.5-bin-hadoop2.6/README.md does not exist
I am sure that my driver program is executing in the same directory beacause when I write os.getcwd()
the output is '/Users/jaoks/Desktop/spark-2.4.5-bin-hadoop2.6'
. Also I tried to do the same without kubernetes only executing pyspark and it works. So I have the question what is wrong with my submit:
./bin/pyspark
--master k8s://https://192.168.64.16:8443
--deploy-mode client
--name spark-shell
--driver-memory 512M
--executor-memory 512M
--conf spark.executor.instances=2
--conf spark.kubernetes.container.image=spark-py:latest
--conf spark.kubernetes.namespace=sparktest
Please any help I would be awesome , I need to use apache in client mode with minikube any feedback it would be great too.