Deploy Spark into Kubernetes Cluster

Hi,

I’m newbie in Kubernetes & Spark Environment. I’m requested to deploy Spark inside Kubernetes so that it’s can be auto Horizontal Scalling.

The problem is, I can’t deploy SparkPi example from official website(https://spark.apache.org/docs/latest/running-on-kubernetes#cluster-mode).

I’ve already follow the instruction, but the pods failed to execute. Here is the explanation :

  1. Already run : Kubectl proxy
  2. When execute :

spark-submit --master k8s://https://localhost:6445 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=5 --conf spark.kubernetes.container.image=xnuxer88/spark-kubernetes-bash-test-entry:v1 local:///opt/spark/examples/jars/spark-examples_2.11-2.3.2.jar

Get Error :
Error: Could not find or load main class org.apache.spark.examples.SparkPi

  1. When I check the docker image (create the container from related image), I found the file.

Is there any missing instruction that I forgot to follow?

Please Help.

Thank You.

Link : https://stackoverflow.com/questions/52623435/deploy-spark-into-kubernetes-cluster

Heya!

Did you by chance see this similar issue? https://stackoverflow.com/questions/51467082/sparkpi-on-kubernetes-could-not-find-or-load-main-class

Is it possible the JAR file isn’t actually present in the image? I might try using their spark-submit command and see if that moves you along. :slight_smile:

Hi Jeffy,

Still the same.

Command :

spark-submit --master k8s://https://localhost:6445 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=5 --conf spark.kubernetes.container.image=xnuxer88/spark-kubernetes-bash-test-entry:v1 --conf spark.kubernetes.authenticate.driver.serviceAccountName=default https://github.com/JWebDev/spark/raw/master/spark-examples_2.11-2.3.1.jar

Thank You.

Additional Image :

Hi ,

I think you should use this path local:///opt/spark/jars/abcd.jar when try to run your spark jobs in k8 cluster.
The thing is the script which is present in spark installation file copies all the jars present under spark2.3.2/examples/jars/abcd.jar to /opt/spark/jars location.

Also before running this thing can you build your docker image using the script given in the installation directory/bin/docker.sh

Is anyone used spark operator to run the spark applications in kuberntes ?

Checkout this blogpost for complete working solution, you can simply checkout and run the Jobs on your local.

  • REST APIs to Start, Stop and monitor Spark Jobs on a click.
  • Production ready Demo Spark Batch and Fault tolerant Streaming Jobs.
  • Run Spark Jobs locally in IDE and Minikube.
  • Launch Spark Job Jars or Docker images using spark-submit
  • Deployment on Kubernetes or AWS EMR