I am new to Airflow and am thus facing some issues. I am working on Airflow, and have successfully deployed it on Celery Executor on AKS. Now I am trying to deploy Airflow using Kubernetes Executor on Azure Kubernetes Service. I am using the helm chart provided by tekn0ir for the purpose with some modifications to it. I used kubectl and managed to deploy it successfully. It has pods for scheduler, webserver, postgresql & dynamically created pods for running task-instances. For the purpose of synchronizing dags, I used a git-init container which successfully syncs dags on both scheduler as well as web server. However, when, I trigger a DAG, a new pod does get successfully created for running the task instance, but it throws error. I viewed the logs for that pod & found that the dags were probably not synchronized on the worker pods:
This is the error log for failed pod:
naman@IDC20Intern733:~/airflow-chart$ kubectl logs --v=3 -n airflow2 getpathi1-d5f07e2ec521471baa827bc82cf97a1e
[2020-06-25 11:16:10,309] {settings.py:213} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=1
/usr/local/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
""")
[2020-06-25 11:16:10,480] {__init__.py:51} INFO - Using executor LocalExecutor
[2020-06-25 11:16:10,852] {dagbag.py:90} INFO - Filling up the DagBag from /opt/airflow/dags/getpath.py
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 32, in <module>
args.func(args)
File "/usr/local/lib/python3.6/site-packages/airflow/utils/cli.py", line 74, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 500, in run
dag = get_dag(args)
File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 146, in get_dag
'parse.'.format(args.dag_id))
airflow.exceptions.AirflowException: dag_id could not be found: getpath. Either the dag did not exist or it failed to parse.
The pod description as executed is given below:
naman@IDC20Intern733:~/airflow-chart$ kubectl describe pod --v=4 -n airflow2 getpathi1-d5f07e2ec521471baa827bc82cf97a1e
Name: getpathi1-d5f07e2ec521471baa827bc82cf97a1e
Namespace: airflow2
Priority: 0
Node: aks-nodepool1-41373778-vmss000001/172.19.0.35
Start Time: Thu, 25 Jun 2020 16:46:05 +0530
Labels: airflow-worker=f356d4e9-5dd2-4fe7-9974-712e9e2b3744
airflow_version=1.10.10
dag_id=getpath
execution_date=2020-06-25T10_03_52.802506_plus_00_00
kubernetes_executor=True
task_id=i1
try_number=1
Annotations: <none>
Status: Failed
IP: 172.19.0.50
IPs: <none>
Init Containers:
git-sync-clone:
Container ID: docker://2423801373d6d152461f5d68d87b20e4a87c27cd311c0c6646a9947c7cc6a7e4
Image: k8s.gcr.io/git-sync:v3.1.1
Image ID: docker-pullable://k8s.gcr.io/git-sync@sha256:aa701af5a29738f4a7aa1686f189787d92cea6401e0e81a170e98b10d365a949
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 25 Jun 2020 16:46:06 +0530
Finished: Thu, 25 Jun 2020 16:46:08 +0530
Ready: True
Restart Count: 0
Environment:
GIT_SYNC_REPO: https://github.com/NamanBhat/dags
GIT_SYNC_BRANCH: master
GIT_SYNC_ROOT: /opt/airflow/dags
GIT_SYNC_DEST:
GIT_SYNC_DEPTH: 1
GIT_SYNC_ONE_TIME: true
GIT_SYNC_REV:
GIT_KNOWN_HOSTS: false
Mounts:
/opt/airflow/dags from airflow-dags (rw)
/var/run/secrets/kubernetes.io/serviceaccount from airflow2-token-bksdt (ro)
Containers:
base:
Container ID: docker://bf5515c4cdd2c4d9e7cd12ececc9e79172abefe2d2f31c5a2f00fbc204c9dfad
Image: tekn0ir/airflow-docker:latest
Image ID: docker-pullable://tekn0ir/airflow-docker@sha256:e4b717801870d330487288b489b9b6462881bc038627b284cbead77fbe98c5cd
Port: <none>
Host Port: <none>
Command:
airflow
run
getpath
i1
2020-06-25T10:03:52.802506+00:00
--local
--pool
default_pool
-sd
/opt/airflow/dags/getpath.py
State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 25 Jun 2020 16:46:09 +0530
Finished: Thu, 25 Jun 2020 16:46:11 +0530
Ready: False
Restart Count: 0
Environment Variables from:
airflow2-env ConfigMap Optional: false
Environment:
AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://postgres:airflow@airflow2-postgresql:5432/airflow
AIRFLOW__CORE__DAGS_FOLDER: /opt/airflow/dags/
Mounts:
/opt/airflow/dags from airflow-dags (ro)
/opt/airflow/logs from airflow-logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from airflow2-token-bksdt (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
airflow-dags:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
airflow-logs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
airflow2-token-bksdt:
Type: Secret (a volume populated by a Secret)
SecretName: airflow2-token-bksdt
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m26s default-scheduler Successfully assigned airflow2/getpathi1-d5f07e2ec521471baa827bc82cf97a1e to aks-nodepool1-41373778-vmss000001
Normal Pulled 8m25s kubelet, aks-nodepool1-41373778-vmss000001 Container image "k8s.gcr.io/git-sync:v3.1.1" already present on machine
Normal Created 8m25s kubelet, aks-nodepool1-41373778-vmss000001 Created container git-sync-clone
Normal Started 8m25s kubelet, aks-nodepool1-41373778-vmss000001 Started container git-sync-clone
Normal Pulled 8m22s kubelet, aks-nodepool1-41373778-vmss000001 Container image "tekn0ir/airflow-docker:latest" already present on machine
Normal Created 8m22s kubelet, aks-nodepool1-41373778-vmss000001 Created container base
Normal Started 8m22s kubelet, aks-nodepool1-41373778-vmss000001 Started container base
How can I resolve this issue, so that the DAGs get synchronized in worker pods at the right path and the tasks get executed in worker pods?
This is the link to the helm chart : https://github.com/tekn0ir/airflow-chart.
Thanks in advance