We’re looking at the possibility of using Kubernetes to facilitate our development pipelines.
Currently we use 360 virtual machines as dedicated Gitlab Runners. Every time a pipeline is triggered on Gitlab, we run ~1,300 jobs (mostly tests - unit, selenium, etc). Each Runner (1 per VM) runs a single job concurrently.
For every job, we spin up a
docker-compose environment which runs 38 containers. Obviously this is pretty hefty, and with dependencies, can take 3-4 minutes to become usable.
.gitlab-ci.yml looks something like this:
stages: - unit - integration - selenium "unit test 1": stage: unit script: - docker-compose exec -T unit-test-container-name phpunit --filter blah unit-test-file-path.php (... lots more of these) "integration test 1": stage: integration script: - docker-compose exec -T unit-test-container-name phpunit --filter blah integration-test-file-path.php (... lots more of these)
I intended to spin up 100+ pods as Gitlab Runners and then every time a pipeline runs, spin up the 38 containers and a Service as a “development environment”. In my head, I was then going to use the runners to communicate with the development environment to run the tests - be it through
kubectl commands or
curl requests, etc.
However, quite a few of our tests change the state of our
mysql database. E.g. a test might add some new rows; another test might delete them.
Two questions for guys with more Kubernetes experience than me:
- Our tests currently run commands like
docker-compose exec -T unit-test-container-name phpunit --filter blah unit-test-file-path.php. How would you replicate this in the setup I’m referring to above? Would I just get the Gitlab Runners (pods in the Kubernetes cluster) to run
kubectl exec? What would be the most efficient way for these Runner pods to communicate with the “development environment” service?
- How can I deal with the state issues? If I ran the jobs sequentially, I could reset the state after every single job by either restarting the
mysqlcontainer or by wiping and re-importing the data. However, with 1,300 jobs, sequential is going to seriously slow our pipelines down. Is there any way at all I could run parallel tests - even if perhaps some of them were chunked up?
All thoughts and ideas appreciated