Hi folks,
We’re looking at the possibility of using Kubernetes to facilitate our development pipelines.
Currently we use 360 virtual machines as dedicated Gitlab Runners. Every time a pipeline is triggered on Gitlab, we run ~1,300 jobs (mostly tests - unit, selenium, etc). Each Runner (1 per VM) runs a single job concurrently.
For every job, we spin up a docker-compose
environment which runs 38 containers. Obviously this is pretty hefty, and with dependencies, can take 3-4 minutes to become usable.
Our .gitlab-ci.yml
looks something like this:
stages:
- unit
- integration
- selenium
"unit test 1":
stage: unit
script:
- docker-compose exec -T unit-test-container-name phpunit --filter blah unit-test-file-path.php
(... lots more of these)
"integration test 1":
stage: integration
script:
- docker-compose exec -T unit-test-container-name phpunit --filter blah integration-test-file-path.php
(... lots more of these)
Etc, etc.
I intended to spin up 100+ pods as Gitlab Runners and then every time a pipeline runs, spin up the 38 containers and a Service as a “development environment”. In my head, I was then going to use the runners to communicate with the development environment to run the tests - be it through kubectl
commands or curl
requests, etc.
However, quite a few of our tests change the state of our mysql
database. E.g. a test might add some new rows; another test might delete them.
Two questions for guys with more Kubernetes experience than me:
- Our tests currently run commands like
docker-compose exec -T unit-test-container-name phpunit --filter blah unit-test-file-path.php
. How would you replicate this in the setup I’m referring to above? Would I just get the Gitlab Runners (pods in the Kubernetes cluster) to runkubectl exec
? What would be the most efficient way for these Runner pods to communicate with the “development environment” service? - How can I deal with the state issues? If I ran the jobs sequentially, I could reset the state after every single job by either restarting the
mysql
container or by wiping and re-importing the data. However, with 1,300 jobs, sequential is going to seriously slow our pipelines down. Is there any way at all I could run parallel tests - even if perhaps some of them were chunked up?
All thoughts and ideas appreciated
Duncan