Agile delivery
Agile delivery 
Tech

How a Flawed CI/CD will Blow you off

Ramalingam S

We are in the era of Agile deliveries. Many engineering managers are aware of the benefits which Continuous Delivery (CD) and Continuous Integration (CI) can bring to any organisation. However not many talk about the day-to-day obstacles where if the process of CI/CD not followed meticulously, can our out to be major roadblocks.

QA and DevOps

If you are a Software QA and not sure about the connection yet , please read on

  • Quality is not only QA’s problem any more
  • Build promotions, CI and CD are not only DevOps problem any more
  • More and more automation at Devops front
  • Success of DevOps relies on a well disciplined QA practice
  • DevOps is not killing QA

What could go wrong ?

Lot of agile teams today possess a common pattern of working model

  • Multiple streams of work
  • DoD (Definition of Done) enforces automation (DevOps, Testing, Processes etc)
  • Continuous integration
  • Automated Functional tests
  • Continuous Deployment

Every single item from the above list is supposed to add value at multiple levels and felicitate quick and stable product delivery.

However release engineering and production deployment path becomes the most cumbersome travel in many places. There could be so many symptoms that indicates CI/CD flaws.

  • Deployments are flaky/unreliable over time
  • Often deployments exceeds allowed down-times
  • Production releases becomes no lesser than a gambling.. (Yes - You heard it right)
  • Lot of Quality issues/defect slippages

The Success of the release engineering and related practices highly depends on the nature of the team (Including Devs, QAs, DevOps). It depends on too many factors and it is hard to propose a single ready made solution to set them right.

I was trying to compile a list of issues that we have faced and possible reasons for the same in Agile delivery team. How about deriving a list of symptoms and corresponding possible reasons for the same.

Not sure the table can be used as a thumb rule but definitely this can be used a checklist for optimizing Release engineering practices.

AreaSymptomsPossible reasons
End to End Pipeline SetupPromoting builds takes more time? Deployment failures Build monitor being red always Who broke the build? Blame game?Semi automated pipeline ( manual selection of builds, data migrations etc) Flaky IaC modules Unstable connections (Data , network)
Deployment FrequencySignificant amount of time in a sprint spent in deployments Issues around Merging, Cherry picking etc Issues around Verification Business/Client is not happy with the deployment frequencyThe Chosen deployment frequency is not right (Too early or late) Not being a conscious decision - Business/Engineering/Clients
Branching Strategy - Trunk based developmentIncomplete features in Production Frequent defect leakages Longer test cycle Lack of clarity around features to be enabled/not enabled/should not be enabledDefects around Feature flags Discipline around clearing unused flags Feature factories Visible/Documentation around available feature flags
Branching Strategy - Multiple branchesOften Merging takes longer time Repeating the tests to ensure merge misses Unstable components Missed merges causing issues No automated means for finding merge misses
Discipline around Test automationI can’t wait for the tests to complete ! Decent test coverage but still lot of defects identified in production People are hesitant to sign up UI test failures Dec complete, tests are not ready :)Long feedback tests Unrealistic data UI tests - bloated over time Catch game? (Effort estimation for Test automation is often ignored/overlooked)
Too many stories running across sprintsAdverse impacts on Test efficiency Testing efforts Complicated merges Multiple test cycles Stories are not sliced properly.
MigrationsDeployment overshooting the planned downtime Point of no return - No reverse migrations :)Non additive migrations No Two way compatibility (Code and Schema) No Blue green deployments

There could be many other patterns and the above list is just an indicative one. It would really help if we continuously review the release engineering practices and follow/adopt the most relevant ones.