Ada Support

Scaling developer workflows at Ada

Raj Ravichandiran, Robert D'Ippolito, and Andre Marcelo-Tanner
Engineering Team

Ada is a rapidly scaling startup with a big engineering organization, and as our customer base grew so did our technology and our developer toil. Solutions and processes that worked when we were a team of 10-20 people no longer applied when we became a team of 80-100.

This translated to slower deployment cycles and frustrated engineers, which was absolutely unacceptable — Ada Engineers deserve better.

The Developer Automation and Deployment Engineering (DADE) team (that’s us, this post’s authors), were tasked with finding a solution that improved workflows and streamlined development processes. The ultimate goal of course is to create a top notch developer experience at Ada that translates to happier engineers and a better product for our customers.

So we sat down with the different engineering teams to understand what the biggest problems were, and what a good experience would look like for them. Read on to find out what these problems were and how we found a solution that addressed them with DevSpace.

bad experience: lack of unique, ephemeral testing environments

Ada’s engineers had to share a few clusters for testing and development, which resulted in long deployment cycles and reduced team velocity.

Every time an engineer needed to run tests on a cluster, they had to notify other engineers so that they don’t make any changes during that time that would interfere with the testing.

As a band-aid solution, engineers started blocking deployment time slots on their calendars. This still caused a lot of friction and confusion, not to mention created a large window for human error — engineers may forget to notify others, or they simply may not be aware of deployment schedules, particularly as new team members were joining at a fast clip.

Scenarios like this one were common:

  • Engineer A books a time slot to test changes
  • They deploy their branch and start testing
  • They soon realize that something isn’t working
  • After some debugging they find out that Engineer B didn’t realize they booked a deployment slot and had overwritten their code by accident

bad experience: complex local development setup process

Setting up services for development is a long and arduous task for both new and seasoned engineers.

Each service has a unique development setup requirement that consists of a codebase and other dependent services to run the service locally. During onboarding, new engineers needed to set up the local environment for the service they would be working on. They had to make sure they installed the right packages with the right versions and set up supporting tools like Docker to deploy the service locally. The process was not always properly documented, so engineers often ended up with missing packages/modules and/or the wrong versions of them.

This was an issue for seasoned engineers as well. If they’d been off a certain service for a few months and then came back to it, they usually found that the local setup had changed and their previous setup was out of date.

The local environments also consumed a lot of CPU/memory resources of the engineers’ local machines, and oftentimes made it impossible to use other applications or programs at the same time. This was especially painful for our machine learning scientists who deal with gigabytes of data and need access to GPU hardware

bad experience: friction while accessing clusters

Finally, accessing our Kubernetes clusters was difficult. Engineers who were interested in checking the status of their deployments or needed access to Kubernetes APIs had to follow a multi-step process involving VPNs, auth tokens, and long CLI commands. This process also had to be repeated every time a cluster was upgraded or deployed.

The DevOps team was inundated with tickets, which caused long development cycles to solve simple Kubernetes issues. It was simply not scalable.

DADE team to the rescue

Before we started looking for solutions, we sat down with different engineering teams to understand what a “VIP developer experience” would look like to them. Based on those conversations, we settled on a few guiding principles:

  1. Engineers should be able to spin up dedicated, remote environments for testing on-demand.
  2. New and seasoned engineers should be able to deploy any core service within 10 minutes of cloning a repo.
  3. Engineers should not need to run services locally.
  4. Engineers should have adequate visibility into how their remote infrastructure is performing and possess the tools to action faulty deployments.

Armed with these principles, we started our search for the right solution and ultimately chose DevSpace, an open source developer tool supported by Loft Labs. DevSpace has two main functions that solve the problems outlined above while meeting the engineers’ guidelines: DevSpace Deploy and DevSpace Development (Dev).

good experience: dedicated, remote testing environments

The first thing we did was implement DevSpace Deploy for our core Ada services. This gives engineers the ability to spin up dedicated, remote testing environments to work on.

While this alone is already a major improvement to the workflow, DevSpace Deploy also creates a few extra efficiencies:

  1. Dependency feature: Engineers can select additional services to deploy within their environment as dependencies, without disrupting the live services or being disrupted by work being done on them. They can mix and match to create the unique environment that meets their development requirements for the test, without worrying about overlapping work or human error.
  2. Dedicated and/or connected: If there are any critical services that engineers need for their environment to behave correctly, but they don’t need to make any changes to them, they can simply connect to them instead of deploying them within the dedicated environment. This reduces the workload that needs to be deployed on each dedicated environment, and environments can be deployed faster.

With DevSpace Deploy, engineers can focus on doing their work without worrying about the pitfalls of a shared cluster. As a result, development is faster and deployment schedules shorter. Having dedicated environments also means that reviewers can test PRs more thoroughly, and as a result, higher quality code is deployed to production.

good experience: quick set up and remote infrastructure

Next, we used DevSpace Dev to solve the setup and local infrastructure problems.

Instead of having to install and run different services locally, engineers could access up-to-date remote containers of the various services they’re working on. They no longer have to worry about having all the right tools or the right versions of repositories, or create a local development environment at all for that matter.

Engineers are also able to sync their local file system with the remote container to edit code and see if it is reflected in the remote environment. If you’ve ever used a cloud service with a local folder like Dropbox or Google Drive, it’s the same idea. The files are hosted remotely, but you can access, edit, and update them locally or remotely, and it all syncs together.

This simplifies the development workflow and makes it faster for engineers to get up and running, or switch between working on different services. Not to mention lightens the load on the local machines by freeing up CPU/memory resources. It’s also especially useful when features or code changes need to be shared across development teams or with product teams — engineers could simply share a url with the team to see the latest updates live.

good experience: easy access and personal dashboard

In addition to DevSpace, Loft Labs offers a control plane that provides direct access to our Kubernetes clusters with auditing and role based access control capabilities.

Engineers no longer need to follow a long multi-step process to get access to our clusters; they can simply log in through Ada’s identity provider. Once they’re in, they can see a simple dashboard that lists services they’ve created along with the logs associated with those services. They can also use this dashboard to observe and inspect their application once it has been deployed to all environments.

DevSpace rollout and reception

Rolling out DevSpace has been a collaborative and iterative process. We hosted walkthrough sessions with different teams and used these sessions to collect data on how they use the tool and feedback on the deployment workflow. We used this information to optimize the rollout.

We also created thorough documentation and a series of video tutorials that enable engineers to get started on DevSpace on their own.

Initially, we rolled out DevSpace in our core services, but after presenting it to the engineering organization at large, many other teams have adopted it and configured it for their own smaller services.

For us, the DADE team, this was a huge win. It’s clear that we’ve been able to improve the engineering experience and empower teams to do better, and more valuable work. This translates to better job satisfaction and a better product for our customers.

Ada is still growing, and we’ve got our sights on introducing AI to non-digital channels such as voice . The work we do on Developer Automation and Deployment Engineering ensures engineers are equipped with the tools and services to do the work that matters most.

build an extraordinary career at Ada

We're at the forefront of AI-first customer service, building disruptive technology for some of the biggest brands in the world.

Explore careers