Password-less access from pipeline to database
The goal
The goal is to be able to run database migrations as part of a Azure DevOps Pipeline with using a stored secret anywhere. The databases are Azure Sql Databases, connected to a VNET, with no public access. There are two major problems to solve:
- Network access
- Authentication
Background
My current project is at a company with a software solution that has been in development for more than 20 years. Traditionally it ran on-premise, that is at the customer’s site. This is still the case for a few customers, but most customers are now using the service served from a cloud-hosted solution - Software-as-a-Service - SaaS
We looking to modernize the cloud hosting platform. The customer’s data are potentially very sensitive, so we do not want to have public access to any data storage, and we do not want to have connection credentials floating around. We prefer to use Entra Ids, and give explicit access to everyone who needs access, with audit trails that show who connected when and did what.
The challenge
- From a pipeline in ADO
- Run a custom tool (command line application)
- To upgrade all customers databases on an Azure SQL Server
Networking
When we started this task, the networking was not properly set up. We had some VNETs defined, one that the Sql Server was connected to, and another one which hosted some build agents. Our goal is to have a Hub-spoke network topology, which you can read more about on Microsoft Learn. It consists of the following features:
- A hub VNET. This is where you put your VPN Gateways, ADO Build Agents, Bastion servers, etc.
- One or more spoke VNETs. Typically one for each environment: DEV, TEST, PREPROD, PROD. Or split on PROD and Non-PROD. This all depends on how your organization needs to control access.
To make this work, we did the following to make our challenge work:
- Set up peering between the hub and the spoke containing the DEV environment that we were interested in at the moment
- Set up DNS Zones that enabled us to use the name of the Sql Server instead of the IP-address when connecting from the hub network to the DEV network
- Set up a VPN Gateway that allowed us to test that everything was set up right
- We already had a VM Scale Set set up as ADO Build Agents in the hub network
(image to illustrate the topology)
With this, networking was fine.
Identity, authentication and authorization
- We defined a specific User-Assigned Managed Identity in Azure to be used for this task. We did it using Terraform, but you could use the portal, Bicep, ARM-templates or any other way of defining your infrastructure. A managed identity is an Entra Id
- We gave this identity Contributor rights to the Sql Server. This was needed for some strange reason to allow the next step, otherwise we would get “identity not found”.
- We gave this identity DB_READER, DB_WRITER and DDL_ADMIN right to each database. In the future this will be automated when provisioning databases (each customer has its own database)
- In ADO we created a Service Connection, and gave it Federated credentials to the managed identity. This allows the service connection to act on behalf of the managed identity.
- In ADO we gave the pipeline access to the service connection. Both to create the service connection and to authorize its usage we need to Project Admin.
Now all the infrastructure was in place to enable access to the database. What was left was to actually get hold of the credentials and use them.
- In the pipeline we added a step of type
AzureCli@2. This is a step that supports using service connections.- The property
azureSubscriptionis set to the name of the service connection. When starting the step, ADO will automatically log in to Azure. - In the inline script we get the access token using:
accessToken=$(az account get-access-token --resource https://database.windows.net/ --query "[accessToken]" -o tsv) - The access token is passed to the migrations application as a parameter
- The property
- In the migrations applications we use the NuGet package
Microsoft.Data.SqlClientversion 6.0.1 to run our database migration scripts. It gets some parameters:connectionString: Contains just server name and database, no credentials (User Id, Password, Authentication). Used to create an instance ofSqlConnectionin the codeaccessToken: Put in to the propertyAccessTokenonSqlConnection.- Also other parameters, like where the Sql files are located.
Turtles all the way down?
With all this in place we where able to run database updates from the pipeline, without any public access to the database, and without any passwords stored anywhere.
So how can this be secure? The only way to change something in the database is one of the following
- Commit Sql-code to the repository containing the SQL-script
- Commit changes to the database updater application
- Change the pipeline, that is commit changes to the repo, again.
What are the attack vectors in our case now? (this is work in progress – I’m still learning…)
- Gain access to the managed identity: I have not been able to login to e.g. Azure CLI (
az login --identity --client-id <guid>). I’m not sure how. - Gain access to the pipeline: It is checked in to a repo, so your need commit rights there to modify it.
- Gain access to Service Connection: If you are Project Admin in ADO, you can authorize other pipelines to use this service connection that is connected to the managed identity being authorized to change any database
- Gain access to the migrations scripts: They are in a repo, so you need commit rights there to modify them.