Many teams struggle to build a service that would notify the owner of an expiring certificate or client secret when integrating applications with Azure AD. It may also happen that the users find out about the issue later rather than sooner and, as a result, they lose access to valuable resources. Luckily, there is a way to fix that up.
Although a service like this should be available in Azure AD application objects out of the box, this is unfortunately not the case.
The expiry of a client secret or certificate is one of the most common problems when the client credential flow is used in an application. As a result, an application can’t authenticate itself and it loses access to the resources.
What does that mean? Trouble for users but also for many teams involved in maintaining the application.
Replacing a certificate or a client secret is a no-brainer if you do it in time. The problem starts when it is too late, and you are trying to figure out why your application suddenly stopped working. If that happens, you have to take a different approach.
To prevent such situations, for one of our largest customers, I built a solution that works at the level of the entire Azure tenant.
It monitors all applications and warns the owners in advance when credentials are about to expire so they have time to act. The application owners get notifications 30, 14, and 7 days before the expiry date via email and MS Teams.
What is so great about this solution? First of all, I used some interesting services from the Azure world to build it, such as PowerShell, Automation Runbook, Logic Apps, and Log Analytics.
And what is even better is that you can do just the same in your environment, almost for free. The monthly cost of this automation is roughly… $5 😊
In this article, I will show you the idea behind this approach.
But before anything else, I should also mention that I can’t share the original PowerShell code as I wrote it specifically for our client. However, knowing what blocks to use, I am sure many of you will be able to recreate the solution on your own.
Let us start with a brief introduction to what credentials there are in the context of Azure AD applications.
First of all, we talk about two elements here:
At this point, we also need to understand two concepts that will make the integration and building of the solution possible:
It is necessary to register an application in the Azure portal and choose the type of tenant to enable identity and access management. From that moment on, it is possible to integrate your application with Azure AD. You get a globally unique ID for your app and can add client secrets or certificates there. (more info here)
It is a template or blueprint for creating a service principal object created in every application tenant. The application object describes how the service can issue tokens to access the application, the required resources, and the actions that the application can take. (more info here)
Some part of my job as a member of the Identity and Access Management team is about integrating applications with the Azure AD service as an identity provider (so-called IdP). Azure offers several options for doing so.
The most common methods to configure SSO include using the SAML protocol or the OAuth 2.0 framework in combination with OIDC.
When using OAuth 2.0, one of the more popular flows is the client credential. It is commonly used for server-to-server interactions that must run in the background without immediate interaction with a user.
Why do we use the client credentials flow? Because we want the application to request an access token to access the resources itself, not on behalf of a user.
Applications communicate independently, like daemons or service accounts. With the client credentials flow, an app sends its own credentials (the Client ID and Client Secret) to Identity Provider that is set up to generate an access token. If the credentials are valid, Identity Provider returns an access token to the client app.
Complicated? If you would like to dig even deeper, check the documentation on Microsoft’s website.
In this type of communication, the application uses its own credentials created in App Registration located in the Azure Portal, which I guess many of you have already seen before:
Here, we can create a client secret that will act as a password for our application or, for the same purpose, add a certificate to use during authentication.
And this is exactly where the source of the problem lies. Client secrets and certificates have a validity period, after which they… expire.
In an ideal world, the owner responsible for the application should have procedures in place to monitor expiring credentials. Unfortunately, this is often not the case.
Ideally, the person responsible for the application and someone who has the appropriate permissions to make changes to it. In Azure AD, it is the account found under the ‘Owners’ tab in App Registration.
Later in this article, I will refer to this place whenever application owners are mentioned.
The account located here can make changes to the application object in Azure AD, and so it can also renew a certificate or client secret.
When designing the workaround for our client, I adopted a more complicated approach that involved ‘tracking’ the owner account of a particular application.
Why did I do that? I wanted to prepare the environment for a situation in which the owner of a given application leaves the organization. If that happens, their account in Azure AD is deleted, and, at the same time, it disappears from the settings in the application object.
This mechanism is designed to keep track of the owner and their manager in the Log Analytics database. The data can be used later when, for some reason, the application loses its owner.
As mentioned, it stores information about two people:
Why do we need this data? Because even if those people are no longer owners of the application, they can indicate who should step in.
In the notifications sent to the previous owner or the manager, I ask to indicate the new owners explaining that the client secret or certificate is about to expire, which could trigger a potential incident.
This complicates the whole logic somewhat but gives us a huge advantage. We create a mechanism for maintaining continuity when it comes to managing the application.
After more than six months of using this solution in our client’s environment, I can only say that there is an almost 100% response to this type of notification.
However, when building a similar solution, you can make simpler assumptions:
There are actually two methods:
The heart of the solution is a PowerShell script that collects and processes information about applications, their certificates, client secrets, and owners and calculates the number of days until the expiry of a given certificate or client secret. The script is hosted in the Azure Automation Runbook.
Let us now look at the components of the solution.
This service allows us to do many different things in the cloud, but in my solution, it is used exclusively for running code. If you want to learn more about the service, I recommend you read the official documentation. It is really worth it!
In Azure Automation, we can also create a schedule for running a script automatically every day at a specified time. However, there is a couple of things to keep in mind.
Namely, the number of applications in an Azure tenant changes very quickly as some new ones are created, and others are deleted. Also, the time until the expiry of a certificate or client secret is running out every second. What that means is that we must scan the environment each day to be able to work on the latest data.
The PowerShell script connects to an Azure tenant, retrieves data of all applications with an active certificate or a client secret, and creates a record for each with the following information:
TimeStamp Application ID Application Name Secret Display Name Secret Expires Date Secret Expires Days Owner Display Name Owner Email Owner Manager DisplayName Owner Manager Email Previous Owner Email Previous Owner Manager Email EmailSent AlertType
Then, the script sends such a record in JSON format to the Log Analytics database, where it is stored. One record per certificate or client secret is created every day so that we know when an application loses or changes owners.
It is also important to see the value of Secret Expires Days. It shows how many days are left until the certificate or client secret expires.
If it is 30, 14, or 7 days, the script, in addition to sending the record to the Log Analytics database, also sends it to Logic Apps, fills in the variable EmailSent to the value Yes, and the variable AlertType to one of either FirstAlert, or SecondAlert, or ThirdAlert.
According to Microsoft’s official documentation, Logic Apps is:
A cloud-based platform for creating and running automated workflows that integrate your apps, data, services, and systems. With this platform, you can quickly develop highly scalable integration solutions for your enterprise and business-to-business (B2B) scenarios. […] It simplifies the way that you connect legacy, modern, and cutting-edge systems across cloud, on premises, and hybrid environments.
This component comes really in handy when you need to integrate several services. In my solution, the Logic App I have created is very simple and its only task is to send notifications.
There are three escalation paths to choose from depending on what application data we have:
How does the Logic App know how to react?
As input to it, the script sends the same JSON record it sends to Log Analytics which contains all the required information. Logic App parses the record, processes it, and based on the data in the record selects the appropriate path.
For example, we fill a record with information about the Previous Owner and put NotExists in the Owner field. That means the script found an application without an owner, called the function to search the archive (Log Analytics database), and found the information about the Previous Owner there.
This is the last element of the solution. You can create it in Log Analytics Workbooks, a service that I really like, and which allows you to visualize the data stored in the Log Analytics database.
Having a dashboard is necessary to monitor the performance of the solution. It gives you a clear picture of all actions performed, the number of notifications sent, and the data about the application owners.
Also, you can use it to see potential errors in the execution of both Azure Automation Runbook tasks and Logic App tasks and to collect statistical information such as the number of applications in a tenant, script execution time, etc.
As a picture is worth a thousand words, I have designed a diagram to make it easier for you to understand how it works.
On the left (the orange lines), you will find a scenario in which the application has no owner assigned, while on the right (the green lines), there is a plan for an application with an assigned owner.
An overview of the solution – see full view here
The description above has certainly given you an idea of what this solution looks like. I think it’s not difficult to do, but it does require some experience with the services I described. In particular with PowerShell because the logic is located in the script.
As I already mentioned, I cannot share the PowerShell, Logic App, or the KQL code itself because it was created for our client. However, there are a few universal things I can show you to make it easier for you to create your own automation.
Also, this is where the chapter will get a little bit more technical but, hopefully, just as interesting. There are a few things I would like to talk about in more detail that will make your job easier.
Ready?
The script needs to retrieve information about the applications from an Azure tenant. Therefore, it needs to:
As the script is supposed to work automatically, I create a Service Principal (SP) for authentication using the Client Credential flow I mentioned before. If you don’t know how to do that, you may like to follow this guide.
Once the client secret has been added, we can use PowerShell to retrieve the access token, use it to make a call to the Graph API, and ask about what we want (or rather, what permissions our SP has).
I use the Graph API to retrieve information about applications, owners, and accounts in Azure AD.
Now, how do you retrieve an access token?
You can use the function that my teammate Konrad Szczudliński (hello!) wrote and that I used in our solution:
###### #Function: Get-OAuthToken ###### function Get-OAuthToken { <# .SYNOPSIS Author: Konrad Szczudliński .DESCRIPTION This function is used to retrieve the OAuth access token based on client id, secret and tenant domain. It also calculates TokenExpiryTime by reducing the 2 minutes from it. #> param ( [Parameter(Mandatory = $true, Position = 0)] [String]$ClientID, [Parameter(Mandatory = $true, Position = 1)] [String]$ClientSecret, [Parameter(Mandatory = $true, Position = 2)] [String]$LoginUrl, [Parameter(Mandatory = $true, Position = 3)] [String]$TenantDomain, [Parameter(Mandatory = $true, Position = 4)] [String]$Resource ) begin { Write-Verbose "Started Get-OAuthToken function" } process { #Get an OAuth access token based on client id, secret and tenant domain $Body = @{grant_type = "client_credentials"; Resource = $Resource; client_id = $ClientID; client_secret = $ClientSecret } try { $OAuth = Invoke-RestMethod -Method Post -Uri "$LoginUrl/$TenantDomain/OAuth2/token?api-version=1.0" -Body $Body } catch { $CustomError = [ordered]@{Status = "Error"; Result = ""; Command = "Get-OAuthToken"; "Message" = "$($TenantDomain): $_.Exception.Message" } } #Construct header parameter for further use $AuthorizationHeader = @{'Authorization' = "$($OAuth.token_type) $($OAuth.access_token)" } #Get Expiry time for token with 120 seconds gap. $TokenExpiryTime = (Get-Date) + (New-TimeSpan -Seconds (($OAuth.expires_in) - 120)) } end { if ($null -eq $OAuth) { Write-Verbose -Message $("Error: Get-OAuthToken" + " Message: " + "Cannot get access token") $CustomError } else { $AccessParameters = [ordered]@{ AuthorizationHeader = $AuthorizationHeader; TokenExpiryTime = $TokenExpiryTime; TenantDomain = $TenantDomain; AccessToken = $OAuth.access_token; } Write-Verbose -Message $("Success: Get-OAuthToken") [ordered]@{Status = "Success"; Result = $AccessParameters; Command = "Get-OAuthToken"; "Message" = "" } return $AccessParameters } } }
Now, use the following variables as the $LoginUrl and $Resource:
$LoginUrl = 'https://login.microsoftonline.com/' $Resource = 'https://graph.microsoft.com'
The other variables are rather self-explanatory. To retrieve an access token, you only need to call a function inside the script:
#Retrieve access token $OAuthToken = Get-OAuthToken -TenantDomain $TenantDomainName -ClientID $ServicePrincipalApplicationID -ClientSecret $ServicePrincipalSecret -LoginUrl $LoginUrl -Resource $Resource $token = $OAuthToken.AccessToken
The second place I authenticate is the PowerShell cmdlet Connect-AzAccount with the – ServicePrincipal switch.
I need this form of authentication in place to be able to send a KQL query to the Log Analytics database and ask for information about the previous owner or manager. Then, I send the query using the cmdlet Invoke-AzOperationalInsightsQuery. I will talk about that in more detail later on.
Our Service Principal must also have the appropriate permissions to be able to read the application and user information in the Azure tenant, so do not forget to add them.
Here’s how you can do that:
Once you’re there, you may add two permissions:
Then you have to ask your Global Administrator to give consent to these permissions.
And that’s all! Now you can use the Graph API to read information about applications and users (owners).
By now, you know how to retrieve the access token for the Service Principal with the required permission to read the client secret or certificate information configured for these applications.
So how do we retrieve information about all the applications and filter out those that have client secret configured?
Below I will show you how I did that using the Invoke-RestMethod cmdlet.
In place of the $token variable, I inserted the variable from the example I showed in the Authentication section above.
#Scan all apps in Azure tenant $odataQuery = "?`$select=id,appId,displayName,passwordCredentials" $Uri = "https://graph.microsoft.com/v1.0/$TenantDomainName/applications$($odataQuery)" $AllApplications = Invoke-RestMethod -Headers @{Authorization = "Bearer $($token)"} -Uri $Uri -Method Get $AllApplicationsValue = $AllApplications.value $AllApplicationsNextLink = $AllApplications."@odata.nextLink" While ($Null -ne $AllApplicationsNextLink) { $AllApplications = Invoke-RestMethod -Headers @{Authorization = "Bearer $($token)"} -Uri $AllApplicationsNextLink -Method Get $AllApplicationsNextLink = $AllApplications."@odata.nextLink" $AllApplicationsValue += $AllApplications.value } $AppsWithSecretConfigured = $AllApplicationsValue | Where-Object {$_.PasswordCredentials}
By following this approach, you will find all the applications you need in the $AppsWithSecretConfigured variable. You can also process them one by one in a ForEach loop by creating a record for each client secret or certificate that you store in Log Analytics or send it to the Logic App.
As I already explained, the Log Analytics workspace is where I store the daily scan results of the environment. There are many advantages to using it. In the context of my solution, the price and the low barrier of entry to learning the KQL query language are on the top of the list.
Let me give you an example of what our client spends on it.
There are about 12,000 applications scanned daily (about 1 million records/month), and the database retention is set at 2 years. And the monthly cost is… $1!
Yes, you read that right 😊
Now, let’s look at the application. Obviously, we collect data so that we can make use of it, but to do that, you need to know how to query it. This is where KQL comes into play.
First, you will need to create a new Log Analytics workspace as explained here.
The next step is to assign the role of Log Analytics Reader to our Service Principal so that it can query the database.
You can do that in the Access control (IAM) tab in the newly created Log Analytics workspace:
At this point, we have permission to read from the Log Analytics database, but there is no data there yet. It will appear after the first iteration of our script.
Now, we need to equip the script with a method for sending and saving data in Log Analytics. To do this, we will create a custom log for storing all the information.
Fortunately, Microsoft has a ready-to-use PowerShell function to help you with that. All you need to do is implement it in your script and remember the following:
# Specify the name of the record type that you'll be creating $LogType = "MyRecordType"
# Create two records with the same set of properties to create $json = @" [{ "StringValue": "MyString1", "NumberValue": 42, "BooleanValue": true, "DateValue": "2019-09-12T20:00:00.625Z", "GUIDValue": "9909ED01-A74C-4874-8ABF-D2678E3AE23D" }, { "StringValue": "MyString2", "NumberValue": 43, "BooleanValue": false, "DateValue": "2019-09-12T20:00:00.625Z", "GUIDValue": "8809ED01-A74C-4874-8ABF-D2678E3AE23D" }] "@
###### #Function: Create-AppRecord ###### function Create-AppRecord { [CmdletBinding()] Param ( [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $AppDisplayName, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $ApplicationAppId, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $OwnerDisplayName, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $OwnerUPN, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $OwnerMail, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $OwnerManagerDisplayName, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $OwnerManagerUPN, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $OwnerManagerMail, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $PreviousOwnerDisplayName, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $PreviousOwnerUPN, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $PreviousOwnerManagerDisplayName, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $PreviousOwnerManagerUPN, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $SecretDisplayName, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $SecretExpiresDate, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $SecretExpiresDays, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] [ValidateSet("Yes","No")]$EmailSent, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] [ValidateSet("NoAlert","FirstAlert","SecondAlert","ThirdAlert")]$AlertType ) $Details = [ordered]@{ 'Application' = $AppDisplayName 'ApplicationAppId' = $ApplicationAppId 'OwnerDisplayName' = $OwnerDisplayName 'OwnerUPN' = $OwnerUPN 'OwnerMail' = $OwnerMail 'OwnerManagerDisplayName' = $OwnerManagerDisplayName 'OwnerManagerUPN' = $OwnerManagerUPN 'OwnerManagerMail' = $OwnerManagerMail 'PreviousOwnerDisplayName' = $PreviousOwnerDisplayName 'PreviousOwnerUPN' = $PreviousOwnerUPN 'PreviousOwnerManagerDisplayName' = $PreviousOwnerManagerDisplayName 'PreviousOwnerManagerUPN' = $PreviousOwnerManagerUPN 'SecretDisplayName' = $SecretDisplayName 'SecretExpiresDate' = $SecretExpiresDate 'SecretExpiresDays' = $SecretExpireDays 'EmailSent' = $EmailSent 'AlertType' = $AlertType } $AppRecord += New-Object PSObject -Property $Details return $AppRecord }
$JSON = ConvertTo-Json -InputObject $AppRecord
Post-LogAnalyticsData -customerId $LogAnalyticsWorkspaceID -sharedKey $LogAnalyticsSharedKey -body ([System.Text.Encoding]::UTF8.GetBytes($JSON)) -logType $logType
$LogAnalyticsWorkspaceID $LogAnalyticsSharedKey
I designed my logic in such a way that if the script finds an application without an owner, it runs an additional function that checks Log Analytics and looks there for a record from the past when the application had an assigned Owner.
Below you will find a function that will send a defined KQL query to the Log Analytics database. You must, of course, customize the KQL query yourself.
The one that I am posting serves only as an example. The query looks for the latest record in the database where OwnerUPN does not have the Empty value. And when it does, my script fills the JSON record when it finds an application without an owner.
Generally, you should adjust the query to suit your logic and what you will be sending to Log Analytics in the relevant properties for the application.
The example below is here to show you how an example KQL query may be sent from within PowerShell:
###### #Function: Find-PreviousAppOwners ###### function Find-PreviousAppOwners { [CmdletBinding()] Param ( [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $ApplicationAppId, [Parameter(ValueFromPipeline=$true, Mandatory = $true)] $LogAnalyticsWorkspaceID ) $ApplicationAppId = '"'+$ApplicationAppId+'"' $Query = "ClientSecretMonitoring_CL | where TimeGenerated > ago(730d) | where ApplicationAppId_g == $ApplicationAppId | where OwnerUPN_s <> 'Empty' | where OwnerManagerUPN_s <> 'Empty' | summarize arg_max(TimeGenerated, *) by OwnerMail_s | project OwnerDisplayName_s,OwnerUPN_s,OwnerManagerDisplayName_s,OwnerManagerUPN_s" $queryResults = Invoke-AzOperationalInsightsQuery -WorkspaceId $LogAnalyticsWorkspaceID -Query $Query -IncludeStatistics $PreviousAppOwners = [System.Linq.Enumerable]::ToArray($queryResults.Results) return $PreviousAppOwners }
Luckily, this is quite simple. By scanning the application in the PasswordCredentials properties, we can see the expiry date of the client secret.
To do that, use the following function:
###### #Function: Get-ExpireDays ###### function Get-ExpireDays { [CmdletBinding()] Param ( [Parameter(ValueFromPipeline=$true, HelpMessage="Type Expire Date", Mandatory = $true)] $EndDate ) [datetime]$StartDate = (Get-Date).Date try { $ExpireDays = ((New-TimeSpan -Start $StartDate -End $EndDate -ErrorAction Stop)).Days } catch { $ErrorMessage = $_.Exception.Message } return $ExpireDays }
In this step, we will also use the features of Logic Apps. And don’t worry if you are new to them. To get the solution up and running, we need to build a very simple Logic App that will send emails and notifications via MS Teams to the owner, the previous owner, or their manager.
We will start by creating a new, empty Logic App and selecting the Consumption plan.
Each Logic App has two types of elements:
In the case of our Logic App, I used a trigger called ‘When an HTTP request is received.’ When this trigger is saved in the Logic App Designer, an HTTP POST URL will be generated.
We can send to this URL our payload that was built using the Create-AppRecord function and converted into JSON.
This can be done using the following cmdlet:
Invoke-RestMethod -Method POST -Uri $LogicAppURL -Body $JSON -ContentType 'application/json'
The variable $LogicAppURL is the URL generated in the trigger.
A schema should then be generated and added to the Logic App for such a request. At this point, the Logic App is ready to accept records created by the script.
The first action they use in the Logic App is Parse JSON. It allows the input record to be parsed so that you can retrieve from it any properties you need (Application Name, Application ID, Owner email, etc.).
Having the record parsed this way, we are now left with building the logic using the Condition action which will depend on the record data.
The action responsible for sending emails is Send email (V4) in the SendGrid. Alternatively, you can use the Send an email (V2) action in the Office365 Outlook service.
For sending MS Teams notifications, use the post message in a chat or channel action in the MS Teams service after creating a service account and logging in with it. Also, remember that the account requires an associated license to use MS Teams.
The script uses sensitive information (e.g., the Logic App URL) for authentication and communication with Log Analytics and Logic Apps. To secure these, you can use the Azure Key Vault service.
As the script is hosted in Azure Automation, I used the Credentials tab. Here you can also store sensitive information securely.
The script can retrieve the information from Credentials using the Get-AutomationPSCredential cmdlet.
By creating an entry in Credentials as UserName, I added the AppID for the Service Principal that the script uses. For Password, I put the value of the client secret generated for this Service Principal.
With the following code, it is possible to retrieve these credentials in the script:
$SP_Creds = Get-AutomationPSCredential -Name "ClientSecretsAndCertificatesMonitoring" $ServicePrincipalApplicationID = $SP_Creds.GetNetworkCredential().UserName $ServicePrincipalSecret = $SP_Creds.GetNetworkCredential().Password
And that would be the last step. It wasn’t so hard, was it?
Last but not least, don’t forget to secure all the sensitive information you will use in the script.
In this article, I wanted to present my idea for monitoring client secrets and certificates. This is not the only approach, and you can achieve the same results using other methods.
I have tried to give you as much information as possible to help you build your own solution. I hope you will find my notes handy, and if you have any questions, just let me know.