Author: ultroni1

Token management
Imagine you get a sudden spike in traffic towards your API, maybe there’s a sale or some other reason. To avoid over consumption and possible service disruption, you need to figure out how to manage that.

Azure OpenAI Token Limit policy

As mentioned in the beginning of this unit, sudden spike is something, you need to handle. The good news is that Azure API Management has something called Token Limit Policy.

This policy allows customers to set limits on token consumption, expressed in tokens-per-minute (TPM) and ensures fair and efficient utilization of OpenAI resources.

Key features

The key features of this policy are:
- Precise Control: Customers can assign token-based limits on various counter keys, such as Subscription key or IP Address, tailoring the enforcement to specific use cases.
- Real-Time Monitoring: The policy relies on token usage metrics returned from the OpenAI endpoint, allowing for accurate monitoring and enforcement of limits in real-time.
- Pre-Calculation of Tokens: It enables precalculation of prompt tokens on the Azure API Management side, minimizing unnecessary requests to the OpenAI backend if the limit is already exceeded.
- Enhanced Customization: Customers can apply headers and variables such as tokens-consumed and remaining-tokens within policies for better control and customization.
Ss you can see, there’s quite a few features that help you manage costs and thanks to the real-time monitoring you can make sure that you’re not exceeding the limits.

How to use it

To use this policy, you need to add it to the inbound processing pipeline of the API operation. Here’s how you can do it:

XMLCopy
```
<azure-openai-token-limit counter-key="key value"
        tokens-per-minute="number"
        estimate-prompt-tokens="true | false"    
        retry-after-header-name="custom header name, replaces default 'Retry-After'" 
        retry-after-variable-name="policy expression variable name"
        remaining-tokens-header-name="header name"  
        remaining-tokens-variable-name="policy expression variable name"
        tokens-consumed-header-name="header name"
        tokens-consumed-variable-name="policy expression variable name" />
```
There’s quite a few attributes you can set, but the most important ones are:
- counter-key: The key to use for counting tokens. This value can be a subscription key or an IP address.
- tokens-per-minute: The number of tokens allowed per minute.
- estimate-prompt-tokens: Whether to estimate prompt tokens or not.
October 8, 2025
API management
So what’s the problem I’m having that makes me want to seek out an API management solution? You most likely have the following challenges:
- Scaling, your API or APIs is used by many clients in different regions of the world and you need to ensure that it’s available and responsive.
- Security, you need to ensure that your API is secure and that only authorized clients can access it.
- Error management, you need to ensure that your API can handle errors gracefully.
- Monitoring, you need to monitor your APIs to ensure that it’s performing as expected.
- Resilience, you need to ensure that your API is resilient and can handle failures gracefully.
For each of these challenges you could opt for a point solution, but that could be challenging to manage. Consider also that your APIs could be built in different tech stacks, which means the solutions to above challenges could mean you need different solutions for each API. If you’re having all these challenges, then you should consider a centralized API management solution like Azure API Management.

Let’s dive deeper into some challenges and see how a centralized API management solution like Azure API Management can help you address them.

Infrastructure as code, IaC

It’s perfectly fine by creating your Azure resources using the Azure portal, but as your infrastructure grows, it becomes harder to manage. One of the problems you face is that you can’t easily replicate your infrastructure in another environment.

It’s also hard to trace all the changes that are made to your infrastructure. This situation is where Infrastructure as Code (IaC) comes in. IaC is the practice of managing your infrastructure using code. To apply IaC on Azure, you have several options, one of which is Bicep. Bicep is a Domain Specific Language (DSL) for deploying Azure resources declaratively. It’s a great way to manage your cloud resources. Here’s a simple example of what Bicep looks like:

BicepCopy
```
param location string = 'eastus'

resource storageAccount 'Microsoft.Storage/storageAccounts@2021-06-01' = {
  name: 'mystorageaccount'
  location: location
  kind: 'StorageV2'
  sku: {
    name: 'Standard_LRS'
  }
}
```
In the preceding example, we defined a storage account using Bicep. We defined the location of the storage account, the kind of storage account, and the SKU (stock-keeping unit). Location is a parameter that we can pass in when we deploy the Bicep file. To deploy the file presented, we would use the Azure CLI like so:

BashCopy
```
az deployment group create --resource-group myResourceGroup --template-file main.bicep
```
The preceding command deploys the storage account to the resource group myResourceGroup and use the Bicep file main.bicep to create the resources in the file.
October 8, 2025
Secure your recovery strategy
Make sure your backup and recovery setup is just as secure as your main environment, including security controls and frequency of backup.

You should always have a clean, safe version of your system ready to go in case something goes wrong. That way, you can switch over to a secure backup system and restore data without introducing any threats.

An ineffecient recovery process can slow down recovery, which can cause you to miss recovery targets. For example, a security problem like encrypted backup data that you can’t decipher or corrupted backup data might slow down recovery.

Contoso’s challenge
- The system runs in active-active mode across regions, and the team has a disaster recovery plan to help restore operations in worst case scenarios.
- Part of this plan involves sending backups to a third region in the United States.
- During a recent drill, they found out those backups were being stored in a system that wasn’t checked often and didn’t have strong security.
- All the backups have been infected with malware. If they had a real disaster at that time, they wouldn’t have been able to recover successfully.
Applying the approach and outcomes
- The team invested time and effort to secure the backup location. They added stronger network and identity protections, and now backups are stored in a way that can’t be changed or tampered with.
- After reviewing their security controls, the team finds that during the recovery process, the application runs without a WAF for a period of time. They change the order of operations to close that gap.
- The team is confident that the backups and the recovery process are much more secure and not easy targets anymore.
October 8, 2025
Enhance reliability through robust security
Use security controls and design patterns to stop attacks and bugs from overloading the system or locking people out.

This approach helps keep the system up and running, even if someone tries to take it down with something like a distributed denial of service (DDoS) attack.

Contoso’s challenge
- The workload team and the workload’s stakeholders know that this system must be extremely reliable because hotel guests rely on it for both business and leisure travel. If it goes down, hotels can’t run properly.
- The team has put a lot of effort into testing functional and nonfunctional requirements to make sure the system works well and stays operational, including using safe ways to roll out updates.
- They’ve focused on keeping things reliable, but they haven’t paid as much attention to security. A recent update had a bug that hackers took advantage of, crashing the system for several hotels. The attack overloaded servers in one region for over four hours, causing major problems for guests and staff.
- The attacker used the app’s servers to sneak in requests to a regional storage system and pull up fake folio data. One of those fake folios was huge and caused the servers to run out of memory. Then, when users tried again, it spread the problem to all the servers.
Applying the approach and outcomes
- The team changed the design so the app servers no longer handle folio requests directly. Instead, they’re using a Valet Key approach to limit access. This approach wouldn’t have stopped the attack completely, but it would have kept the damage contained.
- They also added better input checks to clean up anything suspicious before it reaches the system.
- With stronger input filtering and a smarter design, they’ve reduced the risk of this kind of attack happening again.
Proactively limit attack vectors

Set up controls ahead of time to block common ways that attackers try to break in, like bugs in your code, weak network setups, or missing antivirus.

Regularly scan your code, install security updates, keep software current, and run antivirus tools. These practices help reduce the ways that attackers can get in, and they help keep things running smoothly.

Contoso’s challenge
- The system runs on Azure VMs (virtual machines) that use the latest Ubuntu images from Azure Marketplace. When each VM starts up, it installs some certificates, adjusts a few SSH settings, and loads the app code. But it doesn’t use any antivirus or anti-malware tools.
- Azure Application Gateway fronts the solution, but it’s only used as an internet gateway. The web application firewall (WAF) function isn’t enabled currently.
- These choices leave the system exposed to potential risks, like vulnerabilities in the code or accidental malware installs.
Applying the approach and outcomes
- After talking with the security team in Contoso, the VMs are now enrolled in an enterprise-managed antivirus solution.
- The team also enables and fine-tunes the WAF function to block risky traffic, like SQL injection attempts, before it even reaches the app.
- Both the app and its platform now have stronger layered defenses to help keep the system stable and secure.
October 8, 2025
Do threat modeling to find and resolve potential threats
Analyze each part of your workflow and consider what could go wrong. Use an industry-standard methodology to classify the identified threats.

Threat modeling helps you find and fix security threats before they become real problems. Analyzing your workload helps you put together a report that shows which attack paths are the most serious and helps you quickly find weak spots.

Contoso’s challenge
- Even though they haven’t had a security problem yet, the workload team doesn’t have a clear way to check if all possible threats are covered by their current security setup.
- They realize that there’s a gap in their security, and if something goes wrong, they might not be ready.
Applying the approach and outcomes
- The team brings in a security consulting specialist to learn how to do threat modeling.
- After their first exercise, they find that they have well-designed controls for most threat vectors, but there are some gaps:
  - One problem was in a data cleanup task that runs after Apache Spark jobs. It had two insider threat risks for data leaks.
  - An old system used by a race team that’s no longer active still had access to sensitive race data.
- They’ve scheduled fixes for the next development cycle, including shutting down the old system.
October 8, 2025
Test controls yourself
Have security experts try to ethically hack your system occasionally to find weak spots. Regularly scan your infrastructure, code, and tools to catch any vulnerabilities before they become real problems.

Running security tests that mimic real-world attacks, like penetration testing, helps you see if your defenses actually work.

Threats can sneak in during updates or changes, so it’s smart to build vulnerability scanners right into your deployment process. That way, you can catch problems early and even block risky code from going live until it’s fixed.

Contoso’s challenge
- The threat modeling exercise helped the team find some gaps in their security setup. Now they want to make sure their fixes are strong and that nothing was missed.
- They’ve used open-source tools to test security and found it fun and useful. However, the team and stakeholders want to bring in security professionals to do thorough and rigorous testing regularly.
Applying the approach and outcomes
- The team contacts a well-known Microsoft partner that specializes in cloud security to talk about penetration testing.
- The workload team signs a Statement of Work for quarterly penetration testing, including one white-box test each year for extra confidence.
- The consulting team also helps the development team install anti-malware on dev boxes and the self-hosted build agents.
- Now, both the team and stakeholders feel a lot more confident that they’re ready for potential threats.
October 8, 2025
Get current, and stay current
Ensure that your systems always run the latest updates and security patches. Keep checking how things are working by using audit reports, benchmarks, and test results to spot areas to improve. Consider automation where possible. Use smart threat detection tools that can spot problems as they happen. And every so often, check that your setup still follows Security Development Lifecycle (SDL) best practices.

Keeping your security strong takes ongoing effort. By learning from real-world attacks and test results, you can stay ahead of attackers who are always finding new ways to break in. Automating repetitive tasks also helps reduce human mistakes that could create risks.

SDL reviews bring clarity around security features. They also help you keep track of your workload’s assets and their security reports, which cover where they came from, how they’re used, and any weak spots they might have.

Contoso’s challenge
- The developers that write the Apache Spark jobs are hesitant to make changes. They don’t think that it’s necessary. But this means that the Python and R packages they bring into the solution are likely to get stale over time.
Applying the approach and outcomes
- After the workload team reviews internal processes, they realize that if they don’t keep the Apache Spark jobs up-to-date, they could end up with unpatched components in their system.
- The teams use a new standard for the Apache Spark jobs that all technologies in use must be updated, along with their regular update and patch schedules.
- This method helps close the security gap and lowers the risk of the entire workload running outdated software. Plus, their PaaS and SaaS services help limit their exposure to this risk because they don’t have to patch underlying infrastructure.
October 8, 2025
Optimize the security of your backups
Make sure your backups are encrypted and can’t be changed after they’re saved, especially when they’re being moved or copied.

When you adopt this approach, if you ever need to recover data, you can trust that the backup wasn’t tampered with, either by accident or on purpose.

Contoso’s challenge
- Contoso generates the Environment Protection Agency emissions report every month, but they only need to submit it three times a year.
- They store the report in an Azure Storage account as a backup, just in case something goes wrong with the main system.
- The backup report isn’t encrypted and is sent over HTTPS to the storage account.
Applying the approach and outcomes
- After doing a security gap analysis, the team realizes that the unencrypted backup is a risk.
- They now encrypt the report and store it in Azure Blob Storage by using the write-once, read-many (WORM) setting, which keeps the file from being changed.
- They also add a check. The system now compares a Secure Hash Algorithm (SHA) hash of the report with the backup to make sure nothing is altered.
October 8, 2025
Defend your supply chain
Make sure your tools, libraries, and build systems are safe from tampering. Scan for vulnerabilities during builds and while things are running.

Knowing where your software comes from and checking that it’s legitimate throughout the life cycle helps you catch problems early and fix them before they reach production.

Contoso’s challenge
- The engineering team is setting up their build and release pipelines, but they haven’t made sure the build system is secure or reliable yet.
- They’re using some open-source tools in both their firmware and cloud systems.
- They’ve heard how supply chain attacks or insider threats can sneak in bad code that could mess with systems or leak data. If their customer’s environmental reporting gets compromised, it could be a huge problem for both Contoso and the customers.
Applying the approach and outcomes
- The team updates their build processes for both firmware and back-end cloud systems to include security scans for common vulnerabilities and exposures (CVEs) and malware in dependencies, code, and packages.
- They also look at anti-malware options for their Azure Stack HCI setup, such as Windows Defender Application Control.
- These steps help make sure the software and firmware that they ship doesn’t do anything unexpected, and that their customers’ reporting stays accurate and secure.
Employ strong cryptographic mechanisms

Use strong cryptography, like encryption, certificates, and code signing, to build trust. Make sure only trusted sources can decrypt these mechanisms.

When you adopt this approach, only trusted sources can access or change your system and data.

Even if someone intercepts encrypted data, they can’t read it without the right key. And digital signatures help confirm that nothing was tampered with along the way.

Contoso’s challenge
- The devices that they chose for sensing and data transfer don’t have enough processing power to support HTTPS or custom encryption.
- The workload team plans to use network boundaries as their primary isolation technique.
- A risk review flagged that unencrypted communication between IoT devices and control systems could be a big problem. Just segmenting the network isn’t enough.
Applying the approach and outcomes
- They worked with the device manufacturer to upgrade to a more powerful model. The new devices support certificate-based communication and can verify signed firmware before running it.
October 8, 2025
Apply encryption at every step of the data life cycle
Use encryption to protect your data, whether it’s in storage, moving across the network, or being processed. Base your encryption strategy on how sensitive the data is.

By following this approach, even if someone manages to get access, they can’t read anything without the right keys.

Sensitive data includes configuration information that’s used to gain further access inside the system. Data encryption can help you contain risks.

Contoso’s challenge
- Contoso Rise Up backs up each PostgreSQL database by using the built-in point-in-time restores. To be safe, they also make a daily backup that’s consistent and store it separately in a storage account.
- The disaster recovery storage account is restricted with just-in-time access and only a few Microsoft Entra ID accounts can access it.
- During a recovery drill, an employee tried to access a backup and accidentally copied the backup to network share in the Contoso organization.
- A few months later, this backup was discovered and reported to Contoso’s privacy team. They did a full investigation into how it was accessed and what happened to it up to the time when the incident was discovered. Luckily, no sensitive information was exposed, and the file was deleted after they finished their investigation and audit.
Applying the approach and outcomes
- The team now has a clear rule that all backups must be encrypted at rest by using Azure Storage Service Encryption. And the encryption keys must be secured in Azure Key Vault.
- Even if a backup ends up somewhere it shouldn’t, the data inside it is useless without the decryption key. So a privacy breach is much less likely.
- The disaster recovery plan now includes standard guidance about how to properly handle backups, including how and when to safely decrypt a backup.
October 8, 2025