# Learning APIM Series (3) - Configuring multiple Azure OpenAI instances with APIM and managed identity

# Enable APIM managed identity

  1. Enable APIM managed identity

Ensure you have enabled APIM managed identity on APIM resource.

alt text

# Prepare Azure OpenAI resource

  1. Add role assignment to APIM managed identity

(1) Add role assignment with Cognitive Services OpenAI User role

alt text

(2) Assign to APIM managed identity

alt text

Note: You have to add role assignment on all Azure OpenAI resources

# Configure APIM with Azure OpenAI backends

  1. Add Azure OpenAI backend

alt text

alt text

# Add Azure API with OpenAI

  1. Add Azure API with OpenAI

(1) Add OpenAPI with Add API on APIM API pane

alt text

  1. Create from Open API specification

(1) Select Full on Create from OpenAPI specification

(2) Download the OpenAPI specification from the URL https://raw.githubusercontent.com/Azure/azure-rest-api-specs/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/stable/2023-05-15/inference.json

(3) Modify the Server setting to the OpenAI endpoint url

alt text

(4) Upload the json file to OpenAPI setting

(5) Add openai as API URL suffix

(6) Add Starter and Unlimited as products

alt text

# Set the Inbound policy to add the managed identity token

  1. Add the following policy to the Inbound policy

alt text

1
2
3
4
<authentication-managed-identity resource="https://cognitiveservices.azure.com" output-token-variable-name="msi-access-token" ignore-error="false" />
<set-header name="Authorization" exists-action="override">
<value>@("Bearer " + (string)context.Variables["msi-access-token"])</value>
</set-header>

alt text

# Add Inbound policy for Azure OpenAI redundancy

  1. Add the following policy to the Inbound policy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
<set-variable name="urlId" value="@{
Random rnd = new Random();
int urlId = rnd.Next(1, 3);
return urlId.ToString();
}" />
<cache-lookup-value key="@("instance-01-is-down")" variable-name="instance-01-is-down" />
<cache-lookup-value key="@("instance-02-is-down")" variable-name="instance-02-is-down" />
<choose>
<when condition="@(context.Variables.GetValueOrDefault<string>("urlId").Equals("1") && (!context.Variables.GetValueOrDefault<bool>("instance-01-is-down")))">
<set-backend-service backend-id="aoai-jpe-i-01-20230814" />
</when>
<when condition="@(context.Variables.GetValueOrDefault<string>("urlId").Equals("2") && (!context.Variables.GetValueOrDefault<bool>("instance-02-is-down")))">
<set-backend-service backend-id="aoai-jpe-i-02-20230814" />
</when>
<otherwise>
<return-response>
<set-status code="500" reason="InternalServerError" />
<set-header name="Microsoft-Azure-Api-Management-Correlation-Id" exists-action="override">
<value>@{return Guid.NewGuid().ToString();}</value>
</set-header>
<set-body>A gateway-related error occurred while processing the request.@(context.Variables.GetValueOrDefault&amp;amp;lt;string&amp;amp;gt;("urlId"))</set-body>
</return-response>
</otherwise>
</choose>

alt text

# Add retry Backend policy

  1. Add the following policy to the Backend policy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<retry condition="@(context.Response.StatusCode == 429)" count="2" interval="1" max-interval="10" delta="1" first-fast-retry="true">
<choose>
<when condition="@(context.Response != null && (context.Response.StatusCode == 429))">
<choose>
<when condition="@(context.Variables.GetValueOrDefault<string>("urlId").Equals("1"))">
<cache-store-value key="@("instance-01-is-down")" value="@(true)" duration="10" />
<set-backend-service backend-id="aoai-jpe-i-02-20230814" />
</when>
<when condition="@(context.Variables.GetValueOrDefault<string>("urlId").Equals("2"))">
<cache-store-value key="@("instance-02-is-down")" value="@(true)" duration="10" />
<set-backend-service backend-id="aoai-jpe-i-01-20230814" />
</when>
<otherwise />
</choose>
</when>
<otherwise />
</choose>
<forward-request />
</retry>

alt text

# Configure APIM subscription to accept api-key header and query parameter

alt text

# Test the APIM in a VM with VNet integration

  1. Get the subscription key from APIM

alt text

  1. Add APIM fqdn to hosts file
1
2
3
4
5
vi /etc/hosts

add the following line

10.0.4.4 apim-jpe-20230818.azure-api.net

alt text

  1. Using curl to test the APIM
1
2
3
4
5
6
7
8
9
10
11
curl -X POST "https://apim-jpe-20230818.azure-api.net/openai/deployments/gpt-35-turbo/chat/completions?api-version=2023-05-15" \
-H "Host: apim-jpe-20230818.azure-api.net" \
-H "Content-Type: application/json" \
-H "api-key: <your_APIM_subscription_key" \
-d '{
"model": "gpt-35-turbo",
"messages": [
{"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
{"role": "user", "content": "Who were the founders of Microsoft?"}
]
}'

alt text


# Reference

  • Azure OpenAI API 2023-05-15 Swagger spec
  • Authenticate with managed identity
  • api-management-policy-snippets-Back-end API redundancy.policy.xml