Using OpenAI GPTs and Actions to Augment Infrastructure Engineering Workflows

How I configured a GPT with an External Action to a custom built API hosted on Azure Container Apps

7 min readDec 16, 2024

Introduction

Given I spend a lot of time in the cloud platforms and infrastructure space, one of the first use cases I continually applied GenAI to were common infrastructure engineering workflows such as populating configuration files, managing resources with a CLI, or writing Terraform code. I found tools such as OpenAI’s ChatGPT to be effective and productivity increasing; however, the infrastructure use case seemed to highlight certain sharp edges such as accuracy being more important than creativity given a CLI command or a Terraform resource definition must be syntax perfect and emphasize less general logic then say an analytics script or a presentation. This manifested in issues such as resource names, command names, arguments, or parameters being outdated or wrong. The article below explores how using GPTs and actions can increase the effectiveness of these workflows, how to configure an action, how to build and deploy an API on Azure, and what I learned along the way.

GPTs and Actions

One of the tools that proved effective in addressing accuracy issues were GPTs (see here) which are akin to retrieval-augmented generation (“RAG”) enabled chatbots offered as a service by OpenAI. While building a GPT one can define custom instructions and upload documents to serve as “knowledge”. For example, I built a Terraform GPT where I uploaded all of the docs from the Terraform Azure resource provider. OpenAI handles the underlying complexity of vector embeddings/document search therefore after publishing the GPT I could prompt it for TF code and the overall quality of the provided solutions increased over the general ChatGPT 4o model. Even with the GPTs that had pre-uploaded knowledge I occasionally ran into edge cases where similar sounding parameters would be swapped such as “server name” instead of the correct parameter “name”. See below for an example with an Azure CLI command related question:

GPT with no actions enabled mixing up ‘ — server-name’ with ‘ — name’

It was this specific example above that spawned the idea of using GPT actions to increase the accuracy of Azure CLI related questions. GPT Actions allow GPTs to call external APIs to either retrieve data or perform an action on a user’s behalf with no manual intervention — it can be thought of as a low code implementation of OpenAI function calling because the GPT platform abstracts away the mechanics of generating the API call and processing the response. When a user interacts with a GPT that has actions enabled the user simply asks a question in natural language and the GPT responds in natural language with the API call/response happening under the hood. Now instead of only a document search enabled GPT, my idea was to develop an API that could accept root Azure CLI commands and return the response from the native CLI help function back to the GPT in order to increase the argument accuracy level. For instance, building off the earlier example, if the GPT could get to the point of knowing “az postgres flexible-server firewall-rule list” and pass that to the action (external API) it could return the help output which is the authoritative source of truth for the command arguments.

The Solution: GPT enabled with Actions to an Azure-Hosted API

The solution I ended up landing on was a GPT enabled with Actions that integrates with an API that takes a root Azure CLI command and returns the “ — help” output as shown earlier. When the user prompts the GPT for an Azure CLI related question the GPT will use its existing knowledge to formulate the command root and then reach out to the API to retrieve the authoritative source of truth for the arguments and ground its response in the response from the CLI itself.

High-Level Reference Architecture of the GPT Action Solution

Building and Hosting an API on Azure

The most complex portion of this undertaking was developing an API and hosting it on Azure since the GPT itself doesn’t contain any compute environment where the CLI help command could be run. While there are lots of options I decided on using Azure Container Apps which I found to be a good middle ground between Azure Web Apps for Containers and AKS because it could scale to zero and given my background in Kubernetes I felt more comfortable with the replica based model (see here to learn more about the different Azure container hosting services). The API portion of the overall solution consists of the below:

Azure API Management (APIM): Acts as the API gateway to secure and manage the communication and routing to the Container App backend. Primarily used in my solution to enforce API key authentication for the end client and identity-based authentication for communicating with the backend
Azure Container Apps (ACA): Hosts the python API app and handles authentication, scaling, and incoming HTTP traffic (with a managed ingress controller)
The API App Itself: Built using FastAPI and Gunicorn. FastAPI is a modern, high-performance, web framework for building APIs with Python based on standard Python type hints and Gunicorn is a Web Server Gateway Interface that sits between the Azure Container App Ingress controller (reverse proxy) and the Python application itself since the Fast API python app alone cannot serve any content (learn more here)

The GPT Action Integration

After the API itself was built and tested the action integration it was relatively straightforward. All that is required is an Open API schema which gets uploaded to the GPT builder (and can be exported from Azure APIM), the API keys for authentication, and updated custom instructions to tell the GPT when to call out to the action (see here to learn more). Once the custom GPT is updated when a user prompts for a CLI related question the GPT will call out to the API. See below for an example:

GPT Response with a function call to the AzCLI Help Command API

As one will notice in the example above when the GPT uses the custom action to pull the parameters it accurately provides “ — name” instead of the earlier error of “ — server-name”. This is the power of providing super specific context via an action.

To use the custom GPT yourself check it out here https://chatgpt.com/g/g-Q78XaDnfo-azure-cloud-solution-architect.

Key Learnings About Custom GPT Actions

There’s no direction mechanism to define when a GPT will call the action — this functionality is abstracted by OpenAI. After some trial and error I found that the custom instructions need to be explicit about when to call the action and the endpoint descriptions in the Open API schema definitions need to be clear and concise
Sequential action calls that build on each other are possible if custom instructions are clear. I was able to define this logic via custom instructions for situations where a user is asking the GPT about multiple Azure CLI commands at one time (this essentially allows the GPT to hit the API multiple times with different payloads)
Azure Container Apps is a great service for a low friction, serverless experience. I found being able to rely on standard workflows using tools like Docker for containerization and open source frameworks like FastAPI for the app itself was easier to adopt than say trying to do the same thing with Azure Functions (more managed) or self-hosting (least managed)
GPTs can be called from and between general chat windows using the “@” feature, so if I’m having a session with general ChatGPT 4o I can bring in a custom GPT to answer a specific question by “@’ing” it by name and then go back to the general model after the specific prompt is answered

Conclusion

This project demonstrates how Custom GPTs and Actions can assist infrastructure engineers with highly specific, accuracy-focused workflows — like generating Azure CLI commands. It’s worth noting that the use case of generating Azure CLI commands based on human readable prompts can be achieved with a lot of different solutions — this article is intended to demonstrate the actions feature of custom GPTs and also how to build and host an API on Azure rather than a firmly opinionated statement on this tool being the ultimate solution. For instance, I’ve found that OpenAI’s new web search tool does an almost better job of pulling in relevant Azure documentation to serve as context than uploading a large set of documents that are retrieved using vector search. As these tools continue to mature and evolve my hunch is that using actions as a form of RAG to retrieve information will skew more towards use cases that depend on internal information and systems or user specific information that is behind an authenticated service. In the infrastructure space this could be pulling in company specific guidance for private modules or even pull in help command info for private tools and APIs.

Thank you for taking the time to read! I’d love to hear your thoughts, feedback, or any of your own GenAI use cases in the cloud platforms space — feel free to share your insights or ask questions in the comments or reach out directly.

The views expressed in this post are my own and do not necessarily reflect the official policies, positions, or views of the global EY organization or its member firms.