Skip to content

Azure Private Installation Instructions

Overview

This document provides a comprehensive guide for installing the Yellowbrick Data Warehouse in a private Azure Kubernetes Service (AKS) cluster. Due to the unique requirements of enterprise environments, this guide follows a bring-your-own-Kubernetes approach, allowing you to configure the infrastructure to align with your specific needs. The installation is then customized as per the requirements of the Yellowbrick product. The reference architecture provided serves as a baseline for deployment but can be adapted to fit your enterprise’s existing infrastructure and policies.

NOTE

A private installation requires advanced knowledge of cloud infrastructure and networking. Please contact your internal IT and operations teams as needed, or reach out to Yellowbrick Support for assistance.

Understanding a Private AKS Cluster

A private Azure Kubernetes Service (AKS) cluster is an AKS configuration where the API server endpoint is accessible only within a private network, rather than over the public internet. This enhances security by ensuring that the cluster is isolated from public exposure. The key characteristics of a private AKS cluster include:

  • Private Nodes: Nodes are deployed in a VirtualMachineScaleSet (VMSS) without public IP addresses, ensuring that they are inaccessible from outside the virtual network.
  • Private Endpoint for AKS API: The AKS API server is accessed via a private endpoint within a virtual network, ensuring that only resources within that network can communicate with it.
  • Private Access to Azure Container Registry (ACR): The AKS cluster pulls images from ACR using a private endpoint, preventing public access.
  • Customizable Egress Configuration: Users can define specific outbound connectivity through a user-defined outbound type, typically managed through Azure Firewall or Network Security Groups (NSGs).

For more details, refer to Microsoft's documentation on private AKS clusters.

Infrastructure Preparation

Creating the Base Infrastructure

Before beginning the installation process, you must create the foundational infrastructure within Azure. Below is a suggested example configuration, but you may customize it according to your organization's needs:

  1. Resource Group:

    • Create a dedicated resource group to logically contain all resources related to the Yellowbrick deployment.
  2. Networking:

    • Virtual Network: Create a /16 virtual network to provide sufficient IP space.
    • Subnets:
      • /22 Subnet for Yellowbrick Deployment: Ensure this subnet includes a service endpoint for Microsoft.Storage.
      • /26 Subnet for Azure Firewall: Dedicated for managing network security.
      • /26 Subnet for Azure Firewall Management: Separate management plane.
  3. Azure Firewall:

    • Deploy an Azure Firewall instance with Layer 7 (application-level) rules to control outbound traffic. Ensure the firewall allows traffic to the required egress endpoints on ports 80 and 443.
  4. Route Table:

    • Associate a route table with the Yellowbrick subnet that directs all outbound traffic through the Azure Firewall.
  5. Azure Container Registry (ACR):

    • Create an ACR instance with a private endpoint. The naming convention should follow the format ${instance_name}${SHA1(resource_group_full_id)} for consistency across deployments.
  6. Private DNS Zones:

    • Create private DNS zones to resolve the private endpoints for both ACR and AKS within your virtual network. Refer to the Azure private cluster guide for more details on AKS private DNS zones.
  7. Tagging:

    • Apply the additional tag cluster_yellowbrick_io_owner = yb-install to all manually created infrastructure. This allows the Deployer to identify the correct components during deployment.

Firewall Egress Rules

To ensure proper communication between the private AKS cluster and Azure services, specific egress rules must be configured. These rules allow the cluster to access required Azure resources without exposing the cluster to the public internet.

The following endpoints must be allowed through your firewall:

  • *.data.mcr.microsoft.com - Used for pulling container images.
  • *.hcp.[azure_location].azmk8s.io - For AKS management.
  • acs-mirror.azureedge.net - Container image distribution.
  • login.microsoftonline.com - Authentication services.
  • management.azure.com - Azure Resource Manager.
  • mcr-0001.mcr-msedge.net and mcr.microsoft.com - Additional image repositories.

Refer to the official outbound rules guide for further details.

Deploying the Private AKS Cluster

When deploying the AKS cluster, ensure the following configurations:

  1. Cluster Configuration:

    • Kubernetes Version: Must be 1.30 or later.
    • SKU Tier: Standard.
    • Automatic Upgrade: Upgrade must be disabled.
    • Node Security Channel Type: Node security channel type must be disabled.
    • Private Endpoint: Enable private endpoint and disable the public FQDN for the API server.
    • Identity Management: Enable local accounts, workload identity, and Azure RBAC.
    • Azure Policy: Enforce Azure Policy for resource governance.
    • Networking: Network configuration must use kubenet.
  2. System Node Pool:

    • Type: VirtualMachineScaleSets.
    • Autoscaling: Enabled.
    • VM Size: Standard_D4ds_v5.
    • Disk Size: 128 GB.
    • Operating System: Ubuntu.
    • Networking: No public IP addresses assigned to nodes.
    • Node Labels: Ensure the appropriate labels are applied:
      • cluster.yellowbrick.io/node_type: yb-operator
      • cluster.yellowbrick.io/hardware_type: Standard_D4ds_v5
  3. Managed Identity:

    • The AKS cluster should be configured with a user-provided managed identity, which requires the following roles:
      • Private DNS Zone Contributor: Scoped to the private DNS zones created earlier.
      • Network Contributor: For the virtual network and route table, allowing the cluster to manage the subnet and routes.

Virtual Machine Deployment

Deploy a virtual machine within the Yellowbrick installation subnet:

  1. VM Source: Launch the VM using the Yellowbrick-provided image from the Azure Marketplace.
  2. Managed Identity: Assign the necessary roles to the VM’s managed identity as per the {permissions.json} file provided by Yellowbrick. This step is crucial to ensure that the VM has the appropriate permissions to manage the installation and interact with the Azure resources.

Installation Process

NOTE

By installing Yellowbrick Enterprise Edition software into your Cloud Account, you agree to Yellowbrick’s Enterprise Edition EULA.

  1. Subscribe to the Yellowbrick Data Warehouse Enterprise Edition image in the Azure Marketplace.

  2. Create the base infrastructure as outlined in this deployment guide.

  3. Launch an Azure VM using the Yellowbrick image in the target VNet of the deployment. Assign the necessary roles listed here to the VM’s managed identity. As this VM will not be accessible from the internet, you may need to perform additional steps to ensure SSH and HTTPS access.

  4. Create an SSH connection to the VM as the ybdadmin user using the SSH key pair specified during the launch.

  5. The VM is configured to automatically start the interactive web UI for the deployment process. Retrieve the access key by executing /opt/ybd/get-access-key from the remote shell.

  6. From a web browser, access the VM over HTTPS on port 443 using the VM’s private IP address or DNS name. The connection will be encrypted with a self-signed certificate.

  7. Enter the access key from the previous step to proceed with the installation process.

  8. During the installation, specify that this is a private installation and provide the name of the AKS cluster previously created. The existence of the EKS cluster will be validated, and the network configuration will be shown. Please verify those values are correct.

  9. The Deployer will complete the deployment by configuring the cluster, assigning necessary Azure RBAC roles, creating additional node pools, and deploying the Yellowbrick Operator and related workloads.

Terraform Reference

For a Terraform reference of this infrastructure, please see deployer-contrib.

Conclusion

By following this guide, you can establish a secure environment for the Yellowbrick Data Warehouse within Azure.