Hear three industry experts as they reveal 2025 data and AI trends 

Watch now
| Technical

Cloudera on Azure DNS deep dive

Dongkai Yu headshot
bridge suspension

DNS setup best practice on Azure.

In Cloudera deployments on cloud, one of the key configuration elements is the DNS. Get it wrong and your deployment may become wholly unusable with users unable to access and use the Cloudera data services. If the DNS is set up less ideal than it could be, connectivity and performance issues may arise. In this blog, we’ll take you through our tried and tested best practices for setting up your DNS for use with Cloudera on Azure.

To get started and give you a feel for the dependencies for the DNS, in an Azure deployment for Cloudera, these are the Azure managed services being used: 

  • AKS cluster: Data Warehouse, Data Engineering, Artificial Intelligence, and Data flow

  • Azure database for MySQL: Data Engineering

  • Storage account: all services

  • Azure database for PostgreSQL DB: data lake, Data Hub clusters, Data Warehouse, Artificial Intelligence, Data Flow

  • Key vault: all services

Typical customer governance restrictions and the impact

Most Azure users use private networks with a firewall as egress control. Most users have restrictions on firewalls for wildcard rules. Cloudera resources are created on the fly, which means wildcard rules may be declined by the security team.

Most Azure users use hub-spoke network topology. DNS servers are usually deployed in the hub virtual network or an on-prem data center instead of in the Cloudera VNET. That means if DNS is not configured correctly, the deployment will fail.

Most Cloudera customers deploying on Azure allow the use of service endpoints; there is a smaller set of organizations that do not allow the use of service endpoints. Service endpoint is a simpler implementation to allow resources on a private network to access managed services on Azure Cloud. If service endpoints are not allowed, firewall and private endpoints will be the other two options. Most cloud users do not like opening firewall rules because that will introduce the risk of exposing private data on the internet. That leaves private endpoints the only option, which will also introduce additional DNS configuration for the private endpoints.

Connectivity options from private network to Azure managed services

Going through Firewall to Internet

Route from private VNET to firewall, and then to Azure managed service endpoint on the internet directly.

Going through Service endpoint

Azure provides service endpoints for resources on private networks to access the managed services on the internet without going through the firewall. That can be configured at a subnet level. Since Cloudera resources are deployed in different subnets, this configuration must be enabled on all subnets.

 

Connectivity from private network to Azure managed services

The DNS records of the managed services using service endpoints will be on the internet and managed by Microsoft. The IP address of this service will be a public IP, and routable from the subnet. Please refer to the Microsoft documentation for detail. 

Not all managed services support services endpoint. In a Cloudera deployment scenario, only storage accounts, PostgreSQL DB, and Key Vault support service endpoints. 

Fortunately, most users allow service endpoints. If a customer doesn’t allow service endpoints, they have to go with a private endpoint, which is similar to what needs to be configured in the following content.

Going through Private Endpoint

There is a network interface with a private IP address created with a private endpoint, and there is a private link service associated with a specific network interface, so that other resources in the private network can access this service through the private network IP address.

Private Endpoint

The key here is for the private resources to find a DNS resolve for that private IP address. There are two options to store the DNS record:

  • Azure managed public DNS zones will always be there, but they store different types of IP addresses for the private endpoint. For example: 

    • Storage account private endpoint—the public DNS zone stores the public IP address of that service.

    • AKS API server private endpoint—the public DNS zone stores the private IP of that service.

  • Azure Private DNS zone: The DNS records will be synchronized to the Azure Default DNS of LINKED VNET.

Azure Default DNS of LINKED VNET

Private endpoint is eligible to all Azure managed services that are used in Cloudera deployments. 

As a consequence, for storage accounts, users either use service endpoints or private endpoints. Because the public DNS zone will always return a public IP, the private DNS zone becomes a mandatory configuration. 

For AKS, these two DNS alternatives are both suitable. The challenges of private DNS zones will be discussed next.

 

Challenges of private DNS zone on Azure private network

Important Assumptions

As mentioned above for the typical scenario, most Azure users are using a hub-and-spoke network architecture, and deploy custom private DNS on hub VNET.

The DNS records will be synchronized to Azure default DNS of linked VNET.

Simple Architecture Use Cases

One VNET scenario with private DNS zone:

When a private endpoint is created, Cloudera on Azure will register the private endpoint to the private DNS zone. The DNS record will be synchronized to Azure Default DNS of linked VNET. 

If users use custom private DNS, they can configure conditional forward to Azure Default DNS for the domain suffix of the FQDN.

Hub-and-spoke VNET with Azure default DNS:

Hub-and-spoke VNET with Azure default DNS:

With Azure default DNS, that is still acceptable. The only problem is that the resources on the un-linked VNET will not be able to access the AKS. But since AKS is used by Cloudera resources on the same VNET, that does not pose any major issues.

The Challenge Part

The most popular network architecture among Azure consumers is hub-spoke network with custom private DNS servers deployed either on hub-VNET or on-premises network. 

 

Hub-spoke network with custom private DNS servers

Since DNS records are not synchronized to the Azure Default DNS of the hub VNET, the custom private DNS server cannot find the DNS record for the private endpoint. And because the Cloudera VNET is using the custom private DNS server on hub VNET, the Cloudera resources on Cloudera VNET will go to a custom private DNS server for DNS resolution of the FQDN of the private endpoint. The provisioning will fail.

Create DNS Server as a Forwarder.

With the DNS server deployed in the on-prem network, there isn’t Azure default DNS associated with the on-prem network, so the DNS server couldn’t find the DNS record of the FQDN of the private endpoint.

Azure resources need private DNS support

Different Azure managed services have different DNS attributes, and based on the different use  cases, the DNS configuration is different. 

Storage account

Azure Storage Account supports both service endpoint and private endpoint. All the consumers of the Azure storage account are on Azure in the same region, which makes service endpoints good enough for the storage account.

When there is an on-premises workload involved, because the on-premises network doesn’t support service endpoint, the public IP returned by DNS lookup against FQDN of the storage account will inevitably lead the traffic to the internet. Under this scenario, a private endpoint for the storage account is required. Fortunately, we don’t need to consider this when creating Cloudera for Azure services. This use case is required when replicating data from on-premises storage to the storage account.

Architecture decision Best Practice

  • Use Azure Storage service endpoint for all Azure subnets.

  • Create a private endpoint for Azure storage if there is on-premises data to be loaded to Azure Storage Account.

  • Do not use private endpoint FQDN in Cloudera on Azure configuration no matter whether the private endpoint exists or not. 

Postgres DB

Cloudera for Azure supports 3 types of deployments for Azure Postgres DB: Single server(to be deprecated), Flexible server with delegated network, and Flexible server with private link.

Postgres Single server

Azure Postgres Single Server will be deprecated. We don’t recommend using the single server option. For users who have to use a single server, Azure service endpoint is good enough.

Postgres Flexible with Delegated subnet

With the delegated subnet option for Postgres, a dedicated subnet is required. A /27 CIDR is recommended for the subnet, and a /28 is the minimum. A private DNS zone with any domain name is required. 

Postgres Flexible server with privatelink

With the privatelink option, a private DNS zone with name “privatelink.postgres.database.azure.com” is required. 

DNS resolve for Postgres DB

Azure stores a CNAME record for PostgresDB in the Azure public DNS zone. The CNAME record points to an A record in the private DNS zone.

A record in the private DNS zone.

Architecture decision Best Practice

Choose delegated subnet or privatelink. Private DNS zone for Postgres DB is mandatory.

Azure Key Vault

Azure Key Vault supports service endpoint and doesn’t need a private DNS zone.

Azure MySQL Flexible server

Azure MySQL Flexible server supports privatelink. A private DNS zone is required. Azure stores a CNAME record for Azure MySQL in Azure public DNS zone, and points that CNAME record to an A record in private DNS zone.

Private DNS zone for MySQL DB is inevitable.

Azure Kubernetes Service

AKS API server supports privatelink. AKS stores an A record in the Azure Public DNS zone. Users can choose to use a private DNS zone, or not use a private DNS zone. When using a private DNS zone, another A record is created in the private DNS zone. These two A records are independent of each other. 

Since Azure stores two A records in Azure Public DNS zone and Private DNS zone respectively, users can choose to disable Private DNS zone for AKS if necessary.

Configuration best practices

Users do not need to do anything if Azure Default DNS is being used on the Cloudera VNET configuration. Below steps discuss the best practices and options when using custom private DNS at VNET configuration.

Step 1: Collect Private DNS zone requirement

Since different Azure Manager services have different DNS resolve options, the first step is to collect private DNS zone requirements. Normally, Postgres DB private DNS zone, AKS private DNS zone, MySQL private DNS zone are 3 types of private DNS zone we need to consider. 

Step 2: Pre-Create private DNS zones

Precreate all required private DNS zones. DNS configuration is a super important IT governance. Even Cloudera can create these private DNS zones on your behalf, it’s better to pre-create them so that DNS can have better configuration management.

Step 3: Link VNET to private DNS zones and create conditional forward

Scenario 1: When using custom private DNS and the DNS server on the Cloudera VNET

  1. Link Cloudera VNET to all the private DNS zones.

  2. On the custom private DNS server, create conditional forward on the custom private DNS server to forward DNS requests for the Azure Managed Services to the Azure Default DNS server IP address(168.63.129.16).

Conditional forward on the custom private DNS server to forward DNS requests for the Azure Managed Services to the Azure Default DNS server IP address(168.63.129.16)

Scenario 2: When using custom private DNS and the DNS server on HUB VNET

  1. Link HUB VNET to all the private DNS zones.

  2. On the custom private DNS server, create conditional forward on the custom private DNS server to forward DNS requests for the Azure Managed Services to the Azure Default DNS server IP address (168.63.129.16). 

Scenario 3: When using custom private DNS and the DNS server on on-premise network

  1. Link Cloudera VNET to the private DNS zones.

  2. Create Azure DNS resolver delegated subnet on Cloudera VNET.

  3. Create Azure DNS resolver and inbound endpoint on the Azure DNS resolver delegated subnet. A private IP will be associated with the inbound endpoint. 

  4. On the custom private DNS server, create conditional forward on the custom private DNS server to forward DNS requests for the Azure Managed Services to the private IP address of Azure DNS Resolver inbound endpoint. 

Creating Cloudera Data Services

This section introduces the key DNS related configurations when creating Cloudera services.

Creating a Cloudera environment

  1. If DataFlow, Data Engineering, or Artificial Intelligence is required, AKS private DNS zone ID for Liftie AKS clusters can only be configured at this step. So, please make sure this decision is made.

  2. Cloudera supports using delegated subnet and privatelink for Azure PostgreSQL DB. Please make sure which one to be used before this step. Data Warehouse doesn’t support using privatelink for Azure PostgreSQL DB. So, if a Data Warehouse is to be created, it’s better to use a delegated subnet. 

  3. If AKS private DNS zone ID is to be used, the configuration can only be specified with the Cloudera CLI

  1. If AKS private DNS zone ID is not to be used, the Cloudera UI can be used to create the environment.

  2. Network net configuration for Azure PostgreSQL DB can be selected on the UI or the Cloudera CLI

Flexible Server with Delegated Subnet (deprecated)

When using a delegated subnet for Postgres DB, please select Flexible Server with Delegated Subnet; Select the delegated subnet in the subnet selection; and select the private DNS zone for Postgres DB.

Flexible Server with Private Link

When using a private link for Postgres DB, please select Flexible server with Private Link; and select the private DNS zone for Postgres DB.
NOTE: the private DNS zone name for delegated subnet can be customized. But the private DNS zone name for private link cannot be customized, and has to be “privatelink.postgres.database.azure.com”.

Creating Data Warehouse

As mentioned in the AKS private DNS zone features, Azure stores two A records for AKS API server. One in the public DNS zone, and another in the private DNS zone. They both point to the same private IP address. From AKS perspective, the external resources use the A record on the Public DNS zone, and the AKS internal resources use the A record on the Private DNS zone. 

Azure provides a way to disable the Private DNS zone for AKS API server, so that both AKS external and internal resources can use the A record on the public DNS zone to access the AKS API server. 

Cloudera Data Services can also leverage this feature to simplify the DNS forwarding process.

Use the Cloudera CLI to activate Cloudera Data Warehouse. Use ‘--private-dns-zone-aks’ to specify the private DNS zone ID to ‘None’. 

Creating Data Flow, Machine Learning

An Entitlement is needed to disable the private DNS zone for AKS: “LIFTIE_AKS_DISABLE_PRIVATE_DNS_ZONE”

Otherwise, the AKS private DNS zone configuration will be inherited from the environment setting.

Postgres DB private DNS zone configuration will be inherited from the environment setting.

Creating Data Engineering

The AKS private DNS zone configuration is the same as Data Flow and Machine Learning.

Data Engineering uses Azure MySQL DB. So, MySQL private DNS zone has to be configured with the Cloudera CLI 

Summary

Bringing all things together, consider these best practices for setting up your DNS with Cloudera on Azure:

  • For the storage account, key vault, postgres DB
    • Use service endpoints as the first choice.
    • If service endpoint is not allowed, pre-create private DNS zones and link to the VNET where the DNS server is deployed. Configure conditional forwards from custom private DNS to Azure default DNS.
    • If the custom private DNS is deployed in the on-premises network, use Azure DNS resolver or another DNS server as DNS forwarder on the Cloudera VNET. Conditional forward the DNS lookup from the private DNS to the resolver endpoint.
  • For the data warehouse, DataFlow, or machine learning data services
    • Disable the private DNS zone and use the public DNS zone instead. 
  • For the data engineering data service
    • Configure the Azure DNS resolver or another DNS server as a DNS forwarder on the Cloudera VNET. Conditional forward the DNS lookup from the private DNS to the resolver endpoint. Please refer to Microsoft documentation for the details of setting up an Azure DNS Private Resolver

For more background reading on network and DNS specifics for Azure, have a look at our documentation for the various data services: DataFlow, Data Engineering, Data Warehouse, and Machine Learning. We’re also happy to discuss your specific needs; in that case please reach out to your Cloudera account manager or get in touch.

Ready to Get Started? Let’s Connect.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.