Cluster Requirements
To install OpsVerse instances onto your own cloud using the self hosted model, a K8s cluster is required on your cloud.
Most customers run a managed Kubernetes (e.g., EKS, AKS, or GKE on AWS, Azure, or GCP, respectively) and that's what the examples below will show. However, any Kubernetes cluster will work as long as it meets the requirements below.
This page will show some requirements and examples - in this case, these are the validations ran against ObserveNow
- Kubernetes version: v1.25.x <= version <= 1.30.x
- Minimum of 3 worker nodes (with at least 2 vCPUs and 8GB RAM each)
- VPC configured with CIDR block of /21 or smaller to ensure there are atleast 2048 IPs available for the cluster
In general, below mentioned things are needed in a cluster (irrespective of the provider) that runs any of the apps offered by OpsVerse:
- Networking and security resources: The creation of network resources (VPC in AWS and GCP / VNET in Azure and other appropriate resources like subnets, gateways, route tables, certificate manager, etc) is crucial for creating a secure and well-connected infrastructure that runs all the apps smoothly.
- EKS cluster: Specify the cluster and its configuration such as name, k8s version, node group (name, OS, type of OS), network and security configs, etc.
- Object storage buckets: Object storage buckets are a type of storage service provided by cloud providers (Called S3 in AWS, Google Cloud Storage (GCS) in GCP, and Azure Blob Storage in Azure). They are designed to store and retrieve vast amounts of unstructured data in the form of objects. Object storage is a must for OpsVerse's ObserveNow to function properly.
- IAM resources: IAM (Identity and Access Management) resources are components used to manage secure access to the services and resources provided by the cloud providers. They allow admins to control who can access the cloud infrastructure and what actions a user/automated bot account can perform. This is needed as OpsVerse's ObserveNow frequently talks to object storage buckets.
- Access to Object Storage: IAM should be set in such a way that the pods running in the cluster should have access to the object storage. This is very crucial for OpsVerse's ObserveNow as the app relies on object storage for log storage/retrieval and backup operations.
- AWS: Create an IAM role and a policy
- GCP: Create an IAM service account that binds to Workload Identity
- Azure: Create a storage account key via the Azure Portal to access the storage container
To create an Amazon EKS (Elastic Kubernetes Service) cluster using Terraform, the following steps need to be followed:
Set up the provider: Providers are a logical abstraction of an upstream API. They are responsible for understanding API interactions and exposing resources. Configure the appropriate provider.
Define the network and security resources: Specify the EKS cluster and its configuration such as name, version, networking, object storage buckets
- Network configs: Define the configs to create VPC, Subnets, IGW, NAT, Route tables, and other appropriate resources.
- Security configs: Define the configs to create IAM and certificate manager.
Here is a terraform snippet that has all the configs to create the resources:
The terraform code snippets used in these examples can also be found here: https://github.com/opsverseio/private-saas
Object storage bucket (S3): This step creates an S3 bucket for the ObserveNow to store the logs and the backups. Here is a terraform snippet that has all the configs to create the resource:
NOTE: It is recommended to create the S3 bucket in the same region as the cluster.
IAM (Identity and Access Management):
- Role creation: This step creates a role for the ObserveNow instance (Specifically Loki pods to access the S3 bucket to store and retrieve the logs). The required role is as follows:
- Configure IAM policy for the role/Object Storage Access: This step defines an IAM role in such a way that the pods in the cluster should be able to access the S3 bucket to store/retrieve the logs and backup files.
Here is a sample policy to attach to the created role:
Here is a terraform snippet that has all the configs to create the resource:
NOTE: When you create your EKS cluster, in the Terraform you can set enable_irsa = "true" to make sure you have an IAM OpenID Connect (OIDC) Provider for your EKS cluster
EKS cluster creation: This step creates a new EKS cluster that has 1 worker node pool. Cluster configs such as name, k8s version, networking/security, and object storage buckets can be defined. Specify the EC2 instances that will act as worker nodes in the cluster.
After the successful cluster creation, please send the following details to your OpsVerse POC:
- S3 bucket name
- ARN details
This will help your OpsVerse to set up the ObserveNow and offer you a smooth experience when creating the OpsVerse apps.
There are 2 options when creating the cluster:
Option 1: To use an already existing VPC and subnets and proceed with the cluster creation
If a VPC and subnets already exist in AWS, the same VPC and subnets can be used to create a cluster. Follow the below-mentioned steps:
Example snippet:
NOTE: This is a generic working example snippet that creates an EKS (Assuming a VPC and subnets already exist) cluster with the following resources:
- EKS Cluster with 1 worker node that will have 3 nodes (4 vCPU and 16 GB Memory each)
- S3 bucket for Loki to store the logs and for the backups of VictoriaMetrics, ClickHouse, etc.
- IAM role to access the created S3 bucket.
- IAM policy that defines the scope of the IAM role.
Please feel free to add more granular resources (IGW/NAT Gateways, Route tables, ACM, etc.) as per your organization's security and networking standards.
Option 2: To create a new VPC and subnets and proceed with the cluster creation
If a VPC and subnets don't exist in AWS and have to be created from scratch, follow the below-mentioned steps:
Example snippet:
NOTE: This is a generic working example snippet that creates an EKS cluster with the following resources:
- A VPC in atleast 2 availability zones
- Multiple subnets per availability zone (At least 1 public subnet and 'n' private subnets)
- EKS Cluster with 1 worker node that will have 3 nodes (4 vCPU and 16 GB Memory each)
- S3 bucket for Loki to store the logs and for the backups of VictoriaMetrics, ClickHouse, etc.
- IAM role to access the created S3 bucket.
- IAM policy that defines the scope of the IAM role.
Please feel free to add more granular resources (IG/NAT Gateways, Route tables, etc.) as per your organization's security and networking standards.
Please refer to this working example for more details: https://github.com/opsverseio/private-saas
Please work with your customer success rep to get PrivateSaaS enabled on your GCP account
Please work with your customer success rep to get PrivateSaaS enabled on your Azure account