Skip to content

Update mcad kuberay example #617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 6, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
153 changes: 153 additions & 0 deletions doc/usage/examples/kuberay/config/aw-raycluster-1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
apiVersion: mcad.ibm.com/v1beta1
kind: AppWrapper
metadata:
name: raycluster-complete-1
namespace: default
spec:
resources:
GenericItems:
- replicas: 1
custompodresources: # Optional section that specifies resource requirements
# for non-standard k8s resources, follows same format as
# that of standard k8s resources.
- replicas: 2 # because AppWrappers are generic they must define the resultant pods that will be needed
# to fulfill a request as the request values cannot be reliably extracted from the
# generictemplate below
requests:
cpu: 8
memory: 512Mi
limits:
cpu: 10
memory: 1G
generictemplate:
# The resource requests and limits in this config are too small for production!
# For examples with more realistic resource configuration, see
# ray-cluster.complete.large.yaml and
# ray-cluster.autoscaler.large.yaml.
apiVersion: ray.io/v1alpha1
kind: RayCluster
metadata:
labels:
controller-tools.k8s.io: "1.0"
# A unique identifier for the head node and workers of this cluster.
name: raycluster-complete-1
spec:
rayVersion: '2.5.0'
# Ray head pod configuration
headGroupSpec:
# Kubernetes Service Type. This is an optional field, and the default value is ClusterIP.
# Refer to https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types.
serviceType: ClusterIP
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams:
dashboard-host: '0.0.0.0'
# pod template
template:
metadata:
# Custom labels. NOTE: To avoid conflicts with KubeRay operator, do not define custom labels start with `raycluster`.
# Refer to https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
labels: {}
spec:
containers:
- name: ray-head
image: rayproject/ray:2.5.0
ports:
- containerPort: 6379
name: gcs
- containerPort: 8265
name: dashboard
- containerPort: 10001
name: client
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","ray stop"]
volumeMounts:
- mountPath: /tmp/ray
name: ray-logs
# The resource requests and limits in this config are too small for production!
# For an example with more realistic resource configuration, see
# ray-cluster.autoscaler.large.yaml.
# It is better to use a few large Ray pod than many small ones.
# For production, it is ideal to size each Ray pod to take up the
# entire Kubernetes node on which it is scheduled.
resources:
limits:
cpu: "1"
memory: "2G"
requests:
# For production use-cases, we recommend specifying integer CPU reqests and limits.
# We also recommend setting requests equal to limits for both CPU and memory.
# For this example, we use a 500m CPU request to accomodate resource-constrained local
# Kubernetes testing environments such as KinD and minikube.
cpu: "500m"
memory: "2G"
volumes:
- name: ray-logs
emptyDir: {}
workerGroupSpecs:
# the pod replicas in this group typed worker
- replicas: 1
minReplicas: 1
maxReplicas: 10
# logical group name, for this called small-group, also can be functional
groupName: small-group
# If worker pods need to be added, we can increment the replicas.
# If worker pods need to be removed, we decrement the replicas, and populate the workersToDelete list.
# The operator will remove pods from the list until the desired number of replicas is satisfied.
# If the difference between the current replica count and the desired replicas is greater than the
# number of entries in workersToDelete, random worker pods will be deleted.
#scaleStrategy:
# workersToDelete:
# - raycluster-complete-worker-small-group-bdtwh
# - raycluster-complete-worker-small-group-hv457
# - raycluster-complete-worker-small-group-k8tj7
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {}
#pod template
template:
spec:
containers:
- name: ray-worker
image: rayproject/ray:2.5.0
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","ray stop"]
# use volumeMounts.Optional.
# Refer to https://kubernetes.io/docs/concepts/storage/volumes/
volumeMounts:
- mountPath: /tmp/ray
name: ray-logs
# The resource requests and limits in this config are too small for production!
# For an example with more realistic resource configuration, see
# ray-cluster.autoscaler.large.yaml.
# It is better to use a few large Ray pod than many small ones.
# For production, it is ideal to size each Ray pod to take up the
# entire Kubernetes node on which it is scheduled.
resources:
limits:
cpu: "1"
memory: "1G"
# For production use-cases, we recommend specifying integer CPU reqests and limits.
# We also recommend setting requests equal to limits for both CPU and memory.
# For this example, we use a 500m CPU request to accomodate resource-constrained local
# Kubernetes testing environments such as KinD and minikube.
requests:
# For production use-cases, we recommend specifying integer CPU reqests and limits.
# We also recommend setting requests equal to limits for both CPU and memory.
# For this example, we use a 500m CPU request to accomodate resource-constrained local
# Kubernetes testing environments such as KinD and minikube.
cpu: "500m"
# For production use-cases, we recommend allocating at least 8Gb memory for each Ray container.
memory: "1G"
# use volumes
# Refer to https://kubernetes.io/docs/concepts/storage/volumes/
volumes:
- name: ray-logs
emptyDir: {}

Loading