Skip to main content

Install DataChat

Follow this guide to install DataChat in your cluster using Helm.

Pre-Requisites

note

For all of the PostgreSQL databases below, DataChat recommends configuring the max_connections parameter to 500 or more.

DataChat recommends setting up the following:

  • A Kubernetes cluster. More details can be found below.
  • Three cloud PostgreSQL databases:
    • One for the DataChat's management database, which tracks a number of internal objects, including users, and sessions. This should have at least 4 vCPUs, 8GB of memory, and 10GB of storage.
    • One for the DataChat's compute engine, which is used to run computations on data, most often for file-based datasets. This should have at least 4 vCPUs, 8GB of memory, and 15GB of storage.
    • One for DataChat's "pipeliner" service. This should have at least 2 vCPUs, 8GB of memory, and 10GB of storage.
  • An object storage bucket. This bucket is created with services like CloudStorage or S3 and is used to store user data, such as files.
  • IAM-style user. Create a user that can access the object storage bucket on behalf of DataChat.

If you do not want to use external services, DataChat comes with the ability to create in-cluster services instead.

Configuring a Kubernetes Cluster

Before you can install DataChat, you must have a Kubernetes cluster to install it into. We recommend using Autopilot clusters from Google Kubernetes Engine (GKE) for the most hands-free maintenance. Autopilot clusters will automatically handle operations such as cluster scaling, Kubernetes version upgrades, security, and other settings. Regardless of which type of cluster you use, we have some recommendations that apply to them all:

  • Use "general purpose" compute resources.
  • Make sure your cluster has a load balancer, ingress, and ingress controller configured. DataChat recommends using the ingress-nginx controller.
  • Use "private" clusters that are only accessible through a load balancer and ingress.

The sections below cover cluster-configuration-specific requirements and recommendations.

Autopilot Configuration

For a detailed list of all the configuration options available for Autopilot clusters, refer to Google's documentation.

For a production-grade cluster, we recommend:

  • Using the general-purpose compute class. Autopilot clusters will automatically scale for you, but you can generally expect ~16 total nodes (using the e2-standard-8 instance class) to run a full, production-ready instance of DataChat.
  • Private clusters that are accessed only by a load balancer.
  • An ingress and load balancer to manage traffic.

Other Kubernetes Configurations

If you would prefer (or are required) to use a more traditional Kubernetes cluster, such as Standard GKE clusters or clusters built on top of Amazon's Elastic Kubernetes Services (EKS), we recommend:

  • Using general purpose compute instances, such as AWS's m5.4xlarge (16 vCPU, 64GB of memory) or GCP's e2-standard-8 (8 vCPU, 32GB of memory) or e2-standard-16 (16 vCPU, 64GB of memory).
  • Creating two node pools in different availability zones. We recommend starting with 10 nodes, leaving room to scale up or down as needed.
  • A load balancer, ingress, and ingress controller, as mentioned above.

Using External Services

If you created the external services noted above:

  • In the requiredConfigs section:
    • DATACHAT_APP_DB_HOST: The IP address or host name of the management database.
    • DATACHAT_APP_DB_PORT: The port of the management database.
    • DATACHAT_COMPUTE_ENGINE_HOST: The IP address or host name of the compute engine.
    • DATACHAT_COMPUTE_ENGINE_PORT: The port of the compute engine.
    • DATACHAT_PIPELINER_METADATA_STORAGE_HOST: The IP address or hostname of the "pipeliner metadata storage" database.
    • DATACHAT_PIPELINER_METADATA_STORAGE_PORT: The port for the "pipeliner metadata storage" database. The default is 5432.
    • DATACHAT_STORAGE_BUCKET: The name of the object storage bucket.
    • DATACHAT_STORAGE_ENDPOINT_URL: The URL of the endpoint, e.g. https://s3.us-east-2.amazonaws.com for AWS S3 or https://storage.googleapis.com for GCP CloudStorage.
    • DATACHAT_STORAGE_S3_REGION: The region the bucket is stored in, e.g. us-east-2.
  • (Optional) In the extraConfigs.dcConfig section:
    • DATACHAT_APP_DB_SSL_MODE: The sslmode option for the management database. This defaults to disable. To use SSL, set this to require.
    • DATACHAT_COMPUTE_ENGINE_SSL_MODE: The sslmode option for the compute engine. This defaults to disable. To use SSL, set this to require.
  • In the global.secrets.appStorage section:
    • global.secrets.appStorage.id: The access key ID of the user that can access the bucket.
    • global.secrets.appStorage.secret: The secret access key of the user that can access the bucket.
  • In the global.secrets.managementDatabase section, configure the username and password used to connect to the database.
  • In the global.secrets.computeEngineCredentials section, configure the username and password used to connect to the database.
  • In the secrets.secretTls section, if you are configuring TLS outside of DataChat, use the contents of your .crt and .key files:
    • Pass in your base64 encoded certificate.
    • Pass in your base64 encoded key.

Using In-Cluster Services

If you did not create those external components and would prefer to use the in-cluster services for the compute engine, management database, and object storage:

  • Set global.compute-engine.enabled to true.
  • Set global.management-database.enabled to true.
  • Set global.seaweedfs.enabled to true.
  • Set pipeliner-metadata-storage.enabled to true.

By enabling the above services, you will not need to set any configurations in the requiredConfigs block (e.g. DATACHAT_APP_DB_*, DATACHAT_COMPUTE_ENGINE_*, and DATACHAT_STORAGE_* ).

Installation and Maintenance

Install or Upgrade

Note using helm upgrade --install covers both installation of a chart and upgrade of an existing chart.

From a Private Chart Repository

  1. Acquire the JSON key file given to you by your DataChat representative.
  2. Activate the service account with gcloud:
gcloud auth activate-service-account --key-file=<path to key file>
  1. Login to the private repo using helm registry:
cat <path to key file> | helm registry login -u _json_key --password-stdin https://us-central1-docker.pkg.dev
  1. Run the following command to install the chart. This command will create a namespace and install the chart resources if they don't already exist. If they do exist, they will be upgraded:
helm upgrade --install datachat oci://us-central1-docker.pkg.dev/dc-shared/helm/datachat --version <chart-version> \
--create-namespace --namespace datachat \
--set secrets.dockerconfigjson.dockerUsername=_json_key_base64 \
--set secrets.dockerconfigjson.dockerPassword=$(cat <path to JSON secret key> | tr -d '\n' | base64) \
-f <path to overrides-values.yaml>
  1. Download the Helm chart's .tgz archive from the link provided by your DataChat representative.
  2. Acquire the JSON key file given to you by your DataChat representative.
  3. Run the following command to install the chart. This command will create a namespace and install the chart resources if they don't already exist. If they do exist, they will be upgraded:
helm upgrade --install datachat ./datachat-<chart-version>.tgz \
--create-namespace --namespace datachat
--set secrets.dockerconfigjson.dockerUsername=_json_key_base64 \
--set secrets.dockerconfigjson.dockerPassword=$(cat <path to JSON secret key> | tr -d '\n' | base64) \
-f <path to overrides-values.yaml>

Verification

Helm has provenance tools, which help chart users verify the integrity and origin of a chart. To verify the datachat helm chart, follow these steps:

  1. Acquire the .prov and public key files sent to you from DataChat. If you're installing from the private chart repository, each chart's .prov file exists within the repo.
  2. Add the public key to gpg. For example, gpg --import datachat-public-key.
  3. Pass the --verify flag in your helm commands.

Uninstall

  1. Uninstall the chart:
helm -n datachat uninstall datachat
  1. (Optional) Delete kept PersistentVolumeClaims (PVC) and Secrets.

By default, the datachat helm chart does not delete PVCs or Secrets when helm install | upgrade | uninstall are run. This helps avoid accidental deletion of critical components of DataChat. So after uninstalling, these resources will persist in the datachat namespace.

We recommend keeping this default on, but if you wish to disable this feature, set global.pvc.keep=false and global.secret.keep=false.

note

You can delete the DataChat namespace with kubectl delete namespace datachat.

Values

KeyTypeDefaultDescription
ask-cache.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
ask-cache.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
ask-cache.envlist[]Adds env variables to .spec.template.spec.containers[x].env.
ask-cache.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
ask-cache.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
ask-cache.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
ask-cache.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
ask-cache.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
ask-cache.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
ask-cache.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
ask-cache.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
ask-cache.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
ask-cache.pvc.storageSizestring"10Gi"The size of the service's PersistentVolumeClaim.
ask-cache.replicasint1The number of Pod replicas to create.
ask-cache.resources.limits.cpustring"2"The CPU limit for this service.
ask-cache.resources.limits.memorystring"8Gi"The memory limit for this service.
ask-cache.resources.requests.cpustring"1"The CPU request for this service.
ask-cache.resources.requests.memorystring"4Gi"The memory request for this service.
ask-cache.spec.strategy.typestring"Recreate"The pod replacement strategy.
autocomplete-cache.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
autocomplete-cache.envlist[]Adds env variables to .spec.template.spec.containers[x].env.
autocomplete-cache.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
autocomplete-cache.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
autocomplete-cache.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
autocomplete-cache.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
autocomplete-cache.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
autocomplete-cache.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
autocomplete-cache.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
autocomplete-cache.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
autocomplete-cache.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
autocomplete-cache.pvc.storageSizestring"10Gi"The size of the service's PersistentVolumeClaim.
autocomplete-cache.replicasint1The number of Pod replicas to create.
autocomplete-cache.resources.limits.cpustring"2"The CPU limit for this service.
autocomplete-cache.resources.requests.cpustring"1"The CPU request for this service.
autocomplete-cache.resources.requests.memorystring"1Gi"The memory request for this service.
autocomplete-cache.spec.strategy.typestring"Recreate"The pod replacement strategy.
compute-engine.enabledboolfalseWhether to enable the in-cluster compute-engine engine instance.
compute-engine.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
compute-engine.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
compute-engine.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
compute-engine.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
compute-engine.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
compute-engine.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
compute-engine.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
compute-engine.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
compute-engine.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
compute-engine.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
compute-engine.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
compute-engine.pvc.storageClassNamestring""The name of the StorageClass to use for the PersistentVolumeClaim.
compute-engine.pvc.storageSizestring"100Gi"The size of the service's PersistentVolumeClaim.
compute-engine.replicasint1The number of Pod replicas to create.
compute-engine.resources.requests.memorystring"1500Mi"The memory request for this service.
compute-engine.spec.strategy.typestring"Recreate"The pod replacement strategy.
dc-app-service.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
dc-app-service.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
dc-app-service.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
dc-app-service.hpa.maxReplicasint8When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.maxReplicas.
dc-app-service.hpa.minReplicasint2When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.minReplicas.
dc-app-service.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
dc-app-service.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
dc-app-service.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
dc-app-service.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
dc-app-service.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
dc-app-service.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
dc-app-service.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
dc-app-service.replicasint2The number of Pod replicas to create.
dc-app-service.resources.limits.cpustring"4"The CPU limit for this service.
dc-app-service.resources.limits.memorystring"4Gi"The memory limit for this service.
dc-app-service.resources.requests.cpustring"4"The CPU request for this service.
dc-app-service.resources.requests.memorystring"4Gi"The memory request for this service.
dc-app-service.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
dc-app-service.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
dc-app-service.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
dc-charting-service.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
dc-charting-service.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
dc-charting-service.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
dc-charting-service.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
dc-charting-service.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
dc-charting-service.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
dc-charting-service.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
dc-charting-service.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
dc-charting-service.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
dc-charting-service.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
dc-charting-service.replicasint1The number of Pod replicas to create.
dc-charting-service.resources.limits.cpustring"1"The CPU limit for this service.
dc-charting-service.resources.limits.memorystring"1Gi"The memory limit for this service.
dc-charting-service.resources.requests.cpustring"400m"The CPU request for this service.
dc-charting-service.resources.requests.memorystring"500Mi"The memory request for this service.
dc-charting-service.spec.strategy.typestring"Recreate"The pod replacement strategy.
dc-executor-service.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
dc-executor-service.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
dc-executor-service.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
dc-executor-service.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
dc-executor-service.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
dc-executor-service.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
dc-executor-service.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
dc-executor-service.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
dc-executor-service.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
dc-executor-service.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
dc-executor-service.replicasint1The number of Pod replicas to create.
dc-executor-service.resources.limits.cpustring"1"The CPU limit for this service.
dc-executor-service.resources.limits.memorystring"1Gi"The memory limit for this service.
dc-executor-service.resources.requests.cpustring"1"The CPU request for this service.
dc-executor-service.resources.requests.memorystring"1Gi"The memory request for this service.
dc-executor-service.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
dc-executor-service.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
dc-executor-service.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
dc-executor-worker.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
dc-executor-worker.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
dc-executor-worker.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
dc-executor-worker.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
dc-executor-worker.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
dc-executor-worker.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
dc-executor-worker.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
dc-executor-worker.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
dc-executor-worker.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
dc-executor-worker.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
dc-executor-worker.replicasint1The number of Pod replicas to create.
dc-executor-worker.resources.limits.cpustring"1"The CPU limit for this service.
dc-executor-worker.resources.limits.memorystring"1Gi"The memory limit for this service.
dc-executor-worker.resources.requests.cpustring"1"The CPU request for this service.
dc-executor-worker.resources.requests.memorystring"1Gi"The memory request for this service.
dc-executor-worker.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
dc-executor-worker.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
dc-executor-worker.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
dc-nl2code.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
dc-nl2code.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
dc-nl2code.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
dc-nl2code.hpa.maxReplicasint8When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.maxReplicas.
dc-nl2code.hpa.minReplicasint2When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.minReplicas.
dc-nl2code.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
dc-nl2code.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
dc-nl2code.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
dc-nl2code.replicasint1The number of Pod replicas to create.
dc-nl2code.resources.limits.cpustring"1"The CPU limit for this service.
dc-nl2code.resources.limits.memorystring"5Gi"The memory limit for this service.
dc-nl2code.resources.requests.cpustring"1"The CPU request for this service.
dc-nl2code.resources.requests.memorystring"2.5Gi"The memory request for this service.
dc-nl2code.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
dc-nl2code.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
dc-nl2code.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
encoderlm-service.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
encoderlm-service.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
encoderlm-service.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
encoderlm-service.hpa.maxReplicasint10When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.maxReplicas.
encoderlm-service.hpa.minReplicasint1When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.minReplicas.
encoderlm-service.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
encoderlm-service.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
encoderlm-service.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
encoderlm-service.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
encoderlm-service.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
encoderlm-service.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
encoderlm-service.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
encoderlm-service.replicasint1The number of Pod replicas to create.
encoderlm-service.resources.limits.cpustring"1"The CPU limit for this service.
encoderlm-service.resources.limits.memorystring"2600Mi"The memory limit for this service.
encoderlm-service.resources.requests.cpustring"1"The CPU request for this service.
encoderlm-service.resources.requests.memorystring"2600Mi"The memory request for this service.
encoderlm-service.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
encoderlm-service.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
encoderlm-service.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
executor-task-queue.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
executor-task-queue.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
executor-task-queue.envlist[]Adds env variables to .spec.template.spec.containers[x].env.
executor-task-queue.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
executor-task-queue.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
executor-task-queue.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
executor-task-queue.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
executor-task-queue.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
executor-task-queue.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
executor-task-queue.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
executor-task-queue.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
executor-task-queue.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
executor-task-queue.pvc.storageSizestring"10Gi"The size of the service's PersistentVolumeClaim.
executor-task-queue.replicasint1The number of Pod replicas to create.
executor-task-queue.resources.limits.cpustring"50m"The CPU limit for this service.
executor-task-queue.resources.limits.memorystring"256Mi"The memory limit for this service.
executor-task-queue.resources.requests.cpustring"50m"The CPU request for this service.
executor-task-queue.resources.requests.memorystring"256Mi"The memory request for this service.
executor-task-queue.spec.strategy.typestring"Recreate"The pod replacement strategy.
object-store-gc.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
object-store-gc.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
object-store-gc.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
object-store-gc.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
object-store-gc.resources.limits.cpustring"250m"The CPU limit for this service.
object-store-gc.resources.limits.memorystring"512Mi"The memory limit for this service.
object-store-gc.resources.requests.cpustring"250m"The CPU request for this service.
object-store-gc.resources.requests.memorystring"512Mi"The memory request for this service.
pipeliner-job-queue.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
pipeliner-job-queue.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
pipeliner-job-queue.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
pipeliner-job-queue.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
pipeliner-job-queue.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
pipeliner-job-queue.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
pipeliner-job-queue.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
pipeliner-job-queue.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
pipeliner-job-queue.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
pipeliner-job-queue.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
pipeliner-job-queue.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
pipeliner-job-queue.pvc.storageSizestring"10Gi"The size of the service's PersistentVolumeClaim.
pipeliner-job-queue.replicasint1The number of Pod replicas to create.
pipeliner-job-queue.resources.limits.cpustring"2"The CPU limit for this service.
pipeliner-job-queue.resources.requests.cpustring"1"The CPU request for this service.
pipeliner-job-queue.resources.requests.memorystring"2Gi"The memory request for this service.
pipeliner-job-queue.spec.strategy.typestring"Recreate"The pod replacement strategy.
pipeliner-metadata-storage.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
pipeliner-metadata-storage.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
pipeliner-metadata-storage.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
pipeliner-metadata-storage.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
pipeliner-metadata-storage.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
pipeliner-metadata-storage.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
pipeliner-metadata-storage.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
pipeliner-metadata-storage.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
pipeliner-metadata-storage.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
pipeliner-metadata-storage.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
pipeliner-metadata-storage.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
pipeliner-metadata-storage.pvc.storageSizestring"50Gi"The size of the service's PersistentVolumeClaim.
pipeliner-metadata-storage.replicasint1The number of Pod replicas to create.
pipeliner-metadata-storage.resources.limits.cpustring"2"The CPU limit for this service.
pipeliner-metadata-storage.resources.limits.memorystring"1Gi"The memory limit for this service.
pipeliner-metadata-storage.resources.requests.cpustring"1"The CPU request for this service.
pipeliner-metadata-storage.resources.requests.memorystring"512Mi"The memory request for this service.
pipeliner-metadata-storage.spec.strategy.typestring"Recreate"The pod replacement strategy.
pipeliner-sql.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
pipeliner-sql.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
pipeliner-sql.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
pipeliner-sql.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
pipeliner-sql.hpa.maxReplicasint16When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.maxReplicas.
pipeliner-sql.hpa.minReplicasint1When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.minReplicas.
pipeliner-sql.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
pipeliner-sql.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
pipeliner-sql.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
pipeliner-sql.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
pipeliner-sql.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
pipeliner-sql.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
pipeliner-sql.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
pipeliner-sql.replicasint2The number of Pod replicas to create.
pipeliner-sql.resources.limits.cpustring"4"The CPU limit for this service.
pipeliner-sql.resources.limits.memorystring"4Gi"The memory limit for this service.
pipeliner-sql.resources.requests.cpustring"1"The CPU request for this service.
pipeliner-sql.resources.requests.memorystring"1Gi"The memory request for this service.
pipeliner-sql.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
pipeliner-sql.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
pipeliner-sql.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
pipeliner-web.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
pipeliner-web.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
pipeliner-web.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
pipeliner-web.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
pipeliner-web.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
pipeliner-web.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
pipeliner-web.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
pipeliner-web.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
pipeliner-web.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
pipeliner-web.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
pipeliner-web.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
pipeliner-web.replicasint1The number of Pod replicas to create.
pipeliner-web.resources.limits.cpustring"4"The CPU limit for this service.
pipeliner-web.resources.limits.memorystring"8Gi"The memory limit for this service.
pipeliner-web.resources.requests.cpustring"4"The CPU request for this service.
pipeliner-web.resources.requests.memorystring"4Gi"The memory request for this service.
pipeliner-web.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
pipeliner-web.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
pipeliner-web.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
rabbitmq.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
rabbitmq.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
rabbitmq.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
rabbitmq.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
rabbitmq.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
rabbitmq.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
rabbitmq.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
rabbitmq.pvc.storageSizestring"10Gi"The size of the service's PersistentVolumeClaim.
rabbitmq.replicasint1The number of Pod replicas to create.
rabbitmq.resources.limits.cpustring"2"The CPU limit for this service.
rabbitmq.resources.limits.memorystring"6Gi"The memory limit for this service.
rabbitmq.resources.requests.cpustring"2"The CPU request for this service.
rabbitmq.resources.requests.memorystring"4Gi"The memory request for this service.
rabbitmq.spec.strategy.typestring"Recreate"The pod replacement strategy.
extraConfigs.dcConfigobject{}Extra configurations for the dc-config ConfigMap.
extraConfigs.dcExecutorConfigobject{}Extra configurations for the dc-executor-config ConfigMap.
extraConfigs.encoderLmServiceConfigobject{}Extra configurations for the encoderlm-service-config ConfigMap.
extraConfigs.nl2codeConfigobject{}Extra configurations for the nl2code-config ConfigMap.
extraConfigs.pipelinerConfigobject{}Extra configurations for the pipeliner-config ConfigMap.
extraConfigs.vectorDbConfigobject{}Extra configurations for the vector-db-config ConfigMap.
extraConfigs.vectorMgmtServiceConfigobject{}Extra configurations for the vector-mgmt-svc-config ConfigMap.
global.autoscaling.hpa.enabledboolfalseWhether to enable HorizontalPodAutoscaler resources for the Deployments that support them. Note that this requires prometheus to be installed in the cluster already. See the HorizontalPodAutoscaler section for more details.
global.autoscaling.podmonitoring.enabledboolfalseWhether to enable PodMonitoring resources for the Deployments that support them. PodMonitoring is used within GKE clusters in conjunction with global.autoscaling.hpa.enabled.
global.autoscaling.vpa.enabledboolfalseWhether to enable VerticalPodAutoscaler resources.
global.commonEnvslist[]Adds common env variables to all DataChat Deployments at .spec.template.spec.containers[x].env.
global.commonLabelsobject{"app.kubernetes.io/part-of": "DataChat"}Common labels to add to all objects that are part of this helm chart. Note that these labels are set in both .spec.template.metadata.labels and .metadata.labels.
global.externalAskCache.enabledboolfalseWhether to enable the ask-cache in-cluster Redis instance or not. If false, an in-cluster ask-cache will be created. If true, you will need to configure an external Redis instance and update requiredConfigs.DATACHAT_ASK_CACHE_HOST and requiredConfigs.DATACHAT_ASK_CACHE_PORT to your external instance's host and port. An in-cluster or an external Redis instance is needed to use GenAI capabilities.
global.genAiServices.enabledboolfalseWhether to enable dc-nl2code, encoderlm-service, vector-db-init, vector-db-scrape-and-backup, and vector-mgmt-service. These services are needed to use GenAI capabilities.
global.image.pullSecrets.enabledbooltrueWhether to create a Secret that contains credentials to pull images from a private repository.
global.image.pullSecrets.namestring"regcred"The name of the Secret that contains the image pull secret. This name will be the Kubernetes Secret name, and will be added to all DataChat Deployments .spec.template.spec.imagePullSecrets.name.
global.image.repositorystring"us-central1-docker.pkg.dev/dc-shared/dc-registry/"The private image repository to pull DataChat images from. If using a custom repository, be sure to provide the complete address, including the trailing slash (e.g. http://harbor.datachat.net/apps/datachat_/).
global.image.tagstring""The DataChat release image tag.
global.networkPolicies.enabledboolfalseWhether to enable NetworkPolicy objects. Note that additional configuration is required when enabled. See NetworkPolicies for more details.
global.networkPolicies.ingressControllerNetworkPolicylist[{"namespaceSelector": {"matchLabels": {"kubernetes.io/metadata.name": "ingress-nginx"}}, "podSelector": {"matchLabels": {"app.kubernetes.io/name": "ingress-nginx"}}}]When global.networkPolicies.enabled=true, this NetworkPolicy Ingress rule is applied to DataChat resources that are exposed via the Ingress object. This assumes the ingress controller being used is the ingress-nginx controller and that it is installed in the ingress-nginx Namespace.
global.networkPolicies.type.apiVersionstring"networking.k8s.io/v1"Allows you to override the API version of the NetworkPolicy. For example, if using Calico's NetworkPolicies, this would be projectcalico.org/v3.
global.networkPolicies.type.kindstring"NetworkPolicy"Allows you to override the resource kind of the NetworkPolicy. For example, if using Calico's NetworkPolicies, this could be set to GlobalNetworkPolicy.
global.prometheus.annotationsobject{"prometheus.io/scrape": "true", "prometheus.io/port": "9090"}The annotation to allow prometheus to scrape metrics from the Deployment. Set in .spec.template.metadata.annotations.
global.prometheus.portslist[{"name": "prometheus", "containerPort": 9090}]The default prometheus port and name to be added to a Deployment's .spec.template.spec.containers.[#].ports.
global.pvc.keepbooltrueWhen true, PersistentVolumeClaim objects will not be deleted when running helm uninstall, helm upgrade, or helm rollback. This avoids accidental deletion of data within those PVCs. See https://helm.sh/docs/howto/charts_tips_and_tricks/#tell-helm-not-to-uninstall-a-resource for more details.
global.pvc.storageClassNamestring""A global StorageClass name to use for all PersistentVolumeClaim resources. If empty, the PersistentVolumeClaims will use the default StorageClass in the Kubernetes cluster.
global.qdrant.enabledboolfalseWhether to enable an in-cluster vector database Qdrant.
global.seaweedfs.enabledboolfalseWhether to enabled an in-cluster storage solution instead of using a cloud provider object storage solution. For best performance, DataChat recommends using the latter.
global.secret.keepbooltrueWhen true, Secret objects will not be deleted when running helm uninstall, helm upgrade, or helm rollback. This avoids accidental deletion of Secrets required by DataChat. See https://helm.sh/docs/howto/charts_tips_and_tricks/#tell-helm-not-to-uninstall-a-resource for more details.
ingress.annotationsobject{"nginx.ingress.kubernetes.io/ssl-redirect": "true", "nginx.ingress.kubernetes.io/force-ssl-redirect": "true", "nginx.ingress.kubernetes.io/rewrite-target": "/$2$3", "nginx.ingress.kubernetes.io/proxy-body-size": "15G"}Default annotations to add to the DataChat Ingress object. This assumes the ingress controller being used is the ingress-nginx controller.
ingress.hostslist['']The DNS names that DataChat will be accessible from.
ingress.ingressClassNamestring"nginx"The class name to be used by the Ingress resource. Note that DataChat recommends using the ingress-nginx controller.
ingress.extraPathslist[]Extra paths to add to the Ingress object.
ingress.extraAnnotationsobject{}Extra annotations to add to the Ingress object.
ingress.tls.enabledboolfalseWhether to enable TLS for the Ingress resource.
management-database.enabledboolfalseWhether to enable the in-cluster management-database.
management-database.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
management-database.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
management-database.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
management-database.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
management-database.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
management-database.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
management-database.pvc.storageClassNamestring""The name of the StorageClass to use for the PersistentVolumeClaim.
management-database.pvc.storageSizestring"30Gi"The size of the service's PersistentVolumeClaim.
management-database.replicasint1The number of Pod replicas to create.
management-database.serviceNamestring"management-database-cluster"The name of the in-cluster management-database Service. Note, if this reference is updated here, you must update requiredConfigs.dcConfig.DATACHAT_APP_DB_HOST to match.
management-database.spec.strategy.typestring"Recreate"The pod replacement strategy.
management-db-init.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
management-db-init.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
management-db-init.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
management-db-init.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
management-db-init.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
management-db-init.resources.limits.cpustring"2"The CPU limit for this service.
management-db-init.resources.limits.memorystring"512Mi"The memory limit for this service.
management-db-init.resources.requests.cpustring"1"The CPU request for this service.
management-db-init.resources.requests.memorystring"512Mi"The memory request for this service.
management-service.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
management-service.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
management-service.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
management-service.networkPolicy.ingress.extraFromlist[]Adds additional NetworkPolicy ingress - from blocks. to the default ingress port for the DataChat service.
management-service.networkPolicy.ingress.extralist[]Adds additional NetworkPolicy ingress - from blocks.
management-service.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
management-service.networkPolicy.specOverrideobject{}Completely overrides the NetworkPolicy .spec. See NetworkPolicies for more details.
management-service.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
management-service.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
management-service.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
management-service.replicasint1The number of Pod replicas to create.
management-service.resources.limits.cpustring"4"The CPU limit for this service.
management-service.resources.limits.memorystring"10Gi"The memory limit for this service.
management-service.resources.requests.cpustring"4"The CPU request for this service.
management-service.resources.requests.memorystring"6Gi"The memory request for this service.
management-service.spec.strategy.typestring"Recreate"The pod replacement strategy.
messaging-server.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
messaging-server.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
messaging-server.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
messaging-server.networkPolicy.ingress.extraFromlist[]Adds additional NetworkPolicy ingress - from blocks. to the default ingress port for the DataChat service.
messaging-server.networkPolicy.ingress.extralist[]Adds additional NetworkPolicy ingress - from blocks.
messaging-server.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
messaging-server.networkPolicy.specOverrideobject{}Completely overrides the NetworkPolicy .spec. See NetworkPolicies for more details.
messaging-server.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
messaging-server.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
messaging-server.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
messaging-server.replicasint1The number of Pod replicas to create.
messaging-server.resources.limits.cpustring"500m"The CPU limit for this service.
messaging-server.resources.limits.memorystring"250Mi"The memory limit for this service.
messaging-server.resources.requests.cpustring"500m"The CPU request for this service.
messaging-server.resources.requests.memorystring"100Mi"The memory request for this service.
messaging-server.spec.strategy.typestring"Recreate"The pod replacement strategy.
message-store.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
message-store.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
message-store.envlist[]Adds env variables to .spec.template.spec.containers[x].env.
message-store.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
message-store.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
message-store.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
message-store.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
message-store.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
message-store.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
message-store.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
message-store.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
message-store.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
message-store.pvc.storageSizestring"10Gi"The size of the service's PersistentVolumeClaim.
message-store.replicasint1The number of Pod replicas to create.
message-store.resources.limits.cpustring"2"The CPU limit for this service.
message-store.resources.requests.cpustring"1"The CPU request for this service.
message-store.resources.requests.memorystring"10Gi"The memory request for this service.
message-store.spec.strategy.typestring"Recreate"The pod replacement strategy.
messaging-server.enabledbooltrueWhether to enable the messaging-server.
ml-task-executor.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
ml-task-executor.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
ml-task-executor.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
ml-task-executor.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
ml-task-executor.hpa.maxReplicasint16When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.maxReplicas.
ml-task-executor.hpa.minReplicasint4When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.minReplicas.
ml-task-executor.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
ml-task-executor.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
ml-task-executor.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
ml-task-executor.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
ml-task-executor.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
ml-task-executor.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
ml-task-executor.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
ml-task-executor.replicasint2The number of Pod replicas to create.
ml-task-executor.resources.limits.cpustring"5"The CPU limit for this service.
ml-task-executor.resources.limits.memorystring"10Gi"The memory limit for this service.
ml-task-executor.resources.requests.cpustring"3"The CPU request for this service.
ml-task-executor.resources.requests.memorystring"1Gi"The memory request for this service.
ml-task-executor.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
ml-task-executor.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
ml-task-executor.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
object-store.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
object-store.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
object-store.envlist[]Adds env variables to .spec.template.spec.containers[x].env.
object-store.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
object-store.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
object-store.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
object-store.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
object-store.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
object-store.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
object-store.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
object-store.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
object-store.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
object-store.pvc.storageSizestring"10Gi"The size of the service's PersistentVolumeClaim.
object-store.replicasint1The number of Pod replicas to create.
object-store.resources.limits.cpustring"2"The CPU limit for this service.
object-store.resources.requests.cpustring"1"The CPU request for this service.
object-store.resources.requests.memorystring"50Gi"The memory request for this service.
object-store.spec.strategy.typestring"Recreate"The pod replacement strategy.
optionalConfigs.vectorDbConfig.DATACHAT_VECTOR_DB_ENDPOINTstring"http://qdrant:6333"The endpoint of the vector database. This should include scheme (http/https), DNS name or IP address, and port number.
optionalConfigs.vectorDbConfig.DATACHAT_VECTOR_DB_TYPEstring"qdrant"Which type of vector database to use. Supported values are qdrant and milvus.
pipeliner-metadata-storage.enabledboolfalseWhether to enable the in-cluster pipeliner-metadata-storage.
qdrant.persistence.sizestring"25Gi"The PersistentVolumeClaim size.
qdrant.resources.limits.cpustring"100m"The CPU limit for this service.
qdrant.resources.limits.memorystring"2Gi"The memory limit for this service.
qdrant.resources.requests.cpustring"100m"The CPU request for this service.
qdrant.resources.requests.memorystring"2Gi"The memory request for this service.
requiredConfigs.DATACHAT_STORAGE_BUCKETstring"datachat-appdata"The name of the storage bucket to use. For example, if you have a Google Cloud Storage bucket named app-data, you would put that name "app-data" here. If global.seaweedfs.enabled is true this will be set to datachat-appdata.
requiredConfigs.DATACHAT_STORAGE_ENDPOINT_URLstring"http://seaweedfs-s3:8333"The endpoint that the storage bucket uses. For example, if using Google Cloud Storage, you would put https://storage.googleapis.com here. If AWS S3, https://s3.<region>.amazonaws.com. Note, for AWS S3 only Amazon S3 regular endpoints have been tested. If global.seaweedfs.enabled is true, this value will automatically be set to http://seaweedfs-s3:8333.
requiredConfigs.DATACHAT_KV_STORE_HOSTstring'object-store'The hostname/IP address for the Redis instance used for the object store.
requiredConfigs.DATACHAT_KV_STORE_PORTstring'6379'The port of the Redis instance used for the object store.
requiredConfigs.DATACHAT_MSG_STORE_HOSTstring'message-store'The hostname/IP address for the Redis instance used for the message store.
requiredConfigs.DATACHAT_MSG_STORE_PORTstring'6379'The port of the Redis instance used for the message store.
requiredConfigs.DATACHAT_AC_STORE_HOSTstring'autocomplete-cache'The hostname/IP address for the Redis instance used for the autocomplete cache.
requiredConfigs.DATACHAT_AC_STORE_PORTstring'6379'The port of the Redis instance used for the autocomplete cache.
requiredConfigs.DATACHAT_ASK_CACHE_HOSTstring'ask-cache'The hostname/IP address for the Redis instance used for the ask cache.
requiredConfigs.DATACHAT_ASK_CACHE_PORTstring'6379'The port of the Redis instance used for the ask cache.
requiredConfigs.DATACHAT_EXECUTOR_TASK_QUEUE_HOSTstring'executor-task-queue'The hostname/IP address for the Redis instance used for the executor task queue.
requiredConfigs.DATACHAT_EXECUTOR_TASK_QUEUE_PORTstring'6379'The port of the Redis instance used for the executor task queue.
requiredConfigs.DATACHAT_PIPELINER_QUEUE_HOSTstring'pipeliner-job-queue'The hostname/IP address for the Redis instance used for the pipeliner job queue.
requiredConfigs.DATACHAT_PIPELINER_QUEUE_PORTstring'6379'The port of the Redis instance used for the pipeliner job queue.
requiredConfigs.DATACHAT_STORAGE_S3_REGIONstring"default"The region where your storage backend resides. For example, for Google Cloud Storage us-central1, for AWS S3 us-east-2. If global.seaweedfs.enabled is true this defaults to default.
requiredConfigs.llm.providers.azure.api_config.api_organizationstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.azure.api_config.api_typestring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.azure.api_config.api_urlstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.azure.api_config.api_versionstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.azure.api_secret.azure_api_keystring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.azure.api_secret.azure_client_idstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.azure.api_secret.azure_client_secretstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.azure.api_secret.azure_tenant_idstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.vertexai.api_config.api_urlstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.vertexai.api_config.locationstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.vertexai.api_config.projectstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.vertexai.api_secret.vertexai_api_credentialsstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.openai.api_config.api_organizationstring""See the LLM Configuration section for more details.
requiredConfigs.llm.providers.openai.api_config.api_typestring"open_ai"See the LLM Configuration section for more details.
requiredConfigs.llm.providers.openai.api_config.api_urlstring"https://api.openai.com/v1"See the LLM Configuration section for more details.
requiredConfigs.llm.providers.openai.api_config.api_versionstring"2024-01-10"See the LLM Configuration section for more details.
requiredConfigs.llm.providers.openai.api_secret.api_keystring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.agent_supervisor.llm_providerstring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.agent_supervisor.llm_typestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.agent_supervisor.llm_namestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.codegen.llm_providerstring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.codegen.llm_typestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.codegen.llm_namestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.tag_prediction.llm_providerstring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.tag_prediction.llm_typestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.tag_prediction.llm_namestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.backup_codegen.llm_providerstring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.backup_codegen.llm_typestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.backup_codegen.llm_namestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.question_recommendations.llm_providerstring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.question_recommendations.llm_typestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.question_recommendations.llm_namestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.objective_recommendations.llm_providerstring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.objective_recommendations.llm_typestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.objective_recommendations.llm_namestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.table_explanations.llm_providerstring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.table_explanations.llm_typestring""See the LLM Configuration section for more details.
requiredConfigs.llm.tasks.table_explanations.llm_namestring""See the LLM Configuration section for more details.
scheduler.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
scheduler.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
scheduler.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
scheduler.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
scheduler.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
scheduler.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
scheduler.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
scheduler.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
scheduler.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
scheduler.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
scheduler.replicasint1The number of Pod replicas to create.
scheduler.resources.limits.cpustring"750m"The CPU limit for this service.
scheduler.resources.limits.memorystring"6Gi"The memory limit for this service.
scheduler.resources.requests.cpustring"750m"The CPU request for this service.
scheduler.resources.requests.memorystring"2.5Gi"The memory request for this service.
scheduler.spec.strategy.typestring"Recreate"The pod replacement strategy.
global.secrets.appStorage.enabledbooltrueWhether to create the Secret resource.
global.secrets.appStorage.eso.enabledboolfalseWhether to enable the ExternalSecret.
global.secrets.appStorage.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
global.secrets.appStorage.idstring"<random 32 bytes>"The storage backend's key id.
global.secrets.appStorage.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
global.secrets.appStorage.secretstring"<random 64 bytes>"The storage backend's secret key.
global.secrets.computeEngineCredentials.enabledbooltrueWhether to create the Secret resource.
global.secrets.computeEngineCredentials.eso.enabledboolfalseWhether to enable the ExternalSecret.
global.secrets.computeEngineCredentials.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
global.secrets.computeEngineCredentials.passwordstring"<random 64 bytes>"The password for the compute engine.
global.secrets.computeEngineCredentials.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
global.secrets.computeEngineCredentials.usernamestring"postgres"The username for the compute engine.
secrets.dockerconfigjson.dockerEmailstring""The email to use in the credentials. Note that this is optional.
secrets.dockerconfigjson.dockerPasswordstring""The password to login to the private DataChat image repository.
secrets.dockerconfigjson.dockerServerstring"https://us-central1-docker.pkg.dev"The docker repository server to point to.
secrets.dockerconfigjson.dockerUsernamestring""The username to login to the private repository.
secrets.dockerconfigjson.eso.enabledboolfalseWhether to enable the ExternalSecret.
secrets.dockerconfigjson.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
secrets.dockerconfigjson.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
secrets.eso.refreshIntervalstring"0"The refresh interval for ExternalSecret resources.
secrets.eso.secretStoreRef.kindstring"ClusterSecretStore"The resource type of the ExternalSecretOperator's secret store.
secrets.eso.secretStoreRef.namestring""The name of the existing ExternalSecretOperator secret store. Note that this requires you to have installed and configured the ExternalSecretOperator already. See https://external-secrets.io/latest/ for more details.
secrets.gauthClientSecret.client_idstring"<random 32 bytes>"The client_id for Google Drive Access.
secrets.gauthClientSecret.client_secretstring"<random 32 bytes>"The client_secret for Google Drive Access.
secrets.gauthClientSecret.enabledboolfalseWhether to create the Secret resource.
secrets.gauthClientSecret.eso.enabledboolfalseWhether to enable the ExternalSecret.
secrets.gauthClientSecret.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
secrets.gauthClientSecret.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
secrets.googleSsoSecret.clientIdstring""The client_id for Google SSO.
secrets.googleSsoSecret.clientSecretstring""The client_secret for Google SSO.
secrets.googleSsoSecret.enabledboolfalseWhether to create the Secret resource.
secrets.googleSsoSecret.eso.enabledboolfalseWhether to enable the ExternalSecret.
secrets.googleSsoSecret.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
secrets.googleSsoSecret.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
secrets.iapAudienceSecret.enabledboolfalseWhether to create the Secret resource.
secrets.iapAudienceSecret.eso.enabledboolfalseWhether to enable the ExternalSecret.
secrets.iapAudienceSecret.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
secrets.iapAudienceSecret.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
secrets.iapAudienceSecret.valuestring""The IAP audience secret value.
global.secrets.managementDatabase.enabledbooltrueWhether to create the Secret resource.
global.secrets.managementDatabase.eso.enabledboolfalseWhether to enable the ExternalSecret.
global.secrets.managementDatabase.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
global.secrets.managementDatabase.passwordstring"<random 64 bytes>"Override for the password for the management database.
global.secrets.managementDatabase.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
global.secrets.managementDatabase.rotateFromstring""The name of the Secret which the new Secret's data will come from. To use this configuration you must also set global.secrets.managementDatabase.rotate: true.
global.secrets.managementDatabase.usernamestring"postgres"Override for the username for the management database.
secrets.jwtUtilityRsaKeys.DATACHAT_JWT_RSA_PRIVATE_KEYstring""RSA PKCS#8 private key used for authentication token signing.
secrets.jwtUtilityRsaKeys.DATACHAT_JWT_RSA_PUBLIC_KEYstring""RSA PKCS#8 public key used for authentication token verification.
secrets.jwtUtilityRsaKeys.enabledstringtrueWhether to create the Secret resource.
secrets.jwtUtilityRsaKeys.eso.enabledboolfalseWhether to enable the ExternalSecret.
secrets.jwtUtilityRsaKeys.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
secrets.jwtUtilityRsaKeys.rotateboolfalseWhether to rotate the existing Secrete resource. Note that this does not work on ExternalSecret resources.
secrets.nl2codeSecret.enabledbooltrueWhether to create the Secret resource.
secrets.nl2codeSecret.eso.enabledboolfalseWhether to enable the ExternalSecret.
secrets.nl2codeSecret.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
secrets.nl2codeSecret.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
secrets.pipelinerSecret.enabledbooltrueWhether to create the Secret resource.
secrets.pipelinerSecret.eso.enabledboolfalseWhether to enable the ExternalSecret.
secrets.pipelinerSecret.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
secrets.pipelinerSecret.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
secrets.secretTls.enabledbooltrueWhether to create the Secret resource.
secrets.secretTls.eso.enabledboolfalseWhether to enable the ExternalSecret.
secrets.secretTls.eso.remoteSecretNamestring""The name of the secret that the ExternalSecret will use as its target.
secrets.secretTls.namestring"secret-tls"The name of the TLS Secret.
secrets.secretTls.rotateboolfalseWhether to rotate the existing Secret resource. Note that this does not work on ExternalSecret resources.
secrets.secretTls.tlsCrtstring"<self signed cert>"The TLS Certificate. If enabled and none is provided, a self signed certificate will be created using helm's genSelfSignedCert function and the cert will be used here.
secrets.secretTls.tlsKeystring"<self signed cert>"The TLS Private Key. If enabled and none is provided, a self signed certificate will be created using helm's genSelfSignedCert function and the key will be used here.
vector-db-init.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
vector-db-init.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
vector-db-init.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
vector-db-init.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
vector-db-init.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
vector-db-init.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
vector-db-init.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
vector-db-init.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
vector-db-init.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
vector-db-init.resources.limits.cpustring"2"The CPU limit for this service.
vector-db-init.resources.limits.memorystring"4Gi"The memory limit for this service.
vector-db-init.resources.requests.cpustring"500m"The CPU request for this service.
vector-db-init.resources.requests.memorystring"256Mi"The memory request for this service.
vector-mgmt-service.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
vector-mgmt-service.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
vector-mgmt-service.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
vector-mgmt-service.hpa.maxReplicasint8When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.maxReplicas.
vector-mgmt-service.hpa.minReplicasint1When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.minReplicas.
vector-mgmt-service.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
vector-mgmt-service.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
vector-mgmt-service.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
vector-mgmt-service.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
vector-mgmt-service.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
vector-mgmt-service.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
vector-mgmt-service.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
vector-mgmt-service.replicasint1The number of Pod replicas to create.
vector-mgmt-service.resources.limits.cpustring"1"The CPU limit for this service.
vector-mgmt-service.resources.limits.memorystring"1Gi"The memory limit for this service.
vector-mgmt-service.resources.requests.cpustring"1"The CPU request for this service.
vector-mgmt-service.resources.requests.memorystring"700Mi"The memory request for this service.
vector-mgmt-service.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
vector-mgmt-service.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
vector-mgmt-service.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
web-app.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
web-app.envlist[]Adds env variables to .spec.template.spec.containers[x].env.
web-app.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
web-app.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
web-app.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
web-app.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
web-app.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
web-app.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
web-app.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
web-app.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
web-app.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
web-app.replicasint1The number of Pod replicas to create.
web-app.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
web-app.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
web-app.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.
worker-nlp.commonEnvslist[]Adds common env variables to .spec.template.spec.containers[x].env.
worker-nlp.containerPortslist[]Ports to add to .spec.template.spec.containers[x].ports.
worker-nlp.extraEnvslist[]Adds extra env vars to .spec.template.spec.containers[x].env.
worker-nlp.extraPortslist[]Adds extra ports to .spec.template.spec.containers[x].ports.
worker-nlp.hpa.maxReplicasint16When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.maxReplicas.
worker-nlp.hpa.minReplicasint2When global.autoscaling.hpa.enabled: true this value will set the service's HPA's .spec.minReplicas.
worker-nlp.networkPolicy.ingress.extralist[]If global.networkPolicies.enabled: true, this adds extra ingress configurations to the NetworkPolicy's .spec.ingress block. For example, you would add additional array - from blocks here.
worker-nlp.networkPolicy.ingress.extraFromlist[]Adds additional ingress rules to the default ingress rule. For example, adding additional selectors will be placed in .spec.ingress[0].from block.
worker-nlp.networkPolicy.egressobject{}Adds egress NetworkPolicy block.
worker-nlp.networkPolicy.specOverridelist[]If using custom NetworkPolicy objects, this allows you to override the .spec section entirely.
worker-nlp.podAnnotationsobject{}Adds annotations to the Deployment's Pod spec at .spec.template.metadata.annotations.
worker-nlp.prometheus.exposedboolfalseIf true, allows prometheus.ports to be added to the Deployment's .spec.template.spec.containers[x].ports block.
worker-nlp.prometheus.portslist[]If prometheus.exposed: true then add the ports listed here to the Deployment's .spec.template.spec.containers[x].ports block.
worker-nlp.replicasint2The number of Pod replicas to create.
worker-nlp.resources.limits.cpustring"5"The CPU limit for this service.
worker-nlp.resources.limits.memorystring"15Gi"The memory limit for this service.
worker-nlp.resources.requests.cpustring"3"The CPU request for this service.
worker-nlp.resources.requests.memorystring"5Gi"The memory request for this service.
worker-nlp.spec.strategy.rollingUpdate.maxSurgestring"50%"The maximum number of Pods that can be created over the desired number of Pods during the Pod replacement strategy.
worker-nlp.spec.strategy.rollingUpdate.maxUnavailablestring"50%"The maximum number of pods that can be unavailable during Pod recreation.
worker-nlp.spec.strategy.typestring"RollingUpdate"The pod replacement strategy.

Authentication Signing and Verification Keys

DataChat uses an RSA PKCS#8 public/private key pair to sign and verify access and refresh tokens for user authentication. To set up these keys, follow the steps below:

  1. Use the script below to generate a public and private key:
!/bin/bash
# Define the output file names
PRIVATE_KEY_FILE="private_key.pem"
PUBLIC_KEY_FILE="public_key.pem"

# Generate a 2048-bit RSA private key
openssl genpkey -algorithm RSA -out $PRIVATE_KEY_FILE -pkeyopt rsa_keygen_bits:2048

# Extract the public key from the private key
openssl rsa -pubout -in $PRIVATE_KEY_FILE -out $PUBLIC_KEY_FILE

echo "Private key saved to $PRIVATE_KEY_FILE"
echo "Public key saved to $PUBLIC_KEY_FILE"

# Format the private key for embedding in environment variables
awk '{printf "%s\\n", $0}' "./$PRIVATE_KEY_FILE" | sed 's/\\n$//'
echo -e "\n"
awk '{printf "%s\\n", $0}' "./$PUBLIC_KEY_FILE" | sed 's/\\n$//'

# Delete the files
rm -f $PRIVATE_KEY_FILE $PUBLIC_KEY_FILE
echo -e "\n\nTemporary key files deleted."
  1. Add the keys to customer-values.yaml
...
secrets:
...
jwtUtilityRsaKeys:
DATACHAT_JWT_RSA_PRIVATE_KEY: 'your-private-key'
DATACHAT_JWT_RSA_PUBLIC_KEY: 'your-public-key'

HorizontalPodAutoscaler

DataChat makes use of custom metrics to scale its services. Because of this, it requires the prometheus-adapter with a specific custom configuration.

Installation

  1. Add the prometheus-community helm repo:
helm add prometheus-community https://prometheus-community.github.io/helm-charts
  1. Create a values file with the following configuration:
image:
tag: v0.10.0

logLevel: 1

prometheus:
port: 9090

rules:
default: false
custom:
- seriesQuery: 'active_tasks'
name:
as: 'active_tasks'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
- seriesQuery: 'pipeliner_web_queue_size'
name:
as: 'pipeliner_web_queue'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
metricsQuery: 'avg_over_time(<<.Series>>{<<.LabelMatchers>>}[30s])'
- seriesQuery: 'rabbitmq_queue_messages{queue="datachat.task.incoming.regular"}'
name:
as: 'queue_depth_of_datachat_task_incoming_regular_queue'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>, queue="datachat.task.incoming.regular"}) by (<<.GroupBy>>)'
- seriesQuery: 'active_encoderlm_service_requests{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
as: 'percent_encoderlm_service_procs_busy'
metricsQuery: 'avg(sum(<<.Series>>{<<.LabelMatchers>>}) by (pid, pod, namespace)) by (<<.GroupBy>>)'
- seriesQuery: 'active_dc_nl2code_service_requests{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
as: 'percent_dc_nl2code_procs_busy'
metricsQuery: 'avg(sum(<<.Series>>{<<.LabelMatchers>>}) by (pid, pod, namespace)) by (<<.GroupBy>>) / 5'
- seriesQuery: 'active_dc_app_service_requests{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
as: 'percent_dc_app_service_procs_busy'
metricsQuery: 'avg(sum(<<.Series>>{<<.LabelMatchers>>}) by (pid, pod, namespace)) by (<<.GroupBy>>)'
- seriesQuery: 'rabbitmq_queue_messages{queue="ml_task_executor_rpc"}'
name:
as: 'queue_depth_of_datachat_ml_task_executor_rpc_queue'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>, queue="ml_task_executor_rpc"}) by (<<.GroupBy>>)'
- seriesQuery: 'active_vector_mgmt_service_requests{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
as: 'percent_vector_mgmt_service_procs_busy'
metricsQuery: 'avg(sum(<<.Series>>{<<.LabelMatchers>>}) by (pid, pod, namespace)) by (<<.GroupBy>>) / 5'
external:
- seriesQuery: 'pipeliner_sql_queue_size{job_queue="sqlsingle"}'
name:
as: 'pipeliner_sql_single_task_queue'
resources:
overrides:
kubernetes_namespace: { resource: 'namespace' }
metricsQuery: 'avg(avg_over_time(pipeliner_sql_queue_size{<<.LabelMatchers>>,job_queue="sqlsingle"}[30s]))'
- seriesQuery: 'pipeliner_sql_queue_size{job_queue="sqlmulti"}'
name:
as: 'pipeliner_sql_multi_task_queue'
resources:
overrides:
kubernetes_namespace: { resource: 'namespace' }
metricsQuery: 'avg(avg_over_time(pipeliner_sql_queue_size{<<.LabelMatchers>>,job_queue="sqlmulti"}[30s]))'
  1. Install the prometheus-adapter helm chart:
helm repo update && \
helm install prometheus-adapter prometheus-community/prometheus-adapter -n prometheus --create-namespace -f <path-to-your-values-file>
  1. Upgrade your datachat chart and enable HPAs:
helm upgrade -n datachat --reuse-values --set global.autoscaling.hpa.enabled=true

NetworkPolicies

Some cluster Container Network Interfaces (CNI) may not support or need NetworkPolicies, making these resources optional. To enable NetworkPolicy resources, set global.networkPolicies.enabled: true in the chart's values.yaml file.

Note: Services exposed via the Ingress resource will block all ingress traffic except from the ingress controller's namespace. Currently, this applies only to the Ingress-NGINX Controller in the ingress-nginx namespace. If you use another ingress controller, you must allow ingress traffic from its namespace for DataChat services exposed via the Ingress object.

To accommodate different organizational needs, the chart allows customization of the NetworkPolicy resources' .spec section. For example:

  1. DataChat currently provides only ingress rules. If you need egress rules:
# values.yaml
global:
networkPolicies:
enabled: true

# For each service's NetworkPolicy, add egress rules. For example:
ask-cache:
networkPolicy:
egress:
# <your-egress-rules-here>
  1. If you use custom NetworkPolicy resources like Calico's GlobalNetworkPolicy:
# values.yaml
global:
networkPolicies:
enabled: true
apiVersion: "projectcalico.org/v3"
kind: "GlobalNetworkPolicy"

# For each service's NetworkPolicy, you will then have to override the `.spec` section and use Calico's custom NetworkPolicy syntax. For example:
ask-cache:
networkPolicy:
ingress:
specOverride:
selector: app.kubernetes.io/name == '<datachat-service>'
types:
- Ingress
ingress:
- action: Allow
protocol: TCP
source:
selector: app.kubernetes.io/name == '<datachat-service>'
destination:
ports:
- 6379
  1. If you are using an ingress controller other than the ingress-nginx controller, or if you are using the ingress-nginx controller in a namespace other than ingress-nginx, you need to update the ingress policy to allow traffic from your ingress controller to the exposed DataChat services. Replace $INGRESS_CONTROLLER_NAMESPACE in the policy below with the namespace where your ingress controller is running:
# values.yaml
global:
networkPolicies:
ingressControllerNetworkPolicy:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: $INGRESS_CONTROLLER_NAMESPACE
podSelector:
matchLabels:
app.kubernetes.io/name: $INGRESS_CONTROLLER_NAMESPACE

LLM Configuration

DataChat configures Large Language Models (LLMs) for features such as the Data Assistant code generation, dynamic recommendations, and table explanations. There are two types of LLM configurations, API configuration and API secret:

  • API configuration. Includes parameters such as the model endpoint, model type, and endpoint API version.
  • API secret. Consists of the credentials needed to access the configured LLM endpoint.

LLM usage in a DataChat deployment can be configured by specifying required information about the LLM providers along with the LLMs to use for each LLM task in DataChat.

DataChat supports and uses the following LLM providers:

  • OpenAI
  • Azure OpenAI
  • Google VertexAI

The LLMs supported from each of the above providers varies with the LLM Task in DataChat as can be seen below:

  • Agent Supervisor: (OpenAI GPT-4O, Azure OpenAI GPT-4O, Google VertexAI Claude Sonnet 3.5)
  • Code Generation: (OpenAI GPT-4, OpenAI GPT-4O, Azure OpenAI GPT-4, Azure OpenAI GPT-4O, Google VertexAI Claude Sonnet 3.5)
  • Question Recommendations: (OpenAI GPT-4, Azure OpenAI GPT-4)
  • Objective Recommendations: (Google VertexAI Gemini 1.5 Pro)
  • Table Explanations: (OpenAI GPT-4, Azure OpenAI GPT-4)
  • Tag Predictions: (Google VertexAi Claude Sonnet 3.5)

Note: The API secret is part of our secret management. To rotate the LLM API secret, in addition to updating the secret field (e.g. requiredConfigs.llm.providers.openai.api_secret.api_key), you must also set secrets.nl2codeSecret.rotate to true.

OpenAI LLMs

To use OpenAI LLMs, customers must create their own API keys on the OpenAI website.

requiredConfigs:
llm:
providers:
openai:
api_config:
api_url: "https://api.openai.com/v1"
api_type: "open_ai"
api_version: "2024-01-10"
api_organization: "" # optional field to set.

api_secret:
api_key: "<API key created by your organization>"

Azure LLMs

Azure hosts OpenAI LLMs as part of their cloud infrastructure/service. Customers need to setup OpenAI resources on Azure to manage their LLM deployments.

requiredConfigs:
llm:
providers:
azure:
api_config:
api_url: "<your Azure endpoint>"
api_type: "azure" # if using Azure Application Gateway, api_type should be "azure_ad".
api_version: "2024-02-15-preview"
api_organization: "<your Azure organization name>"

api_secret:
## Azure credentials after Azure OpenAI resource is created
azure_api_key: ""
### if using Azure API Gateway feature
azure_client_id: ""
azure_client_secret: ""
azure_tenant_id: ""

The api_config is similar to that of the openai block. For api_secret, the azure_api_key will be available once the Azure OpenAI Resource is ready.

If you are using an API Gateway to route LLM requests (where api_url will be the endpoint for the API Gateway), you will also need azure_client_id, azure_client_secret, and azure_tenant_id to authenticate with the Gateway. For details, see this document.

Note: If you are using an application gateway, set api_type to azure_ad.

Google VertexAI LLMs

To use Google VertexAI LLMs, customers will need to setup a GCP project and deploy the supported LLMs.

requiredConfigs:
llm:
providers:
vertexai:
api_config:
api_url: "<your VertexAI endpoint>"
project: "<your VertexAI project>"
location: "<your VertexAI project location>"

api_secret:
# VertexAI project credentials
vertexai_api_credentials: "<your VertexAI project credentials>"

  • api_config:
    • api_url. The base URL to which LLM generation requests are sent. The VertexAI client will append the appropriate path to this URL.
    • project. The GCP project name.
    • location. The region in which the LLM is deployed.
  • api_secret:
    • vertexai_api_credentials. The VertexAI credentials for the GCP project.

LLM Task Configuration

To specify the LLM to be used for each task, customers need to ensure that the LLM Provider and model are supported for the particular task by DataChat. Refer to the LLM support matrix laid out below and ensure that the each of the values provided are spelt exactly the same.

  • llm_task:
    • llm_provider. The name of the LLM provider (must be one of azure, openai, or vertexai)
    • llm_type. The LLM model type (choose from the ones that are supported for the task)
    • llm_name. The name of the LLM when it's deployed.
requiredConfigs:
llm:
tasks:
llm_task:
llm_provider: ""
llm_type: ""
llm_name: ""


# Support Matrix for Agent Supervisor Task
# |--------------|-------------------------------|
# | LLM Provider | LLM Type |
# |--------------|-------------------------------|
# | openai | gpt-4o |
# | azure | gpt-4o |
# | vertexai | claude-3-5-sonnet-v2@20241022 |
# |--------------|-------------------------------|


# Support Matrix for Tag Prediction Task
# |---------------------------------------------|
# | LLM Provider | LLM Type |
# |--------------|------------------------------|
# | vertexai | claude-3-5-sonnet-v2@20241022|
# |--------------|------------------------------|


# Support Matrix for Codegen Task
# |----------------------------------------------|
# | LLM Provider | LLM Type |
# |--------------|-------------------------------|
# | openai | gpt-4o |
# | azure | gpt-4o |
# | openai | gpt-4 |
# | azure | gpt-4 |
# | vertexai | claude-3-5-sonnet-v2@20241022 |
# |--------------|-------------------------------|


# Support Matrix for Question Recommendations Task
# |-------------------------|
# | LLM Provider | LLM Type |
# |--------------|----------|
# | openai | gpt-4 |
# | azure | gpt-4 |
# |--------------|----------|


# Support Matrix for Objective Recommendations Task
# |-----------------------------------|
# | LLM Provider | LLM Type |
# |--------------|--------------------|
# | vertexai | gemini-1.5-pro-001 |
# |--------------|--------------------|


# Support Matrix for Table Explanations Task
# |-----------------------------------|
# | LLM Provider | LLM Type |
# |--------------|--------------------|
# | openai | gpt-4 |
# | azure | gpt-4 |
# |--------------|--------------------|

LLM Providers and Tasks Configuration Example

requiredConfigs:
llm:
providers:
openai:
api_config:
api_url: "https://api.openai.com/v1"
api_type: "open_ai"
api_version: "2024-01-10"
api_organization: "datachat" # optional field to set.

api_secret:
api_key: ""

azure:
api_config:
api_url: "https://datachatazure.openai.azure.com/"
api_type: "azure" # if using Azure Application Gateway, api_type should be "azure_ad".
api_version: "2024-02-15-preview"
api_organization: "datachat"

api_secret:
## Azure credentials after Azure OpenAI resource is created
azure_api_key: ""
### if using Azure API Gateway feature
azure_client_id: ""
azure_client_secret: ""
azure_tenant_id: ""

vertexai:
api_config:
api_url: "https://us-central1-aiplatform.googleapis.com/v1/projects/datachat-gcp/"
project: "datachat-gcp"
location: "us-central1"

api_secret:
# VertexAI project credentials
vertexai_api_credentials: ""

tasks:
agent_supervisor:
llm_provider: "azure"
llm_type: "gpt-4o"
llm_name: "ask-gpt-4-o"

codegen:
llm_provider: "vertexai"
llm_type: "claude-3-5-sonnet-v2@20241022"
llm_name: "claude-3-5-sonnet-v2@20241022"

tag_prediction:
llm_provider: "vertexai"
llm_type: "claude-3-5-sonnet-v2@20241022"
llm_name: "claude-3-5-sonnet-v2@20241022"

backup_codegen:
llm_provider: "vertexai"
llm_type: "gemini-1.5-pro-001"
llm_name: "gemini-1.5-pro-001"

question_recommendations:
llm_provider: "openai"
llm_type: "gpt-4"
llm_name: "gpt-4"

objective_recommendations:
llm_provider: "vertexai"
llm_type: "gemini-1.5-pro-001"
llm_name: "gemini-1.5-pro-001"

table_explanations:
llm_provider: "azure"
llm_type: "gpt-4"
llm_name: "ask-gpt-4"