We are Trying to run a ASR (Audio Speech Recognition ) server over kubernetes clusters with scaling, we are not able to achive the result we were hoping for can any one please help me in that

Asking for help? Comment out what you need so we can get more information to help you!

Cluster information:

Kubernetes version:1.24.1
Cloud being used: bare-metal
Installation method:
Host OS: UBUNTU
CNI and version:
CRI and version:

You can format your YAML by highlighting it and pressing Ctrl-Shift-C, it will make your output easier to read.

=================Deployement==================
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sbt-hindi
  labels:
    app: web
spec:
  selector:
    matchLabels:
      builder: PwitDevOps-hindi
  revisionHistoryLimit: 2
  replicas: 7
  progressDeadlineSeconds: 600
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: web
        builder: PwitDevOps-hindi
    spec:
      terminationGracePeriodSeconds: 30
      volumes:
        - name: models
          hostPath:
            path: /home/sbt/models
            type: DirectoryOrCreate
      containers:
        - name: sbt-asr
          image: 'superbotdevops/sbt_asr:latest'
          imagePullSecrets:
          - name: regsecret
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 2700
              protocol: TCP
          env:
            - name: VOSK_MODEL_PATH
              value: /models/hi-IN/higher_education
          volumeMounts:
            - name: models
              mountPath: /models
              subPath: ''
          resources:
              requests:
                memory: 6Gi
                cpu: 700m
              limits:
                memory: 8Gi
                cpu: 800m  
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - web
                topologyKey: kubernetes.io/hostname
      readinessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 10
    livenessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20
=================Service ==================
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
 name: tornado-socket
 annotations:
  kubernetes.io/ingress.class: nginx
  nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
  nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
  nginx.ingress.kubernetes.io/server-snippets: |
   location / {
    proxy_set_header Upgrade $http_upgrade;
    proxy_http_version 1.1;
    proxy_set_header X-Forwarded-Host $http_host;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Forwarded-For $remote_addr;
    proxy_set_header Host $host;
    proxy_set_header Connection "upgrade";
    proxy_cache_bypass $http_upgrade;
    }
spec:
 rules:
  - host: tornado-ws.example.com
    http:
      paths:
      - backend:
        serviceName: tornado-socket
        servicePort: 8000