I’m using hindi language in configmap, which is utf-8 supported and when i use it in configmap then after mounting in the container, i see some random characters between it, although after printing it works fine, but what can be the issue if it is utf-8 supported and the random characters in non-english language ?
Cluster information:
Kubernetes version: v1.22.2
Cloud being used: metlallb
Host OS: ubuntu20
After mounting configmap - प�~Mरिय �~W�~Mराह�~U, �~Fपन�~G �~Eपन In configmap - प्रिय ग्राहक, आपने अपने
and when i’m printing the mounted file characters i’m getting the proper format but some characters are missing from original like आपने as अपने and many more, and i’m using golang to open the mounted json file.
Apologies for not attaching the proper problem, kindly check and if this is a proper issue i will open it on github.
Sorry, I still need info. The ConfigMap API is described as:
Data map[string]string
BinaryData map[string][]byte
A string in Go is a series of bytes. I think we assume that any string-encoded field holds valid unicode data, but we don’t enforce that, as far as I can tell. When we render such a thing to JSON, it should be encoded so as to produce valid JSON
A []byte is an explicit denotation that we will not try to interpret the values at all, and when we encode it as JSON, it is not assumed to be a valid string.
That’s what the object ACTUALLY HOLDS. How you “see” that data matters because different tools have different levels of support for unicode and different error handling.
Can you show me how you are mounting the configmap, just so I can make sure I don’t try something different from what you are doing?
When you say “In configmap”, that does not tell me how you SEE the data. Is it kubectl get -o json or -o yaml or kubectl edit ?
This is FAR from my area of expertise, but it seems like something that we SHOULD get right but also that could easily slip thru the cracks of a primarily-in-English project. Also, as a non-speaker, it’s hard for me to see when we get it wrong somehow
This is the configmap api which i’m using and i’m checking this configmap using kubectl edit, and with kubectl get -o json, the data is same deployment.yaml
My ‘cat’ is from inside the container. I will try it with and editor when I get back tot desk, but I think this shows that there is not a systemic issue in how we handle the contents?
But in other ubuntu machines there is no such extra characters showing in the language, why is that inside the container it is giving those random chars.
It could be the configuration inside - does it have all the right settings and locale support? As an English speaker and ASCII typer, I have never really had to configure a machine for non-ASCII, so I don’t know.
Again, if you hexdump the file, that will tell you for certain if the error is in the data or in the rendering.
Yes it has the locale support, i’m doing the same thing as i was doing in other machines but this time it is just inside the container,Using hexdump or xxd it is showing dots since these are non-ascii(devanagari) characters, is there something else i can share with you which can help in solving this issue?