How to escape Unicode with JSON.toString()?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

How to escape Unicode with JSON.toString()?

Alexander Farber
Hello fellow Jetty users and developers,

is it please possible to escape UTF-8 characters when using

org.eclipse.jetty.util.ajax.JSON.toString() method?

I understand that it might be an internal library, but until now it
works well for me in a servlet which among other tasks sends push
notifications via FCM (Firebase Cloud Messaging) and ADM (Amazon
Device Messaging).

However my problem with the latter is that ADM does not accept any
UTF-8 chars (in my case Cyrillic) and reproducibly fails with the
cryptic error message:

<SerializationException>
<Message>Could not parse XML</Message>
</SerializationException>

java.lang.IllegalStateException:
unknown char '<'(60) in |||<SerializationException>|  <Message>Could
not parse XML</Message>|</SerializationException>||

So is there maybe some possibility in Jetty 9.4.8.v20171121 to encode the chars?

Here is my Java code:

    // this string is POSTed to ADM server
    public String toAdmBody() {
        Map<String, Object> root  = new HashMap<>();
        Map<String, String> data  = new HashMap<>();
        root.put(KEY_DATA, data);
        data.put(KEY_BODY, mBody);
        // ADM does not accept integers
        data.put(KEY_GID, String.valueOf(mGid));
        // TODO encode utf8 chars
        return JSON.toString(root);
    }

Thank you
Alex
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: How to escape Unicode with JSON.toString()?

Joakim Erdfelt-8
org.eclipse.jetty.util.ajax.JSON.toString()  produces a JSON formatted string.

The error you are getting back is an XML?
XML encoding is different then JSON encoding.

org.eclipse.jetty.util.ajax.JSON.toString() tries to follow the guidance at https://tools.ietf.org/html/rfc8259#section-8

Perhaps you have some oddball charset declaration getting in your way.
I don't know how ADM works, but if you are submitting the JSON to them in an HttpClient, make sure your `Content-Type` request header says something like "application/json; charset=utf-8"
If ADM is issuing requests to your server, then make sure your `Content-Type` response header has "application/json; charset=utf-8"


Joakim Erdfelt / [hidden email]

On Wed, Mar 28, 2018 at 10:01 AM, Alexander Farber <[hidden email]> wrote:
Hello fellow Jetty users and developers,

is it please possible to escape UTF-8 characters when using

org.eclipse.jetty.util.ajax.JSON.toString() method?

I understand that it might be an internal library, but until now it
works well for me in a servlet which among other tasks sends push
notifications via FCM (Firebase Cloud Messaging) and ADM (Amazon
Device Messaging).

However my problem with the latter is that ADM does not accept any
UTF-8 chars (in my case Cyrillic) and reproducibly fails with the
cryptic error message:

<SerializationException>
<Message>Could not parse XML</Message>
</SerializationException>

java.lang.IllegalStateException:
unknown char '<'(60) in |||<SerializationException>|  <Message>Could
not parse XML</Message>|</SerializationException>||

So is there maybe some possibility in Jetty 9.4.8.v20171121 to encode the chars?

Here is my Java code:

    // this string is POSTed to ADM server
    public String toAdmBody() {
        Map<String, Object> root  = new HashMap<>();
        Map<String, String> data  = new HashMap<>();
        root.put(KEY_DATA, data);
        data.put(KEY_BODY, mBody);
        // ADM does not accept integers
        data.put(KEY_GID, String.valueOf(mGid));
        // TODO encode utf8 chars
        return JSON.toString(root);
    }

Thank you
Alex
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: How to escape Unicode with JSON.toString()?

Joakim Erdfelt-8
So while Section 7 indicates a "\u####" notation as optional behavior.

That is discouraged in the same spec (Section 8).

It's obvious that section 7 is old, as it limits the "\u" encoding to 3 bytes, even though the UTF-8 / Unicode spec has passed that 3 byte upper limit a while ago and is now at 4 bytes.

The "\u####" behavior would also have no correlation to encoding for XML as indicated by your initial question.

Guess we need more details on what is happening, and what ADM expects, in order to help.


Joakim Erdfelt / [hidden email]

On Wed, Mar 28, 2018 at 10:19 AM, Joakim Erdfelt <[hidden email]> wrote:
org.eclipse.jetty.util.ajax.JSON.toString()  produces a JSON formatted string.

The error you are getting back is an XML?
XML encoding is different then JSON encoding.

org.eclipse.jetty.util.ajax.JSON.toString() tries to follow the guidance at https://tools.ietf.org/html/rfc8259#section-8

Perhaps you have some oddball charset declaration getting in your way.
I don't know how ADM works, but if you are submitting the JSON to them in an HttpClient, make sure your `Content-Type` request header says something like "application/json; charset=utf-8"
If ADM is issuing requests to your server, then make sure your `Content-Type` response header has "application/json; charset=utf-8"


Joakim Erdfelt / [hidden email]

On Wed, Mar 28, 2018 at 10:01 AM, Alexander Farber <[hidden email]> wrote:
Hello fellow Jetty users and developers,

is it please possible to escape UTF-8 characters when using

org.eclipse.jetty.util.ajax.JSON.toString() method?

I understand that it might be an internal library, but until now it
works well for me in a servlet which among other tasks sends push
notifications via FCM (Firebase Cloud Messaging) and ADM (Amazon
Device Messaging).

However my problem with the latter is that ADM does not accept any
UTF-8 chars (in my case Cyrillic) and reproducibly fails with the
cryptic error message:

<SerializationException>
<Message>Could not parse XML</Message>
</SerializationException>

java.lang.IllegalStateException:
unknown char '<'(60) in |||<SerializationException>|  <Message>Could
not parse XML</Message>|</SerializationException>||

So is there maybe some possibility in Jetty 9.4.8.v20171121 to encode the chars?

Here is my Java code:

    // this string is POSTed to ADM server
    public String toAdmBody() {
        Map<String, Object> root  = new HashMap<>();
        Map<String, String> data  = new HashMap<>();
        root.put(KEY_DATA, data);
        data.put(KEY_BODY, mBody);
        // ADM does not accept integers
        data.put(KEY_GID, String.valueOf(mGid));
        // TODO encode utf8 chars
        return JSON.toString(root);
    }

Thank you
Alex
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users



_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: How to escape Unicode with JSON.toString()?

Alexander Farber
Hi Joakim,

no, I am already sending "application/json; charset=utf-8" - so that
is not a problem.

And also ADM error talking about XML unfortunately means nothing, they
mention XML in all their error messages I have seen sofar while
developing (while their API expects JSON being POSTed).

I understand, that encoding UTF-8 chars to \u#### is optional.

But is that possible with jetty-util-ajax or should I switch to some
other lib for JSON encoding?

When I look at
https://github.com/eclipse/jetty.project/blob/jetty-9.4.x/jetty-util-ajax/src/main/java/org/eclipse/jetty/util/ajax/JSON.java
then it can decode \u####, but jetty-util-ajax do the reverse thing too?

I have also posted my question at
https://stackoverflow.com/questions/49538806/how-to-escape-unicode-with-json-tostring-method-in-jetty-util-ajax

Thank you
Alex
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: How to escape Unicode with JSON.toString()?

Greg Wilkins
Alex,

Which characters do you want us to use \u#### encoding for?   > US_ASCII? >ISO_8859? or just chars that would encode to 3 byte utf8?

Maybe we could... we'll discuss.
The other option is to do that encoding when you convert the json string into bytes to send?

cheers



_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: How to escape Unicode with JSON.toString()?

Bill Ross-2

The only way to escape Unicode is the grave.

Please excuse the faulty calendar.

Bill


On 3/29/18 12:47 AM, Greg Wilkins wrote:
Alex,

Which characters do you want us to use \u#### encoding for?   > US_ASCII? >ISO_8859? or just chars that would encode to 3 byte utf8?

Maybe we could... we'll discuss.
The other option is to do that encoding when you convert the json string into bytes to send?

cheers




_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: How to escape Unicode with JSON.toString()?

Alexander Farber
In reply to this post by Greg Wilkins
Hello Greg,

On Thu, Mar 29, 2018 at 9:47 AM, Greg Wilkins <[hidden email]> wrote:

> Which characters do you want us to use \u#### encoding for?   > US_ASCII?
>>ISO_8859? or just chars that would encode to 3 byte utf8?
>
> Maybe we could... we'll discuss.
> The other option is to do that encoding when you convert the json string
> into bytes to send?

I apologize - it has turned out to be an ADM backend problem and now
suddenly (after few weeks) all characters I send to them just work.

But maybe my question is still valid and useful for someone -

if it is possible to add optional encoding to \u#### to the jetty-util-ajax

However now I can not tell which exactly characters that would be.

Regards
Alex
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users