Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Serialize without going through JsonObject #233

Merged
merged 3 commits into from
Sep 20, 2023

Conversation

zyla
Copy link
Contributor

@zyla zyla commented Sep 20, 2023

As described in #152, serialization by constructing a JsonObject has to essentially clone all data, which requires extra allocations.

After this change, we pass data directly to the Serializer.

JsonObject and JsonValue conversion impls were rewritten to delegate to Serialize. This requires some unfortunate panics where we expect Serialize to produce a JSON object, but I think it's better than duplicating the serialization logic.

Benchmark results (compared vs 4cfbf3a):

serialize geojson::FeatureCollection struct (countries.geojson)
                        time:   [990.01 µs 1.0025 ms 1.0172 ms]
                        change: [-74.789% -74.267% -73.763%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

serialize custom struct (countries.geojson)
                        time:   [2.1757 ms 2.2237 ms 2.2718 ms]
                        change: [-40.176% -38.891% -37.671%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

serialize custom struct to geo-types (countries.geojson)
                        time:   [2.3386 ms 2.3570 ms 2.3776 ms]
                        change: [-42.699% -41.782% -40.960%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe

Custom structs still have to go through JsonObject, so there isn't as much improvement there.

Expected JSON outputs in tests had changed, because now we produce JSON object keys in order of the serialize_entry calls, instead of serde_json::Map order.

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

As described in <georust#152>,
serialization by constructing a JsonObject has to essentially clone all
data, which requires extra allocations.

After this change, we pass data directly to the `Serializer`.

`JsonObject` and `JsonValue` conversion impls were rewritten to delegate
to `Serialize`. This requires some unfortunate `panic`s where we expect
`Serialize` to produce a JSON object, but I think it's better than
duplicating the serialization logic.
@urschrei
Copy link
Member

Thanks for the contribution @zyla! Could you say more and / or provide an example (ideally in the form of a test) of the kind of data that can trigger a panic using this PR?

@lnicola
Copy link
Member

lnicola commented Sep 20, 2023

I don't think this can panic. It assumes that serializing a JsonObject (which is a serde_json::Map) Feature or FeatureCollection always results in an object, which seems like a reasonable assumption, given the implementation.

@@ -182,7 +159,26 @@ impl Serialize for Feature {
where
S: Serializer,
{
JsonObject::from(self).serialize(serializer)
let mut map = serializer.serialize_map(None)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to suggest passing in the length here, but serde_json doesn't seem to use it.

Copy link
Member

@michaelkirk michaelkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!

I agree with @lnicola that the panics "should never happen" in this case.

Since serde_json::to_value will attempt to serialize any input, the API needs to be fallible, but our input will never fail because all our fields are serializable given that they are themselves JsonObjects, None (which is serializable), or other primitive types (String, Vec, etc.).

It's probably worth documenting this assumption though. It took me a minute to reason through it.

src/feature.rs Outdated
if let Some(ref properties) = self.properties {
map.serialize_entry("properties", properties)?;
} else {
map.serialize_entry("properties", &JsonObject::new())?;
Copy link
Member

@michaelkirk michaelkirk Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be null instead of an empty object.

I think this reflects a gap in our test coverage. Can you add it if you agree? Something like:

    #[test]    
    fn feature_json_null_properties() {    
        let geojson_str = r#"{
            "geometry": null,    
            "properties": null,    
            "type":"Feature"    
        }"#;    
        let geojson = geojson_str.parse::<GeoJson>().unwrap();
        let GeoJson::Feature(feature) = geojson else {
            panic!("unexpected geojson");
        };    
        assert!(feature.properties.is_none());    
        assert_eq!(feature.to_string(), r#"{"type":"Feature","geometry":null,"properties":null}"#);    
    }   

A Feature object has a member with the name "properties". The
value of the properties member is an object (any JSON object or a
JSON null value).

edit: updated the quote. I initially copied the wrong doc snippet. 🤦

(from https://datatracker.ietf.org/doc/html/rfc7946#section-3.2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Changed the behavior and added a test.

@zyla
Copy link
Contributor Author

zyla commented Sep 20, 2023

It's probably worth documenting this assumption though. It took me a minute to reason through it.

Thanks for the review. Added some comments about that.

@michaelkirk michaelkirk added this pull request to the merge queue Sep 20, 2023
Merged via the queue into georust:main with commit 085b8e0 Sep 20, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants