Skip to main content
Version: sdf-beta4

Custom Serialization

An Stateful Dataflow Service reads records from topics and push them to topics. The data obtained from the topics is deserialized using either json or raw format. Similarly, the data produced to a topic is serialized using either json or raw.

Raw format

Raw format or binary format, tries to create the types using their binary representation. This is only valid for the following primitive types:

  • string: Bytes are converted assuming they are valid UTF-8.
  • bool: Last bit of byte is used for the conversion.
  • numeric types: Are converted using big endian binary representation.
  • bytes: No conversion needed, data from records is passed as a list of u8.

Json format

With the json format, data from topics is deserialized and serialized into expected types using json. All the serialization/deserialization, is done using Rust serde auto-implementation. This mean, among other things, that for object types are de(serialized) using snake_case and enum types are de(serialized) with UpperCamelCase.

In some cases, we may want to rename fields to use another casing. This can be done in objects and enum types by using the deserialize or serialize configuration in their attributes and variants.

Json default behavior

If we have the following types defined:

types:
  new:
    type: object
    properties:
      title:
        type: string
      image-url:
        type: string
        optional: true
      keywords:
        type: keywords
      visibility:
        type: visibility
  keywords:
    type: list
    items:
      type: keyword
  keyword:
    type: object
    properties:
      slug:
        type: string
      human-label:
        type: string
  visibility:
    type: enum
    oneOf:
      public:
        type: null
      private:
        type: null

A valid example of a record that could be deserialized properly for the new type, would be something like:

{
    "title": "Fluvio is great!",
    "image_url": "https://fluvio.io/img/fluvio.jpg"
    "keywords": [
        {
            "slug": "tech",
            "human-label": "Technology"
        }
    ],
    "visibility": "Public"
}

Json custom deserialization

In some occasions, we may want to use different casing. For that reason, we have the serialize and deserialize configuration on object attributes and enum variants.

The serialize and deserialize configurations facilitate a flexible and robust way to manage data transformations. By clearly defining how properties should be renamed during these processes, the system can seamlessly integrate with external APIs or services while maintaining an internal structure that is coherent and easy to manage.

Serialization configuration

The serialize configuration is used to specify how a property should be transformed during the serialization process. It is defined within the properties section of an object type or within the variants of an enum and can have the following options:

  • rename: Specifies the new name for the property when serialized.

Example:

types:
  new:
    type: object
    properties:
      image-url:
        type: string
        optional: true
        serialize:
          rename: imageUrl

In this example, the image-url property will be serialized as imageUrl. The same configuration could be applied on enum variants.

Deserialization configuration

The deserialize configuration is used to specify how a property should be transformed during the deserialization process. It is also defined within the properties section of an object type or within the variants of an enum and can have the following options:

  • rename: Specifies the new name for the property when deserialized.
types:
  visibility:
    type: enum
    oneOf:
      public:
        deserialize:
          rename: available
        type: null
      private:
        deserialize:
          rename: draft
        type: null

In this example, the visibility type will successfully deserialize when the input is available or draft. The same configuration could be applied on object properties.

Custom deserialization in example

For instance, if we want to parse the following data:

{
    "title": "Fluvio is great!",
    "imageUrl": "https://fluvio.io/img/fluvio.jpg"
    "keywords": [
        {
            "slug": "tech",
            "humanLabel": "Technology"
        }
    ],
    "visibility": "available"
}

Let's look at the full example

types:
  new:
    type: object
    properties:
      title:
        type: string
      image-url:
        type: string
        optional: true
        serialize:
          rename: imageUrl
        deserialize:
          rename: imageUrl
      keywords:
        type: keywords
  keywords:
    type: list
    items:
      type: keyword
  keyword:
    type: object
    properties:
      slug:
        type: string
      human-label:
        type: string
        deserialize:
          rename: humanLabel
        serialize:
          rename: humanLabel

  visibility:
    type: enum
    oneOf:
      public:
        serialize:
          rename: available
        deserialize:
          rename: available
        type: null
      private:
        deserialize:
          rename: draft
        serialize:
          rename: draft
        type: null

Notice, that serialize and deserialize are independent and that if a type is used both in serialization and deserialization, adding a configuration for serialize and deserialize could be needed.