Skip to main content
Version: sdf-beta3

FlatMap Operator

The Flat-Map Operator allows an input from the source to be split into multiple entries in the sink. Because it is a map operator, split inputs can be mapped too. However, unlike a map operator, it creates a one to many mapping. Its a powerful tool that could be used to digest arrays, objects, or other nested data. In this example, we will write an dataflow that splits a string in half with FlatMap.

Prerequisites

This guide uses local Fluvio cluster. If you need to install it, please follow the instructions at here.

Transformation

Below is an example of a transform function using flat-map. The function takes the string and splits in half. Both halves are inserted into the sink.

    transforms:
      - operator: flat-map
        run: |
          fn halfword(input: String) -> Result<Vec<String>> {
            let mut ret: Vec<String> = Vec::new();
            let mid = input.len() / 2;
            ret.push(format!("first half: {}",&input[..mid]));
            ret.push(format!("second half: {}",&input[mid..]));
            Ok(ret)
          }

In the example function, the return type is a vector of strings. The split string is re-encoded. In general, the transform function should return a vector of the sink's type.

Running the Example

Copy and paste following config and save it as dataflow.yaml.

# dataflow.yaml
apiVersion: 0.5.0
meta:
  name: flat-map-example
  version: 0.1.0
  namespace: examples

config:
  converter: raw

topics:
  sentences:
    schema:
      value:
        type: string
  halfword:
    schema:
      value:
        type: string

services:
  flat-map-service:
    sources:
      - type: topic
        id: sentences

    transforms:
      - operator: flat-map
        run: |
          fn halfword(input: String) -> Result<Vec<String>> {
            let mut ret: Vec<String> = Vec::new();
            let mid = input.len() / 2;
            ret.push(format!("{}",&input[..mid]));
            ret.push(format!("{}",&input[mid..]));
            Ok(ret)
          }

    sinks:
      - type: topic
        id: halfword

To run example:

$ sdf run

Produce sentences to in sentence topic:

$ echo "0123456789" | fluvio produce sentences

Consume topic halfword to retrieve the result in another terminal:

$ fluvio consume halfword -Bd
first half: 01234
second half: 56789

Here two strings are produced from the input.

Cleanup

Exit sdf terminal and clean-up. The --force flag removes the topics:

$ sdf clean --force

Conclusion

We just covered another basic operator in SDF, the Flat-Map Operator. The Flat-Map is a powerful operator. As a matter of fact, the Flat-Map operator can be used inplace of other operators.