◐ Shell
clean mode source ↗

Feat: New version of entity_key serDe

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

The current entity_key serDe (version 2) is below:

def serialize_entity_key(
    entity_key: EntityKeyProto, entity_key_serialization_version=1
) -> bytes:
    """
    Serialize entity key to a bytestring so it can be used as a lookup key in a hash table.

    We need this encoding to be stable; therefore we cannot just use protobuf serialization
    here since it does not guarantee that two proto messages containing the same data will
    serialize to the same byte string[1].

    [1] https://developers.google.com/protocol-buffers/docs/encoding
    """
    sorted_keys, sorted_values = zip(
        *sorted(zip(entity_key.join_keys, entity_key.entity_values))
    )

    output: List[bytes] = []
    for k in sorted_keys:
        output.append(struct.pack("<I", ValueType.STRING))
        output.append(k.encode("utf8"))
    for v in sorted_values:
        val_bytes, value_type = _serialize_val(
            v.WhichOneof("val"),
            v,
            entity_key_serialization_version=entity_key_serialization_version,
        )

        output.append(struct.pack("<I", value_type))

        output.append(struct.pack("<I", len(val_bytes)))
        output.append(val_bytes)

    return b"".join(output)

e.g, for sorted_keys = {tuple: 1} item_id and sorted_values = {tuple: 1} int64_val: 1\n will give output:
[b'\x02\x00\x00\x00', b'item_id', b'\x04\x00\x00\x00', b'\x08\x00\x00\x00', b'\x01\x00\x00\x00\x00\x00\x00\x00']

This makes deserialization not doable. In order to deserialize we can append the "length" of value to the join_key, such as for the same test key and value we can get the output:
[b'\x02\x00\x00\x00', b'\x07\x00\x00\x00', b'item_id', b'\x04\x00\x00\x00', b'\x08\x00\x00\x00', b'\x01\x00\x00\x00\x00\x00\x00\x00']

Then we can deserialize the bytes to proto.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.