Define OpenSearch® timestamp data in SQL pipeline
Frequently results in Apache Flink® data pipelines include one or more timestamps, either contained in the source events or generated by window aggregations.
When the output of the Apache Flink® data pipeline is an Aiven for OpenSearch® index, convert the Flink timestamps to a format recognizable by OpenSearch®, otherwise they will be interpreted as strings, losing the benefits of time filtering.
OpenSearch® recognises the following as correct date/time formats:
yyyy/MM/dd
for a date fieldHH:mm:ss
for a time fieldyyyy/MM/dd HH:mm:ss
for a timestamp field
Structure the data pipeline output to follow one of the recognized formats.
Define Apache Flink® target tables including timestamps for OpenSearch®
When the result of the data pipeline contains a timestamp column like the below:
EVENT_TIME TIMESTAMP(3),
HOSTNAME STRING,
CPU DOUBLE
To push the data correctly to an OpenSearch® index, you'll need to set
the target column format as STRING
in the Flink table definition,
like:
EVENT_TIME STRING,
HOSTNAME STRING,
CPU DOUBLE
Assuming the EVENT_TIME
is a timestamp, you'll need to specify
it in the format understood by OpenSearch® using the DATE_FORMAT
function, like:
DATE_FORMAT(EVENT_TIME, 'yyyy/MM/dd HH:mm:ss')
Once the pipeline is running, you can check that the EVENT_TIME
field
in OpenSearch® is recognized as a timestamp.