Open
Description
I can't get numeric.mapping
to work with MySQL and Confluent Platform 5.1. Steps to reproduce below.
Create MySQL table:
use demo;
create table transactions (
txn_id INT,
customer_id INT,
amount DECIMAL(5,2),
currency VARCHAR(50),
txn_timestamp VARCHAR(50)
);
insert into transactions (txn_id, customer_id, amount, currency, txn_timestamp) values (3, 2, 17.13, 'EUR', '2018-04-30T21:30:39Z');
Inspect table:
mysql> describe transactions;
+---------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+-------+
| txn_id | int(11) | YES | | NULL | |
| customer_id | int(11) | YES | | NULL | |
| amount | decimal(5,2) | YES | | NULL | |
| currency | varchar(50) | YES | | NULL | |
| txn_timestamp | varchar(50) | YES | | NULL | |
+---------------+--------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
Create connector
curl -X POST http://localhost:8083/connectors -H "Content-Type: application/json" -d '{
"name": "jdbc_source_mysql_12a",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url": "jdbc:mysql://mysql:3306/demo",
"connection.user": "connect_user",
"connection.password": "asgard",
"topic.prefix": "mysql-12a-",
"numeric.mapping": "best_fit",
"table.whitelist" : "demo.transactions",
"mode":"bulk",
"poll.interval.ms" : 3600000
}
}'
Even though "numeric.mapping": "best_fit"
, Kafka Connect stores the DECIMAL(5,2)
as a Decimal
, serialised to bytes in Avro:
$ curl -s "http://localhost:8081/subjects/mysql-12a-transactions-value/versions/1"|jq '.schema|fromjson.fields[] | select (.name == "amount")'
{
"name": "amount",
"type": [
"null",
{
"type": "bytes",
"scale": 2,
"precision": 64,
"connect.version": 1,
"connect.parameters": {
"scale": "2"
},
"connect.name": "org.apache.kafka.connect.data.Decimal",
"logicalType": "decimal"
}
],
"default": null
}
Connect Worker log excerpt:
INFO Kafka version : 2.1.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
…
INFO JdbcSourceTaskConfig values:
batch.max.rows = 100
catalog.pattern = null
connection.attempts = 3
connection.backoff.ms = 10000
connection.password = [hidden]
connection.url = jdbc:mysql://mysql:3306/demo
connection.user = connect_user
dialect.name =
incrementing.column.name =
mode = bulk
numeric.mapping = best_fit
numeric.precision.mapping = false
poll.interval.ms = 3600000
query =
schema.pattern = null
table.blacklist = []
table.poll.interval.ms = 60000
table.types = [TABLE]
table.whitelist = [demo.transactions]
tables = [`demo`.`transactions`]
timestamp.column.name = []
timestamp.delay.interval.ms = 0
topic.prefix = mysql-12a-
validate.non.null = true
(io.confluent.connect.jdbc.source.JdbcSourceTaskConfig)
…
DEBUG Checking for next block of results from BulkTableQuerier{table='"demo"."transactions"', query='null', topicPrefix='mysql-12a-'} (io.confluent.connect.jdbc.source.JdbcSourceTask)
DEBUG BulkTableQuerier{table='"demo"."transactions"', query='null', topicPrefix='mysql-12a-'} prepared SQL query: SELECT * FROM `demo`.`transactions` (io.confluent.connect.jdbc.source.BulkTableQuerier)
DEBUG DECIMAL with precision: '5' and scale: '2' (io.confluent.connect.jdbc.dialect.MySqlDatabaseDialect)
DEBUG DECIMAL with precision: '5' and scale: '2' (io.confluent.connect.jdbc.dialect.MySqlDatabaseDialect)
DEBUG Returning 100 records for BulkTableQuerier{table='"demo"."transactions"', query='null', topicPrefix='mysql-12a-'} (io.confluent.connect.jdbc.source.JdbcSourceTask)
…
kafka-connect_1_8eb73e80dda1 | [2019-01-07 13:37:50,920] DEBUG Sending POST with input {"schema":"{\"type\":\"record\",\"name\":\"transactions\",\"fields\":[{\"name\":\"txn_id\",\"type\":[\"null\",\"int\"],\"default\":null},{\"name\":\"customer_id\",\"type\":[\"null\",\"int\"],\"default\":null},{\"name\":\"amount\",\"type\":[\"null\",{\"type\":\"bytes\",\"scale\":2,\"precision\":64,\"connect.version\":1,\"connect.parameters\":{\"scale\":\"2\"},\"connect.name\":\"org.apache.kafka.connect.data.Decimal\",\"logicalType\":\"decimal\"}],\"default\":null},{\"name\":\"currency\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"txn_timestamp\",\"type\":[\"null\",\"string\"],\"default\":null}],\"connect.name\":\"transactions\"}"} to http://schema-registry:8081/subjects/mysql-12a-transactions-value/versions (io.confluent.kafka.schemaregistry.client.rest.RestService)
I've tried this with three different settings, each still results in the amount
field serialised to bytes in Avro:
"numeric.mapping": "best_fit"
"numeric.mapping": "precision_only"
"numeric.precision.mapping": true
Per docs I am expecting to see the decimal(5,2)
serialised to Avro FLOAT64
(I think - but at least, not bytes
)
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
rmoff commentedon Jan 7, 2019
does
numeric.mapping
only apply toNUMERIC
types, notDECIMAL
?kafka-connect-jdbc/src/main/java/io/confluent/connect/jdbc/dialect/GenericDatabaseDialect.java
Lines 1146 to 1189 in b23e312
rmoff commentedon Jan 7, 2019
https://dev.mysql.com/doc/refman/8.0/en/fixed-point-types.html
So the following DDL:
Creates a table like this - note that
AMOUNT_02
whilst declared asNUMERIC
is created as aDECIMAL
:(MySQL Server version: 8.0.13)
rmoff commentedon Jan 7, 2019
Contrast to Postgres:
All columns are stored as
NUMERIC
:DECIMAL(5,2)
NUMERIC(5,2)
DECIMAL(5)
DECIMAL
100.01
100.02
100
100
numeric.mapping = none
(same as leaving it unset)'\u0011
Øî
d
d
numeric.mapping = best_fit
100.01
100.02
100
100
numeric.mapping = precision_only
'\u0011
Øî
100
100
(Postgres 11.1)
Postgres notes: https://gist.github.com/rmoff/7bb46a0b6d27982a5fb7a103bb7c95b9
[-]numeric.mapping doesn't appear to work with Confluent Platform 5.1 and MySQL[/-][+]numeric.mapping not supported for MySQL[/+]rmoff commentedon Jan 8, 2019
MS SQL notes : https://gist.github.com/rmoff/7bb46a0b6d27982a5fb7a103bb7c95b9#testing-numericmapping-in-ms-sql-server-2017
DECIMAL(5,2)
NUMERIC(5,2)
DECIMAL(5)
DECIMAL
decimal
length 5
precision 5
scale 2
numeric
length 5
precision 5
scale 2
decimal
length 5
precision 5
scale 0
decimal
length 9
precision 18
scale 0
100.01
100.02
100
100
numeric.mapping = none
(same as leaving it unset)'\u0011
Øî
d
d
numeric.mapping = best_fit
'\u0011
100.02
d
d
numeric.mapping = best_fit
(
query
used to CAST allDECIMAL
fields toNUMERIC
)100.01
100.02
100
100
numeric.mapping = precision_only
'\u0011
Øî
100
100
The same problem exists with
DECIMAL
fields being ignored. Since MS SQL accepts bothDECIMAL
andNUMERIC
as native data types, useNUMERIC
for Kafka Connect to correctly ingest the values when usingnumeric.precision=best_fit
. If changing the source schema isn't an option then you can usequery
mode, demonstrated here.[-]numeric.mapping not supported for MySQL[/-][+]numeric.mapping not supported for MySQL / doesn't work for DECIMAL fields[/+][-]numeric.mapping not supported for MySQL / doesn't work for DECIMAL fields[/-][+]numeric.mapping doesn't work for DECIMAL fields[/+]anssip commentedon Feb 6, 2019
I am facing a problem with MySql and decimal data types. The values end up as corrupt strings in the Kafka topic. Without using schemas the values look like this when listing with console-consumer:
I tried if registering an Avro schema would help. I made the type of this
revenue
field to befloat
in the schema and created a JDBC source connector to fill the topic. But this connector fails with followingSeems like it tries to register a new schema that is incompatible with my previously created schema. It tries to use type
bytes
for this revenue field (and for other decimal fields).My table in MySQL looks like this:
Is there some way to work around this issue now?
rmoff commentedon Feb 6, 2019
DECIMAL
isn't supported fornumeric.mapping
. There isn't a way to work around this that I'm aware of. The data isn't "corrupt", it's just a BigDecimal.For more details see https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector#bytes-decimals-numerics
anssip commentedon Feb 7, 2019
I tried to work around this issue by using a SMT cast. I changed the type of the column to varchar in the DB view i'm using here, and then casting it with
But now the connector fails with:
[{"state":"FAILED","trace":"org.apache.kafka.connect.errors.DataException: Invalid Java object for schema type INT64: class java.sql.Timestamp for field: \"started\
Adding this cast transform causes it to fail with datetime fields! Found this issue relating to cast transforms
My second attempt was to fix the the connector code:
kafka-connect-jdbc/src/main/java/io/confluent/connect/jdbc/dialect/GenericDatabaseDialect.java
Line 1146 in b23e312
I changed that line to include both DECIMAL and NUMERIC
Using this hacked-up connector it produces a bit different kind of bytes data in the topic. But seems like that is not the correct way to fix the issue :-)
19 remaining items