Discussion:
[freetds] Records containing DATE data types fail to bulk load with freebcp
Stephen Marshall
2014-01-13 16:16:25 UTC
Permalink
I've found a bug in freebcp. Any record containing the DATE data type fails
to bulk load using freebcp. Since the DATE data type is specific to Sybase
data servers, this does not affect use of freebcp with MSSQL.

This email is rather long, but contains all the details needed to reproduce
the error. It also contains my initial exploration of the problem. I'd
really appreciate it is a person with more knowledge of the code than me
could take a look at this issue.

Thanks,
Steve
-----

REPRODUCING THE ERROR:

The following test shows the problem (see files in the
freebcp_date_error.tar.gz)

1. Create a table called bcp_test with a single column of date type
(see attached SQL file create_bcp_test.sql).
2. Bulk load one record into this table using freebcp, like this:
export TDSDUMP="./tdsdump.txt"
freebcp <dbname>..bcp_test in bcp_test.txt -U<user> -P<password>
--S<server> -c

Where <user>, <password>, and <server> are the connection credentials for
your database server, and <dbname> is the name of the database holding the
table. The data file bcp_test.txt contains only a single record with a
single textual date of the form YYYY-mm-dd.

The freebcp commands returns an error of this form (where ### are ids
specific to the database being used):
Bad row data received from the client while bulk copying into object ###
partition ### in database ###.
Received a row of length 10 whilst maximum or expected row length is 6.

I tested this running freebcp on Mac OS X 10.8.5 against a Sybase ASE 15.7
server running on CentOS 6.

CONFIRMATION THAT THIS IS AN ERROR:

Running the same command with the bcp utility provided with Sybase ASE 15.7
succeeds without error. A simple insert of the same data also succeeds,
e.g.
insert into bcp_test values('2014-01-04');

EXPLORATION OF THE ERROR:

Changing the data type in bcp_test from DATE to DATETIME causes freebcp to
succeed. However, this increases the storage size from 4 to 8 bytes and
causes values to be returned with a complete date time (e.g. "Jan 4 2014
12:00:00:000AM"), rather than just a date.

Interestingly, comparison of the cases where DATE and DATETIME are used
shows Sybase ASE 15.7 returns the same column_type and column_size values
for both cases (61 and 8, respectively). Only the usertype column differs
(37 for DATE, 12 for DATETIME). The type and length seem incorrect for
the DATE type, given this information from the systypes system table:
usertype type length name



-------- ---- ------ --------



37 49 4 date



12 61 8 datetime






See tdsdump.txt-date_vs_datetime for details of the comparison. The type
information comes from the TDS packet returned from a "FMTONLY" query to
get all columns (but no rows) from the table bcp_test.

NEXT STEPS:

Frankly, after isolating this problem to these byte codes, I thought fixing
this would be easy. However, there are a number of transformations and
special cases in the data conversion code from the tds and dblib libraries.
After inserting a special case to set type to 49 (SYBDATE) and length to 4
if usertype is 37, freebcp failed with this error, caused by a failure of
the tds_convert function:
Requested data conversion does not exist

It seems additional logic, perhaps in or around tds_convert() is needed to
support the SYBDATE data type. I've spent about day trying to find the
right place to insert this logic, but, at this point, I am chasing my tail.
If someone on the list has insight into how to add support for a new data
type, I'd appreciate some direction.

Thanks,
Steve
-------------- next part --------------
A non-text attachment was scrubbed...
Name: freebcp_date_error.tar.gz
Type: application/x-gzip
Size: 1336 bytes
Desc: not available
Url : http://lists.ibiblio.org/pipermail/freetds/attachments/20140113/b9664777/attachment.gz
Marc Abramowitz
2014-01-13 17:57:58 UTC
Permalink
Post by Stephen Marshall
I've found a bug in freebcp. Any record containing the DATE data type fails
to bulk load using freebcp. Since the DATE data type is specific to Sybase
data servers, this does not affect use of freebcp with MSSQL.
Newer versions of SQL Server do have DATE.

http://msdn.microsoft.com/en-us/library/bb630352.aspx

So I guess that case should be tested. I've never used freebcp myself.
James K. Lowden
2014-01-15 04:40:34 UTC
Permalink
On Mon, 13 Jan 2014 11:16:25 -0500
Post by Stephen Marshall
Frankly, after isolating this problem to these byte codes, I thought
fixing this would be easy. However, there are a number of
transformations and special cases in the data conversion code from
the tds and dblib libraries. After inserting a special case to set
type to 49 (SYBDATE) and length to 4 if usertype is 37, freebcp
failed with this error, caused by a failure of the tds_convert
function: Requested data conversion does not exist
It seems additional logic, perhaps in or around tds_convert() is
needed to support the SYBDATE data type. I've spent about day trying
to find the right place to insert this logic, but, at this point, I
am chasing my tail. If someone on the list has insight into how to
add support for a new data type, I'd appreciate some direction.
You may find the work Frediano has done with ODBC for TDS 7.3 types is
useful to you.

Start with tds_willconvert(), some of which is generated from a perl
script. You'll also need a struct to hold your DATE components in
wire format if it doesn't fit something we already have. Then for
tds_convert() you'll need

1. A function that accepts a DATE and returns at least VARCHAR,
whatever is commensurate with tds_willconvert().

2. For each type that can be converted to DATE, add the logic to the
switch in the function that accepts that type in convert.c.

3. In db-lib, you'll need SYBDATE and the same struct that libtds uses
fo the wire format. You want to extend dbdatecrack(), too, lest you
start parsing strings instead when you want to know the month in your
application.

Conversion should be symmetrical: if A -> B then B -> A.

Now you have a DATE structure and the functionality to convert between
DATE and strings. The changes to freebcp fall out pretty easily, as
you've already seen.

HTH.

--jkl
Frediano Ziglio
2014-01-16 21:03:51 UTC
Permalink
Post by James K. Lowden
On Mon, 13 Jan 2014 11:16:25 -0500
Post by Stephen Marshall
Frankly, after isolating this problem to these byte codes, I thought
fixing this would be easy. However, there are a number of
transformations and special cases in the data conversion code from
the tds and dblib libraries. After inserting a special case to set
type to 49 (SYBDATE) and length to 4 if usertype is 37, freebcp
failed with this error, caused by a failure of the tds_convert
function: Requested data conversion does not exist
It seems additional logic, perhaps in or around tds_convert() is
needed to support the SYBDATE data type. I've spent about day trying
to find the right place to insert this logic, but, at this point, I
am chasing my tail. If someone on the list has insight into how to
add support for a new data type, I'd appreciate some direction.
You may find the work Frediano has done with ODBC for TDS 7.3 types is
useful to you.
Start with tds_willconvert(), some of which is generated from a perl
script. You'll also need a struct to hold your DATE components in
wire format if it doesn't fit something we already have. Then for
tds_convert() you'll need
1. A function that accepts a DATE and returns at least VARCHAR,
whatever is commensurate with tds_willconvert().
2. For each type that can be converted to DATE, add the logic to the
switch in the function that accepts that type in convert.c.
3. In db-lib, you'll need SYBDATE and the same struct that libtds uses
fo the wire format. You want to extend dbdatecrack(), too, lest you
start parsing strings instead when you want to know the month in your
application.
Conversion should be symmetrical: if A -> B then B -> A.
Now you have a DATE structure and the functionality to convert between
DATE and strings. The changes to freebcp fall out pretty easily, as
you've already seen.
HTH.
I think adding a type is quite complicated at this time. I should add
some guide and probably refactor some code. Every type needs (probably
list not complete):
- some changes in the protocol to tell server we support this type;
- a binary representation for this type;
- function to handle data from server;
- function to handle data to server;
- conversion table;
- conversion function (to and from);
- functions (how many I don't know) to ask informations for the type
(like length, precision, scale) which sometimes change a bit from
library to library.

src/tds/data.c actually collect a bit of these information (functions
to read/write) in a single place but the others. Actually BCP is not
on these list of functions (unfortunately).

For DATE/TIME (Sybase). These types are quite easy to implement. DATE
is the date part or TDS_DATETIME while TIME is the time part of
TDS_DATETIME. So conversions are quite easy, just convert to another
already supported type before converting to another type.

Another problem is to define the proper type. If the official
(dblib/ctlib/odbc) offer already a binary type you need to stick to it
for compatibility. If not you can define the type as you like. I think
Sybase defined the type for DATE/TIME (I'm not sure, just question to
check documentation).

Asking information is really the tricky bit as this code is spread all
over the places :(
Too much places too much switches!

Frediano

Loading...