Discussion:
[freetds] iconv works
Frediano Ziglio
2013-05-20 21:06:42 UTC
Permalink
Hi,
trying to sort the problem of supporting protocol 7.2/7.3 in dblib/ctlib
I came to the problem of varchar(max). The problem was that dblib wants
conversion to single char while varchar(max) was implemented not converting
it. This was fine for ODBC which disable conversions in libTDS to support
getting raw data but not for dblib. To fix the issue a new iconv function
which deal with varchar(max) (which can be streamed having multiple chunks
so tds_get_char_data was not helpful in this case (as only support
contiguous data from the net). I then decided to write code which could
handle conversion from an input stream to an output stream so I can reuse
for every conversion (for instance from bcp file to memory!).

I ended up with this branch:

http://gitorious.org/~freddy77/freetds/mars-freetds/commits/unify_iconv

I'm actually running tests again to publish on nightly.

The reason I didn't pushed is that although code is quite good there are
some things to decide and sort out! Structures/functions to deal with
streams are in include/freetds/stream.h. Input stream it's just a function
to read data while output have a buffer contained in it.

Good things of these patches:
- only a single code to convert characters, better for test and maintenance;
- bit less copy of data;
- allow dblib/ctlib to support up to protocol 7.3;
- allow bcp to support direct streaming, no fseek require

Things to sort out:
- I added a new stream.h in include/freetds. The idea is to move freetds
internal includes in such directory to be able to include with
<freetds/name.h>. Are there any objections? The main reason is to avoid
using the tds prefix for every include;
- tds_stream_convert (in the new src/tds/stream.c) should perhaps be moved
in iconv.c?
- tds_iconv_fread (in src/tds/iconv.c) should be moved to only bcp code?
Now handle termination directly so is very bcp only related. It's also true
that was used only for bcp even before. Now however has very few things to
deal with iconv as conversions are done by tds_stream_convert;

Small todos and optimizations:
- check EINVAL error during conversion, should be ok;
- optimize file terminator detection (is not slower than before);
- reuse buffer reading data from file to avoid too much
allocation/reallocation/free;
- limit reallocation starting allocating a suitable buffer (not too big but
not even fixed);
- reuse tds_stream_convert more, even to read into blobs.

Frediano
Frediano Ziglio
2013-05-21 20:47:01 UTC
Permalink
Hi,
I forgot to say. If you look at test results you'll note that
get_send_data (dblib), t0013 and t0014 (dblib) tests fail with tds 7.3.
These tests cannot work with this version of the protocol as Microsoft
changed the network protocol removing textptr when BLOBs (text/image) are
returned.

Frediano



2013/5/20 Frediano Ziglio <freddy77 at gmail.com>
Post by Frediano Ziglio
Hi,
trying to sort the problem of supporting protocol 7.2/7.3 in dblib/ctlib
I came to the problem of varchar(max). The problem was that dblib wants
conversion to single char while varchar(max) was implemented not converting
it. This was fine for ODBC which disable conversions in libTDS to support
getting raw data but not for dblib. To fix the issue a new iconv function
which deal with varchar(max) (which can be streamed having multiple chunks
so tds_get_char_data was not helpful in this case (as only support
contiguous data from the net). I then decided to write code which could
handle conversion from an input stream to an output stream so I can reuse
for every conversion (for instance from bcp file to memory!).
http://gitorious.org/~freddy77/freetds/mars-freetds/commits/unify_iconv
I'm actually running tests again to publish on nightly.
The reason I didn't pushed is that although code is quite good there are
some things to decide and sort out! Structures/functions to deal with
streams are in include/freetds/stream.h. Input stream it's just a function
to read data while output have a buffer contained in it.
- only a single code to convert characters, better for test and maintenance;
- bit less copy of data;
- allow dblib/ctlib to support up to protocol 7.3;
- allow bcp to support direct streaming, no fseek require
- I added a new stream.h in include/freetds. The idea is to move freetds
internal includes in such directory to be able to include with
<freetds/name.h>. Are there any objections? The main reason is to avoid
using the tds prefix for every include;
- tds_stream_convert (in the new src/tds/stream.c) should perhaps be moved
in iconv.c?
- tds_iconv_fread (in src/tds/iconv.c) should be moved to only bcp code?
Now handle termination directly so is very bcp only related. It's also true
that was used only for bcp even before. Now however has very few things to
deal with iconv as conversions are done by tds_stream_convert;
- check EINVAL error during conversion, should be ok;
- optimize file terminator detection (is not slower than before);
- reuse buffer reading data from file to avoid too much
allocation/reallocation/free;
- limit reallocation starting allocating a suitable buffer (not too big
but not even fixed);
- reuse tds_stream_convert more, even to read into blobs.
Frediano
Loading...