Discussion:
[freetds] FreeTDS not giving any results
James Andrewartha
2013-09-06 09:05:58 UTC
Permalink
Hi,

I have a python script that uses pymssql 1.0.2, that in turn uses
FreeTDS 0.91, connecting to a MSSQL 2008 server. It used to work, but
now I don't get any results. The FreeTDS log (below) show the data is
received, but it's not processed for some reason. I did some code diving
and printf debugging, and dblib.c sets the results_state to empty
because of a TDS_COMPUTEFMT_RESULT (line 4698 in dbsqlok). Does anyone
know why this is happening?

16:30:57.256987 4051 (dblib.c:1312):dbcmd(0x98f1b20, SELECT top 50
ContactEmail FROM uvMerge_Current_Students_ICT_CCGS)
16:30:57.256993 4051 (dblib.c:1319):dbcmd() bufsz = 11
16:30:57.256999 4051 (dblib.c:5882):dbfreebuf(0x98f1b20)
16:30:57.257005 4051 (dblib.c:1369):dbsqlexec(0x98f1b20)
16:30:57.257011 4051 (dblib.c:6862):dbsqlsend(0x98f1b20)
16:30:57.257016 4051 (mem.c:615):tds_free_all_results()
16:30:57.257022 4051 (util.c:156):Changed query state from IDLE to QUERYING
16:30:57.257028 4051 (write.c:140):tds_put_string converting 65 bytes of
"SELECT top 50 ContactEmail FROM uvMerge_Current_Students_ICT_CCGS"
16:30:57.257036 4051 (write.c:168):tds_put_string wrote 130 bytes
16:30:57.257042 4051 (util.c:156):Changed query state from QUERYING to
PENDING
16:30:57.257048 4051 (net.c:741):Sending packet
0000 01 01 00 8a 00 00 01 00-53 00 45 00 4c 00 45 00 |........ S.E.L.E.|
0010 43 00 54 00 20 00 74 00-6f 00 70 00 20 00 35 00 |C.T. .t. o.p. .5.|
0020 30 00 20 00 43 00 6f 00-6e 00 74 00 61 00 63 00 |0. .C.o. n.t.a.c.|
0030 74 00 45 00 6d 00 61 00-69 00 6c 00 20 00 46 00 |t.E.m.a. i.l. .F.|
0040 52 00 4f 00 4d 00 20 00-75 00 76 00 4d 00 65 00 |R.O.M. . u.v.M.e.|
0050 72 00 67 00 65 00 5f 00-43 00 75 00 72 00 72 00 |r.g.e._. C.u.r.r.|
0060 65 00 6e 00 74 00 5f 00-53 00 74 00 75 00 64 00 |e.n.t._. S.t.u.d.|
0070 65 00 6e 00 74 00 73 00-5f 00 49 00 43 00 54 00 |e.n.t.s. _.I.C.T.|
0080 5f 00 43 00 43 00 47 00-53 00 |_.C.C.G. S.|

16:30:57.257113 4051 (dblib.c:4639):dbsqlok(0x98f1b20)
16:30:57.257151 4051 (dblib.c:4669):dbsqlok() not done, calling
tds_process_tokens()
16:30:57.257157 4051 (token.c:540):tds_process_tokens(0x9908c08,
0xbfc77b58, 0xbfc77b5c, 0x6914)
16:30:57.257164 4051 (util.c:156):Changed query state from PENDING to
READING
16:30:57.849904 4051 (net.c:555):Received header
0000 04 01 05 10 00 ad 01 00- |........|

16:30:57.850048 4051 (net.c:609):Received packet
0000 04 01 05 10 00 ad 01 00-81 01 00 00 00 20 00 a7 |........ ..... ..|
0010 c9 00 0c 43 00 6f 00 6e-00 74 00 61 00 63 00 74 |...C.o.n .t.a.c.t|
0020 00 45 00 6d 00 61 00 69-00 6c 00 d1 16 00 62 69 |.E.m.a.i .l....bi|
**data removed**
0500 2e 63 6f 6d 2e 61 75 fd-10 00 c1 00 32 00 00 00 |.com.au. ....2...|

16:30:57.850466 4051 (token.c:555):processing result tokens. marker is
81(TDS7_RESULT)
16:30:57.850480 4051 (token.c:1515):processing TDS7 result metadata.
16:30:57.850487 4051 (mem.c:615):tds_free_all_results()
16:30:57.850500 4051 (token.c:1540):set current_results (1 column) to
tds->res_info
16:30:57.850506 4051 (token.c:1547):setting up 1 columns
16:30:57.850513 4051 (token.c:3420):adjust_character_column_size:
Server charset: CP1252
Server column_size: 201
Client charset: ISO-8859-1
Client column_size: 201
16:30:57.850524 4051 (token.c:1486):tds7_get_data_info:
colname = ContactEmail (12 bytes)
type = 39 (varchar)
server's type = 167 (xvarchar)
column_varint_size = 2
column_size = 201 (201 on server)
16:30:57.850531 4051 (token.c:1556): name size/wsize
type/wtype utype
16:30:57.850537 4051 (token.c:1557): --------------------
--------------- --------------- -------
16:30:57.850543 4051 (token.c:1567): ContactEmail 201/201
39/167 0
16:30:57.850551 4051 (util.c:156):Changed query state from READING to
PENDING
16:30:57.850559 4051 (dblib.c:4700):dbsqlok() found result token
16:30:57.850567 4051 (dblib.c:1813):dbnumcols(0x98f1b20)
16:30:57.850573 4051 (dblib.c:2761):dbcount(0x98f1b20)
16:30:57.850579 4051 (dblib.c:1813):dbnumcols(0x98f1b20)
16:30:57.850586 4051 (dblib.c:1839):dbcolname(0x98f1b20, 1)
16:30:57.850593 4051 (dblib.c:2831):dbcoltype(0x98f1b20, 1)
16:30:57.850841 4051 (dblib.c:2018):dbnextrow(0x98f1b20)
16:30:57.850854 4051 (dblib.c:2031):dbnextrow() dbresults_state = 1
(_DB_RES_RESULTSET_EMPTY)
16:30:57.850861 4051 (dblib.c:2036):leaving dbnextrow() returning -2
(NO_MORE_ROWS)
16:30:57.850867 4051 (dblib.c:2761):dbcount(0x98f1b20)
16:31:20.183588 4051 (dblib.c:1443):dbclose(0x98f1b20)
16:31:20.183730 4051 (dblib.c:258):dblib_del_connection(0xb702e400,
0x9908c08)
16:31:20.183742 4051 (mem.c:615):tds_free_all_results()
16:31:20.183801 4051 (util.c:156):Changed query state from PENDING to DEAD
16:31:20.183820 4051 (dblib.c:305):dblib_release_tds_ctx(1)
16:31:20.183829 4051 (dblib.c:5882):dbfreebuf(0x98f1b20)

Thanks,
--
James Andrewartha
Network & Projects Engineer
Christ Church Grammar School
Claremont, Western Australia
Ph. (08) 9442 1757
Mob. 0424 160 877
James Andrewartha
2013-09-06 12:51:06 UTC
Permalink
On 6/09/13 5:05 PM, "James Andrewartha" <jandrewartha at ccgs.wa.edu.au>
Post by James Andrewartha
I have a python script that uses pymssql 1.0.2, that in turn uses
FreeTDS 0.91, connecting to a MSSQL 2008 server. It used to work, but
now I don't get any results. The FreeTDS log (below) show the data is
received, but it's not processed for some reason. I did some code diving
and printf debugging, and dblib.c sets the results_state to empty
because of a TDS_COMPUTEFMT_RESULT (line 4698 in dbsqlok). Does anyone
know why this is happening?
A few other things about that log - it's actually from the development
tarball, and the data processed is exactly the first packet received - the
log doesn't record any others, and the TCP window fills up so it's not
pulling them out of the socket buffer either.
--
James Andrewartha

Network & Projects Engineer
Christ Church Grammar School
Claremont, Western Australia
Ph. (08) 9442 1757
Mob. 0424 160 877
Marc Abramowitz
2013-09-06 14:18:18 UTC
Permalink
Hi James,

I've seen a similar behavior when using pymssql 1.0.2 with newer versions
of FreeTDS. No rows seem to come back. I didn't look in depth at
diagnosing and debugging it though, because 2 things seem to work around
the problem:

1. Use an older version of FreeTDS. E.g.: the one packaged with our Ubuntu
10 systems, which is 0.82-6build1. This is reasonable for production.
2. Use a newer version of pymssql -- e.g.: the pymssql 2.X versions
installed with pip or directly from the source -- I do this for
development because I happen to do a lot of maintenance work on pymssql.
Personally, I'd love to have more folks looking at pymssql 2.X, because
I've invested time in it, but that's my own bias. :-)

I am very curious about the incompatibility between newer FreeTDS and the
older pymssql, so I will be interested to see what you find out.

Marc
Post by James Andrewartha
On 6/09/13 5:05 PM, "James Andrewartha" <jandrewartha at ccgs.wa.edu.au>
Post by James Andrewartha
I have a python script that uses pymssql 1.0.2, that in turn uses
FreeTDS 0.91, connecting to a MSSQL 2008 server. It used to work, but
now I don't get any results. The FreeTDS log (below) show the data is
received, but it's not processed for some reason. I did some code diving
and printf debugging, and dblib.c sets the results_state to empty
because of a TDS_COMPUTEFMT_RESULT (line 4698 in dbsqlok). Does anyone
know why this is happening?
A few other things about that log - it's actually from the development
tarball, and the data processed is exactly the first packet received - the
log doesn't record any others, and the TCP window fills up so it's not
pulling them out of the socket buffer either.
--
James Andrewartha
Network & Projects Engineer
Christ Church Grammar School
Claremont, Western Australia
Ph. (08) 9442 1757
Mob. 0424 160 877
_______________________________________________
FreeTDS mailing list
FreeTDS at lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/freetds
James Andrewartha
2013-09-09 07:56:41 UTC
Permalink
Post by Marc Abramowitz
I've seen a similar behavior when using pymssql 1.0.2 with newer versions
of FreeTDS. No rows seem to come back. I didn't look in depth at
diagnosing and debugging it though, because 2 things seem to work around
1. Use an older version of FreeTDS. E.g.: the one packaged with our Ubuntu
10 systems, which is 0.82-6build1. This is reasonable for production.
This is what I've gone for.
Post by Marc Abramowitz
2. Use a newer version of pymssql -- e.g.: the pymssql 2.X versions
installed with pip or directly from the source -- I do this for
development because I happen to do a lot of maintenance work on pymssql.
Personally, I'd love to have more folks looking at pymssql 2.X, because
I've invested time in it, but that's my own bias. :-)
I am very curious about the incompatibility between newer FreeTDS and the
older pymssql, so I will be interested to see what you find out.
I've reached my limit of understanding, having a workaround and seeing
that it's a well-known problem means I'm not going to do any futher
investigation. One tidbit is that it's reported to work on amd64:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=709210 although I
briefly tested on Ubuntu 12.04 amd64 and it didn't work. That particular
machine also uses Tiny TDS 0.5.0 (which is a ruby binding) without any
problems.
--
James Andrewartha
Network & Projects Engineer
Christ Church Grammar School
Claremont, Western Australia
Ph. (08) 9442 1757
Mob. 0424 160 877
Randy Syring
2013-09-09 12:30:51 UTC
Permalink
Post by James Andrewartha
That particular
machine also uses Tiny TDS 0.5.0 (which is a ruby binding) without any
problems.
I'm pretty sure Tiny TDS uses its own bundled version of FreeTDS and
doesn't rely on the system FreeTDS. We do the same with pymssql.

*Randy Syring*
Husband | Father | Redeemed Sinner

/"For what does it profit a man to gain the whole world
and forfeit his soul?" (Mark 8:36 ESV)/
James Andrewartha
2013-09-09 12:40:43 UTC
Permalink
From: Randy Syring <rsyring at gmail.com<mailto:rsyring at gmail.com>>
On 09/09/2013 03:56 AM, James Andrewartha wrote:

That particular
machine also uses Tiny TDS 0.5.0 (which is a ruby binding) without any
problems.



I'm pretty sure Tiny TDS uses its own bundled version of FreeTDS and doesn't rely on the system FreeTDS. We do the same with pymssql.

Hmm, this one is linked against the system libsybdb.s o:

libsybdb.so.5 => /usr/lib/x86_64-linux-gnu/libsybdb.so.5 (0x00007f7c84849000)

It also uses /etc/freetds/freetds.conf, which is what made me think of it. I can't speak too much to the app or how the gems were compiled for it, it's written and managed by a third party.

--
James Andrewartha
Network & Projects Engineer
Christ Church Grammar School
Claremont, Western Australia
Ph. (08) 9442 1757
Mob. 0424 160 877
Marc Abramowitz
2013-09-09 20:57:09 UTC
Permalink
I believe that I just found the problem and it's in pymssql 1.0.2. The
following patch to the C code in pymssql fixes the problem for me and
makes a simple script return a result row, whereas the unpatched version
doesn't (with FreeTDS XXX):

https://gist.github.com/msabramo/6501240


So this would indicate that the problem is with pymssql and not FreeTDS.

Out of curiosity, does anyone know why the old code worked with older
versions of FreeTDS? As far as I can tell, dbnumcols should normally
return a positive integer with the number of columns in the result. The
old code is looking for dbnumcols to return a value <= 0. Presumably, this
used to mean something in older versions of FreeTDS?

Marc
James Andrewartha
2013-09-10 03:05:22 UTC
Permalink
Hi Marc,
Post by Marc Abramowitz
I believe that I just found the problem and it's in pymssql 1.0.2. The
following patch to the C code in pymssql fixes the problem for me and
makes a simple script return a result row, whereas the unpatched version
https://gist.github.com/msabramo/6501240
This patch works for me too - thanks for tracking it down.
--
James Andrewartha
Network & Projects Engineer
Christ Church Grammar School
Claremont, Western Australia
Ph. (08) 9442 1757
Mob. 0424 160 877
James K. Lowden
2013-09-10 03:32:37 UTC
Permalink
On Mon, 9 Sep 2013 20:57:09 +0000
Post by Marc Abramowitz
https://gist.github.com/msabramo/6501240
So this would indicate that the problem is with pymssql and not FreeTDS.
Out of curiosity, does anyone know why the old code worked with older
versions of FreeTDS? As far as I can tell, dbnumcols should normally
return a positive integer with the number of columns in the result.
The old code is looking for dbnumcols to return a value <= 0.
http://code.google.com/p/pymssql/source/browse/mssqldbmodule.c?name=1.1.0

It looks like the current code reflects your patch?

dbnumcols returns the number of columns in the resultset. That would
be zero if there are no columns. It never returns an error or a
negative number. It's been a long time since there's been any need to
touch it.

The pymssql code looks a little confused by my lights, perhaps because
the ODBC (and probably Python DBI) definition of "results" doesn't
match that of db-lib.

The comment in the code says ""DECLARE @a INT; SELECT 1" returns two
result sets: one with zero columns, followed by one with one column. "
Strictly speaking that's not true: the server returns only one:

$ TDSDUMP=stdout bsqldb -q -S$S <<< 'declare @a int select 1 as a' |
sed -Ene '/packet/,/^$/p' ...
net.c:740:Sending packet
0000 01 01 00 42 00 00 01 00-64 00 65 00 63 00 6c 00 |...B.... d.e.c.l.|
0010 61 00 72 00 65 00 20 00-40 00 61 00 20 00 69 00 |a.r.e. . @.a. .i.|
0020 6e 00 74 00 20 00 73 00-65 00 6c 00 65 00 63 00 |n.t. .s. e.l.e.c.|
0030 74 00 20 00 31 00 20 00-61 00 73 00 20 00 61 00 |t. .1. . a.s. .a.|
0040 0a 00 - |..|

net.c:621:Received packet
0000 04 01 00 21 00 35 01 00-81 01 00 00 00 20 00 38 |...!.5.. ..... .8|
0010 01 61 00 d1 01 00 00 00-fd 10 00 c1 00 01 00 00 |.a...... ........|
0020 00 - |.|

That's one D1(ROW) and one FD(DONE) packet.

In db-lib, things like output variables and result status from stored
procedures are are fetched with special API functions, so you do things
like

while dbresults
dbnumcols, dbbind
while dbnextrow
for each column ....
or
for each dbnumalts /* compute rows */
dbcount
for each dbnumrets /* output variables */
dbretstat

That's a little messy and because not every DBMS has a notion of
compute rows, output variables, and return status, ODBC folds them into
"results". IOW, with ODBC you land at the top of the results loop more
often for each query executed.

At a glance, it looks like the pymssql code is emulating that sort of
thing. The last_results variable gets flung around a lot, and
dbresults is called from more than one place. That's an invitation to
error: easy to get wrong and hard to get right.

Not a FreeTDS version issue, though. Should work equally well with the
current release as with 0.82.

--jkl
Marc Abramowitz
2013-09-10 14:59:23 UTC
Permalink
Thanks James, for going above and beyond the call of duty, by looking at
pymssql code. That was much more detail than I could have reasonably
expected, so thank you!
Post by James K. Lowden
http://code.google.com/p/pymssql/source/browse/mssqldbmodule.c?name=1.1.0
It looks like the current code reflects your patch?
More info then you probably care about pymssql, but yeah, the mercurial
repo you looked at already reflects my patch. In fact, I came up with the
patch by observing that code built from Mercurial worked, whereas the
released
pymssql 1.0.2 on PyPI did not and then diffing and looking for a likely
candidate. So someone fixed this at some point in the repo, but it never
got included in an official pymssql 1.x release. And then folks stopped
maintaining the 1.x code (written in C) and started a whole new 2.X
version,
that is written with Cython. That code doesn't seem to have this problem
either, and that code is more actively maintained now (mostly by me), but
it's
never been officially released, so it isn't as proven in production.


My interest here was in possibly making a very small tweak to the old 1.X
code, so that there could be a 1.0.3 version that fixes only this one bug.

If you get curious about how the new pymssql handles this (any comments
from you here are not expected but would be much appreciated), look here:

https://github.com/pymssql/pymssql/blob/master/_mssql.pyx#L1036


Thanks again for a very thoughtful and helpful reply before.

Marc
James K. Lowden
2013-09-12 03:37:58 UTC
Permalink
On Tue, 10 Sep 2013 14:59:23 +0000
Post by Marc Abramowitz
More info then you probably care about pymssql
My bread and water these days comes from Continuum Analytics, so I have
great interest in things db pythonic.
Post by Marc Abramowitz
And then folks stopped maintaining the 1.x code (written in C) and
started a whole new 2.X version, that is written with Cython. That
code doesn't seem to have this problem either, and that code is more
actively maintained now (mostly by me), but it's
never been officially released, so it isn't as proven in production.
I'm afraid I don't follow. There's a 2.x version not in "production"?
If I'm using garden-variety Python, which pymssql should I have?
Post by Marc Abramowitz
https://github.com/pymssql/pymssql/blob/master/_mssql.pyx#L1036
# Since python doesn't have a do/while loop do it this way
while True:
with nogil:
self.last_dbresults = dbresults(self.dbproc)
self.num_columns = dbnumcols(self.dbproc)
if self.last_dbresults != SUCCEED or self.num_columns
break
check_cancel_and_raise(self.last_dbresults, self)

You're supposed to call dbnumcols() only after dbresults returns
SUCCEED.

Wouldn't this work better as

self.clear_metadata()
while (self.last_dbresults = dbresults(self.dbproc)) == SUCCEED:
self._rows_affected = dbcount(self.dbproc)
if (self.num_columns = dbnumcols(self.dbproc)) > 0:
column_names = list()
column_types = list()
for col in xrange(1, self.num_columns + 1):
column_names.append(dbcolname(self.dbproc, col))
coltype = dbcoltype(self.dbproc, col)
column_types.append(get_api_coltype(coltype))
self.column_names = tuple(column_names)
self.column_types = tuple(column_types)
return True
if (mumble = dbnumalts(self.dbproc)) > 0
# do the needful and
return True

if self.last_dbresults == NO_MORE_RESULTS or \
self.last_dbresults == NO_MORE_RPC_RESULTS:
self.clear_metadata()
if dbhasretstat(self.dbproc):
self.retstatus = dbretstatus(self.dbproc)
# handle output parameters
# (not currently supported)
return False

check_cancel_and_raise(self.last_dbresults, self)
return False

That way, each db-lib function is called just once, and every return
value of dbresults is handled.

Studying this turned up a little bug in FreeTDS. Consider

create table #t( t int )
insert into #t values (1)
insert into #t values (2)

TDS has a variety of DONE packets, and DONE handling is one of the more
difficult parts to get right. The problem is partly that the protocol
has changed over the years, partly in anticipating when the server
has stopped sending data, and partly (as I alluded to in my last
message) because libtds supports different APIs with different notions
of "results".

Be that as it may, DONE packets carry the "rows affected" information
that dbcount returns. It's one of the murkier corners of db-lib,
because there's no correspondence between the application's
opportunities to call dbcount and the number of DONE packets that the
server sends.

When the above SQL is submitted as a single batch, the server returns 3
DONE packets, one for each statement. dbresults returns SUCCEED after
the *first* done packet, the one resulting from the CREATE TABLE
statement, for which no count is valid. (A bit in the DONE packet
signifies whether the count value can be trusted, cf. dbiscount.)

So the poor user calls dbresults, gets SUCCEED, calls dbiscount, gets
FALSE. WTF?

When dbresults is called a second time, the library returns to the
TDS stream and parses out the next two DONE packets, sets the count to
1 (twice, once per packet), and then returns NO_MORE_RESULTS. At that
point dbiscount might return TRUE, but the user isn't supposed to call
dbcount after unless dbresults returned SUCCEED. :-(

This is what happens when the folks writing the library are learning as
they go.

The corrective is both simple and difficult. Simply, every TDS packet
carries a "more results" bit, telling the client whether or not the
server is finished sending. For DONE tokens, db-lib should continue
reading the stream until either that bit is 0 -- indicating SUCCESS,
but no rows returned -- or another kind of packet arrives, say a
row-format packet.

The difficult part is that, as I said, this is one of the gnarlier
parts of the library. For ODBC, each DONE packet constitutes a
result. For db-lib, only the last one does. So the client library
passes into libtds a "read until" parameter, meaning that the libraries
have shared state. It's enough to make you want to take up knitting
instead.

The complexity is entirely unnecessary; it's an accidental outgrowth
the organic way in which libtds evolved. And I have a very nice design
for libtds2. What it needs is someone with time.

--jkl
Marc Abramowitz
2013-09-12 16:42:13 UTC
Permalink
Hi James,

Again, way more info than I could have reasonably expected, so thank you
very much for taking the time!

There is a lot of info here and it's probably going to take me a bit of
time to process, but I wanted to make sure to thank you and answer your
easy questions.
Post by James K. Lowden
My bread and water these days comes from Continuum Analytics, so I have
great interest in things db pythonic.
Apologies for making the assumption that you would only be interested in
FreeTDS-proper and not in stuff that uses it, like pymssql. Folks at CA
are probably extremely comfortable with Cython -- that means that you guys
would be fantastic at reviewing the pymssql 2.x code :-)
Post by James K. Lowden
I'm afraid I don't follow. There's a 2.x version not in "production"?
If I'm using garden-variety Python, which pymssql should I have?
OK, let me clarify. pymssql 1.x is written mostly in C with a Python
wrapper and if you look on PyPI, it's the latest release:

https://pypi.python.org/pypi/pymssql/1.0.2


This version has been around since 2009 and this is the version that my
company (SurveyMonkey) is using in production, because it works fine for
our needs. This version has the buggy code I posted earlier which causes
it to not get any results if you use it with newer versions of FreeTDS,
but it works fine with FreeTDS 0.82 (I don't understand why but it does).

However, if you create a virtualenv and do `pip install pymssql` (not
asking specifically for 1.0.2), you'll actually get a 2.X development
snapshot (because pip currently follows links to our Google Code repo --
my understanding is that new pip versions will stop doing this soonish and
will only download from PyPI, but I digress):

marca at marca-mac:~$ virtualenv /tmp/foobar
?
marca at marca-mac:~$ source /tmp/foobar/bin/activate
(foobar)marca at marca-mac:~$ pip install pymssql
?

(foobar)marca at marca-mac:~$ pip freeze | grep pymssql
Warning: cannot find svn location for pymssql==2.0.0b1-dev-20130403
pymssql==2.0.0b1-dev-20130403


This is a tarball of the 2.X version that I uploaded to Google Code in
April. This is the 100% Cython code. I use this personally for development
and it works very well for me. It works with every version of FreeTDS that
I've tried including FreeTDS that I've built straight from Gitorious and
FreeTDS 0.82 (I think it even works with FreeTDS < 0.7 in some older Linux
distros and FreeBSD 8.1). It also appears to be significantly faster than
the 1.X code, probably because it all compiles down to C and then machine
code and doesn't have any interpreted Python code. It has features that
the 1.X code doesn't have, though not features that everyone needs,
especially if going through an ORM or abstraction layer like SQLAlchemy.
It also has a branch for "python3" which works well -- at some point soon,
I'd like to merge that into the mainline and do a release with it. We
don't use this version in production at SurveyMonkey, because we don't
have a compelling reason to risk upgrading. I don't know if other folks
use this in production or not. I'd guess that a lot of folks would shy
away from something that is not officially on PyPI, but I could be wrong.


If folks are doing new development with pymssql, I'd steer them towards
the 2.x version, as it has more features, better performance, a test
suite, Python 3 compatibility coming soon, and it's been getting commits.
Post by James K. Lowden
Wouldn't this work better as...
Going to review this later when I have more time for open-source hacking.
My guess is that what you suggested is probably better because you know
db-lib intimately well :-)

Speaking of db-lib, what is the best place to read to get advice about how
to use the API -- i.e.: the kind of info that you wrote about (e.g.:
"You're supposed to call dbnumcols() only after dbresults returns
SUCCEED."). I've glanced at the FreeTDS manual in the past and I've
sometimes looked at some of the docs on individual functions from
Microsoft and Sybase, but I don't know if I've ever seen as much detail
about the overall process as you mentioned in your email. A canonical
example of result processing would be handy (though perhaps I have that
now from your email :-))

Thanks again for the boat loads of info and for looking at the pymssql
code and making suggestions!

Cheers,
Marc
Post by James K. Lowden
On Tue, 10 Sep 2013 14:59:23 +0000
Post by Marc Abramowitz
More info then you probably care about pymssql
My bread and water these days comes from Continuum Analytics, so I have
great interest in things db pythonic.
Post by Marc Abramowitz
And then folks stopped maintaining the 1.x code (written in C) and
started a whole new 2.X version, that is written with Cython. That
code doesn't seem to have this problem either, and that code is more
actively maintained now (mostly by me), but it's
never been officially released, so it isn't as proven in production.
I'm afraid I don't follow. There's a 2.x version not in "production"?
If I'm using garden-variety Python, which pymssql should I have?
Post by Marc Abramowitz
https://github.com/pymssql/pymssql/blob/master/_mssql.pyx#L1036
# Since python doesn't have a do/while loop do it this way
self.last_dbresults = dbresults(self.dbproc)
self.num_columns = dbnumcols(self.dbproc)
if self.last_dbresults != SUCCEED or self.num_columns
break
check_cancel_and_raise(self.last_dbresults, self)
You're supposed to call dbnumcols() only after dbresults returns
SUCCEED.
Wouldn't this work better as
self.clear_metadata()
self._rows_affected = dbcount(self.dbproc)
column_names = list()
column_types = list()
column_names.append(dbcolname(self.dbproc, col))
coltype = dbcoltype(self.dbproc, col)
column_types.append(get_api_coltype(coltype))
self.column_names = tuple(column_names)
self.column_types = tuple(column_types)
return True
if (mumble = dbnumalts(self.dbproc)) > 0
# do the needful and
return True
if self.last_dbresults == NO_MORE_RESULTS or \
self.clear_metadata()
self.retstatus = dbretstatus(self.dbproc)
# handle output parameters
# (not currently supported)
return False
check_cancel_and_raise(self.last_dbresults, self)
return False
That way, each db-lib function is called just once, and every return
value of dbresults is handled.
Studying this turned up a little bug in FreeTDS. Consider
create table #t( t int )
insert into #t values (1)
insert into #t values (2)
TDS has a variety of DONE packets, and DONE handling is one of the more
difficult parts to get right. The problem is partly that the protocol
has changed over the years, partly in anticipating when the server
has stopped sending data, and partly (as I alluded to in my last
message) because libtds supports different APIs with different notions
of "results".
Be that as it may, DONE packets carry the "rows affected" information
that dbcount returns. It's one of the murkier corners of db-lib,
because there's no correspondence between the application's
opportunities to call dbcount and the number of DONE packets that the
server sends.
When the above SQL is submitted as a single batch, the server returns 3
DONE packets, one for each statement. dbresults returns SUCCEED after
the *first* done packet, the one resulting from the CREATE TABLE
statement, for which no count is valid. (A bit in the DONE packet
signifies whether the count value can be trusted, cf. dbiscount.)
So the poor user calls dbresults, gets SUCCEED, calls dbiscount, gets
FALSE. WTF?
When dbresults is called a second time, the library returns to the
TDS stream and parses out the next two DONE packets, sets the count to
1 (twice, once per packet), and then returns NO_MORE_RESULTS. At that
point dbiscount might return TRUE, but the user isn't supposed to call
dbcount after unless dbresults returned SUCCEED. :-(
This is what happens when the folks writing the library are learning as
they go.
The corrective is both simple and difficult. Simply, every TDS packet
carries a "more results" bit, telling the client whether or not the
server is finished sending. For DONE tokens, db-lib should continue
reading the stream until either that bit is 0 -- indicating SUCCESS,
but no rows returned -- or another kind of packet arrives, say a
row-format packet.
The difficult part is that, as I said, this is one of the gnarlier
parts of the library. For ODBC, each DONE packet constitutes a
result. For db-lib, only the last one does. So the client library
passes into libtds a "read until" parameter, meaning that the libraries
have shared state. It's enough to make you want to take up knitting
instead.
The complexity is entirely unnecessary; it's an accidental outgrowth
the organic way in which libtds evolved. And I have a very nice design
for libtds2. What it needs is someone with time.
--jkl
_______________________________________________
FreeTDS mailing list
FreeTDS at lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/freetds
James K. Lowden
2013-09-13 05:05:09 UTC
Permalink
Hi Marc,
Post by Marc Abramowitz
Speaking of db-lib, what is the best place to read to get advice
about (e.g.: "You're supposed to call dbnumcols() only after
dbresults returns SUCCEED.").
Everything I know about how db-lib is *supposed* to work comes from
reading the vendors' reference manuals, many times and very carefully.
Over the years I guess I've crawled inside the head of whoever designed
it. (I don't know who that is, but I'd like to meet him!) It's
actually a very simple library with a few strange gargoyles.
Post by Marc Abramowitz
A canonical example of result processing would be handy
(though perhaps I have that now from your email :-))
That is why I wote the Tenderfoot section of the UG. :-)
Post by Marc Abramowitz
Thanks again for the boat loads of info and for looking at the pymssql
code and making suggestions!
Sure thing. Happy hacking!

--jkl
Marc Abramowitz
2013-09-13 18:34:44 UTC
Permalink
Post by James K. Lowden
Post by Marc Abramowitz
A canonical example of result processing would be handy
(though perhaps I have that now from your email :-))
That is why I wote the Tenderfoot section of the UG. :-)
Wow, I somehow missed that. That looks awesome -- will be reading that in
depth later?

Marc

Loading...