Discussion:
[freetds] dbdataready?
Marc Abramowitz
2013-09-27 00:52:02 UTC
Permalink
Today, a colleague and I got excited about making pymssql "green" by
supporting "green" thread implementations like gevent by adding a callback
to pymssql that it would call after the query has been sent and before the
results are received. One could have this callback call something like
gevent.sleep() to yield control to another greenlet.

I implemented a very basic version of this that calls the callback after
calling dbsqlsend and before calling dbsqlok (which blocks) [1]. This very
basic implementation kind of helps in some cases because yielding after the
dbsqlsend gives other coroutines a chance to run but it's not fully green
because the dbsqlok is a blocking operation so when it gets hit it will
block all coroutines.

What I'd really like to get do is use the dbdataready [1] function of
DB-API so that I can avoid blocking on dbsqlok. Unfortunately, it seems
that it was never implemented in FreeTDS

Are there any plans to implement dbdataready or the dbpoll, the Sybase
flavor of async?

Marc

[1] https://github.com/pymssql/pymssql/pull/133/files
[2] http://technet.microsoft.com/en-us/library/aa937037.aspx
James K. Lowden
2013-09-27 01:52:31 UTC
Permalink
On Thu, 26 Sep 2013 17:52:02 -0700
Post by Marc Abramowitz
Today, a colleague and I got excited about making pymssql "green" by
supporting "green" thread implementations like gevent by adding a
callback to pymssql that it would call after the query has been sent
and before the results are received. One could have this callback
call something like gevent.sleep() to yield control to another
greenlet.
...
Post by Marc Abramowitz
What I'd really like to get do is use the dbdataready [1] function of
DB-API so that I can avoid blocking on dbsqlok.
There's no plan. In case you're interested in trying, it's either
fairly simple or nearly impossible to implement.

There are only three ways to know if a TCP socket has data waiting to be
read:

1. read(2) & friends
2. select/poll

(There are signals, too, but they don't work well for TCP connections,
nor with threads generally. So they don't count.)

FreeTDS uses poll(2). It could maintain a flag named, say, fready. It
would clear the flag before calling poll, and set it when poll returned
with an indication that data are pending on the socket. dbpoll(),
then, would simply examine the flag.

(There is a small vendor incompatibility: Microsoft calls it
dbdataready; Sybase calls it dbpoll.)

It's a little more involved than that, because the notion of data
pending to be read consists of more than what poll(2) last reported.
The question dbpoll() answers is narrower: can dbresults be called now
without blocking? That's deep in the libtds state machine, which isn't
very clearly elucidated in the code.

The question in my mind that I can't look into right now is whether or
not there's always a read pending after a query is issued, or if
libtds begins reading only when requested to do so by the client
library. If there's always a read pending, then the putative flag
would be easy to implement. If not, it would take some work to get
dbpoll the information it needs.

--jkl
Marc Abramowitz
2013-10-04 16:05:55 UTC
Permalink
An update for anyone who's interested.

I realized that dbpoll and dbdataready are not the droids that I'm looking
for.

What I wanted to do was make pymssql work nicely with gevent; that is to
yield control to other greenlets (coroutines) after issuing a query and
waiting for I/O to come back from the server. This allows the program to do
useful work while waiting for results from the database, including but not
limited to issuing more queries and having them run in parallel rather than
serially.

dbpoll or dbdataready would've allowed this but they would've required the
app to have a polling loop where it polls for results from the server and
yields control if there aren't.

I can achieve the goal of making pymssql a good cooperative citizen without
polling by simply using dbiordesc from FreeTDS. This gives me the file
descriptor of the socket that FreeTDS is waiting on. I can take this and
pass it to gevent.socket.wait_read and that's it. Gevent will suspend my
greenlet and resume it at some point after data has arrived on the file
descriptor. From some basic testing, this appears to work quite nicely and
allows me to write simple programs that issue multiple slow queries in
parallel.

So I don't think that I have a use case for dbpoll and dbdataready anymore.
It appears that you can do async without having these, if you already have
an async engine that allows you to wait on a file descriptor.

Marc
Post by James K. Lowden
On Thu, 26 Sep 2013 17:52:02 -0700
Post by Marc Abramowitz
Today, a colleague and I got excited about making pymssql "green" by
supporting "green" thread implementations like gevent by adding a
callback to pymssql that it would call after the query has been sent
and before the results are received. One could have this callback
call something like gevent.sleep() to yield control to another
greenlet.
...
Post by Marc Abramowitz
What I'd really like to get do is use the dbdataready [1] function of
DB-API so that I can avoid blocking on dbsqlok.
There's no plan. In case you're interested in trying, it's either
fairly simple or nearly impossible to implement.
There are only three ways to know if a TCP socket has data waiting to be
1. read(2) & friends
2. select/poll
(There are signals, too, but they don't work well for TCP connections,
nor with threads generally. So they don't count.)
FreeTDS uses poll(2). It could maintain a flag named, say, fready. It
would clear the flag before calling poll, and set it when poll returned
with an indication that data are pending on the socket. dbpoll(),
then, would simply examine the flag.
(There is a small vendor incompatibility: Microsoft calls it
dbdataready; Sybase calls it dbpoll.)
It's a little more involved than that, because the notion of data
pending to be read consists of more than what poll(2) last reported.
The question dbpoll() answers is narrower: can dbresults be called now
without blocking? That's deep in the libtds state machine, which isn't
very clearly elucidated in the code.
The question in my mind that I can't look into right now is whether or
not there's always a read pending after a query is issued, or if
libtds begins reading only when requested to do so by the client
library. If there's always a read pending, then the putative flag
would be easy to implement. If not, it would take some work to get
dbpoll the information it needs.
--jkl
_______________________________________________
FreeTDS mailing list
FreeTDS at lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/freetds
Loading...