Monday, January 24, 2011

Asynchronous RPC in App Engine Today

While I was laying the groundwork for a new datastore client library with support for asynchronous requests, I added some low-level support for asynchronous RPCs that you can use today. The only App Engine API with documented support for asynchronous RPCs is urlfetch, and it happens to be quite useful with that.

Suppose you want to fetch some data from a remote service. The remote service has two instances, both of which are slightly flaky. What you want to do is send off requests to both servers simultaneous (this is the easy part) and then wait for the first one to give you a result. The latter uses the new API that I'm about to describe here.

from google.appengine.api import urlfetch, apiproxy_stub_map

urls = ['http://service1.com', 'http://service2.com'] # Etc.

rpcs = []
for url in urls:
rpc = urlfetch.create_rpc(deadline=1.0)
urlfetch.make_fetch_call(rpc, url)
rpcs.append(rpc)

rpc = apiproxy_stub_map.UserRPC.wait_any(rpcs)
# Now rpc is the first rpc that returned a result. Have at it!

That's all! If you're interested in learning more about this handy class method, just check out its docstring in the App Engine SDK. Note that technically you should loop until it doesn't return None.

You can also repeatedly call wait_any() to get subsequent result. Make sure to remove the rpc it returns (if any) from the list, since otherwise it will return the same rpc over and over again: the specification of wait_any() says it returns the first rpc in the given list that completes, regardless of whether you have seen it before.

Also note that there currently is no way to cancel the other RPCs, which is why I passed a low deadline to the create_rpc() call. The problem is that even if you completely ignore the other RPCs, the App Engine runtime still waits for them to finish or timeout.

Finally, there is also a similar class method UserRPC.wait_all(), which waits until all RPCs in the list you pass it are complete. (It doesn't return anything.)

PS. Don't look too closely at the implementation of these methods. It may change as we think of a better way to do it. But we're committed to the API.

Friday, January 7, 2011

A new App Engine datastore API

This post is primarily intended for App Engine users (and of those, only Python users :-).

Over the past months I've been working on a new design for the Python datastore API, under the code name Datastore Plus. The new design is very ambitious, and changes a lot of things:
  • New, cleaner implementations of Key, Model, Property and Query classes
  • High-level asynchronous API using Python generators as coroutines (PEP 342)
The design is meant to eventually replace the existing db package in the App Engine runtime library, but for now, it is just an open source project which you have to download and copy into your application.

I am not at all finished with this design, but I believe in listening to users, so I am making a preliminary version of the new API available for review. Please send me your thoughts, either in this blog, or via private mail to guido (at) google.com. Note that the implementation works, but I cannot guarantee that it won't change.

Documentation is here: http://goo.gl/D6Onw

The project to check out is here: http://goo.gl/GapXI

You must use Mercurial to check out the project, but it's fine to check it out anonymously -- I don't require anybody to log in to look or comment. (If using Mercurial is too much of a burden, there's also a zipfile on the site, but I don't plan to update it frequently.)

I'm interested in receiving any kind of feedback at all. It would help me if you could clarify whether your feedback is about an issue with the documentation, an issue with the implementation, or an issue with the API design -- though I realize you can't always tell the difference. :-)

(You can also comment on the thread in the google-appengine-python group here: https://groups.google.com/group/google-appengine-python/browse_thread/thread/454cb81d49e759f2.)