Friday, January 7, 2011

A new App Engine datastore API

This post is primarily intended for App Engine users (and of those, only Python users :-).

Over the past months I've been working on a new design for the Python datastore API, under the code name Datastore Plus. The new design is very ambitious, and changes a lot of things:
  • New, cleaner implementations of Key, Model, Property and Query classes
  • High-level asynchronous API using Python generators as coroutines (PEP 342)
The design is meant to eventually replace the existing db package in the App Engine runtime library, but for now, it is just an open source project which you have to download and copy into your application.

I am not at all finished with this design, but I believe in listening to users, so I am making a preliminary version of the new API available for review. Please send me your thoughts, either in this blog, or via private mail to guido (at) google.com. Note that the implementation works, but I cannot guarantee that it won't change.

Documentation is here: http://goo.gl/D6Onw

The project to check out is here: http://goo.gl/GapXI

You must use Mercurial to check out the project, but it's fine to check it out anonymously -- I don't require anybody to log in to look or comment. (If using Mercurial is too much of a burden, there's also a zipfile on the site, but I don't plan to update it frequently.)

I'm interested in receiving any kind of feedback at all. It would help me if you could clarify whether your feedback is about an issue with the documentation, an issue with the implementation, or an issue with the API design -- though I realize you can't always tell the difference. :-)

(You can also comment on the thread in the google-appengine-python group here: https://groups.google.com/group/google-appengine-python/browse_thread/thread/454cb81d49e759f2.)

9 comments:

Anthony Mills said...

When's Google App Engine getting updated to a later version of Python?

Sorry, couldn't resist. :)

Sounds like a neat project, kind of like C# 5.0's async stuff.

Chris Tan said...

Very exciting work! I've read about "trampoline" before but haven't yet had a chance to use it.

Will you be experimenting with it in Rietveld?

maSnun said...

A zip file would be great. Thanks.

Guido van Rossum said...

I'm not here to comment on other aspects of App Engine. Stay tuned...

Rietveld is probably not the first thing I'll try this on, it's a rather large program by now...

Zip file uploaded:
http://code.google.com/p/appengine-ndb-experiment/downloads/detail?name=appengine-ndb-experiment.zip

PJE said...

Interesting... it looks like there's some overlap here with the async WSGI+futures discussion on the Web-SIG these last couple days.

I just did a sketch of a WSGI 2 async API using a PEP 380-style coroutine implementation... perhaps an async WSGI API could have some legs after all, at least in contexts like app engine.

Guido van Rossum said...

@PJE I noticed the same overlap. However App Engine is not friendly to what most people seem to think is the most important use of async in WSGI: producing the response incrementally. App Engine always collects the entire response in a buffer and only starts shipping it to the client when the code is finished. (OTOH for serving and uploading large files, App Engine has different solutions which do not go through the app code at all. So any worries people might have about this are unnecessary. It's just a different philosophy.)

David W. said...

I don't mean to be unnecessarily negative, but this looks like a multi-car pile up involving a pretty computer science project, mode bit-in-waiting[X], and all the existing terminology used for datastore. And because It's Guido, guaranteed some subset of people will start using it over the existing API, and so the Python support forums will forever more require one further step in determining if the Query class referred to is "the old one or the other one".

In the meantime, while it's pretty, it fails to address one of the principle problems with the current API: it's completely Google-specific. For moving to and from App Engine, this is a much more interesting and immediate, and woefully difficult problem than some effortless asynchrony and banishing the evilly clumsy bastion-of-Java design of the current API.

Please consider evolving the current API in some backwards-compatible manner in favour of starting afresh.

[X] http://www.bookjive.com/wiki/Book:The_Soul_of_a_New_Machine

Fh said...

Hi Guido,

Cool stuff, I was eager to test it since I read (and re-read) the docs.

I have one big problem though: No DatetimeProperty!

I'm used to having an updated_at and created_at fields on all my models for convenience reasons. I just discovered that Datetime properties haven't been implemented yet.

Why is that? How long should I wait for them? Is using the old db in the meantime the best solution to this problem?

Thanks!

Guido van Rossum said...

Hi Fh, the only reason DatetimeProperty (and a bunch of others) hasn't been implemented yet is that I didn't think they'd be particularly interesting. You could probably code them up yourself -- I'd happily take a patch (against the latest Hg archive please) uploaded to Rietveld (codereview.appspot.com).

As a workaround, I think you could use a GenericProperty for now, which does have Datetime support. Just set the value to a datetime instance.