Twitter Updates

    follow me on Twitter

    Friday, 22 July 2011

    Making django a better WSGI citizen

    Aka Monkey patching Django with paste registry


    Once upon a time there were two Django projects. A legacy one and a newly developed one. The modern one was developed independently as a separate project. They were both working on separate databases (in fact Django db backend was not used, it was just mongodb and redis backends), using different set of middlewares, different context processors… Two healthy projects.


    What is also important they were both using gevent and deployed as wsgi servers.


    During production deployment of the modern one we’ve decided that if we can make them both to work together using wfront, than we can save on amount of python instances being run on the same cluster (read: RAM).


    In this configuration one instance of wfront wsgi app wraps couple of wsgi apps within one thread. Thread is than shared by gevent managed requests dispatch.


       +-----+    +-------------------------------------+
       |     |    |                                     |
       |     +--->| Legacy Django Project as a WSGI App |
       |     |    |                                     |
       |     |    +-------------------------------------+
       |     |
       |  W  |    +-------------------------------------+
       |     |    |                                     |
       |  F  +--->| Modern Django Project as a WSGI App |
       |     |    |                                     |
       |  R  |    +-------------------------------------+
       |     |
       |  O  |    +-------------------------------------+
       |     |    |                                     |
       |  N  +--->| Other WSGI App 1                    |
       |     |    |                                     |
       |  T  |    +-------------------------------------+
       |     |
       |     |            ...
       |     |
       |     |    +-------------------------------------+
       |     |    |                                     |
       |     +--->| Other WSGI App X                    |
       |     |    |                                     |
       +-----+    +-------------------------------------+
    
    

    Idea is simple and as long as we talk about clean WSGI apps that are not cluttering globals than we are fine.


    Unfortunately Django is using globals. And the first one I’ve reached was parsed settings object.
    The standard way to read settings from within Django project is to use:


    from django conf import settings

    What it does is a creation of a lazy/proxy/cache singleton that once called loads all settings.py content to itself. Location of settings.py is taken from os.environ['DJANGO_SETTINGS_MODULE']. Thanks to it we have one namespace optimizing access for all Django apps/modules to take settings from.


    What we need it to have a separate settings.py file loaded for each of our Django based wsgi apps. Certainly we have to do something about loading the settings from file name defined in os.environ['DJANGO_SETTINGS_MODULE'].


    So we need to hijack following entry:


    class LazySettings(LazyObject):
        def _setup(self):
            try:
                settings_module = os.environ[ENVIRONMENT_VARIABLE]
                if not settings_module: # If it's set but is an empty string.
                    raise KeyError
            except KeyError:
                raise ImportError("Settings cannot be imported, because environment variable %s is undefined." % ENVIRONMENT_VARIABLE)
    
            self._wrapped = Settings(settings_module)
    
    settings = LazySettings()

    Than we need to make it configurable, preferably on initialization/call to Django wsgi application created by application = django.core.handlers.wsgi.WSGIHandler().


    I wanted to keep surgery on minimal level so inspired by great article by Simon Willson about djng I’ve decided to use paste.registry module. It’s a nice way to have a request-specific (thread safe) objects. Greenlet level safety was already maintained by geven monkey patching.


    import gevent.monkey; gevent.monkey.patch_all();


    The idea behind paste.registry is to wrap existing WSGI app with registry setup middleware. Than you can use other WSGI middleware to inject some data, then another to edit it and finally wsgi app to read it. Data is passed using request object, but data access api is perfect for monkey patching of existing code.


                +------------+   +------------+   +-------------+
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                | WSGI       |   | WSGI       |   | WSGI        |
                |            |   |            |   |             |
                | Middleware |   | Middleware |   | Application |
                |            |   |            |   |             |
                | Registry   |   | B          |   |             |
                | Manager    |   |            |   |             |
                |            |   |            |   |             |
       Request  |            |   |            |   |             |
    --------------------------------------------->|             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
       Response |            |   |            |   |             |
    <---------------------------------------------|             |
                |            |   |            |   |             |
                +------------+   +------------+   +-------------+
    
    

    Let’s start from monkey patching of django.conf to go through registry defined object.


    def monkey_patch_conf():
        import django.conf
        django.conf.settings = StackedObjectProxy(name="patched Settings")

    Next step is to properly initialize the settings object using an extra middleware. Fortunately there is already a way to initialize Settings from specific file. Just Settings(settings_module).


    class SettingsMiddleware:
        def __init__(self, app, settings_module):
            self.wrapped_app = app
            self.settings_module = settings_module
    
        def __call__(self, environ, start_response):
            from django.conf import Settings, settings
            real_settings = Settings(self.settings_module)
    
            if environ.has_key('paste.registry'):
                environ['paste.registry'].register(settings, real_settings)
            return self.wrapped_app(environ, start_response)

    A final step is to initialize registry and wrap it all together.


    server = WSGIServer(address, wfront.route([
                    ('localhost', RegistryManager(SettingsMiddleware(application, "foo.settings")), None),
                    ('*', RegistryManager(SettingsMiddleware(application, "bar.settings")), None),
                 ]))

    Paste RegistryManager initializes the paste registry. Than our SettingsMiddleware loads appropriate settings file and than in the end django application works on the request with having appropriate settings loaded.


                +------------+   +------------+   +-------------+
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                | WSGI       |   | WSGI       |   | WSGI        |
                |            |   |            |   |             |
                | Middleware |   | Middleware |   | Application |
                |            |   |            |   |             |
                | Registry   |   | Settings   |   | Django      |
                | Manager    |   |            |   |             |
                |            |   |            |   |             |
       Request  |            |   |            |   |             |
    --------------------------------------------->|             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
                |            |   |            |   |             |
       Response |            |   |            |   |             |
    <---------------------------------------------|             |
                |            |   |            |   |             |
                +------------+   +------------+   +-------------+
    
    

    And that is al folks. At least for first global :) Next global was optimization for context_processors which patching required pretty the same steps.


    Code Sample


    import gevent.monkey; gevent.monkey.patch_all();
    
    from gevent.wsgi import WSGIServer
    from paste.registry import RegistryManager, StackedObjectProxy
    import wfront
    
    import logging
    log = logging.getLogger(__name__)
    
    def monkey_patch_conf():
        log.debug("monkey_patching django.conf")
        import django.conf
        django.conf.settings = StackedObjectProxy(name="patched Settings")
    
    
    class SettingsMiddleware:
        """
        A middleware that loads settings from a specific file and exposes its content to whole wsgi thread, coroutine.
        Requires:
        - RegistryManager middleware
        - monkey_patch_conf
        exemplary usage: RegistryManager(SettingsMiddleware(wsgi_application, "foo.settings"))
        """
        def __init__(self, app, settings_module):
            self.wrapped_app = app
            self.settings_module = settings_module
    
        def __call__(self, environ, start_response):
            from django.conf import Settings, settings
            real_settings = Settings(self.settings_module)
    
            if environ.has_key('paste.registry'):
                environ['paste.registry'].register(settings, real_settings)
            return self.wrapped_app(environ, start_response)
    
    
    def main():
        monkey_patch_conf()
    
        import django.core.handlers.wsgi
        application = django.core.handlers.wsgi.WSGIHandler()
    
        address = "0.0.0.0", 8000
        server = WSGIServer(address, wfront.route([
                    ('localhost', RegistryManager(SettingsMiddleware(application, "foo.settings")), None),
                    ('*', RegistryManager(SettingsMiddleware(application, "bar.settings")), None),
                 ]))
    
        try:
            log.info( "Server running on port %s:%d. Ctrl+C to quit" % address)
            server.serve_forever()
        except KeyboardInterrupt:
            server.stop()
            log.info( "Bye bye")
    
    
    if __name__ == "__main__":
        main()

    0 comments: