Setting up a ZEO Cluster with session tracking.

Revised version.

Meta:

Valid for:  Silva 0.9.x
Author:     Jan-Wijbrand Kolman
Email:      jw@infrae.com
CVS:        $Revision: 1.5 $ $Date: 2003/04/07 09:40:52 $ 

REVISION NOTES:

Although the lastest version for the ZEO (Zope Enterprise Objects) software is version 2. This document still covers version 1. In so far this version of ZEO and/or related software versions is not available online anymore, Infrae can provide tar archives of theses versions.

In the near future Infrae will start testing version 2 of ZEO (and its related software) and update this document accordingly.

Introduction

These notes were collected during a ZEO Cluster setup proces to test Silva in a clustered environment. Most information can be found at the www.zope.org website, although either slightly outdated or targeting a specific platform or usage.

The document asumes knowledge of and experience with setting up Zope from sources, using the Zope Session managers, etc. It also assumes a UNIX-like OS. The instructions use '$ ' for the command line prompt and commands or code snippets which really should be typed on one line, will be split with a ''.

I hope the notes do provide enough instructions to setup a ZEO cluster and a highlevel overview and sufficient pointers to more information. Feedback and/or corrections is highly appreciated.

Problem description

A ZEO cluster provides building blocks to setup a scalable and highly available platform for CPU intensive webapplications.

Typically a ZEO cluster consists of 1 or more ZEO Clients and 1 ZEO Server. Each Client, most probably running on a seperate network node, serves the actual webapplication. The Server provides access to one central ZODB (the object database storage) and is responsible for synchronizing the Clients throughout database changes. Each Client keeps a local object cache in RAM, to speed up object read actions.

To evenly spread the application load over the different Clients a 'front-end', or 'load balancing' server is put in place. End users connnect to the Balancer, which transparently maps the requests to one of the Clients. Typically the Client and Server nodes are kept on a private network, where this Balancer acts as a "bridge" between the public and this private network.

Most web applications keep track of the end users "state" within the application. Webapplication may do so by "tagging" the end user web browser with a cookie. This cookie merely provides an identifying number to the webapplication. With this ID the application keeps a 'session' object per end user. This session object is a generic container to store end user states within the application.

Normally (in a non-clustered situation) this session object is not persisted into the ZODB, but kept in a RAM cache. Amongst other reasons, one is to prevent the ZODB from excesive growth, since for each change in object (and in case of session objects, this effectively is for each http request), data is appended to this ZODB.

In a clustered situation however, if the Balancer maps a request to, say, Client node A in the network, this will set a session object in its RAM cache. A consecutive request might be mapped to a Client B, which does not have a session object in RAM identified by the end users cookie. This will obviously break the application state tracking.

A solution is to actually persist the session object in a central database on the Server. To prevent the mentioned excesive growth, a seperate DB is setup which does not append, but just replaces the data on each (session) object change.

Software versions used:

  • Python 2.1.3
  • Zope 2.5.1
  • ZEO version 1
  • the Berkely DB libraries
  • Berkeley DB bindings for Zope/Python
  • the External Mount product

Rough instructions

The steps involved roughly are:

  • Setup a ZEO Server, and, preferably more than one, ZEO

Client(s) all able to use the ExternalMount product.

  • Setup a "scratch" Zope instance backed by a Berkeley DB with

at least one user defined folder. This will create a 'bootstrap' Berkeley DB for later use.

  • Setup the Server to use both the "main" ZODB FileStorage and

the BerkeleyStorage. This BerkeleyStorage will store the session objects. Put the "bootstrapped" Berkeley DB in the directory where the Server is instructed to find it.

  • Create an 'ExternalMethod' in one of the Clients 'Extensions'

directory. This 'ExternalMethod' will later 'mount' the session database

  • Start the Cluster. Use the ZMI of one of the running Clients

to add a 'Mount via ExternalMethod'. Point this Mount to the previously created ExternalMethod and specify the id of the created folder in the 'bootstrapped' session BD. This folder is mounted and available to store objects.

  • Create a 'Transient Object Container' in this mounted folder

and instruct the 'session_data_manager' to store its session data in this 'Transient Container'. Since data in this container is actually centrally persisited in the Berkeley DB, these (session) objects are available to all Clients.

  • Setup a (load balancing) front end webserver (the Balancer) as

a cental point of access to the Cluster.

  • Instruct the Clients (e.g. with the 'SiteRoot' product) to

generate abolute URLs corresponding to the Balancers URL. This will make all requests go through the Balancer.

Berkeley ZODB Storage

The "out of the box" ZODB backend is called 'FileStorage'. By design, this backend is a versionable and undoable persistent storage for (Zope) objects. Each change to an object is appended to the database file, which, unless this database is 'packed' every now and then, continues to grow

This standard FileStorage backend can be replaced by a Berkeley DB backend. This may be setup in a fully versionable and undoable fashion, but can also be setup in a so called 'Packless' fashion, which does not keep any version or undo information for the objects in the DB.

A setup like this needs a working installation of the Berkeley DB version 3 libraries, Python bindings to these libraries and a Zope interfaces to be able to make use of this DB (see [A6], [A8], [A9]).

A 'custom_zodb.py' (see [B1]) file in the Zope instance directory (see [A1]) instructs Zope to use this BerkeleyStorage instead of the standard FileStorage.

Partitioned ZODB Storage (see [A4], [A5])

To make the ZEO Server use two ZODB backends simultaneously (two 'partitions' in the ZODB) a StorageConfig.py (see [B2]) is used (the filename actually is arbitrary). The Server is then started with no 'custom_zodb.py' file in place, but with extra startup arguments (see [B3]).

Mounting Session Storage (see [A4])

After the Cluster is started, the BerkeleyStorage can be mounted. Use the ZMI of one of the Clients. Add a 'Mount via ExternalMethod' object. It doesn't actually create a Zope object, but will invoke an ExternalMEthod (see [B5]).

This procedure will ask for three parameters - 'Module', 'Function' and 'Path'. The 'Module' is the name of the filename in which the 'Function' resides.

The 'Path' parameter value should be the name of the folder, say 'session_data', created inside the Berkeley DB using the 'scratch Zope'. By using this name, the BerkeleyStorage is mounted and accessible through the 'session_data' folder in the root of the Zope hiearchy.

The standard 'session_data_manager' puts its session objects in a 'Transient Object Container', which in turn is put inside the 'temp_folder' in the Zope root. Objects in this temp_folder are only stored in RAM.

Create a new 'Transient Object Container' inside the 'session_data' folder and instrcut the 'session_data_manager' to use this transient container. Session objects are now persisted in the BerkeleyStorage and thus available throughout the Cluster.

A: References

[A1] Central Zope code base, multiple instances:
http://www.zope.org/Members/4am/instancehome
[A2] ZEO setup:
http://www.zope.org/Members/kedai/UseZeoZope
[A3] www.zope.org cluster setup:
http://www.zope.org/About/
[A4] Session tracking & ZEO:
http://www.zope.org/Members/randy/ZEO-Sessions
http://www.zope.org/Members/jgrewen/ZEO%20with%20CoreSessionTracking%20and%20a%20mounted%20Berkeley%20session%20DB
http://www.zope.org/Members/mcdonc/HowTos/UseExternalMountWithCST
[A5] Berkeley Storage Wiki and code:
http://www.zope.org/Wikis/ZODB/BerkeleyStorage
http://dev.zope.org/Wikis/DevSite/Proposals/BerkeleyStorage
[A6] ExternalMount product:
http://www.zope.org/Members/hathawsh/ExternalMount
[A7] MountedStorage:
http://www.zope.org/Members/natsukashi/Products/MountedStorage
[A8] Bindings:
Zope: http://www.zope.org/Products/bsddb3Storage
Python: http://pybsddb.sourceforge.net/
[A9] Berkeley DB:
http://www.sleepycat.com/

B: Code

[B1] 'custom_zodb.py' to use BerkelyStorage:

import os
from bsddb3Storage.Packless import Packless

env = os.path.join('var', 'bsddb3Storage')
Storage = Packless(name='BerkeleyStorage', env=env)

[B2] 'StorageConfig.py'

import os
import ZODB.FileStorage
from bsddb3Storage.Packless import Packless
main_storage = ZODB.FileStorage.FileStorage(
                                  os.path.join('var', 'Data.fs'),)
session_storage = Packless(
 name='BerkeleyStorage', env=os.path.join('var', 'bsddb3Storage'))

[B3] 'start_zeo' script:

#!/bin/sh
export INST_HOME=`pwd`
export INSTANCE_HOME=`pwd`
export ZOPE_HOME= [ path to zope install directory ]

exec python $ZOPE_HOME/lib/python/ZEO/start.py \
  -p 9000 \
  -D \
  ZEO_SERVER_PID=var/ZEO_SERVER.pid \
  STUPID_LOG_FILE=var/ZEO_EVENTS.log \
  -S main=StorageConfig:main_storage \
  -S sessiondata=StorageConfig:session_storage \
  "$@"

[B4] Mount_Session_Data.py' ExternalMethod:

import ZODB, ZEO.ClientStorage
def mountSessionData():
return ZODB.DB(ZEO.ClientStorage.ClientStorage(
                   ('192.168.1.20', 9000), storage='sessiondata'))

[B5] Add 'Mount via ExternalMethod' 'object', fill in parameters:

Module = Mount_Session_Data
Function = MountSessionData
Path = [ name of mount folder ]

Copyright © 2002-2004 Infrae. All rights reserved.
See also "LICENSE.txt" in the Silva package.

Scroll to top of page To table of contents for the site: acc-m Search the site: acc-f To site index: acc-i Find content in the site: acc-f No link