Blog

Author Archive

SMP: Symmetric Multi-Processor Explained (Part 1)

Posted on: April 4th, 2012 by stephenbroeker No Comments

As a software engineer at Pyramid Technology, Symmetric Multi-Processors (SMPs) served as a technology backbone. An SMP system essentually consists of multiple cpus that all use the same physical memory.

Each cpu ran some flavor of the Unix operating system.  In Unix, user spaces processes have seperate physical address spaces, so user processes are not a problem in SMP.  But the kernel maps to hard-wired physical addresses; thus multiple cpus, running the same kernel in SMP, is a problem.

SMP Locks to Avoid Corruption

A problem occurs when data structures are manipulated or changed.  If the same code or related code (on different cpus) is changing the same data structure, at the same time, then data corruption can occur. To prevent this data corruption, we protect the data structures with locks.  That is, a process must obtain a read or write lock before accessing a data structure.  Such operations thus synchronize data access.

So how many locks do we add?

AT&T was one of the first companies to create an SMP system: the 3B5000.  This system consisted of multiple cpus running Unix (AT&T invented Unix) and a single kernel lock.  This single lock was used to make sure that one and only one process was in the kernel at the same time.

If a process was in user mode, then performance was wonderful.  On the other hand, if a process needed to be in kernel mode, then performance could be terrible.  Adding additional cpus did not improve the problem since only a single process could be in the kernel. Thus this lock did not scale.  That is, adding X number of cpus did not result in a linear increase in system performance, since all processes need the kernel at some point.

In this blog entry, I have described the SMP problem.  My next blog post will describe the solution.

Openstack Swift TempAuth Module

Posted on: March 28th, 2012 by stephenbroeker No Comments

Last time I presented a Swift REST API example. Now I’ll explain the Openstack Swift Test Authentication and Authorization System (tempauth). This is an excellent authentication module for Swift All In One (SAIO) and for development work.

Add Tempauth to Openstack Swift Proxy Server

The first thing that you will need to do is to add tempauth to the Proxy Server configuration. So make sure that the following is in proxy-server.conf:

[pipeline:main]
pipeline = catch_errors cache tempauth proxy-server

[app:proxy-server]
account_autocreate = true

[filter:tempauth]
use = egg:swift#tempauth
user_admin_admin = admin .admin .reseller_admin
user_test_tester = testing .admin
user_test2_tester2 = testing2 .admin
user_test_tester3 = testing3

The configuration lines under the [filter:tempauth] section that begin with “user_”, define a users login, password, and privileges. These lines are of the form: user_<login1>_<login2> = <password> <privileges>

Four Users, Four Permissions

This configuration enables tempauth and creates the following four users, each with different permissions.

1) login = admin:admin
password = admin
privileges = .admin, .reseller_admin

2) login = test:tester
password = testing
privileges = .admin

3) login = test2:tester2
password = testing2
privileges = .admin

4) login = test:tester3
password = testing3
privileges = None

Openstack Swift Privileges

So what do the privileges mean?

1) admin
Admin users can do anything within their account.

2) reseller_admin
Reseller Admin users can do anything to any account.

3) None
Non-Admin users can only perform operations per container based on the container’s X-Container-Read and

X-Container-Write ACLs.

To allow a “user” to read the objects in container, then set the container header “X-Container-Read: .r:user”.

To allow a “user” to list the contents of a container, then set the container header “X-Container-Read: .rlistings”.

To allow a “user” to read and list the objects in container, then set the container header

“X-Container-Read: .r:user, .rlistings”.

To allow anyone to write to a container, then set the container header “X-Container-Write: .r:*”.

For complete ACL details, check out Openstack Swift dev documentation.

When a Non-Admin user is created, then the only way to create X-Container-Read and X-Container-Write headers is via a Reseller Admin user.

Or, do you have another solution?

Swift REST API Example

Posted on: March 14th, 2012 by stephenbroeker No Comments

Swift REST API Example

Last post I explained REST interfaces, and promised to use a Swift REST API as an example.   This application programming interface (API) supports the following operations.

1)  Swift REST API Authorization

GET Authorization

————————

This operation is used to obtain an authorization token and URL for a  given user login and password.   This token and URL are then used in any subsequent operations.

  • URL Data: None.
  • Required Request Headers:
    • X-Auth-User = user login.
    • X-Auth-Key = user password.
  • Optional Request Headers: None.
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Date = current date.
    • X-Auth-Token = authorization token, input for subsequent operations.
    • X-Storage-Token = storage token, input for subsequent operations.
    • X-Storage-Url = storage URL, input for subsequent operations.
  • HTTP Data: None.

2) Swift REST API Account

DELETE Account
————————

Mark an account as deleted.   Swift REST API will clean up the account as time permits.

  • URL Data: None.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers: None.
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
  • HTTP Data: None.

GET Account
————————

Get the list of containers in an account.

  • URL Data: None.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers: None.
  • Parameters:
    • format=type : return http data in “json” or “xml” format.
    • limit=number : limit the number of returned containers.
    • marker=filter : provide a filter. The returned list of containers will start after “filter”.
  • Response Headers:
    • Accept-Ranges = always “bytes”. This header will eventually comply with the http “range” header.
    • Content-Length = the number of bytes in http data.
    • Content-Type = http data content type.
    • Date = current date.
    • X-Account-Bytes-Used = the number of bytes used in the account. Keep in mind that this field is updated in a lazy fashion.
    • X-Account-Container-Count = the number of containers in the account.
    • X-Account-Object-Count = the number of objects in the account. Keep in mind that this field is updated in a lazy fashion.
  • HTTP Data: The list of containers for the account.

HEAD Account
————————

Get account statistics.

  • URL Data: None.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers: None.
  • Parameters: None.
  • Response Headers:
    • Accept-Ranges = always “bytes”. This header will eventually comply with the http “range” header.
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
    • X-Account-Bytes-Used = the number of bytes used in the account. Keep in mind that this field is updated in a lazy fashion.
    • X-Account-Container-Count = the number of containers in the account.
    • X-Account-Object-Count = the number of objects in the account. Keep in mind that this field is updated in a lazy fashion.
  • HTTP Data: None.

POST Account
————————

  • Post meta-data to an account.
  • URL Data: None.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers:
    • X-Account-Meta-* = the user can create a account meta-data header. Such headers are of the form: “key value”.   The resulting account header will be: “X-Account-Meta-key: value”
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
  • HTTP Data: None.

PUT Account
————————

Create an account.

  • URL Data: None.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers:
    • X-Account-Meta-* = the user can create a account meta-data header. Such headers are of the form: “key value”.   The resulting account header will be: “X-Account-Meta-key: value”
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
  • HTTP Data: None.

3) Swift REST API Container

DELETE Container
————————

Mark a container as deleted. Swift will clean up the container as time permits.

  • URL Data: Container name.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers: None.
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
  • HTTP Data: None.

GET Container
————————

Get the list of objects in a container.

  • URL Data: Container name.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers: None.
  • Parameters:
    • delimiter=char: apply a character as a “delimiter” to the list of objects.
    • format=type : return http data in “json” or “xml” format.
    • limit=number : limit the number of returned objects.
    • marker=filter : provide a filter. The returned list of objects will start after “filter”.
    • path=string : for object names with embedded slashes (/).
    • prefix=string : The returned list of objects will start with “string”.
  • Response Headers:
    • Accept-Ranges = always “bytes”. This header will eventually comply with the http “range” eader.
    • Content-Length = the number of bytes in http data.
    • Content-Type = http data content type.
    • Date = current date.
    • X-Container-Bytes-Used = the number of bytes used in the container.
    • X-Container-Object-Count = the number of objects in the container.
  • HTTP Data: The list of objects for the container.

HEAD Container
————————

Get container statistics.

  • URL Data: Container name.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers: None.
  • Parameters: None.
  • Response Headers:
    • Accept-Ranges = always “bytes”. This header will eventually comply with the http “range” header.
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
    • X-Container-Bytes-Used = the number of bytes used in the container.
    • X-Container-Object-Count = the number of objects in the container.
  • HTTP Data: None.

POST Container
————————

  • Post meta-data to a container.
  • URL Data: Container name.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers:
    • X-Container-Meta-* = the user can create a container meta-data header. Such headers are of the form: “key value”. The resulting container header will be: “X-Container-Meta-key: value”
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
  • HTTP Data: None.

PUT Container
————————

Create a container.

  • URL Data: Container name.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers:
    • X-Account-Meta-* = the user can create a account meta-data header. Such headers are of the form: “key value”.   The resulting account header will be: “X-Account-Meta-key: value”
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
  • HTTP Data: None.

4) Swift REST API Object

COPY Object
————————

Copy an object from one container to another.

  • URL Data: src_container/src_object.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
    • Destination = /<dest_container>/<dest_object>
  • Optional Request Headers: None.
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
    • Etag = checksum of new object data.
    • Last-Modified = the data that the source object was last changed.
    • X-Copied-From = source container and object.
    • X-Object-Meta-* = user defined meta data.
  • HTTP Data: None.

DELETE Object
————————

Delete an object from a container.

  • URL Data: container/object.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers: None.
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
  • HTTP Data: None.

GET Object
————————

Get an object from a container.

  • URL Data: container/object.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers:
    • If-Match: etag = return the object data if there is an etag match.
    • If-Modified-Since: date = return the object data if it was modified since “data”.
    • If-None-Match: etag = return the object data if there is no etag match.
    • If-Unmodified-Since: date = return the object data if it was unmodified since “data”.
    • Range: bytes = return the specified object byte range.
  • Parameters: None.
  • Response Headers:
    • Content-Length = number of bytes in http data.
    • Content-Type = http data content type.
    • Date = current date.
    • Etag = checksum of object data.
    • Last-Modified = the data that the object was last changed.
    • X-Object-Meta-* = user defined meta data.
  • HTTP Data: Object data.

POST Object
————————

Post meta-data to an object.

  • URL Data: container/object.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers:
    • X-Object-Meta-* = the user can create an object meta-data header. Such headers are of the form: “key value”.   The resulting object header will be: “X-Object-Meta-key: value”
    • X-Delete-After: seconds = delete the object after “seconds”.
    • X-Delete-At: seconds = delete the object after current time + “seconds”.
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
  • HTTP Data: None.

PUT Object
————————

Create an object in a container.

  • URL Data: container/object.
  • Required Request Headers:
    • X-Auth-Token = authorization token.
  • Optional Request Headers:
    • X-Object-Manifest = indicates that this object is a manifest, that is, contains an ordered list of data objects.   This header is used to support objects larger than  4GB.
    • X-Object-Meta-* = the user can create an object meta-data header. Such headers are of the form: “key value”.   The resulting object header will be: “X-Object-Meta-key: value”
    • Transfer-Encoding = use transfer encoding.
  • Parameters: None.
  • Response Headers:
    • Content-Length = always zero.
    • Content-Type = http data content type.
    • Date = current date.
    • Etag = checksum of object data.
    • Last-Modified = the data that the object was last changed.
    • X-Object-Meta-* = user defined meta data.
  • HTTP Data: None.

What would you add to this Swift REST API example?

REST Interfaces Explained

Posted on: March 7th, 2012 by stephenbroeker No Comments

REST (REpresentational State Transfer) is an interface type that was created in 2000 by Roy Fielding in a doctoral dissertation. It is based on HTTP, so a REST interface is basically composed of GET, PUT, POST, HEAD, and DELETE operations.  I’ll explain REST Interfaces in this post, and follow-up with a Swift REST API example.

REST Interface HTTP Components

First, let’s look at each operation as defined by the HTTP components:

  1. Method
    The operations typically are: COPY, DELETE, GET, HEAD, PUT, and POST.
    The server has complete freedom in defining these operations.
    Typically though, these operations are defined as follows:
    COPY – create a copy of an item.
    • DELETE – delete an item.
    • GET – read an item.
    • HEAD – read item meta-data.
    • PUT – create an item.
    • POST – write item meta-data.
  2. URL
    The URL is commonly of the form:
    <http>://<ip><port>/<version>/<auth>/<parameters>
    Where:

    • <http> = “http” or “https”.
    • <ip> = ip address.
    • <port> = port number.
    • <version> = version string.
    • <auth> = authorization token.
    • <data> = operation specific data.
    • <parameters> = options.
      Parameters begins with a “?”.   Multiple parameters are seperated by a “&”.
  3. Request Headers
    Operation input headers.
  4. Response Headers
    Headers returned by the interface.
  5. Response Data.
    Data returned by the interface in ASCII format.
  6. Status Code
    HTTP status code. This code is in the range 1 – 600 and can be broken down as follows:

    • 1xx Informational.
    • 2xx Success.
    • 3xx Redirection.
    • 4xx Client Error.
    • 5xx Server Error.

Next time I’ll present a SWIFT REST Interface API example. In the meantime, what would you add to this explanation?

Managing a Hive Nucleus

Posted on: March 2nd, 2012 by stephenbroeker No Comments

I just finished reading the February 2012 edition of the American Bee Journal.   In this edition, I discovered a most interesting article:  “The Half-Hive: Setting Up and Managing a Nucleus Hive“.

The author, Larry Connor, does a wonderful job of explaining the proper use of a Hive Nucleus in an apiary. I have never really understood the purpose before. Mr. Connor does such a fine job,  that I am now totally motivated to build and use a nucleus in my apiary this  year.

To start with, a nucleus is essentually a half hive. That is, instead of using a super with 10 frames,  a nucleus uses a super with only 5 frames. The reason being that the purpose of a nucleus is to be an apiary nursery,  not to produce honey,  like a normal hive.

Apiary Nursery

So what exactly is an apiary nursery?   An apiary nursery is a nucleus that is used to produce frames of brood. Once the brood is capped, the frames are moved from the nucleus hive to a  normal hive. Worker bees and the queen must be removed from the brood frames  (shake em baby, shake em)  before they are transferred. Otherwise, you would be moving worker bees from one hive to another  and a fight might break out.
Besides, you want the nucleus work bees to stay at home and produce more brood  frames.

Adding brood frames to a hive, strengthens the hive by quickly adding worker bees. I have tried added swarms to an existing hive, in the past, with minimal  results. The worker bees started fighting and the result was lots of dead bees. Adding brood frames is like adding a swarm without a fight.

You can create as many nukes as you want. But I have decided to use a single nucleus this year in my apiary. I want to start simple. I will ensure that the nucleus always has at least one frame of brood. Extra frames will be moved to normal hives, when the brood is capped. Thus the nucleus will never get larger than a single super.

At the end of the season, I will add a new queen to the nucleus, to help it make it through the winter.  Hopefully, the nucleus will result in stronger honey hives that survive the  winter.

B-Haven 2011 Annual Report

Posted on: February 28th, 2012 by stephenbroeker No Comments

I am very excited about the up and coming Honey Flow.  That is, when the local flowers are blooming and there is sufficient necter production for bees to grow the hives.  The Honey Flow usually starts some time in April.

2011 Honey Production

In the mean time, I would like to recap last years (2011) honey production.  We started the season with a single hive, that had made it successfully through the winter.  I decided to expand the apiary to four total hives.  I thus needed to create three new hives.

In the past, I had always created new hives by purchasing bee packages.  A Bee Package is a box full of bees (4 pounds worth) with a queen bee (in a cage) and a food source (a reservoir of sugar syrup).  The queen is in a cage so that the worker bees have a chance to bond to her scent and thus adopt her as their queen.  If the worker bees do not accept the queen, then they will kill her.

This year I decided to try a new approach to creating a new hive, split an existing hive into two hives.  I had never done this before and had no idea if my attempt would be successfull, even though it was standard procedure in the world of beekeeping.  Splitting a hive envolves the following:

1)  Count the number of frames with brood.

2)  Count the number of frames with only honey.

3) Count the number of frames with only pollen.

4)  Move half of the brood frames into the new hive.

5)  Move half of the honey frames into the new hive.

6) Move half of the pollen frames into the new hive.

7)  Add a caged queen into both hives.

This procedure resulted in two identical hives.  After three days, I opened the hives and made sure that the queens were out of their cages.  The old hive now had two queens.  They will fight it out, let the best female survive!  The ultimate chick fight! So I started with a single hive, and now I had two hives.  To get to my total of four, I created two more new hives from bee packages.  Once I had the apiary up to full strength,  all I had to do was let the bees do their thing.

Spring & Summer Tending

Over the course of the Spring and Summer I checked on the hives once a week.  I made sure that each hive had enough free space.  If a hive runs out of space, then the bees will swarm.  This is essentually hive reproduction, in which the bees create a new queen.  When the new queen emerges from a cell, she mates and leaves to start a new hive.

The new queen cannot survive on her own, so roughly half of the worker population leaves with her, after they have gourged themselves with honey.  The result is that the old hive has approximately half the number of worker bees and the hive thus has more free space.  This process is of no benefit to my apiary and does in fact weaken a hive.  So I wanted to avoid a swarm as much as possible.  Hence I frequently checked each hive and added empty supers if there was not enough free space.  I always erred on the side of caution, better to have to much free space than not enough.

In 2011, the Spring was somewhat delayed, but we made up for it with a long, mild Summer.  This resulted in a tremendous harvest!  We averaged 16 gallons per hive!

The hives had 4 to  5 honey supers.  I had to use a step stool to examine these hives!

I should take this opportunity to explain what a honey super is.  The first two supers are for the hive to raise bees and store honey and pollen for the winter.  Any additional supers are strictly for honey.  I ensure that by placing a screen (called a queen excluder) on top of the second super.  The wholes in this screen is large enough for the workers to travel through, but too small for the queen to squeeze through (she is after all somewhat larger than a worker bee due to her egg laying capabilities).

Once the Honey Flow died down in August, I decided to harvest the honey in September.  And as previously stated, the harvest was awsome.  I collected honey, honey comb, and bee bread.  I also melted down the wax trimmings from the pure honey, so that I could easily save it and use it in the future for candles and such.

Colony Collapse Disorder

But once the summer ended and the temperatures cooled down, I started loosing hives.  Three of the hives experienced Colony Collapse Disorder (CCD).  Which means that the worker bees absconded and the hive was reduced to a very small number of worker bees (less than 50) to keep the hive health and strong through the upcoming Winter.  So why did the worker bees leave?  No one really knows for sure.  The possible reasons include:

1)  The worker bees were killed during foraging by pesticides.

2)  The hive become infected with some kind of disease.

3)  The hive become infested with mites, hive beattles or some other pest.

4)  Parasitic wasps or flies that kill the worker bees.

5)  The hive swarmed and the reduced old hive was too weak.

The bottom line was that once CCD hit, the hive was doomed.  It was not strong enough to make it through the winter.  Two of the hives died in November and a third died in December.  When a hive died, I harvested the remaining honey and bee bread and discarded frames with dead brood.

So I am now left with a single hive.  The hope is that this hive will make it to the Honey Bloom this year, and regain honey production levels reached last year.  My next blog will be about my trip to Honey Bee Genetics to pick up the queens and packages for this years apiary.  The pick up date is April 14.

Swift Storage Devices

Posted on: February 28th, 2012 by stephenbroeker No Comments

In a previous blog, I presented the topology of Openstack Swift Object Storage.   That is, how Swift uses XFS  to store Accounts, Containers, and Objects.

A separate Account Database is created for each Account and an Account Database is a SQLite  instantiation in the file system. An Account Database consists of a Account Stat Table and a Container Table. The Account Stat Table contains meta-data for that specific account.   The Container Table contains an entry for each container in that account.

Similarly,  a separate Container Database is created for each Container  and an Container Database is an SQLite instantiation in the file system. A Container Database consists of a Container Stat Table and an Object Table. The Container Stat Table contains meta-data for that specific container. The Object Table contains an entry for each object in that container.

And finally, object data is stored in XFS files.

So what happens when a Swift Storage Device becomes full?   The storage device is managed by XFS,  so the file system will run out of space when this happens.   Thus XFS will NOT be able to perform subsequent writes,  but reads and deletes will still work.

What happens with the Swift REST API? Reads (GETs) continue to work, this is good.
Writes (PUTs and POSTs) obviously do not work  and this is as expected. But what about deletes (DELETEs)? If a Swift Account is “full”,  then users should be able to free up space in the account.

The sad, fact of the matter is that deletes do NOT work at this point. A DELETE Object operation fails with a 503 Internal Error. This error is the result of a Proxy Server exception. The exception occures because a database update fails. To understand this exception, we must understand how objects are deleted in  Swift.

As previously stated, a Container Database has an Object Table,  which contains an entry for each object in it. The Object Data itself is stored in an XFS file. So to delete an object, the appropriate Object Table entry has to be updated  and marked as “deleted”. The Object Data is left for the Object Reaper to clean up at a later time.

The Problem

The problem is with the Object Table update. This is a write, and since the file system is full, this update will fail.

So how do we get around this problem?   The Swift Account is full and users CANNOT free up account space. Currently, OpenStack recommends monitoring the Storage Devices so that an alarm  is generated when a device becomes 80% full. The Swift Administrator then,  has to add new Storage Devices to the Swift Rings. This involves modifying the rings and restarting the Proxy Servers. The Account will no longer be full,  since it now has additional space,  but users will NEVER be able to delete objects and containers from the full  storage device.

The Solution?

This solution is less than adequate and is clearly lacking in Enterprise quality. I would now like to propose more robust solutions, two in fact.

Solution 1: create a utility that removes objects from the full Storage Device. This utility would remove the Object Data file and then update the Object Table. The customer would obviously have to identify which objects should be removed. Once there is enough free space,  the customer could then use the normal DELETE operations to free up further
space. This solution is good in that it allows an account to be quickly repaired,  but it is bad in that a cluster administrator has to perform the action.

Soultion 2: create a configurable Swift Storage Device High Water Mark (HWM). The HWM would be expressed as a storage percentage,  that is how full can any Storage Device get before PUT operations are turned off?  So when the HWM is hit for any Storage Device, all PUT operations would fail for all accounts. But users could still perform GETs, POSTs, and DELETEs. They could thus still manage their accounts. The cluster admin would then have time to add storage devices,  without users clammering about “full” accounts.

Database Performance with Openstack Swift Improvements – Part 3

Posted on: February 22nd, 2012 by stephenbroeker No Comments

In Part 2 of this blog post series, I proposed replacing SQLite with MySQL as a database engine and changing the database schema to use chunking. These changes ensure that database performance is consistent for Account and Container Databases.

But what about Objects? As previously stated, Object data is stored in files. Are there problems with this?

Yes. Storing Object data in files is nice in that it is simple and Object data is easy to find – just look in the file system. But what about performance?

A file system adds a lot of overhead to the equation, functionality is not free. There are a number of problems with this approach.

Database Performance Problem 1: Buffer Cache

File systems use an in memory buffer cache to speed up repetitive, sequential, small file IO. But Swift Objects are written once with no partial updates or reads. Object are always completely rewritten. They are also always completely read. So the buffer cache sucks up a lot of memory and does not improve performance. In fact it degrades performance because of overhead.

Database Performance Problem 2: Multi Use of the File System

You will recall that the file system is also used to store the Account and Container databases. In addition, XFS supports Extended Attributes. These allow name/value pairs to be attached to a file. Swift uses XFS Extended Attributes to store user defined headers (meta data).

The end result is that the file system is multi use: database, meta data, and object data. This results in poor disk performance. All three of these components use the file system in different ways and thus have different effects on the file system and causes constant disk head seeks. For the best performance, we want to minimize seeks and keep the disk heads moving in one direction.

Database Performance Problem 3: Using File System Meta Data

Querying file system meta data performs poorly. File systems are great at moving data. They are terrible at querying data. Traversing inodes and data blocks is expensive. Consider the code to do an “ls”. This is basically a straight forward problem. Get the inode for a directory, perform multiple reads on the inode data, and filter and sort the results. This is a substantial ammount of code. Now consider if the meta data was in a database. The “ls” code becomes a single database query.

Database Performance Problem 4: All Components in a Single Repository

The database, meta data, and object data are all in the same file system. They thus are all in the same repository. Thus if the repo is down, then nothing is available. Now consider moving the meta data into the database and moving the database to a different repo. The result is that the database is still available if the file system is down. And the file system is still available if the database is down.

Database Performance Solution

The solution to these four problems is to move the file system meta data into the database and then move the database to another repo. In addition, scrap the file system and change the database to point directly into raw disk partitions. This means that Object Data would no longer be stored as files but the Object Table (in the Container Database) would point to the: partition name, partition offset, and object size. An object would thus be defined by the three tuple: (partition, offset, size). We thus remove the file system in all of its complexity and overhead and use raw disk partitions. To summerize:

  1. Replace SQLite with MySQL.
  2. Separate the database repo from the file system.
  3. Remove the file system.
  4. Move all meta data into the database.
  5. Change Object Table to use raw disk partitions.

That’s how I’ve fixed the problem. How have you solved these database performance issues?

Openstack Swift Database Performance Improvements Part 2

Posted on: February 15th, 2012 by stephenbroeker No Comments

In my previous Openstack post I established the groundwork for my proposed solution to improve the performance of Openstack Swift Database that consists of two parts: MySQL and Database Chunking.

Database Performance with Openstack Improvement: MySQL

For the first part of the solution, I propose replacing SQLite with MySQL as a database engine. As the name implies, SQLite is fine for small databases, but has performance problems with larger Openstack databases. MySQL is perfect for this problem, since a database is represented as a file system directory, and database tables are represented as files.

Openstack Performance Improvement: Database Chunking

For the second part of the solution, I propose using Database Chunking. That is, breaking up the Container and Object Tables into chunks. The result would be database queries on reasonably sized tables. The table structure would be tiered and tables would either be of type “index” or “data”. Index Tables would point to the next level of table, which would be “index” or “data”. Data Tables contain the actual table data and are thus leaf nodes in the table schema.

The optimal size of each table chunk would have to be determined through experimentation, but for purposes of argument, let us assume a table chunk size of 100,000 rows. So for an empty Account, the Container Table schema would be:

Container Index 1 -> empty

After the first container is created, the Container Table schema would be:

Container Index 1 -> Container Data 1
Container Data 1 -> 1 row

When container number 100,001 gets created, the Container Table schema would be:

Container Index 1 -> Container Data 1
Container Data 2
Container Data 1 -> 100,000 rows
Container Data 2 -> 1 row

So Container Index 1 can map (100,000 X 100,000 = 10 Billion) containers. When container number 10,000,000,001 gets created, the Container Table schema would be:

Container Index 1 -> Container Index 2
Container Index 3
Container Index 2 -> Container Data 1

Container Data 100,000
Container Index 3 -> Container Data 100,001
Container Data 1 -> 100,000 rows

Container Data 100,000 -> 100,000 rows
Container Data 100,001 -> 1 row

This Database Table Schema will essentually scale forever. Thus these database changes will ensure that database performance is consistent for Accounts and Containers. But what about Objects? In my next blog, I will first identify problems with Swift Object data storage and then present a solution.

In the meantime, how do you deal with Objects?

Openstack Swift Database Performance Part 1

Posted on: February 8th, 2012 by stephenbroeker No Comments

In my last two posts (Python’s Strengths & Weaknesses) I have been describing the operation of Openstack Swift Storage. Swift storage basically consists of four components: Ring, Database, Zones, and File system. I’m proposing some performance improvements to this design. But first we need to understand the Swift database schema. An Openstack Account Database consists of two tables: Account Stat and Container. And a Container Database consists of two tables: Container Stat and Object.

Openstack Account Stat Table

The following is a detailed view of the Openstack Database Account Stat Table:
account
created_at
put_timestamp
delete_timestamp
container_count
object_count
bytes_used
hash
id
status
status_changed_at
metadata

Openstack Container Table

And the following is a detailed view of the Container Table:
ROWID
name
put_timestamp
delete_timestamp
object_count
bytes_used
deleted

Notice that both the Account Stat Table and the Container Table have deleted attributes. These attributes are required since rows in these tables are never deleted, they are just marked as deleted. The reason that rows are not deleted is that this would require some time of synchronization (locking), in case another thread was accessing the same database. And we all know that locking in the Cloud is a very bad thing, it would destroy scaling. So these tables are append or update, deletes are not allowed.

This is all well and good for performance, but happens when these tables grow? The Account Stat Table will never grow, it will always have one and only one row. But the Container Table will grow with time, as containers are created for the account. So what happens to SQLite performance when the Container Table gets large? First, since an SQLite database is a file, file performance will degrade as the file grows. Second, database query performance will also degrade as the file grows.

Container Database

The following is a detailed view of the Container Stat Table:
account
container
created_at
put_timestamp
delete_timestamp
object_count
reported_put_timestamp
reported_delete_timestamp
reported_object_count
reported_bytes_used
hash
id
status
status_changed_at
metadata

Object Table

And the following is a detailed view of the Object Table:
ROWID
name
created_at
size
content_type
etag
deleted

Notice that both the Container Stat Table and the Object Table have deleted attributes, just like the Account Database, and for the same reasons. So both the Container and Objects Tables will have performance problems as containers and objects are created over time.

Next time I’ll propose a solution that consists of two parts: MySQL and Database Chunking.