From this post to the next few ones, I will write about my current contributions to OpenStack-Swift as part of the Outreach Program by Gnome foundation.
Swift is an object storage engine written in Python. I am working to make sure clusters are set up with the correct configuration. To do so OPTIONS * verb is used to get the information about servers which then the swift client can access and make sure the configuration is done right.
Before diving into the details of the implementation, this post will brief over swift architecture.
Swift object storage allows you to store and retrieve files. It is a distributed storage platform (API accessible) for static data. Data is stored in a structured three level – account, container, object.
Account server – Account storage contains metadata descriptive information about itself and the list of containers in the account.
Container server – Container storage area has metadata about itself (the container) and the list of objects in it.
Object server – Object storage location is where the data object and its metadata will be stored.
Storage node is a machine that is running swift services. A cluster is a collection of one or more nodes. Clusters can be distributed across different regions. Swift architecture stores by default three replicas (of partitions) for durability and resilience to failure.
The Auth system:
TempAuth – In this, the authentication part can be an external system or a subsystem with in Swift. The user passes an auth token to Swift and swift validates it with the auth system (external or within). If valid, the auth system passes back an expiration date, and Swift stores the expiration part in its cache.
Keystone Auth – Swift can authenticate against OpenStack Keystone system.
Extending Auth – This can be done by writing a new wsgi middleware just like Keystone project is implementing.
Proxy server – Proxy server lets you interact with the rest of the architecture. It takes incoming requests, looks up the location of account, container or object in the ring.
The ring – A ring defines mapping between entities stored on the disk and their physical location. There are different rings for accounts, containers and one object ring per storage policy (more on this below). To determine a location of any account, container or object, we need to interact with the ring. The ring is also responsible for determining which devices are used for handoff in failure scenarios.
Storage policies – Storage policies provide a way to give different feature levels and services in the way a object is stored. For instance, some of the containers might have default 3x replication, the new containers could be using 2x replication. Once a container is created with a storage policy, all the objects in it will also be created with the same policy.
Details of these will be explored as I move ahead with my work.
Sources: A lot of my understanding and this briefing comes by reading the docs and by making contributions to the code.