Today I am at a design workshop for Swift – NOT the Apple programming language announced this week but rather the Object storage solution that is a core part of the OpenStack cloud platform.
The Training is being given by SwiftStack – http://www.swiftstack.com and Racktop Systems – http://www.racktop.com.
This is a One day workshop and I will be publishing real-time notes from the session. So here goes…
Swift and SwiftStack
Swift is the object storage platform for OpenStack.
SwiftStack is the leading contributor to Swift and provides a wrapper of services for Swift to make the service easier to implement.
SwiftStack is a Venture-backed company that was established in 2011.
SwiftStack provides the operational and management layer for Swift.
There are more than 2,000 contributors to the Swift platform.
Rackspace originated Swift but now IBM, AT&T, HP, Comcast and Time Warner Cable have all built on the Swift/SwiftStack platform.
Swift runs on commodity hardware, as does OpenStack.
Storage in OpenStack includes:
- Cinder (Block)
- Swift (Object)
- Manila (Shared File System)
Swift is the OpenStack equivalent of Amazon S3.
Swift is an API.
– Highly Scalable
– Hardware Proof – it assumes unreliable hardware.
Swift runs on any Linux-based architecture.
Load Balancers are outside swift. SwiftStack includes load balancing.
Load Balancer (includes SSL and Authentication) talks to:
– Proxy talks to:
– Account / Container / Object
– A replication and consistency layer talks to:
– Standard servers with disks.
Authentication can use OpenStack Keystone but you can integrate other standards such as LDAP or Active Directory.
Guidance is to not use Keystone since it is not designed for lighter loads and doesn’t scale well to support heavily used and large scale environments.
Keystone can also create a single point of failure for Swift. Better options would be to use robust Active Directory or LDAP that will probably already exist in the environment.
Swift also offers a simple hashed user/password Auth function for quick setup.
– Enforces the default 3x replication
– Enforces a quorum
– Enforces User set ACLs
– Uses fastest available copy for reads (single read only required)
Swift doesn’t want RAID. Data protection is done by RAID.
Swift will store data to create the most unique positioning of data to avoid placing in an location shared with another copy.
An Account Container keeps track of containers and objects
Objects stored by Object Servers. Metadata is stored with the data using a standard filesystem (XFS).
– No RAID
– Use SATA or SAS drives
– You can use SSD for Read heavy Caching
We are working with two things:
– SSH to the node(s) we are creating.
NTP is an important service.
If your nodes use Active Directory for authentication, you should specify your AD servers’ hostnames or IPs for the NTP server settings.
Partition Power – Err on the side of over sizing. This is harder to change once set. Replication happens on a partition basis (regardless of data content in a partition). Replication has a performance overhead.
A Partition Power of 16 gives a pool of up to nearly 2,000 (1966) drives in a cluster. Which with 3TB drives yields usable storage of 1.97PB with 3TB drives.
A node can only belong to one cluster.
If nodes have multiple interfaces you can assign one interface to the proxy and load balancer and the other interface for intra-node communications.
Note: Linux uses Partition for a section of a drive
Swift uses Partition as a folder or bucket.