Ceph Intro & Architectural Overview

preview_player
Показать описание
Ceph is a free software storage platform designed to present object, block, and file storage from a single distributed computer cluster. Ceph's main goals are to be completely distributed without a single point of failure, scalable to the exabyte level, and freely-available. The data is replicated, making it fault tolerant. Ceph software runs on commodity hardware. The system is designed to be both self-healing and self-managing and strives to reduce both administrator and budget overhead. Ceph employs four distinct kinds of daemons:[4] Cluster monitors (ceph-mon) that keep track of active and failed cluster nodes Metadata servers (ceph-mds) that store the metadata of inodes and directories Object storage devices (ceph-osd) that actually store the content of files. Ideally, OSDs store their data on a local btrfs filesystem to leverage its built-in copy-on-write capabilities, though other local filesystems can be used instead.[5] Representational state transfer (RESTful) gateways (ceph-rgw) that expose the object storage layer as an interface compatible with Amazon S3 or OpenStack Swift APIs All of these are fully distributed, and may run on the same set of servers. Clients directly interact with all of them.[6] Ceph does striping of individual files across multiple nodes to achieve higher throughput, similarly to how RAID0 stripes partitions across multiple hard drives. Adaptive load balancing is supported whereby frequently accessed objects are replicated over more nodes.[citation needed] As of December 2014, underlying filesystems recommended for production environments are ext4 (small scale) and XFS (large scale deployments), while Btrfs and ZFS are recommended for non-production environments.[7] Object storage An architecture diagram showing the relations between components of the Ceph storage platform Ceph implements distributed object storage. Ceph’s software libraries provide client applications with direct access to the reliable autonomic distributed object store (RADOS) object-based storage system, and also provide a foundation for some of Ceph’s features, including RADOS Block Device (RBD), RADOS Gateway, and the Ceph File System. The librados software libraries provide access in C, C++, Java, Python and PHP. The RADOS Gateway also exposes the object store as a RESTful interface which can present as both native Amazon S3 and OpenStack Swift APIs. Block storage Ceph’s object storage system allows users to mount Ceph as a thinly provisioned block device. When an application writes data to Ceph using a block device, Ceph automatically stripes and replicates the data across the cluster. Ceph's RADOS Block Device (RBD) also integrates with kernel virtual machines (KVMs). Ceph RBD interfaces with the same Ceph object storage system that provides the librados interface and the CephFS file system, and it stores block device images as objects. Since RBD is built on top of librados, RBD inherits librados's capabilities, including read-only snapshots and revert to snapshot. By striping images across the cluster, Ceph improves read access performance for large block device images. The block device is supported in virtualization platforms, including Apache CloudStack, OpenStack, OpenNebula, Ganeti, and Proxmox Virtual Environment. These integrations allow administrators to use Ceph's block device as the storage for their virtual machines in these environments. File system Ceph’s file system (CephFS) runs on top of the same object storage system that provides object storage and block device interfaces. The Ceph metadata server cluster provides a service that maps the directories and file names of the file system to objects stored within RADOS clusters. The metadata server cluster can expand or contract, and it can rebalance the file system dynamically to distribute data evenly among cluster hosts. This ensures high performance and prevents heavy loads on specific hosts within the cluster.

Slides
Рекомендации по теме
Комментарии
Автор

3:51 -- I love how every time someone talks about Ceph they have a moment like this. A little Freudian slip. "This is the approach that WE favor.." --throat clear-- Sometimes instead you'll catch a little chuckle, as if the speaker is thinking to themselves how awesome Ceph is, as if they can't help themselves. (they're better than I am, they know a better way than I do and they straight up know it... and for once I actually like that. [Because I can learn from it.]) And it's almost like it's secretly a joke if you aren't already fully aware that it's the future.. I was worried about the unknown but I'm starting to gain a comfortable understanding. (I love YouTube learning.) Enough to deploy a 4 node cluster. Little clues show me that we're still only in the beginning with Ceph but I'm also pretty late to the party. Production ready but the best is still in store. I'm ready to party now! ... this video in particular helped me realize that I have 1 more node that I can add to my cluster. Time to put my resources to REAL use. Thank you.

damonmueller
Автор

THE best explanation of Ceph I have seen so far. Great work!

suyashdongre
Автор

Great presentation. Exactly what I was looking for.

Bigeinla
Автор

Best tech presentation I seen in a long time

enmanuelh
Автор

I've been searching all day for a decent explanation of CEPH and this is by far one of the "BEST" I have seen. Thank you very much. The light bulb just came on and it's burning bright. How can I get more of that type of training and explanation?

OnesimusX
Автор

determinism means that a specific key always generates a specific value no matter how many times the algorithm is rolled. This basically means there is no randomness involved in the hashing algorithm itself.

amirmohg
Автор

lol. how the description text below this video is better than the whole ceph website at explaining what ceph is and does.

MistaSmith
Автор

I think the description below the video is a part of science public paper. Have any one help me find that title or link on google scholar?

MilorRamadi
Автор

Some good info, looking forward to setting up my first CEPH cluster

lostsoulparty
Автор

yeah in overall thats a pretty comprehensive introduction to ceph, nice

danielkrajnik
Автор

I could see some correlation between the Dynamic Subtree Partitioning which CEPH uses and the DNE (Distributed Namespace) used in Lustre. Do they branch out from the same origin?

abhishekkr
Автор

Huh. File systems should be fully deterministic. The same data in the same file should put it in the same blocks. That would allow reduction again.

isbestlizard
Автор

Great explanation, easy to understand

jarabers
Автор

where can i check the slides used in this great keynote ?

MohamedGamil
Автор

It remindes me of the 432 chips by Intel and distributed computing and distributed voted i/o processing.
Does it allow for offline and non-deterministic osd in the factoring of recovery/deletion/renaming options ?

cemery
Автор

This is still a really good presentation

afortiorama
Автор

This is cool. What is max performance of Ceph? Can it read/write at 50 GB/sec to a clustered fs?

isbestlizard
Автор

Are hard links supported? (not softlinks) Its required for things like Cassandra and other high bandwidth file persisted products.

richrein
Автор

Great talk...but Now I'm baffled as to how CEPH works with the other Openstack storage solutions like Swift and Cinder...?

horizonbrave
Автор

only issue i see here is as you scale out you scale failure points instead of 1 huge storage appliance with redundancy you have tons of small appliances with no redundancy

balla