rbd cache （一）

cache

1、why

　　The existence of cache is based on a mismatch between the performance characteristics of core components of computing architectures, namely that bulk storage cannot keep up with the performance requirements of the CPU and application processing.

2、what

　　The technique of storing a copy of data temporarily in rapidly-accessible storage media (also known as memory) local to the CPU and separate from bulk storage

3、with

Latency is reduced for active data, which results in higher performance levels for the application.
I/O operations to external storage are reduced as much of the I/O is diverted to cache, resulting in lower levels of SAN traffic and contention for the SAN.
Data can sit permanently on external storage arrays or traditional storage, which maintains the consistency and integrity of the data using features provided by the array, such as snapshots or replication.
Flash is targeted at just the part of the workload that benefits from lower latency, resulting in a more cost-effective use of high $/TB storage.

4、classify

Write-through cache directs write I/O onto cache and through to underlying permanent storage before confirming I/O completion to the host. This ensures data updates are safely stored on, for example, a shared storage array, but has the disadvantage that I/O still experiences latency based on writing to that storage. Write-through cache is good for applications that write and then re-read data frequently as data is stored in cache and results in low read latency.（先写cache后写backend，最新的写均能在cache读到）
Write-around cache is a similar technique to write-through cache, but write I/O is written directly to permanent storage, bypassing the cache. This can reduce the cache being flooded with write I/O that will not subsequently be re-read, but has the disadvantage is that a read request for recently written data will create a “cache miss” and have to be read from slower bulk storage and experience higher latency.（不写cache，直接写backend，导致最新的写不能再cache获取到读）
Write-back cache is where write I/O is directed to cache and completion is immediately confirmed to the host. This results in low latency and high throughput for write-intensive applications, but there is data availability exposure risk because the only copy of the written data is in cache. As we will discuss later, suppliers have added resiliency with products that duplicate writes. Users need to consider whether write-back cache solutions offer enough protection as data is exposed until it is staged to external storage. Write-back cache is the best performing solution for mixed workloads as both read and write I/O have similar response time levels.（依靠副本等策略避免数据丢失）

5、where

In the server – Some caching solutions are deployed directly in the server, either on RAID cards or Fibre Channelhost bus adapter (HBA) cards. Products in the market today include LSI’s range of Nytro MegaRAID PCIe cards and Qlogic’s FabricCache.Both these products aim to accelerate I/O by caching data on the card itself or in the case of FabricCache on a connected PCIe SSD device that uses thePCIe bus for power.
- 服务器（主机）上：RAID 卡或者 HBA 卡上做缓存。
Working with the hypervisor – In this case the hypervisor is involved in the caching process, typically through one of two methods.
- VMM 内：在 Hypervisor 上做缓存。
In the operating system – Microsoft provides write-back cache within Windows Server 2012 R2 that can be used with Hyper-V. There are other caching software solutions that deploy into the operating system, providing acceleration for Windows and Linux environments, such as FlashSoft from SanDisk.Having caching software integrated with the OS provides the ability to be more targeted with caching software, for example, by applying it only to certain disk volumes or folders, although these solutions may be less flexible with clustered environments
- 客户机操作系统内：以 Windows 2012 为例，它提供 write-back 缓存机制。

6、problems

for example, the problem of cache warm-up, where cache needs to be loaded with enough active data to reduce cache misses and allow it to start improving I/O response times.

There will always be a trade-off between latency and resiliency and so it becomes dependent on the user to look at whether write-cache is an essential requirement of the deployment.

　　One other consideration is the algorithms or logic used to determine what to cache. Some solutions use simple “least recently used” policies to discard data; others are more complex and look at the data for clues as to which should be retained in cache.

7、new

　　NVDIMM technology, which uses the DRAM slots and delivers NAND flash storage offers a middle ground by providing performance that comes close to DRAM speeds but provides a permanent storage medium.

8、different with ceche tier

location: tier是rados层在osd端进行数据缓存，也就是说不论是块存储、对象存储还是文件存储都可以使用tier来提高读写速度；rbd cache是rbd层在客户端的缓存，也就是只支持块存储。

problem: Rbd cache是客户端的缓存，当多个客户端使用同个块设备时（例如ocfs2），存在客户端数据不一致的问题。举个例子，用户A向块设备写入数据后，数据停留在客户自己的缓存中，没有立即刷新到磁盘，所以其它用户读取不到A写入的数据。但是tier不存在这个问题，因为所有用户的数据都直接写入到ssd，用户读取数据也是在ssd中读取的，所以不存在客户端数据不一致问题。

usage: Tier使用ssd做缓存，而Rbd cache只能使用内存做缓存。SSD和内存有两个方面的差别，一个是读写速度、另一个是掉电保护。掉电后内存中的数据就丢失了，而ssd中的数据不会丢失。

参考与引用：

1、http://www.computerweekly.com/feature/Write-through-write-around-write-back-Cache-explained

2、rbd缓存设置：http://docs.openfans.org/ceph/ceph4e2d658765876863/ceph-1/copy_of_ceph-block-device3010ceph57578bbe59073011/cache-settings3010librbd7f135b588bbe7f6e3011

3、验证是否开启：http://www.zphj1987.com/2015/11/16/%E9%AA%8C%E8%AF%81rbd%E7%9A%84%E7%BC%93%E5%AD%98%E6%98%AF%E5%90%A6%E5%BC%80%E5%90%AF/

4、ceph rbd的优化，rbd cache 从内存迁移到ssd的性能提升： http://blog.csdn.net/lzw06061139/article/details/51203461

5、红帽ceph发行版的rbd cache设置：https://access.redhat.com/documentation/en/red-hat-ceph-storage/version-1.2.3/red-hat-ceph-storage-123-ceph-block-device/chapter-10-cache-settings

6、ceph rbd介绍：http://my.oschina.net/linuxhunter/blog/541997