474095e46c
Highlights: - "experimental" code for managing md/raid1 across a cluster using DLM. Code is not ready for general use and triggers a WARNING if used. However it is looking good and mostly done and having in mainline will help co-ordinate development. - RAID5/6 can now batch multiple (4K wide) stripe_heads so as to handle a full (chunk wide) stripe as a single unit. - RAID6 can now perform read-modify-write cycles which should help performance on larger arrays: 6 or more devices. - RAID5/6 stripe cache now grows and shrinks dynamically. The value set is used as a minimum. - Resync is now allowed to go a little faster than the 'mininum' when there is competing IO. How much faster depends on the speed of the devices, so the effective minimum should scale with device speed to some extent. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIVAwUAVTbIxDnsnt1WYoG5AQIgAA//Z9FlEpkHcJJ75WGjrXgGJPyNTfEZOkoz jnD8PpBY2Afp341vMatd0XKErdEAhuPQmJMAa+tbxht6pk77X3fSzXghGA8FEafg yazn5pfBt6NXmepV4vhl/+LNYuWRxxLSA9EDm7wEg+tiO0UEts+a0w2TSXzcT1w+ 30Yi1EjAlTaJ5yjlBHUOtTXWE43D6RKnUr6FMy2dRlnzlFyDlezRMChWo9v6OkQF YqJ20FcvmJdLHY/6Yif3jvpm7eQecMqdCZENTvW/mJI86zqf6E+ToCYS1VNjfDnK ud61iU9eshu4WtNWBG6KLuHBD0grO1NaEL7/S16w1KdNJMhYgiK8WussvIAJEesA 5SlETM7Y/1XFq8puwlAq2/tuPfhZ+TFxnAwce/C3hMTDcYAACnS/R6INFQXqGvy3 nX1NLogrCycX8oqxv3jTFKLVqIVwlkSlHcUGzIWjcfCF37StcXFKI5q862agyg2+ NNocFMuXhPPM1YcB9JJSo2nCsor4e9tTdVEZlFm2B3cc8LJ9BLWUMSoi1h7VK/1g P7psnPIjz7/cdI2TZTFjGTZ0Kvhx/NTYp41AZealDNxeGWUNM+5xGZnUF8QRBc/E 0dGHtEAah834BDQFvNnJtuuh/s+KwbvswjNP+njoBsHjIQIvngDABpOwpIkdqF6r diQ2gUPnHN0= =OHG6 -----END PGP SIGNATURE----- Merge tag 'md/4.1' of git://neil.brown.name/md Pull md updates from Neil Brown: "More updates that usual this time. A few have performance impacts which hould mostly be positive, but RAID5 (in particular) can be very work-load ensitive... We'll have to wait and see. Highlights: - "experimental" code for managing md/raid1 across a cluster using DLM. Code is not ready for general use and triggers a WARNING if used. However it is looking good and mostly done and having in mainline will help co-ordinate development. - RAID5/6 can now batch multiple (4K wide) stripe_heads so as to handle a full (chunk wide) stripe as a single unit. - RAID6 can now perform read-modify-write cycles which should help performance on larger arrays: 6 or more devices. - RAID5/6 stripe cache now grows and shrinks dynamically. The value set is used as a minimum. - Resync is now allowed to go a little faster than the 'mininum' when there is competing IO. How much faster depends on the speed of the devices, so the effective minimum should scale with device speed to some extent" * tag 'md/4.1' of git://neil.brown.name/md: (58 commits) md/raid5: don't do chunk aligned read on degraded array. md/raid5: allow the stripe_cache to grow and shrink. md/raid5: change ->inactive_blocked to a bit-flag. md/raid5: move max_nr_stripes management into grow_one_stripe and drop_one_stripe md/raid5: pass gfp_t arg to grow_one_stripe() md/raid5: introduce configuration option rmw_level md/raid5: activate raid6 rmw feature md/raid6 algorithms: xor_syndrome() for SSE2 md/raid6 algorithms: xor_syndrome() for generic int md/raid6 algorithms: improve test program md/raid6 algorithms: delta syndrome functions raid5: handle expansion/resync case with stripe batching raid5: handle io error of batch list RAID5: batch adjacent full stripe write raid5: track overwrite disk count raid5: add a new flag to track if a stripe can be batched raid5: use flex_array for scribble data md raid0: access mddev->queue (request queue member) conditionally because it is not set when accessed from dm-raid md: allow resync to go faster when there is competing IO. md: remove 'go_faster' option from ->sync_request() ...
479 lines
16 KiB
Plaintext
479 lines
16 KiB
Plaintext
#
|
|
# Block device driver configuration
|
|
#
|
|
|
|
menuconfig MD
|
|
bool "Multiple devices driver support (RAID and LVM)"
|
|
depends on BLOCK
|
|
select SRCU
|
|
help
|
|
Support multiple physical spindles through a single logical device.
|
|
Required for RAID and logical volume management.
|
|
|
|
if MD
|
|
|
|
config BLK_DEV_MD
|
|
tristate "RAID support"
|
|
---help---
|
|
This driver lets you combine several hard disk partitions into one
|
|
logical block device. This can be used to simply append one
|
|
partition to another one or to combine several redundant hard disks
|
|
into a RAID1/4/5 device so as to provide protection against hard
|
|
disk failures. This is called "Software RAID" since the combining of
|
|
the partitions is done by the kernel. "Hardware RAID" means that the
|
|
combining is done by a dedicated controller; if you have such a
|
|
controller, you do not need to say Y here.
|
|
|
|
More information about Software RAID on Linux is contained in the
|
|
Software RAID mini-HOWTO, available from
|
|
<http://www.tldp.org/docs.html#howto>. There you will also learn
|
|
where to get the supporting user space utilities raidtools.
|
|
|
|
If unsure, say N.
|
|
|
|
config MD_AUTODETECT
|
|
bool "Autodetect RAID arrays during kernel boot"
|
|
depends on BLK_DEV_MD=y
|
|
default y
|
|
---help---
|
|
If you say Y here, then the kernel will try to autodetect raid
|
|
arrays as part of its boot process.
|
|
|
|
If you don't use raid and say Y, this autodetection can cause
|
|
a several-second delay in the boot time due to various
|
|
synchronisation steps that are part of this step.
|
|
|
|
If unsure, say Y.
|
|
|
|
config MD_LINEAR
|
|
tristate "Linear (append) mode"
|
|
depends on BLK_DEV_MD
|
|
---help---
|
|
If you say Y here, then your multiple devices driver will be able to
|
|
use the so-called linear mode, i.e. it will combine the hard disk
|
|
partitions by simply appending one to the other.
|
|
|
|
To compile this as a module, choose M here: the module
|
|
will be called linear.
|
|
|
|
If unsure, say Y.
|
|
|
|
config MD_RAID0
|
|
tristate "RAID-0 (striping) mode"
|
|
depends on BLK_DEV_MD
|
|
---help---
|
|
If you say Y here, then your multiple devices driver will be able to
|
|
use the so-called raid0 mode, i.e. it will combine the hard disk
|
|
partitions into one logical device in such a fashion as to fill them
|
|
up evenly, one chunk here and one chunk there. This will increase
|
|
the throughput rate if the partitions reside on distinct disks.
|
|
|
|
Information about Software RAID on Linux is contained in the
|
|
Software-RAID mini-HOWTO, available from
|
|
<http://www.tldp.org/docs.html#howto>. There you will also
|
|
learn where to get the supporting user space utilities raidtools.
|
|
|
|
To compile this as a module, choose M here: the module
|
|
will be called raid0.
|
|
|
|
If unsure, say Y.
|
|
|
|
config MD_RAID1
|
|
tristate "RAID-1 (mirroring) mode"
|
|
depends on BLK_DEV_MD
|
|
---help---
|
|
A RAID-1 set consists of several disk drives which are exact copies
|
|
of each other. In the event of a mirror failure, the RAID driver
|
|
will continue to use the operational mirrors in the set, providing
|
|
an error free MD (multiple device) to the higher levels of the
|
|
kernel. In a set with N drives, the available space is the capacity
|
|
of a single drive, and the set protects against a failure of (N - 1)
|
|
drives.
|
|
|
|
Information about Software RAID on Linux is contained in the
|
|
Software-RAID mini-HOWTO, available from
|
|
<http://www.tldp.org/docs.html#howto>. There you will also
|
|
learn where to get the supporting user space utilities raidtools.
|
|
|
|
If you want to use such a RAID-1 set, say Y. To compile this code
|
|
as a module, choose M here: the module will be called raid1.
|
|
|
|
If unsure, say Y.
|
|
|
|
config MD_RAID10
|
|
tristate "RAID-10 (mirrored striping) mode"
|
|
depends on BLK_DEV_MD
|
|
---help---
|
|
RAID-10 provides a combination of striping (RAID-0) and
|
|
mirroring (RAID-1) with easier configuration and more flexible
|
|
layout.
|
|
Unlike RAID-0, but like RAID-1, RAID-10 requires all devices to
|
|
be the same size (or at least, only as much as the smallest device
|
|
will be used).
|
|
RAID-10 provides a variety of layouts that provide different levels
|
|
of redundancy and performance.
|
|
|
|
RAID-10 requires mdadm-1.7.0 or later, available at:
|
|
|
|
ftp://ftp.kernel.org/pub/linux/utils/raid/mdadm/
|
|
|
|
If unsure, say Y.
|
|
|
|
config MD_RAID456
|
|
tristate "RAID-4/RAID-5/RAID-6 mode"
|
|
depends on BLK_DEV_MD
|
|
select RAID6_PQ
|
|
select ASYNC_MEMCPY
|
|
select ASYNC_XOR
|
|
select ASYNC_PQ
|
|
select ASYNC_RAID6_RECOV
|
|
---help---
|
|
A RAID-5 set of N drives with a capacity of C MB per drive provides
|
|
the capacity of C * (N - 1) MB, and protects against a failure
|
|
of a single drive. For a given sector (row) number, (N - 1) drives
|
|
contain data sectors, and one drive contains the parity protection.
|
|
For a RAID-4 set, the parity blocks are present on a single drive,
|
|
while a RAID-5 set distributes the parity across the drives in one
|
|
of the available parity distribution methods.
|
|
|
|
A RAID-6 set of N drives with a capacity of C MB per drive
|
|
provides the capacity of C * (N - 2) MB, and protects
|
|
against a failure of any two drives. For a given sector
|
|
(row) number, (N - 2) drives contain data sectors, and two
|
|
drives contains two independent redundancy syndromes. Like
|
|
RAID-5, RAID-6 distributes the syndromes across the drives
|
|
in one of the available parity distribution methods.
|
|
|
|
Information about Software RAID on Linux is contained in the
|
|
Software-RAID mini-HOWTO, available from
|
|
<http://www.tldp.org/docs.html#howto>. There you will also
|
|
learn where to get the supporting user space utilities raidtools.
|
|
|
|
If you want to use such a RAID-4/RAID-5/RAID-6 set, say Y. To
|
|
compile this code as a module, choose M here: the module
|
|
will be called raid456.
|
|
|
|
If unsure, say Y.
|
|
|
|
config MD_MULTIPATH
|
|
tristate "Multipath I/O support"
|
|
depends on BLK_DEV_MD
|
|
help
|
|
MD_MULTIPATH provides a simple multi-path personality for use
|
|
the MD framework. It is not under active development. New
|
|
projects should consider using DM_MULTIPATH which has more
|
|
features and more testing.
|
|
|
|
If unsure, say N.
|
|
|
|
config MD_FAULTY
|
|
tristate "Faulty test module for MD"
|
|
depends on BLK_DEV_MD
|
|
help
|
|
The "faulty" module allows for a block device that occasionally returns
|
|
read or write errors. It is useful for testing.
|
|
|
|
In unsure, say N.
|
|
|
|
|
|
config MD_CLUSTER
|
|
tristate "Cluster Support for MD (EXPERIMENTAL)"
|
|
depends on BLK_DEV_MD
|
|
depends on DLM
|
|
default n
|
|
---help---
|
|
Clustering support for MD devices. This enables locking and
|
|
synchronization across multiple systems on the cluster, so all
|
|
nodes in the cluster can access the MD devices simultaneously.
|
|
|
|
This brings the redundancy (and uptime) of RAID levels across the
|
|
nodes of the cluster.
|
|
|
|
If unsure, say N.
|
|
|
|
source "drivers/md/bcache/Kconfig"
|
|
|
|
config BLK_DEV_DM_BUILTIN
|
|
bool
|
|
|
|
config BLK_DEV_DM
|
|
tristate "Device mapper support"
|
|
select BLK_DEV_DM_BUILTIN
|
|
---help---
|
|
Device-mapper is a low level volume manager. It works by allowing
|
|
people to specify mappings for ranges of logical sectors. Various
|
|
mapping types are available, in addition people may write their own
|
|
modules containing custom mappings if they wish.
|
|
|
|
Higher level volume managers such as LVM2 use this driver.
|
|
|
|
To compile this as a module, choose M here: the module will be
|
|
called dm-mod.
|
|
|
|
If unsure, say N.
|
|
|
|
config DM_MQ_DEFAULT
|
|
bool "request-based DM: use blk-mq I/O path by default"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
This option enables the blk-mq based I/O path for request-based
|
|
DM devices by default. With the option the dm_mod.use_blk_mq
|
|
module/boot option defaults to Y, without it to N, but it can
|
|
still be overriden either way.
|
|
|
|
If unsure say N.
|
|
|
|
config DM_DEBUG
|
|
bool "Device mapper debugging support"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
Enable this for messages that may help debug device-mapper problems.
|
|
|
|
If unsure, say N.
|
|
|
|
config DM_BUFIO
|
|
tristate
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
This interface allows you to do buffered I/O on a device and acts
|
|
as a cache, holding recently-read blocks in memory and performing
|
|
delayed writes.
|
|
|
|
config DM_BIO_PRISON
|
|
tristate
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
Some bio locking schemes used by other device-mapper targets
|
|
including thin provisioning.
|
|
|
|
source "drivers/md/persistent-data/Kconfig"
|
|
|
|
config DM_CRYPT
|
|
tristate "Crypt target support"
|
|
depends on BLK_DEV_DM
|
|
select CRYPTO
|
|
select CRYPTO_CBC
|
|
---help---
|
|
This device-mapper target allows you to create a device that
|
|
transparently encrypts the data on it. You'll need to activate
|
|
the ciphers you're going to use in the cryptoapi configuration.
|
|
|
|
For further information on dm-crypt and userspace tools see:
|
|
<http://code.google.com/p/cryptsetup/wiki/DMCrypt>
|
|
|
|
To compile this code as a module, choose M here: the module will
|
|
be called dm-crypt.
|
|
|
|
If unsure, say N.
|
|
|
|
config DM_SNAPSHOT
|
|
tristate "Snapshot target"
|
|
depends on BLK_DEV_DM
|
|
select DM_BUFIO
|
|
---help---
|
|
Allow volume managers to take writable snapshots of a device.
|
|
|
|
config DM_THIN_PROVISIONING
|
|
tristate "Thin provisioning target"
|
|
depends on BLK_DEV_DM
|
|
select DM_PERSISTENT_DATA
|
|
select DM_BIO_PRISON
|
|
---help---
|
|
Provides thin provisioning and snapshots that share a data store.
|
|
|
|
config DM_CACHE
|
|
tristate "Cache target (EXPERIMENTAL)"
|
|
depends on BLK_DEV_DM
|
|
default n
|
|
select DM_PERSISTENT_DATA
|
|
select DM_BIO_PRISON
|
|
---help---
|
|
dm-cache attempts to improve performance of a block device by
|
|
moving frequently used data to a smaller, higher performance
|
|
device. Different 'policy' plugins can be used to change the
|
|
algorithms used to select which blocks are promoted, demoted,
|
|
cleaned etc. It supports writeback and writethrough modes.
|
|
|
|
config DM_CACHE_MQ
|
|
tristate "MQ Cache Policy (EXPERIMENTAL)"
|
|
depends on DM_CACHE
|
|
default y
|
|
---help---
|
|
A cache policy that uses a multiqueue ordered by recent hit
|
|
count to select which blocks should be promoted and demoted.
|
|
This is meant to be a general purpose policy. It prioritises
|
|
reads over writes.
|
|
|
|
config DM_CACHE_CLEANER
|
|
tristate "Cleaner Cache Policy (EXPERIMENTAL)"
|
|
depends on DM_CACHE
|
|
default y
|
|
---help---
|
|
A simple cache policy that writes back all data to the
|
|
origin. Used when decommissioning a dm-cache.
|
|
|
|
config DM_ERA
|
|
tristate "Era target (EXPERIMENTAL)"
|
|
depends on BLK_DEV_DM
|
|
default n
|
|
select DM_PERSISTENT_DATA
|
|
select DM_BIO_PRISON
|
|
---help---
|
|
dm-era tracks which parts of a block device are written to
|
|
over time. Useful for maintaining cache coherency when using
|
|
vendor snapshots.
|
|
|
|
config DM_MIRROR
|
|
tristate "Mirror target"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
Allow volume managers to mirror logical volumes, also
|
|
needed for live data migration tools such as 'pvmove'.
|
|
|
|
config DM_LOG_USERSPACE
|
|
tristate "Mirror userspace logging"
|
|
depends on DM_MIRROR && NET
|
|
select CONNECTOR
|
|
---help---
|
|
The userspace logging module provides a mechanism for
|
|
relaying the dm-dirty-log API to userspace. Log designs
|
|
which are more suited to userspace implementation (e.g.
|
|
shared storage logs) or experimental logs can be implemented
|
|
by leveraging this framework.
|
|
|
|
config DM_RAID
|
|
tristate "RAID 1/4/5/6/10 target"
|
|
depends on BLK_DEV_DM
|
|
select MD_RAID1
|
|
select MD_RAID10
|
|
select MD_RAID456
|
|
select BLK_DEV_MD
|
|
---help---
|
|
A dm target that supports RAID1, RAID10, RAID4, RAID5 and RAID6 mappings
|
|
|
|
A RAID-5 set of N drives with a capacity of C MB per drive provides
|
|
the capacity of C * (N - 1) MB, and protects against a failure
|
|
of a single drive. For a given sector (row) number, (N - 1) drives
|
|
contain data sectors, and one drive contains the parity protection.
|
|
For a RAID-4 set, the parity blocks are present on a single drive,
|
|
while a RAID-5 set distributes the parity across the drives in one
|
|
of the available parity distribution methods.
|
|
|
|
A RAID-6 set of N drives with a capacity of C MB per drive
|
|
provides the capacity of C * (N - 2) MB, and protects
|
|
against a failure of any two drives. For a given sector
|
|
(row) number, (N - 2) drives contain data sectors, and two
|
|
drives contains two independent redundancy syndromes. Like
|
|
RAID-5, RAID-6 distributes the syndromes across the drives
|
|
in one of the available parity distribution methods.
|
|
|
|
config DM_ZERO
|
|
tristate "Zero target"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
A target that discards writes, and returns all zeroes for
|
|
reads. Useful in some recovery situations.
|
|
|
|
config DM_MULTIPATH
|
|
tristate "Multipath target"
|
|
depends on BLK_DEV_DM
|
|
# nasty syntax but means make DM_MULTIPATH independent
|
|
# of SCSI_DH if the latter isn't defined but if
|
|
# it is, DM_MULTIPATH must depend on it. We get a build
|
|
# error if SCSI_DH=m and DM_MULTIPATH=y
|
|
depends on SCSI_DH || !SCSI_DH
|
|
---help---
|
|
Allow volume managers to support multipath hardware.
|
|
|
|
config DM_MULTIPATH_QL
|
|
tristate "I/O Path Selector based on the number of in-flight I/Os"
|
|
depends on DM_MULTIPATH
|
|
---help---
|
|
This path selector is a dynamic load balancer which selects
|
|
the path with the least number of in-flight I/Os.
|
|
|
|
If unsure, say N.
|
|
|
|
config DM_MULTIPATH_ST
|
|
tristate "I/O Path Selector based on the service time"
|
|
depends on DM_MULTIPATH
|
|
---help---
|
|
This path selector is a dynamic load balancer which selects
|
|
the path expected to complete the incoming I/O in the shortest
|
|
time.
|
|
|
|
If unsure, say N.
|
|
|
|
config DM_DELAY
|
|
tristate "I/O delaying target"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
A target that delays reads and/or writes and can send
|
|
them to different devices. Useful for testing.
|
|
|
|
If unsure, say N.
|
|
|
|
config DM_UEVENT
|
|
bool "DM uevents"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
Generate udev events for DM events.
|
|
|
|
config DM_FLAKEY
|
|
tristate "Flakey target"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
A target that intermittently fails I/O for debugging purposes.
|
|
|
|
config DM_VERITY
|
|
tristate "Verity target support"
|
|
depends on BLK_DEV_DM
|
|
select CRYPTO
|
|
select CRYPTO_HASH
|
|
select DM_BUFIO
|
|
---help---
|
|
This device-mapper target creates a read-only device that
|
|
transparently validates the data on one underlying device against
|
|
a pre-generated tree of cryptographic checksums stored on a second
|
|
device.
|
|
|
|
You'll need to activate the digests you're going to use in the
|
|
cryptoapi configuration.
|
|
|
|
To compile this code as a module, choose M here: the module will
|
|
be called dm-verity.
|
|
|
|
If unsure, say N.
|
|
|
|
config DM_SWITCH
|
|
tristate "Switch target support (EXPERIMENTAL)"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
This device-mapper target creates a device that supports an arbitrary
|
|
mapping of fixed-size regions of I/O across a fixed set of paths.
|
|
The path used for any specific region can be switched dynamically
|
|
by sending the target a message.
|
|
|
|
To compile this code as a module, choose M here: the module will
|
|
be called dm-switch.
|
|
|
|
If unsure, say N.
|
|
|
|
config DM_LOG_WRITES
|
|
tristate "Log writes target support"
|
|
depends on BLK_DEV_DM
|
|
---help---
|
|
This device-mapper target takes two devices, one device to use
|
|
normally, one to log all write operations done to the first device.
|
|
This is for use by file system developers wishing to verify that
|
|
their fs is writing a consitent file system at all times by allowing
|
|
them to replay the log in a variety of ways and to check the
|
|
contents.
|
|
|
|
To compile this code as a module, choose M here: the module will
|
|
be called dm-log-writes.
|
|
|
|
If unsure, say N.
|
|
|
|
endif # MD
|