InfiniBox supports filesystems and volumes. Filesystems and volumes share the backend storage and can co-exist within the same storage pool. This release includes support for NFS protocol version 3 and many storage capabilities of volumes, such as snapshots and flash cache (also called SSD cache).
InfiniBox filesystems' main design points
InfiniBox filesystems can be as large as useable capacity in InfiniBox, as fast as the network supports, benefit from the system SSD cache, be thinly provisioned, use snapshots, use self-encrypted drives, and be mounted to any client.
InfiniBox implements NAS directly on top of the data layer, allowing it to get similar data services to our SAN implementation. It features the following:
User mode – for enhanced stability, InfiniBox NAS does not use kernel resources
Shared cache - the filesystem shares the cache with the SAN volumes to reduce memory consumption and increase performance
Transaction protection for taking a snapshot during a transaction, without performance impact
- RAID, data placement, and provisioning operations are taken by the block device
- Built for scale:
- High capacity
- High file count
- Large number of filesystems
- High performance
InfiniBox NAS terminology
- NAS - Network-attached storage provides file-level storage that serves files. In NAS, as opposed to SAN, the filesystem is managed by the storage array rather than by the hosts.
- IFS - Infinidat File System is a B-tree based filesystem. Btree allows IFS to translate a host's request for a file to its location inside the storage with a single operation, allowing the filesystem to scale to billions of files.
- Filesystems that use the UNIX security style contain files that have UNIX permissions.
Each file has its own read/write/execute permission per file owner, group owner, and everyone else.
- Filesystems that use the Windows security style contain files that have ACL permissions.
Each file has a list of access permission entries per user or group.
How filesystems compare to volumes
Filesystems and SAN volumes have many similarities and some differences:
|Provisioned inside pools||√||√|
|Can be moved between pools non-disruptively||√||√|
|Thin / Thick provisioned||√||√|
|Capable of snapshots||√||√|
|Can be write-enabled or write-protected (read-only)||√||√|
|Can benefit from SSD cache||√||√|
Map a volume to a host / cluster
(Normally only 1 host / cluster)
Export the filesystem to clients
(Many clients can access the same filesystem)
|Protocol||Fibre Channel (FC), iSCSI||NFS v3, SMB|
|Protocol semantics||Writing to blocks (LBAs) inside a volume||Writing to offsets inside files|
|Protocol Transport||Fibre Channel||TCP/IP|
|Growing capacity||Requires host side actions||No host side action required|
|Link Redundancy||Multipath - Host is in charge of using all available paths|
LACP provides redundancy on layer 2
When LACP fails, IP addresses automatically failover to another LACP
|Physical fabric||Fibre Channel||Ethernet|
Small file capacity accounting
- InfiniBox optimizes data layout per file size:
- Very small files (below 128 bytes) are stored inside the metadata structure (Btree) and do not consume additional space
- Small files (below 64KB)
- Allocated with a 64KB space, and the empty space (within the 64KB) is filled with zeroes, which our compression removes
- The files are stored in 4KB increments; for example, a 7KB file will be either:
- 4KB if it was able to compress to <4KB
- 8KB if it was not able to compress at all, or compressed to >4KB
- Large files (above 64KB) are compressed and their compressed size is rounded up to 4KB increments
- Compression affects the "Allocated" and "Snapshots" capacity, but does not reduce the "Used" capacity counters, and so small files are still reported back to the client as consuming 64KB
- For filesystems that are provisioned with a size that is a multiplication of 4KB, the "Used" and "Allocated" counters are identical
- The best practice is to thin-provision the filesystem and increase its size to compensate for the virtually inflated files
Symbolic link capacity accounting
As symbolic links are small enough to be stored within the filesystem metadata structure, they do not require an actual file. The symbolic link capacity is accounted for as follows:
- 1 iNode
- 0 capacity
- Maximum length of the Permissions field for a single export: 3,000 client entries