챕터 4. ZFS 데이터셋 (ZFS Datasets)

일반 파일시스템에서는 파티션을 만들어 다양한 유형의 데이터를 분리하고, 파티션에 다양한 최적화를 적용하고, 파티션이 사용할 수 있는 공간의 양을 제한할 수 있습니다. 각 파티션은 디스크에서 특정 양의 공간을 할당받습니다. 우리 모두 그런 경험이 있습니다. 우리는 다음 달, 내년, 5년 후 이 시스템의 각 파티션에 얼마나 많은 디스크 공간이 필요할지 추측합니다. 그런데 미래를 내다보면 각 파티션에 할당하기로 결정한 공간의 양이 잘못되었을 가능성이 높습니다. 모든 데이터를 저장할 공간이 충분하지 않은 파티션은 디스크를 추가하거나 데이터를 이동해야 하므로 시스템 관리가 복잡해집니다. 파티션에 공간이 너무 많으면 다른 곳에 두는 것이 더 나을 데이터를 파티션에 버리게 됩니다. 루카스의 UFS2 시스템 중 하나 이상은 /home의 어딘가에 대한 심볼릭 링크로 /usr/포트를 가지고 있습니다. Jude는 보통 /usr/local/var에 /var의 일부를 저장합니다.

ZFS는 여유 공간을 풀링하여 일반적인 파일 시스템에서는 불가능한 파티션의 유연성을 제공함으로써 이 문제를 해결합니다. 생성하는 각 ZFS 데이터 세트는 그 안에 파일을 저장하는 데 필요한 공간만 사용합니다. 각 데이터 세트는 풀의 모든 여유 공간에 액세스할 수 있으므로 파티션 크기에 대한 걱정을 덜 수 있습니다. 6장에서 설명한 대로 할당량으로 데이터 세트의 크기를 제한하거나 예약을 통해 최소한의 공간을 보장할 수 있습니다.

일반 파일시스템은 별도의 파티션을 사용하여 다양한 유형의 데이터에 대해 서로 다른 정책과 최적화를 설정합니다. /var에는 로그와 데이터베이스처럼 자주 변경되는 파일이 들어 있습니다. 루트 파일시스템은 성능보다 일관성과 안전성이 중요합니다. /home에서는 무엇이든 가능합니다. 하지만 기존 파일시스템에 대한 정책을 한 번 설정하면 변경하기가 정말 어렵습니다. UFS용 tunefs(8) 유틸리티를 사용하려면 파일시스템을 마운트 해제해야 변경할 수 있습니다. 아이노드 수와 같은 일부 특성은 파일시스템이 생성된 후에는 변경할 수 없습니다.

기존 파일시스템의 핵심 문제는 유연성 부족으로 귀결됩니다. ZFS 데이터 세트는 거의 무한대로 유연합니다.

데이터세트 (Datasets)

데이터 세트는 이름이 지정된 데이터 덩어리입니다. 이 데이터는 파일, 디렉터리, 권한 등 기존 파일 시스템과 비슷할 수 있습니다. 원시 블록 장치, 다른 데이터의 사본 또는 디스크에 넣을 수 있는 모든 것일 수 있습니다.

ZFS는 기존 파일시스템이 파티션을 사용하는 것과 마찬가지로 데이터세트를 사용합니다. /usr에 대한 정책과 /home에 대한 별도의 정책이 필요하신가요? 각각 데이터세트를 만들면 됩니다. iSCSI 대상에 대한 차단 장치가 필요하신가요? 바로 데이터 세트입니다. 데이터 세트의 사본을 원하시나요? 그것도 또 다른 데이터 세트입니다.

데이터 세트에는 계층적 관계가 있습니다. 하나의 스토리지 풀이 각 최상위 데이터 세트의 부모입니다. 각 데이터 세트는 하위 데이터 세트를 가질 수 있습니다. 이 장에서 살펴보겠지만 데이터 세트는 부모로부터 많은 특성을 상속받습니다.

모든 데이터 세트 작업은 zfs(8) 명령으로 수행합니다. 이 명령에는 다양한 종류의 하위 명령이 있습니다.

데이터세트 유형 (Dataset Types)

현재 ZFS에는 파일시스템(filesystems), 볼륨(volumes), 스냅샷(snapshots), 복제본(clones), 북마크(bookmarks) 등 5가지 유형의 데이터 세트가 있습니다.

파일 시스템 데이터 세트는 기존 파일 시스템과 유사합니다. 파일과 디렉터리를 저장합니다. ZFS 파일시스템에는 마운트 지점이 있으며 읽기 전용, setuid 바이너리 제한 등과 같은 기존 파일시스템의 특성을 지원합니다. 파일시스템 데이터세트에는 권한, 파일 생성 및 수정을 위한 타임스탬프, NFSv4 액세스 제어 플래그, chflags(2) 등의 기타 정보도 포함됩니다.

ZFS 볼륨 또는 zvol은 블록 장치입니다. 일반 파일 시스템에서는 iSCSI용 파일 백업 파일 시스템이나 특수 목적의 UFS 파티션을 만들 수 있습니다. ZFS에서 이러한 블록 디바이스는 파일과 디렉터리의 모든 오버헤드를 우회하여 기본 풀에 직접 상주합니다. Zvol은 디스크 이미지를 마운트하는 데 사용되는 FreeBSD 메모리 장치를 건너뛰고 장치 노드를 가져옵니다.

스냅샷은 특정 시점의 데이터 세트에 대한 읽기 전용 사본입니다. 스냅샷을 사용하면 나중에 사용할 수 있도록 이전 버전의 파일시스템과 그 안에 있는 파일을 보존할 수 있습니다. 스냅샷은 현재 파일 시스템과 스냅샷에 있는 파일 간의 차이에 따라 일정량의 공간을 사용합니다.

클론은 기존 데이터 세트의 스냅샷을 기반으로 하는 새로운 데이터 세트로, 파일 시스템을 포크할 수 있습니다. 데이터 세트의 모든 항목에 대한 추가 복사본을 얻을 수 있습니다. 프로덕션 웹 사이트가 포함된 데이터세트를 복제하여 프로덕션 사이트를 건드리지 않고도 해킹할 수 있는 사이트 사본을 만들 수 있습니다. 복제본은 생성된 원본 스냅샷과의 차이점을 저장하는 데만 공간을 사용합니다. 7장에서는 스냅샷, 클론 및 북마크에 대해 다룹니다.

데이터 세트가 필요한 이유는 무엇인가요? (Why Do I Want Datasets?)

당연히 데이터 세트가 필요합니다. 디스크에 파일을 저장하려면 파일 시스템 데이터 세트가 필요합니다. 그리고 /usr 및 /var와 같은 각 기존 유닉스 파티션에 대한 데이터 세트가 필요할 것입니다. 하지만 ZFS를 사용하면 많은 데이터 세트가 필요합니다. 아주 많은 데이터 세트가 필요합니다. 파티션 수에 대한 하드코딩된 제한과 파티션의 유연성이 없는 기존 파일 시스템에서는 이런 일이 벌어질 수 없습니다. 하지만 많은 데이터세트를 사용하면 데이터에 대한 통제력이 높아집니다.

각 ZFS 데이터 세트에는 작동을 제어하는 일련의 속성이 있어 관리자가 데이터 세트의 작동 방식과 데이터 보호 수준을 제어할 수 있습니다. 기존 파일 시스템에서와 마찬가지로 각 데이터세트를 정확하게 조정할 수 있습니다. 데이터세트 속성은 풀 속성과 매우 유사하게 작동합니다.

시스템 관리자는 개별 데이터 세트에 대한 제어 권한을 다른 사용자에게 위임하고, 해당 사용자가 루트 권한 없이도 데이터 세트를 관리할 수 있도록 할 수 있습니다. 조직에 여러 프로젝트 팀이 있는 경우, 각 프로젝트 관리자에게 자신만의 공간을 주고 "여기, 원하는 대로 정리해 보세요."라고 말할 수 있습니다. 업무량을 줄여주는 것은 무엇이든 좋은 일입니다.

복제 및 스냅샷과 같은 많은 ZFS 기능은 데이터 세트 단위로 작동합니다. 데이터를 논리적 그룹으로 분리하면 조직을 지원하기 위해 이러한 ZFS 기능을 더 쉽게 사용할 수 있습니다.

각각 다른 팀에서 관리하는 수십 개의 사이트가 있는 웹 서버를 예로 들어 보겠습니다. 어떤 팀은 여러 사이트를 담당하는 반면, 어떤 팀은 한 사이트만 담당합니다. 어떤 사람들은 여러 팀에 소속되어 있습니다. 전통적인 파일 시스템 모델을 따른다면 /webserver 데이터 집합을 만들어 모든 것을 그 안에 넣고 그룹 권한과 sudo(8)로 액세스를 제어할 수 있습니다. 수십 년 동안 이런 식으로 살아왔고 잘 작동하는데 왜 바꾸어야 할까요?

하지만 각 팀에 대한 데이터 집합을 만들고, 그 상위 데이터 집합 내에 각 사이트에 고유한 데이터 집합을 부여하면 가능성은 배가됩니다.

팀에서 테스트를 위해 웹 사이트의 사본이 필요하신가요? 복제하세요. 기존 파일 시스템에서는 사이트 디렉터리 전체를 복사해야 하므로 사이트에 필요한 디스크의 양이 두 배로 늘어나고 시간도 훨씬 더 오래 걸립니다. 복제본은 사이트 간의 차이점에 해당하는 공간만 사용하며 즉시 나타납니다.

팀에서 사이트의 새 버전을 배포하려고 하는데 이전 사이트의 백업이 필요하신가요? 스냅샷을 만드세요. 이 새 사이트는 이전 사이트와 동일한 파일을 많이 사용하므로 디스크 공간 사용량을 줄일 수 있습니다. 또한 배포가 심하게 잘못되었을 때 스냅샷으로 롤백하여 이전 버전을 복원할 수 있습니다.

특정 웹 사이트에 파일 시스템 수준의 성능 조정이나 압축 또는 로컬에서 생성된 일부 속성이 필요하신가요? 해당 사이트에 맞게 설정하세요.

각 팀에 대한 데이터 집합을 만든 다음, 각 팀에서 각자의 사이트에 대한 하위 데이터 집합을 만들도록 할 수 있습니다. 기술에 맞게 인력을 구성하는 것이 아니라 인력에 맞게 데이터 집합을 구성할 수 있습니다.

모든 사이트에서 파일 시스템 설정(속성)을 변경해야 하는 경우, 상위 데이터 집합을 변경하고 하위 데이터 집합이 이를 상속하도록 합니다.

사용자 홈 디렉토리에도 동일한 속성이 적용됩니다.

컴퓨터 간에 데이터 세트를 이동할 수도 있습니다. 웹 사이트가 웹 서버를 넘치나요? 데이터 세트의 절반을 사용자 지정 설정, 모든 복제본 및 스냅샷과 함께 새 서버로 전송하세요.

많은 파일시스템 데이터세트를 사용하는 데는 한 가지 단점이 있습니다. 파일시스템 내에서 파일을 이동하면 파일 이름이 변경됩니다. 서로 다른 파일시스템 간에 파일을 이동하려면 이름만 바꾸는 것이 아니라 파일을 새 위치로 복사하고 이전 위치에서 삭제해야 합니다. 데이터 세트 간 파일 복사는 시간이 더 걸리고 더 많은 여유 공간이 필요합니다. 하지만 ZFS가 여러 데이터 세트에 제공하는 모든 이점에 비하면 사소한 문제입니다. 이 문제는 다른 파일시스템에도 존재하지만, 대부분의 다른 파일시스템을 사용하는 호스트는 파티션이 몇 개 밖에 없기 때문에 잘 드러나지 않습니다.

데이터 세트 보기 (Viewing Datasets)

zfs list 명령은 모든 데이터 세트와 그에 대한 몇 가지 기본 정보를 표시합니다.

$ zfs list 
NAME                USED  AVAIL  REFER  MOUNTPOINT 
mypool              420M  17.9G    96K  none 
mypool/ROOT         418M  17.9G    96K  none 
mypool/ROOT/default 418M  17.9G   418M  / 
...

첫 번째 필드에는 데이터 세트의 이름이 표시됩니다.

USED 및 REFER 아래에서 데이터 집합이 사용하는 디스크 공간의 양에 대한 정보를 확인할 수 있습니다. ZFS의 놀라운 유연성과 효율성에 대한 한 가지 단점은 디스크 공간 사용량에 대한 해석을 이해하지 못하면 다소 비현실적으로 보인다는 것입니다. 6장에서는 디스크 공간과 이를 사용하는 전략에 대해 설명합니다.

AVAIL 열은 풀 또는 데이터 집합에 남아 있는 여유 공간을 표시합니다.

마지막으로 마운트포인트는 데이터세트를 마운트해야 하는 위치를 보여줍니다. 이는 데이터 세트가 마운트된다는 의미는 아니며, 단지 마운트될 경우 이 위치로 이동한다는 의미일 뿐입니다. 마운트된 모든 ZFS 파일시스템을 보려면 zfs mount를 사용하세요.

데이터세트를 인수로 제공하면 zfs list에 해당 특정 데이터 집합만 표시됩니다.

$ zfs list mypool/lamb 
NAME         USED  AVAIL  REFER  MOUNTPOINT 
mypool/lamb  192K  17.9G    96K  /lamb

-t 플래그와 유형으로 표시되는 데이터세트의 유형을 제한합니다. 파일 시스템, 볼륨 또는 스냅샷을 표시할 수 있습니다. 여기서는 스냅샷만 표시합니다.

$ zfs list -t snapshot 
NAME                    USED  AVAIL  REFER  MOUNTPOINT 
zroot/var/log/db@backup    0      -  10.0G  -

이제 파일시스템을 볼 수 있게 되었으니 직접 만들어 보겠습니다.

데이터세트 생성, 이동 및 삭제하기 (Creating, Moving, and Destroying Datasets)

zfs create 명령을 사용해 데이터세트를 생성합니다. 스냅샷, 복제본, 북마크는 7장에서 살펴보겠지만, 지금은 파일시스템과 볼륨에 대해 알아보겠습니다.

파일시스템 만들기 (Creating Filesystems)

파일시스템은 대부분의 시스템에서 가장 일반적인 데이터세트 유형입니다. 누구나 파일을 저장하고 정리할 공간이 필요합니다. 풀과 파일 시스템 이름을 지정하여 파일 시스템 데이터집합을 만듭니다.

$ zfs create mypool/lamb

이렇게 하면 mypool이라는 ZFS 풀에 새 데이터 세트인 lamb이 생성됩니다. 풀에 기본 마운트 지점이 있는 경우 새 데이터 세트가 기본적으로 마운트됩니다(이 장 뒷부분의 "ZFS 파일 시스템 마운트" 참조).

$ mount | grep lamb 
mypool/lamb on /lamb (zfs, local, noatime, nfsv4acls)

괄호 안의 마운트 설정은 일반적으로 부모 데이터세트에서 상속된 ZFS 속성입니다. 하위 파일시스템을 만들려면 상위 파일시스템의 전체 경로를 입력합니다.

$ zfs create mypool/lamb/baby

이 장의 뒷부분에 나오는 '부모/자식 관계'에서 살펴보겠지만, 데이터세트는 마운트 지점을 비롯한 많은 특성을 부모로부터 상속받습니다.

볼륨 만들기 (Creating Volumes)

-V 플래그와 볼륨 크기를 사용하여 볼륨을 만들려는 볼륨을 zfs create에 알려줍니다. 볼륨 데이터세트의 전체 경로를 입력합니다.

$ zfs create -V 4G mypool/avolume

Zvols는 다른 데이터세트와 마찬가지로 데이터세트 목록에 표시됩니다. -t volume 옵션을 추가하여 zfs list에 zvol만 표시하도록 할 수 있습니다.

$ zfs list mypool/avolume 
NAME             USED  AVAIL  REFER  MOUNTPOINT 
ypool/avolume  4.13G  17.9G    64K  -

Z볼은 볼륨의 크기와 ZFS 메타데이터를 더한 만큼의 공간을 자동으로 예약합니다. 이 4GB zvol은 4.13GB의 공간을 사용합니다.

블록 디바이스로서 zvol에는 마운트 지점이 없습니다. 하지만 /dev/zvol 아래에 디바이스 노드가 있으므로 다른 블록 디바이스와 마찬가지로 액세스할 수 있습니다.

$ ls -al /dev/zvol/mypool/avolume 
crw-r-----  1 root  operator  0x4d Mar 27 20:22 /dev/zvol/mypool/avolume

이 디바이스 노드에서 newfs(8)를 실행하고 디스크 이미지를 복사한 후 일반적으로 다른 블록 디바이스처럼 사용할 수 있습니다.

데이터세트 이름 변경 (Renaming Datasets)

데이터세트의 이름을 바꾸려면 이상하게도 zfs rename 명령을 사용하면 됩니다. 데이터세트의 현재 이름을 첫 번째 인수로 지정하고 새 위치를 두 번째 인수로 지정합니다.

$ zfs rename db/production db/old 
$ zfs rename db/testing db/production

데이터 세트의 이름을 강제로 바꾸려면 -f 플래그를 사용합니다. 프로세스가 실행 중인 파일시스템은 마운트 해제할 수 없지만 -f 플래그를 사용하면 강제로 마운트 해제할 수 있습니다. 데이터세트를 사용 중인 모든 프로세스는 사용 중이던 데이터에 대한 액세스 권한을 잃고 사용자가 원하는 대로 반응합니다.

아마 심할 겁니다.

데이터세트 이동하기 (Moving Datasets)

데이터세트를 ZFS 트리의 일부에서 다른 부분으로 이동하여 데이터세트를 새 부모의 하위 집합으로 만들 수 있습니다. 자식은 부모로부터 속성을 상속하므로 데이터세트의 많은 속성이 변경될 수 있습니다. 데이터세트에 특별히 설정된 속성은 변경되지 않습니다.

여기서는 내결함성을 개선하기 위해 몇 가지 속성을 설정한 새 상위 데이터세트인 zroot/var/db 데이터세트 아래에서 데이터베이스를 이동합니다.

$ zfs rename zroot/var/db/mysql zroot/important/mysql

마운트 지점은 상속되므로, 이렇게 하면 데이터 집합의 마운트 지점이 변경될 수 있습니다. rename 명령에 -u 플래그를 추가하면 ZFS가 마운트 지점을 즉시 변경하지 않으므로 프로퍼티를 의도한 값으로 재설정할 시간을 벌 수 있습니다. 컴퓨터를 다시 시작하거나 데이터세트를 수동으로 다시 마운트하면 새 마운트 지점을 사용한다는 점을 기억하세요.

스냅샷의 이름을 변경할 수는 있지만 상위 데이터 세트에서 스냅샷을 이동할 수는 없습니다. 스냅샷은 7장에서 자세히 다룹니다.

데이터세트 삭제하기 (Destroying Datasets)

데이터 세트가 지겨우신가요? 헛간 뒤로 끌어다 놓고 zfs destroy를 통해 고통에서 벗어나세요.

$ zfs destroy db/old

-r 플래그를 추가하면 데이터세트의 모든 자식(데이터 세트, 스냅샷 등)을 재귀적으로 파기합니다. 복제된 데이터세트를 모두 파기하려면 -R을 사용합니다. 데이터세트의 자식이 정확히 무엇인지 알 수 없는 경우가 종종 있으므로 데이터세트를 재귀적으로 파기할 때는 매우 주의하세요.

데이터세트를 파기할 때 어떤 일이 발생하는지 정확히 확인하려면 -v 및 -n 플래그를 사용할 수 있습니다. -v 플래그는 소멸되는 항목에 대한 자세한 정보를 출력하고, -n은 zfs(8)에 드라이런을 수행하도록 지시합니다. 이 두 플래그는 트리거를 실행하기 전에 이 명령이 실제로 무엇을 파괴하는지 보여줍니다.

ZFS 속성 (ZFS Properties)

ZFS 데이터세트에는 데이터세트 작동 방식을 제어하는 속성이라고 하는 여러 가지 설정이 있습니다. 이 중 몇 가지는 데이터세트를 만들 때만 설정할 수 있지만, 대부분은 데이터세트가 라이브 상태일 때 조정할 수 있습니다. ZFS는 또한 데이터세트가 사용하는 공간의 양, 압축 또는 중복 제거 비율, 데이터세트의 생성 시간 등의 정보를 제공하는 여러 읽기 전용 속성을 제공합니다.

각 데이터세트는 해당 데이터세트에 속성이 특별히 설정되어 있지 않는 한 부모로부터 속성을 상속받습니다.

속성 보기 (Viewing Properties)

zfs(8) 도구는 특정 속성 또는 데이터 세트의 모든 속성을 검색할 수 있습니다. zfs get 명령과 원하는 속성, 원하는 경우 데이터 세트 이름을 사용합니다.

$ zfs get compression mypool/lamb
NAME         PROPERTY     VALUE    SOURCE
mypool/lamb  compression  lz4      inherited from mypool

NAME 아래에는 요청한 데이터 집합이 표시되고, PROPERTY에는 요청한 속성이 표시됩니다. VALUE는 속성이 설정된 값입니다.

SOURCE는 조금 더 복잡합니다. 기본 소스는 이 속성이 ZFS의 기본값으로 설정되어 있음을 의미합니다. 로컬 소스는 누군가가 이 데이터세트에 이 속성을 의도적으로 설정했음을 의미합니다. 임시 속성은 데이터세트가 마운트될 때 설정되었으며, 데이터세트가 마운트 해제되면 이 속성은 일반적인 값으로 되돌아갑니다. 상속된 속성은 이 장의 뒷부분에 있는 "상위/하위 관계"에서 설명하는 대로 상위 데이터세트에서 가져옵니다.

일부 속성은 소스가 관련이 없거나 본질적으로 명백하기 때문에 소스가 없습니다. 데이터세트가 생성된 날짜와 시간을 기록하는 생성 속성에는 소스가 없습니다. 이 값은 시스템 시계에서 가져온 것입니다.

데이터세트 이름을 지정하지 않으면 zfs get은 모든 데이터세트에 대해 이 속성의 값을 표시합니다. 특수 속성 키워드는 모두 데이터세트의 모든 속성을 검색합니다.

$ zfs get all mypool/lamb 
NAME         PROPERTY   VALUE                 SOURCE 
mypool/lamb  type       filesystem            - 
mypool/lamb  creation   Fri Mar 27 20:05 2015 - 
mypool/lamb  used       192K                  - 
...

all를 사용하고 데이터세트 이름을 지정하지 않으면 모든 데이터세트에 대한 모든 속성을 가져옵니다. 이것은 많은 정보입니다. 속성 이름을 쉼표로 구분하여 여러 속성을 표시합니다.

$ zfs get quota,reservation zroot/home 
NAME        PROPERTY     VALUE   SOURCE 
zroot/home  quota        none    local 
zroot/home  reservation  none    default

zfs list와 -o 수정자를 사용하여 속성을 볼 수도 있습니다. 이 방법은 여러 데이터세트의 여러 속성을 보려는 경우에 가장 적합합니다. 데이터세트의 이름을 표시하려면 특수 속성 name을 사용합니다.

$ zfs list -o name,quota,reservation 
NAME                QUOTA  RESERV 
db                   none    none 
zroot                none    none 
zroot/ROOT           none    none 
zroot/ROOT/default   none    none 
... 
zroot/var/log        100G     20G 
...

데이터세트 이름을 추가하여 해당 데이터세트에 대해 이러한 속성을 이 형식으로 볼 수도 있습니다.

속성 변경 (Changing Properties)

zfs set 명령으로 속성을 변경합니다. 속성 이름, 새 설정, 데이터세트 이름을 입력합니다. 여기서는 compression 속성을 off로 변경합니다.

$ zfs set compression=off mypool/lamb/baby

zfs get으로 변경 사항을 확인합니다.

$ zfs get compression mypool/lamb/baby 
NAME              PROPERTY     VALUE  SOURCE 
mypool/lamb/baby  compression  off    local

대부분의 속성은 속성이 변경된 후에 기록된 데이터에만 적용됩니다. compression 속성은 디스크에 쓰기 전에 데이터를 압축하도록 ZFS에 지시합니다. 압축에 대해서는 6장에서 설명합니다. 압축을 비활성화해도 변경 전에 쓰여진 데이터는 압축이 해제되지 않습니다. 마찬가지로 압축을 활성화해도 디스크에 이미 있는 데이터가 마술처럼 압축되지 않습니다. 압축 활성화의 이점을 최대한 활용하려면 모든 파일을 다시 작성해야 합니다. 새 데이터 세트를 생성하고 zfs 전송을 통해 데이터를 복사한 다음 원본 데이터 세트를 삭제하는 것이 좋습니다.

읽기 전용 속성 (Read-Only Properties)

ZFS는 읽기 전용 속성을 사용하여 데이터 집합에 대한 기본 정보를 제공합니다. 디스크 공간 사용량은 속성으로 표현됩니다. "디스크가 반쯤 찼습니다."라는 속성을 변경하여 사용 중인 데이터의 양을 변경할 수는 없습니다. (6장에서는 ZFS 디스크 공간 사용량에 대해 다룹니다.) creation 속성은 이 데이터 집합이 언제 만들어졌는지 기록합니다. 디스크에 데이터를 추가하거나 제거하여 많은 읽기 전용 속성을 변경할 수 있지만 이러한 속성을 직접 쓸 수는 없습니다.

Filesystem Properties

One key tool for managing the performance and behavior of traditional filesystems is mount options. You can mount traditional filesystems read-only, or use the noexec flag to disable running programs from them. ZFS uses properties to achieve the same effects. Here are the properties used to accomplish these familiar goals.

atime

A file’s atime indicates when the file was last accessed. ZFS’ atime property controls whether the dataset tracks access times. The default value, on, updates the file’s atime metadata every time the file is accessed. Using atime means writing to the disk every time it’s read.

Turning this property off avoids writing to the disk when you read a file, and can result in significant performance gains. It might confuse mailers and other similar utilities that depend on being able to determine when a file was last read.

Leaving atime on increases snapshot size. The first time a file is accessed, its atime is updated. The snapshot retains the original access time, while the live filesystem contains the newly updated accessed time. This is the default.

exec

The exec property determines if anyone can run binaries and commands on this filesystem. The default is on, which permits execution. Some environments don’t permit users to execute programs from their personal or temporary directories. Set the exec property to off to disable execution of programs on the filesystem.

The exec property doesn’t prohibit people from running interpreted scripts, however. If a user can run /bin/sh, they can run /bin/sh /home/mydir/script.sh. The shell is what’s actually executing—it only takes instructions from the script.

readonly

If you don’t want anything writing to this dataset, set the readonly property to on. The default, off, lets users modify the dataset within administrative permissions.

setuid

Many people consider setuid programs risky.2 While some setuid programs must be setuid, such as passwd(1) and login(1), there’s rarely a need to have setuid programs on filesystems like /home and /tmp. Many sysadmins disallow setuid programs except on specific filesystems.

ZFS’ setuid property toggles setuid support. If set to on, the filesystem supports setuid. If set to off, the setuid flag is ignored.

User-Defined Properties

ZFS properties are great, and you can’t get enough of them, right? Well, start adding your own. The ability to store your own metadata along with your datasets lets you develop whole new realms of automation. The fact that children automatically inherit these properties makes life even easier.

To make sure your custom properties remain yours, and don’t conflict with other people’s custom properties, create a namespace. Most people prefix their custom properties with an organizational identifier and a colon. For example, FreeBSD-specific properties have the format “org.freebsd:propertyname,” such as org.freebsd:swap. If the illumos project creates its own property named swap, they’d call it org.illumos:swap. The two values won’t collide.

For example, suppose Jude wants to control which datasets get backed up via a dataset property. He creates the namespace com.allanjude.3 Within that namespace, he creates the property backup_ignore.

# zfs set com.allanjude:backup_ignore=on mypool/lamb

Jude’s backup script checks the value of this property. If it’s set to true, the backup process skips this dataset.

Parent/Child Relationships

Datasets inherit properties from their parent datasets. When you set a property on a dataset, that property applies to that dataset and all of its children. For convenience, you can run zfs(8) commands on a dataset and all of its children by adding the -r flag. Here, we query the compression property on a dataset and all of its children.

# zfs get -r compression mypool/lamb NAME PROPERTY VALUE SOURCE mypool/lamb compression lz4 inherited from mypool mypool/lamb/baby compression off local

Look at the source values. The first dataset, mypool/lamb, inherited this property from the parent pool. In the second dataset, this property has a different value. The source is local, meaning that the property was set specifically on this dataset.

We can restore the original setting with the zfs inherit command.

# zfs inherit compression mypool/lamb/baby # zfs get -r compression mypool/lamb NAME PROPERTY VALUE SOURCE mypool/lamb compression lz4 inherited from mypool mypool/lamb/baby compression lz4 inherited from mypool

The child now inherits the compression properties from the parent, which inherits from the grandparent.

When you change a parent’s properties, the new properties automatically propagate down to the child.

# zfs set compression=gzip-9 mypool/lamb # zfs get -r compression mypool/lamb NAME PROPERTY VALUE SOURCE mypool/lamb compression gzip-9 local mypool/lamb/baby compression gzip-9 inherited from mypool/lamb

I told the parent dataset to use gzip-9 compression. That percolated down to the child.

Inheritance and Renaming

When you move or rename a dataset so that it has a new parent, the parent’s properties automatically propagate down to the child. Locally set properties remain unchanged, but inherited ones switch to those from the new parent.

Here we create a new parent dataset and check its compression property.

# zfs create mypool/second # zfs get compress mypool/second NAME PROPERTY VALUE SOURCE mypool/second compression lz4 inherited from mypool

Our baby dataset uses gzip-9 compression. It’s inherited this property from mypool/lamb. Now let’s move baby to be a child of second, and see what happens to the compression property.

# zfs rename mypool/lamb/baby mypool/second/baby # zfs get -r compression mypool/second NAME PROPERTY VALUE SOURCE mypool/second compression lz4 inherited from mypool mypool/second/baby compression lz4 inherited from mypool

The child dataset now belongs to a different parent, and inherits its properties from the new parent. The child keeps any local properties.

Data on the baby dataset is a bit of a tangle, however. Data written before compression was turned on is uncompressed. Data written while the dataset used gzip-9 compression is compressed with gzip-9. Any data written now will be compressed with lz4. ZFS sorts all this out for you automatically, but thinking about it does make one's head hurt.

Removing Properties

While you can set a property back to its default value, it’s not obvious how to change the source back to inherit or default, or how to remove custom properties once they’re set.

To remove a custom property, inherit it.

# zfs inherit com.allanjude:backup_ignore mypool/lamb

This works even if you set the property on the root dataset.

To reset a property to its default value on a dataset and all its children, or totally remove custom properties, use the zfs inherit command on the pool’s root dataset.

# zfs inherit -r compression mypool

It’s counterintuitive, but it knocks the custom setting off of the root dataset.

Mounting ZFS Filesystems

With traditional filesystems you listed each partition, its type, and where it should be mounted in /etc/fstab. You even listed temporary mounts such as floppies and CD-ROM drives, just for convenience. ZFS allows you to create such a large number of filesystems that this quickly grows impractical.

Each ZFS filesystem has a mountpoint property that defines where it should be mounted. The default mountpoint is built from the pool’s mountpoint. If a pool doesn’t have a mount point, you must assign a mount point to any datasets you want to mount.

# zfs get mountpoint zroot/usr/home NAME PROPERTY VALUE SOURCE zroot/usr/home mountpoint /usr/home inherited from zroot/usr

The filesystem normally get mounted at /usr/home. You could override this when manually mounting the filesystem.

The zroot pool used for a default FreeBSD install doesn’t have a mount point set. If you create new datasets directly under zroot, they won’t have a mount point. Datasets created on zroot under, say, /usr, inherit a mount point from their parent dataset.

Any pool other than the pool with the root filesystem normally has a mount point named after the pool. If you create a pool named db, it gets mounted at /db. All children inherit their mount point from that pool unless you change them.

When you change the mountpoint property for a filesystem, the filesystem and any children that inherit the mount point are unmounted. If the new value is legacy, then they remain unmounted. Otherwise, they are automatically remounted in the new location if the property was previously legacy or none, or if they were mounted before the property was changed. In addition, any shared filesystems are unshared and shared in the new location.

Just like ordinary filesystems, ZFS filesystems aren’t necessarily mounted. The canmount property controls a filesystem’s mount behavior. If canmount is set to yes, running zfs mount -a mounts the filesystem, just like mount -a. When you enable ZFS in /etc/rc.conf, FreeBSD runs zfs mount -a at startup.

When the canmount property is set to noauto, a dataset can only be mounted and unmounted explicitly. The dataset is not mounted automatically when the dataset is created or imported, nor is it mounted by the zfs mount -a command or unmounted by zfs unmount -a.

Things can get interesting when you set canmount to off. You might have two non-mountable datasets with the same mount point. A dataset can exist solely for the purpose of being the parent to future datasets, but not actually store files, as we’ll see below. C

hild datasets do not inherit the canmount property.

Changing the canmount property does not automatically unmount or mount the filesystem. If you disable mounting on a mounted filesystem, you’ll need to manually unmount the filesystem or reboot.

Datasets without Mount Points

ZFS datasets are hierarchical. You might need to create a dataset that will never contain any files only so it can be the common parent of a number of other datasets. Consider a default install of FreeBSD 10.1 or newer.

# zfs mount zroot/ROOT/default / zroot/tmp /tmp zroot/usr/home /usr/home zroot/usr/ports /usr/ports zroot/usr/src /usr/src ...

We have all sorts of datasets under /usr, but there’s no /usr dataset mounted. What’s going on?

A zfs list shows that a dataset exists, and it has a mount point of /usr. But let’s check the mountpoint and canmount properties of zroot/usr and all its children.

# zfs list -o name,canmount,mountpoint -r zroot/usr NAME CANMOUNT MOUNTPOINT zroot/usr off /usr zroot/usr/home on /usr/home zroot/usr/ports on /usr/ports zroot/usr/src on /usr/src

With canmount set to off, the zroot/usr dataset is never mounted. Any files written in /usr, such as the commands in /usr/bin and the packages in /usr/local, go into the root filesystem. Lower-level mount points such as /usr/src have their own datasets, which are mounted.

The dataset exists only to be a parent to the child datasets. You’ll see something similar with the /var partitions.

Multiple Datasets with the Same Mount Point

Setting canmount to off allows datasets to be used solely as a mechanism to inherit properties. One reason to set canmount to off is to have two datasets with the same mount point, so that the children of both datasets appear in the same directory, but might have different inherited characteristics.

FreeBSD’s installer does not have a mountpoint on the default pool, zroot. When you create a new dataset, you must assign a mount point to it.

If you don’t want to assign a mount point to every dataset you create right under the pool, you might assign a mountpoint of / to the zroot pool and leave canmount set to off. This way, when you create a new dataset, it has a mountpoint to inherit. This is a very simple example of using multiple datasets with the same mount point.

Imagine you want an /opt directory with two sets of subdirectories. Some of these directories contain programs, and should never be written to after installation. Other directories contain data. You must lock down the ability to run programs at the filesystem level.

# zfs create db/programs # zfs create db/data

Now give both of these datasets the mountpoint of /opt and tell them that they cannot be mounted.

# zfs set canmount=off db/programs # zfs set mountpoint=/opt db/programs

Install your programs to the dataset, and then make it read-only.

# zfs set readonly=on db/programs

You can’t run programs from the db/data dataset, so turn off exec and setuid. We need to write data to these directories, however.

# zfs set canmount=off db/data # zfs set mountpoint=/opt db/data # zfs set setuid=off db/data # zfs set exec=off db/data

Now create some child datasets. The children of the db/programs dataset inherit that dataset’s properties, while the children of the db/data dataset inherit the other set of properties.

# zfs create db/programs/bin # zfs create db/programs/sbin # zfs create db/data/test # zfs create db/data/production

We now have four datasets mounted inside /opt, two for binaries and two for data. As far as users know, these are normal directories. No matter what the file permissions say, though, nobody can write to two of these directories. Regardless of what trickery people pull, the system won’t recognize executables and setuid files in the other two. When you need another dataset for data or programs, create it as a child of the dataset with the desired settings. Changes to the parent datasets propagate immediately to all the children.

Pools without Mount Points

While a pool is normally mounted at a directory named after the pool, that isn’t necessarily so.

# zfs set mountpoint=none mypool

This pool no longer gets mounted. Neither does any dataset on the pool unless you specify a mount point. This is how the FreeBSD installer creates the pool for the OS.

# zfs set mountpoint=/someplace mypool/lamb

The directory will be created if necessary and the filesystem mounted.

Manually Mounting and Unmounting Filesystems

To manually mount a filesystem, use zfs mount and the dataset name. This is most commonly used for filesystems with canmount set to noauto.

# zfs mount mypool/usr/src

To unmount a filesystem and all of its children, use zfs unmount.

# zfs unmount mypool/second

If you want to temporarily mount a dataset at a different location, use the -o flag to specify a new mount point. This mount point only lasts until you unmount the dataset.

# zfs mount -o mountpoint=/mnt mypool/lamb

You can only mount a dataset if it has a mountpoint defined. Defining a temporary mount point when the dataset has no mount point gives you an error.

ZFS and /etc/fstab

You can choose to manage some or all of your ZFS filesystem mount points with /etc/fstab if you prefer. You can recreate the zvol device by renaming the volume with zfshe filesystem.

# zfs set mountpoint=legacy mypool/second

Now you can mount this dataset with the mount(8) command:

# mount -t zfs mypool/second /tmp/second

You can also add ZFS datasets to the system’s /etc/fstab. Use the full dataset name as the device node. Set the type to zfs. You can use the standard filesystem options of noatime, noexec, readonly or ro, and nosuid. (You could also explicitly give the default behaviors of atime, exec, rw, and suid, but these are ZFS’ defaults.) The mount order is normal, but the fsck field is ignored. Here’s an /etc/fstab entry that mounts the dataset scratch/junk nosuid at /tmp.

scratch/junk /tmp nosuid 2 0

Tweaking ZFS Volumes

Zvols are pretty straightforward—here’s a chunk of space as a block device; use it. You can adjust how a volume uses space and what kind of device node it offers.

Space Reservations

The volsize property of a zvol specifies the volume’s logical size. By default, creating a volume reserves an amount of space for the dataset equal to the volume size. (If you look ahead to Chapter 6, it establishes a refreservation of equal size.) Changing volsize changes the reservation. The volsize can only be set to a multiple of the volblocksize property, and cannot be zero.

Without the reservation, the volume could run out of space, resulting in undefined behavior or data corruption, depending on how the volume is used. These effects can also occur when the volume size is changed while it is in use, particularly when shrinking the size. Adjusting the volume size can confuse applications using the block device.

Zvols also support sparse volumes, also known as thin provisioning. A sparse volume is a volume where the reservation is less than the volume size. Essentially, using a sparse volume permits allocating more space than the dataset has available. With sparse provisioning you could, say, create ten 1 TB sparse volumes on your 5 TB dataset. So long as your volumes are never heavily used, nobody will notice that you’re overcommitted.

Sparse volumes are not recommended. Writes to a sparse volume can fail with an “out of space” error even if the volume itself looks only partially full.

Specify a sparse volume at creation time by specifying the -s option to the zfs create -V command. Changes to volsize are not reflected in the reservation. You can also reduce the reservation after the volume has been created.

Zvol Mode

FreeBSD normally exposes zvols to the operating system as geom(4) providers, giving them maximum flexibility. You can change this with the volmode property.

Setting a volume’s volmode to dev exposes volumes only as a character device in /dev. Such volumes can be accessed only as raw disk device files. They cannot be partitioned or mounted, and they cannot participate in RAIDs or other GEOM features. They are faster. In some cases where you don’t trust the device using the volume, dev mode can be safer.

Setting volmode to none means that the volume is not exposed outside ZFS. These volumes can be snapshotted, cloned, and replicated, however. These volumes can be suitable for backup purposes.

Setting volmode to default means that volume exposure is controlled by the sysctl vfs.zfs.vol.mode. You can set the default zvol mode system-wide. A value of 1 means the default is geom, 2 means dev, and 3 means none.

While you can change the property on a live volume, it has no effect. This property is processed only during volume creation and pool import. You can recreate the zvol device by renaming the volume with zfs rename.

Dataset Integrity

Most of ZFS’ protections work at the VDEV layer. That’s where blocks and disks go bad, after all. Some hardware limits pool redundancy, however. Very few laptops have enough hard drives to use mirroring, let alone RAID-Z. You can do some things at the dataset layer to offer some redundancy, however, by using checksums, metadata redundancy, and copies. Most users should never touch the first two, and users with redundant virtual devices probably want to leave all three alone.

Checksums

ZFS computes and stores checksums for every block that it writes. This ensures that when a block is read back, ZFS can verify that it is the same as when it was written, and has not been silently corrupted in one way or another. The checksum property controls which checksum algorithm the dataset uses. Valid settings are on, fletcher2, fletcher4, sha256, off, and noparity.

The default value, on, uses the algorithm selected by the OpenZFS developers. In 2015 that algorithm is fletcher4, but it might change in future releases.

The standard algorithm, fletcher4, is the default checksum algorithm. It’s good enough for most use and is very fast. If you want to use fletcher4 forever and ever, you could set this property to fletcher4. We recommend keeping the default of on, however, and letting ZFS upgrade your pool’s checksum algorithm when it’s time.

The value off disables integrity checking on user data.

The value noparity not only disables integrity but also disables maintaining parity for user data. This setting is used internally by a dump device residing on a RAID-Z pool and should not be used by any other dataset. Disabling checksums is not recommended.

Older versions of ZFS used the fletcher2 algorithm. While it’s supported for older pools, it’s certainly not encouraged. The sha256 algorithm is slower than fletcher4, but less likely to result in a collision. In most cases, a collision is not harmful.

The sha256 algorithm is frequently recommended when doing deduplication.

Copies

ZFS stores two or three copies of important metadata, and can give the same treatment to your important user data. The copies property tells ZFS how many copies of user data to keep. ZFS attempts to put those copies on different disks, or failing that, as far apart on the physical disk as possible, to help guard against hardware failure. When you increase the copies property, ZFS also increases the number of copies of the metadata for that dataset, to a maximum of three.

If your pool runs on two mirrored disks, and you set copies to 3, you’ll have six copies of your data. One of them should survive your ill-advised use of dd(1) on the raw provider device or that plunge off the roof.

Increasing or decreasing copies only affects data written after the setting change. Changing copies from 1 to 2 doesn’t suddenly create duplicate copies of all your data, as we see here. Create a 10 MB file of random data:

# dd if=/dev/random of=/lamb/random1 bs=1m count=10 10+0 records in 10+0 records out 10485760 bytes transferred in 0.144787 secs (72421935 bytes/sec) # zfs set copies=2 mypool/lamb

Now every block is stored twice. If one of the copies becomes corrupt, ZFS can still read your file. It knows which of the blocks is corrupt because its checksums won’t match. But look at the space use on the pool (the REFER space in the pool listing).

# zfs list mypool/lamb NAME USED AVAIL REFER MOUNTPOINT mypool/lamb 10.2M 13.7G 10.1M /lamb

Only the 10 MB we wrote were used. No extra copy was made of this file, as you wrote it before changing the copies property. With copies set to 2, however, if we either write another file or overwrite the original file, we’ll see different disk usage.

# dd if=/dev/random of=/lamb/random2 bs=1m count=10 10+0 records in 10+0 records out 10485760 bytes transferred in 0.141795 secs (73950181 bytes/sec)

Look at disk usage now.

# zfs list mypool/lamb NAME USED AVAIL REFER MOUNTPOINT mypool/lamb 30.2M 13.7G 30.1M /lamb

The total space usage is 30 MB, 10 for the first file of random data, and 20 for 2 copies of the second 10 MB file. When we look at the files with ls(1), they only show the actual size:

# ls -l /lamb/random* -rw-r--r-- 1 root wheel 10485760 Apr 6 15:27 /lamb/random1 -rw-r--r-- 1 root wheel 10485760 Apr 6 15:29 /lamb/random2

If you really want to muck with your dataset’s resilience, look at metadata redundancy.

Metadata Redundancy

Each dataset stores an extra copy of its internal metadata, so that if a single block is corrupted, the amount of user data lost is limited. This extra copy is in addition to any redundancy provided at the VDEV level (e.g., by mirroring or RAID-Z). It’s also in addition to any extra copies specified by the copies property (below), up to a total of three copies.

The redundant_metadata property lets you decide how redundant you want your dataset metadata to be. Most users should never change this property.

When redundant_metadata is set to all (the default), ZFS stores an extra copy of all metadata. If a single on-disk block is corrupt, at worst a single block of user data can be lost.

When you set redundant_metadata to most, ZFS stores an extra copy of only most types of metadata. This can improve performance of random writes, because less metadata must be written. When only most metadata is redundant, at worst about 100 blocks of user data can be lost if a single on-disk block is corrupt. The exact behavior of which metadata blocks are stored redundantly may change in future releases.

If you set redundant_metadata to most and copies to 3, and the dataset lives on a mirrored pool, then ZFS stores six copies of most metadata, and four copies of data and some metadata.

This property was designed for specific use cases that frequently update metadata, such as databases. If the data is already protected by sufficiently strong fault tolerance, reducing the number of copies of the metadata that must be written each time the database changes can improve performance. Change this value only if you know what you are doing.

Now that you have a grip on datasets, let’s talk about pool maintenance.

2 Properly written setuid programs are not risky. That’s why real setuid programs are risky.

3 When you name ZFS properties after yourself, you are immortalized by your work. Whether this is good or bad depends on your work.