About two years ago I have made a guide for really old GlusterFS 3.11 version that was available back then on FreeBSD 12.0. Recently I noticed that GlusterFS version in FreeBSD Ports (and packages) is not finally up-to-date with upstream GlusterFS versions.
This guide will show you how to create GlusterFS 8 distributed filesystem on latest FreeBSD 13. At the moment of writing this article FreeBSD 13 is at RC1 state but it will be released within a month.
While in the earlier guide I created dispersed volume with redundancy comparably to RAID6 but between 6 nodes not disks. This means that 2 of 6 nodes can crash and that GlusterFS would still work without a problem. Today I will show you more minimalistic approach with 3 node setup and a volume that takes space only on nodes node0 and node1 while node2 will be used as an arbiter only and does not hold any data. The arbiter greatly improves split brain problems because instead of vulnerable two node cluster we have a three nodes in the cluster so even if any of them fails we still have 2 of 3 votes.
I will not repeat all ‘initial’ steps needed to prepare these three FreeBSD hosts as it was already described here – GlusterFS Cluster on FreeBSD with Ansible and GNU Parallel – in the older article about that topic. I will focus on the GlusterFS commands that need to be executed to achieve our goal.
We will use several prompts in this guide to show which commands will be executed on which nodes.
[ALL] # command that will be executed on all node0/node1/node2 nodes [node0] # command that will be executed on node0 only
GlusterFS
We have three nodes on our lab.
- node0 - 10.0.10.200 - DATA NODE 'A'
- node1 - 10.0.10.201 - DATA NODE 'B'
- node2 - 10.0.10.202 - ARBITER NODE
Install and then enable and start the GlusterFS.
[ALL] # pkg install glusterfs [ALL] # sysrc glusterd_enable=YES glusterd_enable: -> YES [ALL] # service glusterd start Starting glusterd.
Enable and mount the /proc filesystem and create needed directories for GlusterFS bricks.
[ALL] # grep procfs /etc/fstab proc /proc procfs rw 0 0 [ALL] # mount /proc [ALL] # mkdir -p /bricks/data/{01,02,03,04}
Now connect all these nodes into one cluster and create GlusterFS volume.
[node0] # gluster peer status Number of Peers: 0 [node0] # gluster peer probe node1 peer probe: success [node0] # gluster peer probe node2 peer probe: success [node0] # gluster peer status Number of Peers: 2 Hostname: node1 Uuid: b5bc1602-a7bb-4f62-8149-98ca97be1784 State: Peer in Cluster (Connected) Hostname: node2 Uuid: 2bfa0c71-04b4-4660-8a5c-373efc5da15c State: Peer in Cluster (Connected) [node0] # gluster volume create data \ replica 2 \ arbiter 1 \ node0:/bricks/data/01 \ node1:/bricks/data/01 \ node2:/bricks/data/01 \ node0:/bricks/data/02 \ node1:/bricks/data/02 \ node2:/bricks/data/02 \ node0:/bricks/data/03 \ node1:/bricks/data/03 \ node2:/bricks/data/03 \ node0:/bricks/data/04 \ node1:/bricks/data/04 \ node2:/bricks/data/04 \ force volume create: data: success: please start the volume to access data [node0] # gluster volume start data volume start: data: success [node0] # gluster volume info Volume Name: data Type: Distributed-Replicate Volume ID: f73d57ea-6f10-4840-86e7-f8178540e948 Status: Started Snapshot Count: 0 Number of Bricks: 4 x (2 + 1) = 12 Transport-type: tcp Bricks: Brick1: node0:/bricks/data/01 Brick2: node1:/bricks/data/01 Brick3: node2:/bricks/data/01 (arbiter) Brick4: node0:/bricks/data/02 Brick5: node1:/bricks/data/02 Brick6: node2:/bricks/data/02 (arbiter) Brick7: node0:/bricks/data/03 Brick8: node1:/bricks/data/03 Brick9: node2:/bricks/data/03 (arbiter) Brick10: node0:/bricks/data/04 Brick11: node1:/bricks/data/04 Brick12: node2:/bricks/data/04 (arbiter) Options Reconfigured: storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on performance.client-io-threads: off [node0] # gluster volume status Status of volume: data Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick node0:/bricks/data/01 49152 0 Y 4595 Brick node1:/bricks/data/01 49152 0 Y 1022 Brick node2:/bricks/data/01 49152 0 Y 3356 Brick node0:/bricks/data/02 49153 0 Y 4597 Brick node1:/bricks/data/02 49153 0 Y 1024 Brick node2:/bricks/data/02 49153 0 Y 3358 Brick node0:/bricks/data/03 49154 0 Y 4599 Brick node1:/bricks/data/03 49154 0 Y 1026 Brick node2:/bricks/data/03 49154 0 Y 3360 Brick node0:/bricks/data/04 49155 0 Y 4601 Brick node1:/bricks/data/04 49155 0 Y 1028 Brick node2:/bricks/data/04 49155 0 Y 3362 Self-heal Daemon on localhost N/A N/A Y 4604 Self-heal Daemon on node1 N/A N/A Y 1031 Self-heal Daemon on node2 N/A N/A Y 3365 Task Status of Volume data ------------------------------------------------------------------------------ There are no active volume tasks [node0] # ps aux | grep -e gluster -e RSS | cut -d ' ' -f 1-27 USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 4604 4.0 0.7 64364 22520 - Rs 21:15 53:50.30 /usr/local/sbin/glusterfs -s localhost --volfile-id shd/data -p root 4585 3.0 0.7 48264 21296 - Rs 21:14 56:13.25 /usr/local/sbin/glusterd --pid-file=/var/run/glusterd.pid (glusterfsd) root 4597 3.0 0.7 66472 22484 - Rs 21:15 48:54.63 /usr/local/sbin/glusterfsd -s node0 --volfile-id data.node0.bricks-data-02 -p root 4599 3.0 0.7 62376 22464 - Rs 21:15 48:23.41 /usr/local/sbin/glusterfsd -s node0 --volfile-id data.node0.bricks-data-03 -p root 4595 2.0 0.8 66864 23724 - Rs 21:15 49:03.23 /usr/local/sbin/glusterfsd -s node0 --volfile-id data.node0.bricks-data-01 -p root 4601 2.0 0.7 62376 22444 - Rs 21:15 49:17.01 /usr/local/sbin/glusterfsd -s node0 --volfile-id data.node0.bricks-data-04 -p root 6748 0.0 0.1 12868 2560 2 S+ 19:59 0:00.00 grep -e gluster -e
The GlusterFS data volume is now created and started. You can mount it and use it the way you like.
[node2] # mkdir /data [node2] # kldload fusefs [node2] # mount_glusterfs node0:/data /data [node2] # echo $? 0 [node2] # df -h /data Filesystem Size Used Avail Capacity Mounted on /dev/fuse 123G 2.5G 121G 2% /data
Voila! Mounted and ready to serve.
Tuning
GlusterFS comes without any tuning applied so I suggest something to start with.
[node0] # gluster volume set data client.event-threads 8 [node0] # gluster volume set data cluster.lookup-optimize on [node0] # gluster volume set data cluster.readdir-optimize on [node0] # gluster volume set data features.cache-invalidation on [node0] # gluster volume set data group metadata-cache [node0] # gluster volume set data network.inode-lru-limit 200000 [node0] # gluster volume set data performance.cache-invalidation on [node0] # gluster volume set data performance.cache-refresh-timeout 10 [node0] # gluster volume set data performance.cache-size 1GB [node0] # gluster volume set data performance.io-thread-count 16 [node0] # gluster volume set data performance.parallel-readdir on [node0] # gluster volume set data performance.stat-prefetch on [node0] # gluster volume set data performance.write-behind-trickling-writes on [node0] # gluster volume set data performance.write-behind-window-size 100MB [node0] # gluster volume set data server.event-threads 8 [node0] # gluster volume set data server.outstanding-rpc-limit 256
That is all in this rather short guide.
Treat it as an addendum to the original GlusterFS article linked earlier.
Pingback: GlusterFS 8 on FreeBSD 13 | 0ddn1x: tricks with *nix