Monthly Archives: June 2019

Valuable News – 2019/06/24

The Valuable News weekly series is dedicated to provide summary about news, articles and other interesting stuff mostly but not always related to the UNIX or BSD systems. Whenever I stumble upon something worth mentioning on the Internet I just put it here.

Today the amount information that we get using various information streams is at massive overload. Thus one needs to focus only on what is important without the need to grep(1) the Internet everyday. Hence the idea of providing such information ‘bulk’ as I already do that grep(1).

UNIX

DragonFly BSD 5.6 Released.
https://www.dragonflybsd.org/release56/
http://lists.dragonflybsd.org/pipermail/users/2019-June/358196.html

FreeBSD and Linux Kernel – Multiple TCP Based Remote Denial of Service Vulnerabilities from Netflix.
By default FreeBSD is not vulnerable – you need to recompile kernel to use RACK TCP stack.
https://github.com/Netflix/security-bulletins/blob/master/advisories/third-party/2019-001.md

Journaling Kafka Messages with S3 Connector and Minio.
https://blog.minio.io/journaling-kafka-messages-with-s3-connector-and-minio-83651a51045d

Pock Can Place macOS Dock Inside MacBook TouchBar.
https://pock.dev/

RAMBleed on Solaris/SPARC?
https://www.solaris.wtf/blog/rambleed-on-solaris-sparc/

XigmaNAS 12.0.0.4.6766 and 11.2.0.4.6766 Released.
https://twitter.com/XigmaNAS/status/1141716786376839168
https://twitter.com/XigmaNAS/status/1141716836234518528

New Features in OpenZFS 0.8.1.
https://arstechnica.com/gadgets/2019/06/zfs-features-bugfixes-0-8-1/

FreeBSD Reworks Its W^X Implementation – Extending mmap/mprotect API to Specify Max Page Protections.
https://twitter.com/lattera/status/1141795840748609537
https://svnweb.freebsd.org/base?view=revision&revision=349240

BSD Now 303 – OpenZFS in Ports.
https://www.bsdnow.tv/303

FreeBSD from Linux Developer Perspective.
https://www.bsdcan.org/2019/schedule/events/1057.en.html
https://youtu.be/ml8vo1FWT1I

DragonFly BSD VM Work in 5.6 Release.
http://lists.dragonflybsd.org/pipermail/users/2019-June/358196.html

DragonFly BSD HAMMER1 versus HAMMER2 Performance.
https://www.phoronix.com/scan.php?page=news_item&px=DragonFlyBSD-5.6-HAMMER2-Perf

FreeBSD reworks random(4) to Make Fortuna Allow Increased Concurrency.
https://svnweb.freebsd.org/base?view=revision&revision=349154

FreeBSD Adds ACPI Support for USB Driver.
https://svnweb.freebsd.org/base?view=revision&revision=349161

OpenSSH Gets Protection Against Side Channel Attacks.
https://undeadly.org/cgi?action=article;sid=20190621081455

The vmtouch – Virtual Memory Toucher.
https://hoytech.com/vmtouch/
https://github.com/hoytech/vmtouch

Managing Jails with Ansible.
https://twitter.com/BSDTV/status/1142126677885095941

Click to access 512_handout.pdf


https://youtu.be/FUxdm4rdrf8

Intel has become 1st Uranium FreeBSD donor of 2019.
https://twitter.com/freebsdfndation/status/1142125266543755267

8 Popular Products You Did Not Know Were Built with FreeBSD.
https://www.designnews.com/design-hardware-software/8-popular-products-you-didnt-know-were-built-open-source

Why FreeBSD’s VIMAGE is Awesome.
https://twitter.com/BSDTV/status/1142093221599895552
https://www.bsdcan.org/2019/schedule/events/1036.en.html

Click to access 507_Automated_firewall_testing.pdf


https://youtu.be/gTyt7KLz1mw

FreeBSD 11.3-RC2 Available.
https://lists.freebsd.org/pipermail/freebsd-stable/2019-June/091293.html

The xsv is fast CSV command line toolkit written in Rust.
https://github.com/BurntSushi/xsv

In Other BSDs for 2019/06/22.
https://www.dragonflydigest.com/2019/06/22/23086.html

DragonFly BSD 5.6.1 Released.
https://www.dragonflybsd.org/

OpenBSD Merges LLVM 8.0.0 Release.
https://marc.info/?l=openbsd-cvs&m=156132754603489&w=2

Hardware

WD Ultrastar DC HC510 10TB SATA Hard Drive Review.
https://www.servethehome.com/hgst-wd-ultrastar-dc-hc510-10tb-sata-hdd-review/

Life

How to Be Great? Just Be Good – Repeatably.
https://blog.stephsmith.io/how-to-be-great/

CIA Spied on People Through Their Smart TVs.
https://www.vice.com/en_us/article/8qbq5x/the-cia-spied-on-people-through-their-smart-tvs-leaked-documents-reveal

How Information is Like Snacks/Money/Drugs to Your Brain.
http://newsroom.haas.berkeley.edu/how-information-is-like-snacks-money-and-drugs-to-your-brain/

Swedish Couple Builds Greenhouse Around Home to Stay Warm and Grow Food All Year Long.
https://returntonow.net/2019/03/04/swedish-couple-builds-greenhouse-around-home-to-stay-warm-and-grow-food-all-year-long/

Other

Google Chrome has Become Surveillance Software – Time to Switch.
https://www.washingtonpost.com/technology/2019/06/21/google-chrome-has-become-surveillance-software-its-time-switch/

Why You Need to Give Firefox a Chance.
https://dev.to/dtroode/why-you-need-to-give-firefox-a-chance-5g5a

Will Smith Invests in App that Helps Teens with Financial Literacy.
https://www.blackenterprise.com/will-smith-invests-in-app-that-helps-teens-with-financial-literacy/

EOF

FreeBSD Enterprise 1 PB Storage

Today FreeBSD operating system turns 26 years old. 19 June is an International FreeBSD Day. This is why I got something special today :). How about using FreeBSD as an Enterprise Storage solution on real hardware? This where FreeBSD shines with all its storage features ZFS included.

Today I will show you how I have built so called Enterprise Storage based on FreeBSD system along with more then 1 PB (Petabyte) of raw capacity.

I have build various storage related systems based on FreeBSD:

This project is different. How much storage space can you squeeze from a single 4U system? It turns out a lot! Definitely more then 1 PB (1024 TB) of raw storage space.

Here is the (non clickable) Table of Contents.

  • Hardware
  • Management Interface
  • BIOS/UEFI
  • FreeBSD System
    • Disks Preparation
    • ZFS Pool Configuration
    • ZFS Settings
    • Network Configuration
    • FreeBSD Configuration
  • Purpose
  • Performance
    • Network Performance
    • Disk Subsystem Performance
  • FreeNAS
  • UPDATE 1 – BSD Now 305
  • UPDATE 2 – Real Life Pictures in Data Center

Hardware

There are 4U servers with 90-100 3.5″ drive slots which will allow you to pack 1260-1400 Terabytes of data (with 14 TB drives). Examples of such systems are:

I would use the first one – the TYAN FA100 for short name.

logo-tyan.png

While both GlusterFS and Minio clusters were cone on virtual hardware (or even FreeBSD Jails containers) this one uses real physical hardware.

The build has following specifications.

 2 x 10-Core Intel Xeon Silver 4114 CPU @ 2.20GHz
 4 x 32 GB RAM DDR4 (128 GB Total)
 2 x Intel SSD DC S3500 240 GB (System)
90 x Toshiba HDD MN07ACA12TE 12 TB (Data)
 2 x Broadcom SAS3008 Controller
 2 x Intel X710 DA-2 10GE Card
 2 x Power Supply

Price of the whole system is about $65 000 – drives included. Here is how it looks.

tyan-fa100-small.jpg

One thing that you will need is a rack cabinet that is 1200 mm long to fit that monster πŸ™‚

Management Interface

The so called Lights Out management interface is really nice. Its not bloated, well organized and works quite fast. you can create several separate user accounts or can connect to external user services like LDAP/AD/RADIUS for example.

n01.png

After logging in a simple Dashboard welcomes us.

n02.png

We have access to various Sensor information available with temperatures of system components.

n03

We have System Inventory information with installed hardware.

n04.png

There is separate Settings menu for various setup options.

n05.png

I know its 2019 but HTML5 only Remote Control (remote console) without need for any third party plugins like Java/Silverlight/Flash/… is very welcomed. It works very well too.

n06.png

n07.png

One is of course allowed to power on/off/cycle the box remotely.

n08.png

The Maintenance menu for BIOS updates.

n09.png

BIOS/UEFI

After booting into the BIOS/UEFI setup its possible to select from which drives to boot from. On the screenshots the two SSD drives prepared for system.

nas01.png

The BIOS/UEFI interface shows two Enclosures but its two Broadcom SAS3008 controllers. Some drive are attached via first Broadcom SAS3008 controller, the rest is attached via the second one, and they call them Enclosures instead od of controllers for some reason.

nas05.png

FreeBSD System

I have chosen latest FreeBSD 12.0-RELEASE for the purpose of this installation. Its generally very ‘default’ installation with ZFS mirror on two SSD disks. Nothing special.

logo-freebsd.jpg

The installation of course supports the ZFS Boot Environments bulletproof upgrades/changes feature.

# zpool list zroot
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
zroot   220G  3.75G   216G        -         -     0%     1%  1.00x  ONLINE  -

# zpool status zroot
  pool: zroot
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            da91p4  ONLINE       0     0     0
            da11p4  ONLINE       0     0     0

errors: No known data errors

# df -g
Filesystem              1G-blocks Used  Avail Capacity  Mounted on
zroot/ROOT/default            211    2    209     1%    /
devfs                           0    0      0   100%    /dev
zroot/tmp                     209    0    209     0%    /tmp
zroot/usr/home                209    0    209     0%    /usr/home
zroot/usr/ports               210    0    209     0%    /usr/ports
zroot/usr/src                 210    0    209     0%    /usr/src
zroot/var/audit               209    0    209     0%    /var/audit
zroot/var/crash               209    0    209     0%    /var/crash
zroot/var/log                 209    0    209     0%    /var/log
zroot/var/mail                209    0    209     0%    /var/mail
zroot/var/tmp                 209    0    209     0%    /var/tmp

# beadm list
BE      Active Mountpoint  Space Created
default NR     /            2.4G 2019-05-24 13:24

Disks Preparation

From all the possible setups with 90 disks of 12 TB capacity I have chosen to go the RAID60 way – its ZFS equivalent of course. With 12 disks in each RAID6 (raidz2) group – there will be 7 such groups – we will have 84 used for the ZFS pool with 6 drives left as SPARE disks – that plays well for me. The disks distribution will look more or less like that.

DISKS  CONTENT
   12  raidz2-0
   12  raidz2-1
   12  raidz2-2
   12  raidz2-3
   12  raidz2-4
   12  raidz2-5
   12  raidz2-6
    6  spares
   90  TOTAL

Here is how FreeBSD system sees these drives by camcontrol(8) command. Sorted by attached SAS controller – scbus(4).

# camcontrol devlist | sort -k 6
(AHCI SGPIO Enclosure 1.00 0001)   at scbus2 target 0 lun 0 (pass0,ses0)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 50 lun 0 (pass1,da0)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 52 lun 0 (pass2,da1)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 54 lun 0 (pass3,da2)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 56 lun 0 (pass5,da4)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 57 lun 0 (pass6,da5)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 59 lun 0 (pass7,da6)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 60 lun 0 (pass8,da7)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 66 lun 0 (pass9,da8)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 67 lun 0 (pass10,da9)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 74 lun 0 (pass11,da10)
(ATA INTEL SSDSC2KB24 0100)        at scbus3 target 75 lun 0 (pass12,da11)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 76 lun 0 (pass13,da12)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 82 lun 0 (pass14,da13)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 83 lun 0 (pass15,da14)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 85 lun 0 (pass16,da15)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 87 lun 0 (pass17,da16)
(Tyan B7118 0500)                  at scbus3 target 88 lun 0 (pass18,ses1)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 89 lun 0 (pass19,da17)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 90 lun 0 (pass20,da18)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 91 lun 0 (pass21,da19)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 92 lun 0 (pass22,da20)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 93 lun 0 (pass23,da21)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 94 lun 0 (pass24,da22)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 95 lun 0 (pass25,da23)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 96 lun 0 (pass26,da24)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 97 lun 0 (pass27,da25)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 98 lun 0 (pass28,da26)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 99 lun 0 (pass29,da27)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 100 lun 0 (pass30,da28)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 101 lun 0 (pass31,da29)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 102 lun 0 (pass32,da30)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 103 lun 0 (pass33,da31)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 104 lun 0 (pass34,da32)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 105 lun 0 (pass35,da33)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 106 lun 0 (pass36,da34)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 107 lun 0 (pass37,da35)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 108 lun 0 (pass38,da36)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 109 lun 0 (pass39,da37)
(ATA TOSHIBA MG07ACA1 0101)        at scbus3 target 110 lun 0 (pass40,da38)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 48 lun 0 (pass41,da39)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 49 lun 0 (pass42,da40)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 51 lun 0 (pass43,da41)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 53 lun 0 (pass44,da42)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 55 lun 0 (da43,pass45)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 59 lun 0 (pass46,da44)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 64 lun 0 (pass47,da45)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 67 lun 0 (pass48,da46)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 68 lun 0 (pass49,da47)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 69 lun 0 (pass50,da48)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 73 lun 0 (pass51,da49)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 76 lun 0 (pass52,da50)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 77 lun 0 (pass53,da51)
(Tyan B7118 0500)                  at scbus4 target 80 lun 0 (pass54,ses2)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 81 lun 0 (pass55,da52)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 82 lun 0 (pass56,da53)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 83 lun 0 (pass57,da54)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 84 lun 0 (pass58,da55)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 85 lun 0 (pass59,da56)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 86 lun 0 (pass60,da57)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 87 lun 0 (pass61,da58)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 88 lun 0 (pass62,da59)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 89 lun 0 (da63,pass66)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 90 lun 0 (pass64,da61)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 91 lun 0 (pass65,da62)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 92 lun 0 (da60,pass63)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 94 lun 0 (pass67,da64)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 97 lun 0 (pass68,da65)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 98 lun 0 (pass69,da66)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 99 lun 0 (pass70,da67)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 100 lun 0 (pass71,da68)
(Tyan B7118 0500)                  at scbus4 target 101 lun 0 (pass72,ses3)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 102 lun 0 (pass73,da69)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 103 lun 0 (pass74,da70)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 104 lun 0 (pass75,da71)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 105 lun 0 (pass76,da72)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 106 lun 0 (pass77,da73)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 107 lun 0 (pass78,da74)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 108 lun 0 (pass79,da75)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 109 lun 0 (pass80,da76)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 110 lun 0 (pass81,da77)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 111 lun 0 (pass82,da78)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 112 lun 0 (pass83,da79)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 113 lun 0 (pass84,da80)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 114 lun 0 (pass85,da81)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 115 lun 0 (pass86,da82)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 116 lun 0 (pass87,da83)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 117 lun 0 (pass88,da84)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 118 lun 0 (pass89,da85)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 119 lun 0 (pass90,da86)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 120 lun 0 (pass91,da87)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 121 lun 0 (pass92,da88)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 122 lun 0 (pass93,da89)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 123 lun 0 (pass94,da90)
(ATA INTEL SSDSC2KB24 0100)        at scbus4 target 124 lun 0 (pass95,da91)
(ATA TOSHIBA MG07ACA1 0101)        at scbus4 target 125 lun 0 (da3,pass4)

One my ask how to identify which disk is which when the FAILURE will came … this is where FreeBSD’s sesutil(8) command comes handy.

# sesutil locate all off
# sesutil locate da64 on

The first sesutil(8) command disables all location lights in the enclosure. The second one turns on the identification for disk da64.

I will also make sure to NOT use the whole space of each drive. Such idea may be pointless but imagine the following situation. Five 12 TB disks failed after 3 years. You can not get the same model drives so you get other 12 TB drives, maybe even from other manufacturer.

# grep da64 /var/run/dmesg.boot
da64 at mpr1 bus 0 scbus4 target 93 lun 0
da64:  Fixed Direct Access SPC-4 SCSI device
da64: Serial Number 98G0A1EQF95G
da64: 1200.000MB/s transfers
da64: Command Queueing enabled
da64: 11444224MB (23437770752 512 byte sectors)

A single 12 TB drive has 23437770752 of 512 byte sectors which equals 12000138625024 bytes of raw capacity.

# expr 23437770752 \* 512
12000138625024

Now image that these other 12 TB drives from other manufacturer will come with 4 bytes smaller size … ZFS will not allow their usage because their size is smaller.

This is why I will use exactly 11175 GB size of each drive which is more or less 1 GB short of its total 11176 GB size.

Below is command that will do that for me for all 90 disks.

# camcontrol devlist \
    | grep TOSHIBA \
    | awk '{print $NF}' \
    | awk -F ',' '{print $2}' \
    | tr -d ')' \
    | while read DISK
      do
        gpart destroy -F                   ${DISK} 1> /dev/null 2> /dev/null
        gpart create -s GPT                ${DISK}
        gpart add -t freebsd-zfs -s 11175G ${DISK}
      done

# gpart show da64
=>         40  23437770672  da64  GPT  (11T)
           40  23435673600     1  freebsd-zfs  (11T)
  23435673640      2097072        - free -  (1.0G)


ZFS Pool Configuration

Next, we will have to create our ZFS pool, its probably the longest zpool command I have ever executed πŸ™‚

As the Toshiba 12 TB disks have 4k sectors we will need to set vfs.zfs.min_auto_ashift to 12 to force them.

# sysctl vfs.zfs.min_auto_ashift=12
vfs.zfs.min_auto_ashift: 12 -> 12

# zpool create nas02 \
    raidz2  da0p1  da1p1  da2p1  da3p1  da4p1  da5p1  da6p1  da7p1  da8p1  da9p1 da10p1 da12p1 \
    raidz2 da13p1 da14p1 da15p1 da16p1 da17p1 da18p1 da19p1 da20p1 da21p1 da22p1 da23p1 da24p1 \
    raidz2 da25p1 da26p1 da27p1 da28p1 da29p1 da30p1 da31p1 da32p1 da33p1 da34p1 da35p1 da36p1 \
    raidz2 da37p1 da38p1 da39p1 da40p1 da41p1 da42p1 da43p1 da44p1 da45p1 da46p1 da47p1 da48p1 \
    raidz2 da49p1 da50p1 da51p1 da52p1 da53p1 da54p1 da55p1 da56p1 da57p1 da58p1 da59p1 da60p1 \
    raidz2 da61p1 da62p1 da63p1 da64p1 da65p1 da66p1 da67p1 da68p1 da69p1 da70p1 da71p1 da72p1 \
    raidz2 da73p1 da74p1 da75p1 da76p1 da77p1 da78p1 da79p1 da80p1 da81p1 da82p1 da83p1 da84p1 \
    spare  da85p1 da86p1 da87p1 da88p1 da89p1 da90p1

# zpool status
  pool: nas02
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:05 with 0 errors on Fri May 31 10:26:29 2019
config:

        NAME        STATE     READ WRITE CKSUM
        nas02       ONLINE       0     0     0
          raidz2-0  ONLINE       0     0     0
            da0p1   ONLINE       0     0     0
            da1p1   ONLINE       0     0     0
            da2p1   ONLINE       0     0     0
            da3p1   ONLINE       0     0     0
            da4p1   ONLINE       0     0     0
            da5p1   ONLINE       0     0     0
            da6p1   ONLINE       0     0     0
            da7p1   ONLINE       0     0     0
            da8p1   ONLINE       0     0     0
            da9p1   ONLINE       0     0     0
            da10p1  ONLINE       0     0     0
            da12p1  ONLINE       0     0     0
          raidz2-1  ONLINE       0     0     0
            da13p1  ONLINE       0     0     0
            da14p1  ONLINE       0     0     0
            da15p1  ONLINE       0     0     0
            da16p1  ONLINE       0     0     0
            da17p1  ONLINE       0     0     0
            da18p1  ONLINE       0     0     0
            da19p1  ONLINE       0     0     0
            da20p1  ONLINE       0     0     0
            da21p1  ONLINE       0     0     0
            da22p1  ONLINE       0     0     0
            da23p1  ONLINE       0     0     0
            da24p1  ONLINE       0     0     0
          raidz2-2  ONLINE       0     0     0
            da25p1  ONLINE       0     0     0
            da26p1  ONLINE       0     0     0
            da27p1  ONLINE       0     0     0
            da28p1  ONLINE       0     0     0
            da29p1  ONLINE       0     0     0
            da30p1  ONLINE       0     0     0
            da31p1  ONLINE       0     0     0
            da32p1  ONLINE       0     0     0
            da33p1  ONLINE       0     0     0
            da34p1  ONLINE       0     0     0
            da35p1  ONLINE       0     0     0
            da36p1  ONLINE       0     0     0
          raidz2-3  ONLINE       0     0     0
            da37p1  ONLINE       0     0     0
            da38p1  ONLINE       0     0     0
            da39p1  ONLINE       0     0     0
            da40p1  ONLINE       0     0     0
            da41p1  ONLINE       0     0     0
            da42p1  ONLINE       0     0     0
            da43p1  ONLINE       0     0     0
            da44p1  ONLINE       0     0     0
            da45p1  ONLINE       0     0     0
            da46p1  ONLINE       0     0     0
            da47p1  ONLINE       0     0     0
            da48p1  ONLINE       0     0     0
          raidz2-4  ONLINE       0     0     0
            da49p1  ONLINE       0     0     0
            da50p1  ONLINE       0     0     0
            da51p1  ONLINE       0     0     0
            da52p1  ONLINE       0     0     0
            da53p1  ONLINE       0     0     0
            da54p1  ONLINE       0     0     0
            da55p1  ONLINE       0     0     0
            da56p1  ONLINE       0     0     0
            da57p1  ONLINE       0     0     0
            da58p1  ONLINE       0     0     0
            da59p1  ONLINE       0     0     0
            da60p1  ONLINE       0     0     0
          raidz2-5  ONLINE       0     0     0
            da61p1  ONLINE       0     0     0
            da62p1  ONLINE       0     0     0
            da63p1  ONLINE       0     0     0
            da64p1  ONLINE       0     0     0
            da65p1  ONLINE       0     0     0
            da66p1  ONLINE       0     0     0
            da67p1  ONLINE       0     0     0
            da68p1  ONLINE       0     0     0
            da69p1  ONLINE       0     0     0
            da70p1  ONLINE       0     0     0
            da71p1  ONLINE       0     0     0
            da72p1  ONLINE       0     0     0
          raidz2-6  ONLINE       0     0     0
            da73p1  ONLINE       0     0     0
            da74p1  ONLINE       0     0     0
            da75p1  ONLINE       0     0     0
            da76p1  ONLINE       0     0     0
            da77p1  ONLINE       0     0     0
            da78p1  ONLINE       0     0     0
            da79p1  ONLINE       0     0     0
            da80p1  ONLINE       0     0     0
            da81p1  ONLINE       0     0     0
            da82p1  ONLINE       0     0     0
            da83p1  ONLINE       0     0     0
            da84p1  ONLINE       0     0     0
        spares
          da85p1    AVAIL
          da86p1    AVAIL
          da87p1    AVAIL
          da88p1    AVAIL
          da89p1    AVAIL
          da90p1    AVAIL

errors: No known data errors

# zpool list nas02
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
nas02   915T  1.42M   915T        -         -     0%     0%  1.00x  ONLINE  -

# zfs list nas02
NAME    USED  AVAIL  REFER  MOUNTPOINT
nas02    88K   675T   201K  none

ZFS Settings

As the primary role of this storage would be keeping files I will use one of the largest values for recordsize – 1 MB – this helps getting better compression ratio.

… but it will also serve as iSCSI Target in which we will try to fit in the native 4k blocks – thus 4096 bytes setting for iSCSI.

# zfs set compression=lz4         nas02
# zfs set atime=off               nas02
# zfs set mountpoint=none         nas02
# zfs set recordsize=1m           nas02
# zfs set redundant_metadata=most nas02
# zfs create                      nas02/nfs
# zfs create                      nas02/smb
# zfs create                      nas02/iscsi
# zfs set recordsize=4k           nas02/iscsi

Also one word on redundant_metadata as its not that obvious parameter. To quote the zfs(8) man page.

# man zfs
(...)
redundant_metadata=all | most
  Controls what types of metadata are stored redundantly.  ZFS stores
  an extra copy of metadata, so that if a single block is corrupted,
  the amount of user data lost is limited.  This extra copy is in
  addition to any redundancy provided at the pool level (e.g. by
  mirroring or RAID-Z), and is in addition to an extra copy specified
  by the copies property (up to a total of 3 copies).  For example if
  the pool is mirrored, copies=2, and redundant_metadata=most, then ZFS
  stores 6 copies of most metadata, and 4 copies of data and some
  metadata.

  When set to all, ZFS stores an extra copy of all metadata.  If a
  single on-disk block is corrupt, at worst a single block of user data
  (which is recordsize bytes long can be lost.)

  When set to most, ZFS stores an extra copy of most types of metadata.
  This can improve performance of random writes, because less metadata
  must be written.  In practice, at worst about 100 blocks (of
  recordsize bytes each) of user data can be lost if a single on-disk
  block is corrupt.  The exact behavior of which metadata blocks are
  stored redundantly may change in future releases.

  The default value is all.
(...)

From the description above we can see that its mostly useful on single device pools because when we have redundancy based on RAIDZ2 (RAID6 equivalent) we do not need to keep additional redundant copies of metadata. This helps to increase write performance.

For the record – iSCSI ZFS zvols are create with command like that one below – as sparse files – also called Thin Provisioning mode.

# zfs create -s -V 16T nas02/iscsi/test

As we have SPARE disks we will also need to enable the zfsd(8) daemon by adding zfsd_enable=YES to the /etc/rc.conf file.

We also need to enable autoreplace property for our pool because by default its set to off.

# zpool get autoreplace nas02
NAME   PROPERTY     VALUE    SOURCE
nas02  autoreplace  off      default

# zpool set autoreplace=on nas02

# zpool get autoreplace nas02
NAME   PROPERTY     VALUE    SOURCE
nas02  autoreplace  on       local

Other ZFS settings are in the /boot/loader.conf file. As this system has 128 GB RAM we will let ZFS use 50 to 75% of that amount for ARC.

# grep vfs.zfs /boot/loader.conf
  vfs.zfs.prefetch_disable=1
  vfs.zfs.cache_flush_disable=1
  vfs.zfs.vdev.cache.size=16M
  vfs.zfs.arc_min=64G
  vfs.zfs.arc_max=96G
  vfs.zfs.deadman_enabled=0

Network Configuration

This is what I really like about FreeBSD. To setup LACP link aggregation tou just need 5 lines in /etc/rc.conf file. On Red Hat Enterprise Linux you would need several files with many lines each.

# head -5 /etc/rc.conf
  defaultrouter="10.20.30.254"
  ifconfig_ixl0="up"
  ifconfig_ixl1="up"
  cloned_interfaces="lagg0"
  ifconfig_lagg0="laggproto lacp laggport ixl0 laggport ixl1 10.20.30.2/24 up"

# ifconfig lagg0
lagg0: flags=8843 metric 0 mtu 1500
        options=e507bb
        ether a0:42:3f:a0:42:3f
        inet 10.20.30.2 netmask 0xffffff00 broadcast 10.20.30.255
        laggproto lacp lagghash l2,l3,l4
        laggport: ixl0 flags=1c
        laggport: ixl1 flags=1c
        groups: lagg
        media: Ethernet autoselect
        status: active
        nd6 options=29

The Intel X710 DA-2 10GE network adapter is fully supported under FreeBSD by the ixl(4) driver.

intel-x710-da-2.jpg

Cisco Nexus Configuration

This is the Cisco Nexus configuration needed to enable LACP aggregation.

First the ports.

NEXUS-1  Eth1/32  NAS02_IXL0  connected 3  full  a-10G  SFP-H10GB-A
NEXUS-2  Eth1/32  NAS02_IXL1  connected 3  full  a-10G  SFP-H10GB-A

… and now aggregation.

interface Ethernet1/32
  description NAS02_IXL1
  switchport
  switchport access vlan 3
  mtu 9216
  channel-group 128 mode active
  no shutdown
!
interface port-channel128
  description NAS02
  switchport
  switchport access vlan 3
  mtu 9216
  vpc 128

… and the same/similar on the second Cisco Nexus NEXUS-2 switch.

FreeBSD Configuration

These are three most important configuration files on any FreeBSD system.

I will now post all settings I use on this storage system.

The /etc/rc.conf file.

# cat /etc/rc.conf
# NETWORK
  hostname="nas02.local"
  defaultrouter="10.20.30.254"
  ifconfig_ixl0="up"
  ifconfig_ixl1="up"
  cloned_interfaces="lagg0"
  ifconfig_lagg0="laggproto lacp laggport ixl0 laggport ixl1 10.20.30.2/24 up"

# KERNEL MODULES
  kld_list="${kld_list} aesni"

# DAEMON | YES
  zfs_enable=YES
  zfsd_enable=YES
  sshd_enable=YES
  ctld_enable=YES
  powerd_enable=YES

# DAEMON | NFS SERVER
  nfs_server_enable=YES
  nfs_client_enable=YES
  rpc_lockd_enable=YES
  rpc_statd_enable=YES
  rpcbind_enable=YES
  mountd_enable=YES
  mountd_flags="-r"

# OTHER
  dumpdev=NO

The /boot/loader.conf file.

# cat /boot/loader.conf
# BOOT OPTIONS
  autoboot_delay=3
  kern.geom.label.disk_ident.enable=0
  kern.geom.label.gptid.enable=0

# DISABLE INTEL HT
  machdep.hyperthreading_allowed=0

# UPDATE INTEL CPU MICROCODE AT BOOT BEFORE KERNEL IS LOADED
  cpu_microcode_load=YES
  cpu_microcode_name=/boot/firmware/intel-ucode.bin

# MODULES
  zfs_load=YES
  aio_load=YES

# RACCT/RCTL RESOURCE LIMITS
  kern.racct.enable=1

# DISABLE MEMORY TEST @ BOOT
  hw.memtest.tests=0

# PIPE KVA LIMIT | 320 MB
  kern.ipc.maxpipekva=335544320

# IPC
  kern.ipc.shmseg=1024
  kern.ipc.shmmni=1024
  kern.ipc.shmseg=1024
  kern.ipc.semmns=512
  kern.ipc.semmnu=256
  kern.ipc.semume=256
  kern.ipc.semopm=256
  kern.ipc.semmsl=512

# LARGE PAGE MAPPINGS
  vm.pmap.pg_ps_enabled=1

# ZFS TUNING
  vfs.zfs.prefetch_disable=1
  vfs.zfs.cache_flush_disable=1
  vfs.zfs.vdev.cache.size=16M
  vfs.zfs.arc_min=64G
  vfs.zfs.arc_max=96G

# ZFS DISABLE PANIC ON STALE I/O
  vfs.zfs.deadman_enabled=0

# NEWCONS SUSPEND
  kern.vt.suspendswitch=0

The /etc/sysctl.conf file.

# cat /etc/sysctl.conf
# ZFS ASHIFT
  vfs.zfs.min_auto_ashift=12

# SECURITY
  security.bsd.stack_guard_page=1

# SECURITY INTEL MDS (MICROARCHITECTURAL DATA SAMPLING) MITIGATION
  hw.mds_disable=3

# DISABLE ANNOYING THINGS
  kern.coredump=0
  hw.syscons.bell=0

# IPC
  kern.ipc.shmmax=4294967296
  kern.ipc.shmall=2097152
  kern.ipc.somaxconn=4096
  kern.ipc.maxsockbuf=5242880
  kern.ipc.shm_allow_removed=1

# NETWORK
  kern.ipc.maxsockbuf=16777216
  kern.ipc.soacceptqueue=1024
  net.inet.tcp.recvbuf_max=8388608
  net.inet.tcp.sendbuf_max=8388608
  net.inet.tcp.mssdflt=1460
  net.inet.tcp.minmss=1300
  net.inet.tcp.syncache.rexmtlimit=0
  net.inet.tcp.syncookies=0
  net.inet.tcp.tso=0
  net.inet.ip.process_options=0
  net.inet.ip.random_id=1
  net.inet.ip.redirect=0
  net.inet.icmp.drop_redirect=1
  net.inet.tcp.always_keepalive=0
  net.inet.tcp.drop_synfin=1
  net.inet.tcp.fast_finwait2_recycle=1
  net.inet.tcp.icmp_may_rst=0
  net.inet.tcp.msl=8192
  net.inet.tcp.path_mtu_discovery=0
  net.inet.udp.blackhole=1
  net.inet.tcp.blackhole=2
  net.inet.tcp.hostcache.expire=7200
  net.inet.tcp.delacktime=20

Purpose

Why one would built such appliance? Because its a lot cheaper then to get the ‘branded’ one. Think about Dell EMC Data Domain for example – and not just ‘any’ Data Domain but almost the highest one – the Data Domain DD9300 at least. It would cost about ten times more at least … with smaller capacity and taking not 4U but closer to 14U with three DS60 expanders.

But you can actually make this FreeBSD Enterprise Storage behave like Dell EMC Data Domain .. or like their Dell EMC Elastic Cloud Storage for example.

The Dell EMC CloudBoost can be deployed somewhere on your VMware stack to provide the DDBoost deduplication. Then you would need OpenStack Swift as its one of the supported backed devices.

emc-cloudboost-swift-cover.png

emc-cloudboost-swift-support.png

The OpenStack Swift package in FreeBSD is about 4-5 years behind reality (2.2.2) so you will have to use Bhyve here.

# pkg search swift
(...)
py27-swift-2.2.2_1             Highly available, distributed, eventually consistent object/blob store
(...)

Create Bhyve virtual machine on this FreeBSD Enterprise Storage with CentOS 7.6 system for example, then setup Swift there, but it will work. With 20 physical cores to spare and 128 GB RAM you would not even noticed its there.

This way you can use Dell EMC Networker with more then ten times cheaper storage.

In the past I also wrote about IBM Spectrum Protect (TSM) which would also greatly benefit from FreeBSD Enterprise Storage. I actually also use this FreeBSD based storage as space for IBM Spectrum Protect (TSM) container pool directories. Exported via iSCSI works like a charm.

You can also compare that FreeBSD Enterprise Storage to other storage appliances like iXsystems TrueNAS or EXAGRID.

Performance

You for sure would want to know how fast this FreeBSD Enterprise Storage performs πŸ™‚

I will share all performance data that I gathered with a pleasure.

Network Performance

First the network performance.

I user iperf3 as the benchmark.

I started the server on the FreeBSD side.

# iperf3 -s

… and then I started client on the Windows Server 2016 machine.

C:\iperf-3.1.3-win64>iperf3.exe -c nas02 -P 8
(...)
[SUM]   0.00-10.00  sec  10.8 GBytes  9.26 Gbits/sec                  receiver
(..)

This is with MTU 1500 – no Jumbo frames unfortunatelly 😦

Unfortunatelly this system has only one physical 10GE interface but I did other test also. Using two such boxes with single 10GE interface. That saturated the dual 10GE LACP on FreeBSD side nicely.

I also exported NFS and iSCSI to Red Hat Enterprise Linux system. The network performance was about 500-600 MB/s on single 10GE interface. That would be 1000-1200 MB/s on LACP aggregation.

Disk Subsystem Performance

Now the disk subsystem.

First some naive test using diskinfo(8) FreeBSD’s builtin tool.

# diskinfo -ctv /dev/da12
/dev/da12
        512             # sectorsize
        12000138625024  # mediasize in bytes (11T)
        23437770752     # mediasize in sectors
        4096            # stripesize
        0               # stripeoffset
        1458933         # Cylinders according to firmware.
        255             # Heads according to firmware.
        63              # Sectors according to firmware.
        ATA TOSHIBA MG07ACA1    # Disk descr.
        98H0A11KF95G    # Disk ident.
        id1,enc@n500e081010445dbd/type@0/slot@c/elmdesc@ArrayDevice11   # Physical path
        No              # TRIM/UNMAP support
        7200            # Rotation rate in RPM
        Not_Zoned       # Zone Mode

I/O command overhead:
        time to read 10MB block      0.067031 sec       =    0.003 msec/sector
        time to read 20480 sectors   2.619989 sec       =    0.128 msec/sector
        calculated command overhead                     =    0.125 msec/sector

Seek times:
        Full stroke:      250 iter in   5.665880 sec =   22.664 msec
        Half stroke:      250 iter in   4.263047 sec =   17.052 msec
        Quarter stroke:   500 iter in   6.867914 sec =   13.736 msec
        Short forward:    400 iter in   3.057913 sec =    7.645 msec
        Short backward:   400 iter in   1.979287 sec =    4.948 msec
        Seq outer:       2048 iter in   0.169472 sec =    0.083 msec
        Seq inner:       2048 iter in   0.469630 sec =    0.229 msec

Transfer rates:
        outside:       102400 kbytes in   0.478251 sec =   214114 kbytes/sec
        middle:        102400 kbytes in   0.605701 sec =   169060 kbytes/sec
        inside:        102400 kbytes in   1.303909 sec =    78533 kbytes/sec

So now we know how fast a single disk is.

Let’s repeast the same test on the ZFS zvol device.

# diskinfo -ctv /dev/zvol/nas02/iscsi/test
/dev/zvol/nas02/iscsi/test
        512             # sectorsize
        17592186044416  # mediasize in bytes (16T)
        34359738368     # mediasize in sectors
        65536           # stripesize
        0               # stripeoffset
        Yes             # TRIM/UNMAP support
        Unknown         # Rotation rate in RPM

I/O command overhead:
        time to read 10MB block      0.004512 sec       =    0.000 msec/sector
        time to read 20480 sectors   0.196824 sec       =    0.010 msec/sector
        calculated command overhead                     =    0.009 msec/sector

Seek times:
        Full stroke:      250 iter in   0.006151 sec =    0.025 msec
        Half stroke:      250 iter in   0.008228 sec =    0.033 msec
        Quarter stroke:   500 iter in   0.014062 sec =    0.028 msec
        Short forward:    400 iter in   0.010564 sec =    0.026 msec
        Short backward:   400 iter in   0.011725 sec =    0.029 msec
        Seq outer:       2048 iter in   0.028198 sec =    0.014 msec
        Seq inner:       2048 iter in   0.028416 sec =    0.014 msec

Transfer rates:
        outside:       102400 kbytes in   0.036938 sec =  2772213 kbytes/sec
        middle:        102400 kbytes in   0.043076 sec =  2377194 kbytes/sec
        inside:        102400 kbytes in   0.034260 sec =  2988908 kbytes/sec

Almost 3 GB/s – not bad.

Time for even more oldschool test – the immortal dd(8) command.

This is with compression=off setting.

One process.

# dd if=/dev/zero of=FILE bs=128m status=progress
26172456960 bytes (26 GB, 24 GiB) transferred 16.074s, 1628 MB/s
202+0 records in
201+0 records out
26977763328 bytes transferred in 16.660884 secs (1619227644 bytes/sec)

Four concurrent processes.

# dd if=/dev/zero of=FILE${X} bs=128m status=progress
80933289984 bytes (81 GB, 75 GiB) transferred 98.081s, 825 MB/s
608+0 records in
608+0 records out
81604378624 bytes transferred in 98.990579 secs (824365101 bytes/sec)

Eight concurrent processes.

# dd if=/dev/zero of=FILE${X} bs=128m status=progress
174214610944 bytes (174 GB, 162 GiB) transferred 385.042s, 452 MB/s
1302+0 records in
1301+0 records out
174617264128 bytes transferred in 385.379296 secs (453104943 bytes/sec)

Lets summarize that data.

1 STREAM(s) ~ 1600 MB/s ~ 1.5 GB/s
4 STREAM(s) ~ 3300 MB/s ~ 3.2 GB/s
8 STREAM(s) ~ 3600 MB/s ~ 3.5 GB/s

So the disk subsystem is able to squeeze 3.5 GB/s of sustained speed in sequential writes. That us that if we would want to saturate it we would need to add additional two 10GE interfaces.

The disks were stressed only to about 55% which you can see in other useful FreeBSD tool – gstat(8) command.

n10.png

Time for more ‘intelligent’ tests. The blogbench test.

First with compression disabled.

# time blogbench -d .
Frequency = 10 secs
Scratch dir = [.]
Spawning 3 writers...
Spawning 1 rewriters...
Spawning 5 commenters...
Spawning 100 readers...
Benchmarking for 30 iterations.
The test will run during 5 minutes.
(...)
Final score for writes:          6476
Final score for reads :        660436

blogbench -d .  280.58s user 4974.41s system 1748% cpu 5:00.54 total

Second with compression set to LZ4.

# time blogbench -d .
Frequency = 10 secs
Scratch dir = [.]
Spawning 3 writers...
Spawning 1 rewriters...
Spawning 5 commenters...
Spawning 100 readers...
Benchmarking for 30 iterations.
The test will run during 5 minutes.
(...)
Final score for writes:          7087
Final score for reads :        733932

blogbench -d .  299.08s user 5415.04s system 1900% cpu 5:00.68 total

Compression did not helped much, but helped.

To have some comparision we will run the same test on the system ZFS pool – two Intel SSD DC S3500 240 GB drives in mirror which have following features.

The Intel SSD DC S3500 240 GB drives:

  • Sequential Read (up to) 500 MB/s
  • Sequential Write (up to) 260 MB/s
  • Random Read (100% Span) 75000 IOPS
  • Random Write (100% Span) 7500 IOPS
# time blogbench -d .
Frequency = 10 secs
Scratch dir = [.]
Spawning 3 writers...
Spawning 1 rewriters...
Spawning 5 commenters...
Spawning 100 readers...
Benchmarking for 30 iterations.
The test will run during 5 minutes.
(...)
Final score for writes:          6109
Final score for reads :        654099

blogbench -d .  278.73s user 5058.75s system 1777% cpu 5:00.30 total

Now the randomio test. Its multithreaded disk I/O microbenchmark.

The usage is as follows.

usage: randomio filename nr_threads write_fraction_of_io fsync_fraction_of_writes io_size nr_seconds_between_samples

filename                    Filename or device to read/write.
write_fraction_of_io        What fraction of I/O should be writes - for example 0.25 for 25% write.
fsync_fraction_of_writes    What fraction of writes should be fsync'd.
io_size                     How many bytes to read/write (multiple of 512 bytes).
nr_seconds_between_samples  How many seconds to average samples over.

The randomio with 4k block.

# zfs create -s -V 1T nas02/iscsi/test
# randomio /dev/zvol/nas02/iscsi/test 8 0.25 1 4096 10
  total |  read:         latency (ms)       |  write:        latency (ms)
   iops |   iops   min    avg    max   sdev |   iops   min    avg    max   sdev
--------+-----------------------------------+----------------------------------
54137.7 |40648.4   0.0    0.1  575.8    2.2 |13489.4   0.0    0.3  405.8    2.6
66248.4 |49641.5   0.0    0.1   19.6    0.3 |16606.9   0.0    0.2   26.4    0.7
66411.0 |49817.2   0.0    0.1   19.7    0.3 |16593.8   0.0    0.2   20.3    0.7
64158.9 |48142.8   0.0    0.1  254.7    0.7 |16016.1   0.0    0.2  130.4    1.0
48454.1 |36390.8   0.0    0.1  542.8    2.7 |12063.3   0.0    0.3  507.5    3.2
66796.1 |50067.4   0.0    0.1   24.1    0.3 |16728.7   0.0    0.2   23.4    0.7
58512.2 |43851.7   0.0    0.1  576.5    1.7 |14660.5   0.0    0.2  307.2    1.7
63195.8 |47341.8   0.0    0.1  261.6    0.9 |15854.1   0.0    0.2  361.1    1.9
67086.0 |50335.6   0.0    0.1   20.4    0.3 |16750.4   0.0    0.2   25.1    0.8
67429.8 |50549.6   0.0    0.1   21.8    0.3 |16880.3   0.0    0.2   20.6    0.7
^C

… and with 512 sector.

# zfs create -s -V 1T nas02/iscsi/test
# randomio /dev/zvol/nas02/iscsi/TEST 8 0.25 1 512 10
  total |  read:         latency (ms)       |  write:        latency (ms)
   iops |   iops   min    avg    max   sdev |   iops   min    avg    max   sdev
--------+-----------------------------------+----------------------------------
58218.9 |43712.0   0.0    0.1  501.5    2.1 |14506.9   0.0    0.2  272.5    1.6
66325.3 |49703.8   0.0    0.1  352.0    0.9 |16621.4   0.0    0.2  352.0    1.5
68130.5 |51100.8   0.0    0.1   24.6    0.3 |17029.7   0.0    0.2   24.4    0.7
68465.3 |51352.3   0.0    0.1   19.9    0.3 |17112.9   0.0    0.2   23.8    0.7
54903.5 |41249.1   0.0    0.1  399.3    1.9 |13654.4   0.0    0.3  335.8    2.2
61259.8 |45898.7   0.0    0.1  574.6    1.7 |15361.0   0.0    0.2  371.5    1.7
68483.3 |51313.1   0.0    0.1   22.9    0.3 |17170.3   0.0    0.2   26.1    0.7
56713.7 |42524.7   0.0    0.1  373.5    1.8 |14189.1   0.0    0.2  438.5    2.7
68861.4 |51657.0   0.0    0.1   21.0    0.3 |17204.3   0.0    0.2   21.7    0.7
68602.0 |51438.4   0.0    0.1   19.5    0.3 |17163.7   0.0    0.2   23.7    0.7
^C

Both randomio tests were run with compression set to LZ4.

Next is bonnie++ benchmark. It has been run with compression set to LZ4.

# bonnie++ -d . -u root
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
nas02.local 261368M   139  99 775132  99 589190  99   383  99 1638929  99 12930 2046
Latency             60266us    7030us    7059us   21553us    3844us    5710us
Version  1.97       ------Sequential Create------ --------Random Create--------
nas02.local         -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 +++++ +++ +++++ +++ 12680  44 +++++ +++ +++++ +++ 30049  99
Latency              2619us      43us     714ms    2748us      28us      58us

… and last but not least the fio benchmark. Also with LZ4 compression enabled.

# fio --randrepeat=1 --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=64
fio-3.13
Starting 1 process
Jobs: 1 (f=1): [m(1)][98.0%][r=38.0MiB/s,w=12.2MiB/s][r=9735,w=3128 IOPS][eta 00m:05s]
test: (groupid=0, jobs=1): err= 0: pid=35368: Tue Jun 18 15:14:44 2019
  read: IOPS=3157, BW=12.3MiB/s (12.9MB/s)(3070MiB/248872msec)
   bw (  KiB/s): min= 9404, max=57732, per=98.72%, avg=12469.84, stdev=3082.99, samples=497
   iops        : min= 2351, max=14433, avg=3117.15, stdev=770.74, samples=497
  write: IOPS=1055, BW=4222KiB/s (4323kB/s)(1026MiB/248872msec)
   bw (  KiB/s): min= 3179, max=18914, per=98.71%, avg=4166.60, stdev=999.23, samples=497
   iops        : min=  794, max= 4728, avg=1041.25, stdev=249.76, samples=497
  cpu          : usr=1.11%, sys=88.64%, ctx=677981, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=12.3MiB/s (12.9MB/s), 12.3MiB/s-12.3MiB/s (12.9MB/s-12.9MB/s), io=3070MiB (3219MB), run=248872-248872msec
  WRITE: bw=4222KiB/s (4323kB/s), 4222KiB/s-4222KiB/s (4323kB/s-4323kB/s), io=1026MiB (1076MB), run=248872-248872msec

Dunno how about you but I am satisfied with performance πŸ™‚

FreeNAS

Originally I really wanted to use FreeNAS on these boxes and I even installed FreeNAS on them. It run nicely but … the security part of FreeNAS was not best.

This is the output of pkg audit command. Quite scarry.

root@freenas[~]# pkg audit -F
Fetching vuln.xml.bz2: 100%  785 KiB 804.3kB/s    00:01
python27-2.7.15 is vulnerable:
Python -- NULL pointer dereference vulnerability
CVE: CVE-2019-5010
WWW: https://vuxml.FreeBSD.org/freebsd/d74371d2-4fee-11e9-a5cd-1df8a848de3d.html

curl-7.62.0 is vulnerable:
curl -- multiple vulnerabilities
CVE: CVE-2019-3823
CVE: CVE-2019-3822
CVE: CVE-2018-16890
WWW: https://vuxml.FreeBSD.org/freebsd/714b033a-2b09-11e9-8bc3-610fd6e6cd05.html

libgcrypt-1.8.2 is vulnerable:
libgcrypt -- side-channel attack vulnerability
CVE: CVE-2018-0495
WWW: https://vuxml.FreeBSD.org/freebsd/9b5162de-6f39-11e8-818e-e8e0b747a45a.html

python36-3.6.5_1 is vulnerable:
Python -- NULL pointer dereference vulnerability
CVE: CVE-2019-5010
WWW: https://vuxml.FreeBSD.org/freebsd/d74371d2-4fee-11e9-a5cd-1df8a848de3d.html

pango-1.42.0 is vulnerable:
pango -- remote DoS vulnerability
CVE: CVE-2018-15120
WWW: https://vuxml.FreeBSD.org/freebsd/5a757a31-f98e-4bd4-8a85-f1c0f3409769.html

py36-requests-2.18.4 is vulnerable:
www/py-requests -- Information disclosure vulnerability
WWW: https://vuxml.FreeBSD.org/freebsd/50ad9a9a-1e28-11e9-98d7-0050562a4d7b.html

libnghttp2-1.31.0 is vulnerable:
nghttp2 -- Denial of service due to NULL pointer dereference
CVE: CVE-2018-1000168
WWW: https://vuxml.FreeBSD.org/freebsd/1fccb25e-8451-438c-a2b9-6a021e4d7a31.html

gnupg-2.2.6 is vulnerable:
gnupg -- unsanitized output (CVE-2018-12020)
CVE: CVE-2017-7526
CVE: CVE-2018-12020
WWW: https://vuxml.FreeBSD.org/freebsd/7da0417f-6b24-11e8-84cc-002590acae31.html

py36-cryptography-2.1.4 is vulnerable:
py-cryptography -- tag forgery vulnerability
CVE: CVE-2018-10903
WWW: https://vuxml.FreeBSD.org/freebsd/9e2d0dcf-9926-11e8-a92d-0050562a4d7b.html

perl5-5.26.1 is vulnerable:
perl -- multiple vulnerabilities
CVE: CVE-2018-6913
CVE: CVE-2018-6798
CVE: CVE-2018-6797
WWW: https://vuxml.FreeBSD.org/freebsd/41c96ffd-29a6-4dcc-9a88-65f5038fa6eb.html

libssh2-1.8.0,3 is vulnerable:
libssh2 -- multiple issues
CVE: CVE-2019-3862
CVE: CVE-2019-3861
CVE: CVE-2019-3860
CVE: CVE-2019-3858
WWW: https://vuxml.FreeBSD.org/freebsd/6e58e1e9-2636-413e-9f84-4c0e21143628.html

git-lite-2.17.0 is vulnerable:
Git -- Fix memory out-of-bounds and remote code execution vulnerabilities (CVE-2018-11233 and CVE-2018-11235)
CVE: CVE-2018-11235
CVE: CVE-2018-11233
WWW: https://vuxml.FreeBSD.org/freebsd/c7a135f4-66a4-11e8-9e63-3085a9a47796.html

gnutls-3.5.18 is vulnerable:
GnuTLS -- double free, invalid pointer access
CVE: CVE-2019-3836
CVE: CVE-2019-3829
WWW: https://vuxml.FreeBSD.org/freebsd/fb30db8f-62af-11e9-b0de-001cc0382b2f.html

13 problem(s) in the installed packages found.

root@freenas[~]# uname -a
FreeBSD freenas.local 11.2-STABLE FreeBSD 11.2-STABLE #0 r325575+95cc58ca2a0(HEAD): Mon May  6 19:08:58 EDT 2019     root@mp20.tn.ixsystems.com:/freenas-releng/freenas/_BE/objs/freenas-releng/freenas/_BE/os/sys/FreeNAS.amd64  amd64

root@freenas[~]# freebsd-version -uk
11.2-STABLE
11.2-STABLE

root@freenas[~]# sockstat -l4
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS
root     uwsgi-3.6  4006  3  tcp4   127.0.0.1:9042        *:*
root     uwsgi-3.6  3188  3  tcp4   127.0.0.1:9042        *:*
nobody   mdnsd      3144  4  udp4   *:31417               *:*
nobody   mdnsd      3144  6  udp4   *:5353                *:*
www      nginx      3132  6  tcp4   *:443                 *:*
www      nginx      3132  8  tcp4   *:80                  *:*
root     nginx      3131  6  tcp4   *:443                 *:*
root     nginx      3131  8  tcp4   *:80                  *:*
root     ntpd       2823  21 udp4   *:123                 *:*
root     ntpd       2823  22 udp4   10.49.13.99:123       *:*
root     ntpd       2823  25 udp4   127.0.0.1:123         *:*
root     sshd       2743  5  tcp4   *:22                  *:*
root     syslog-ng  2341  19 udp4   *:1031                *:*
nobody   mdnsd      2134  3  udp4   *:39020               *:*
nobody   mdnsd      2134  5  udp4   *:5353                *:*
root     python3.6  236   22 tcp4   *:6000                *:*


I even tried to get explanation why FreeNAS has such outdated and insecure packages in their latest version – FreeNAS 11.2-U3 Vulnerabilities – a thread I started on their forums.

Unfortunatelly its their policy which you can summarize as ‘do not touch/change versions if its working’ – at last I got this implression.

Because if these security holes I can not recommend the use of FreeNAS and I movedto original – the FreeBSD system.

One other interesting note. After I installed FreeBSD I wanted to import the ZFS pool created by FreeNAS. This is what I got after executing the zpool import command.

# zpool import
   pool: nas02_gr06
     id: 1275660523517109367
  state: ONLINE
 status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://illumos.org/msg/ZFS-8000-EY
 config:

        nas02_gr06  ONLINE
          raidz2-0  ONLINE
            da58p2  ONLINE
            da59p2  ONLINE
            da60p2  ONLINE
            da61p2  ONLINE
            da62p2  ONLINE
            da63p2  ONLINE
            da64p2  ONLINE
            da26p2  ONLINE
            da65p2  ONLINE
            da23p2  ONLINE
            da29p2  ONLINE
            da66p2  ONLINE
            da67p2  ONLINE
            da68p2  ONLINE
        spares
          da69p2

   pool: nas02_gr05
     id: 5642709896812665361
  state: ONLINE
 status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://illumos.org/msg/ZFS-8000-EY
 config:

        nas02_gr05  ONLINE
          raidz2-0  ONLINE
            da20p2  ONLINE
            da30p2  ONLINE
            da34p2  ONLINE
            da50p2  ONLINE
            da28p2  ONLINE
            da38p2  ONLINE
            da51p2  ONLINE
            da52p2  ONLINE
            da27p2  ONLINE
            da32p2  ONLINE
            da53p2  ONLINE
            da54p2  ONLINE
            da55p2  ONLINE
            da56p2  ONLINE
        spares
          da57p2

   pool: nas02_gr04
     id: 2460983830075205166
  state: ONLINE
 status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://illumos.org/msg/ZFS-8000-EY
 config:

        nas02_gr04  ONLINE
          raidz2-0  ONLINE
            da44p2  ONLINE
            da37p2  ONLINE
            da18p2  ONLINE
            da36p2  ONLINE
            da45p2  ONLINE
            da19p2  ONLINE
            da22p2  ONLINE
            da33p2  ONLINE
            da35p2  ONLINE
            da21p2  ONLINE
            da31p2  ONLINE
            da47p2  ONLINE
            da48p2  ONLINE
            da49p2  ONLINE
        spares
          da46p2

   pool: nas02_gr03
     id: 4878868173820164207
  state: ONLINE
 status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://illumos.org/msg/ZFS-8000-EY
 config:

        nas02_gr03  ONLINE
          raidz2-0  ONLINE
            da81p2  ONLINE
            da71p2  ONLINE
            da14p2  ONLINE
            da15p2  ONLINE
            da80p2  ONLINE
            da16p2  ONLINE
            da88p2  ONLINE
            da17p2  ONLINE
            da40p2  ONLINE
            da41p2  ONLINE
            da25p2  ONLINE
            da42p2  ONLINE
            da24p2  ONLINE
            da43p2  ONLINE
        spares
          da39p2

   pool: nas02_gr02
     id: 3299037437134217744
  state: ONLINE
 status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://illumos.org/msg/ZFS-8000-EY
 config:

        nas02_gr02  ONLINE
          raidz2-0  ONLINE
            da84p2  ONLINE
            da76p2  ONLINE
            da85p2  ONLINE
            da8p2   ONLINE
            da9p2   ONLINE
            da78p2  ONLINE
            da73p2  ONLINE
            da74p2  ONLINE
            da70p2  ONLINE
            da77p2  ONLINE
            da11p2  ONLINE
            da13p2  ONLINE
            da79p2  ONLINE
            da89p2  ONLINE
        spares
          da90p2

   pool: nas02_gr01
     id: 1132383125952900182
  state: ONLINE
 status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://illumos.org/msg/ZFS-8000-EY
 config:

        nas02_gr01  ONLINE
          raidz2-0  ONLINE
            da91p2  ONLINE
            da75p2  ONLINE
            da0p2   ONLINE
            da82p2  ONLINE
            da1p2   ONLINE
            da83p2  ONLINE
            da2p2   ONLINE
            da3p2   ONLINE
            da4p2   ONLINE
            da5p2   ONLINE
            da86p2  ONLINE
            da6p2   ONLINE
            da7p2   ONLINE
            da72p2  ONLINE
        spares
          da87p2



It seems that FreeNAS does ZFS little differently and they create a separate pool for every RAIDZ2 target with dedicated spares. Interesting …

UPDATE 1 – BSD Now 305

The FreeBSD Enterprise 1 PB Storage article was featured in the BSD Now 305 – Changing Face of Unix episode.

Thanks for mentioning!

UPDATE 2 – Real Life Pictures in Data Center

Some of you asked for a real life pictures of this monster. Below you will find several pics taken at the data center.

Front case with cabling.

tyan-real-01.jpg

Alternate front view.

tyan-real-09.jpg

Back of the case with cabling.

tyan-real-02.jpg

Top view with disks.

tyan-real-03

Alternate top view.

tyan-real-07.jpg

Disks slots zoom.

tyan-real-08.jpg

SSD and HDD disks.

tyan-real-06.jpg

EOF

Valuable News – 2019/06/17

The Valuable News weekly series is dedicated to provide summary about news, articles and other interesting stuff mostly but not always related to the UNIX or BSD systems. Whenever I stumble upon something worth mentioning on the Internet I just put it here.

Today the amount information that we get using various information streams is at massive overload. Thus one needs to focus only on what is important without the need to grep(1) the Internet everyday. Hence the idea of providing such information ‘bulk’ as I already do that grep(1).

UNIX

FreeBSD sysctlview 1.3 is Out.
https://twitter.com/alfsiciliano/status/1138583963423952896
https://gitlab.com/alfix/sysctlview

Desktop Neo – Rethinking the Desktop Interface for Productivity.
https://desktopneo.com/

OpenZFS (ZoL) on FreeBSD renamed from sysutils/zol to sysutils/openzfs port.
https://svnweb.freebsd.org/ports?view=revision&revision=503975

Arduino Development Using OpenBSD CLI.
https://playground.arduino.cc/OpenBSD/CLI/

RAMBleed – Reading Bits in Memory w/o Accessing Them.
https://rambleed.com/

FreeBSD psm(4) Driver Enables Touchpads and Trackpads by Default.
https://svnweb.freebsd.org/base?view=revision&revision=348873

Creating FreeBSD Kernel Development Environment with bhyve and Nested Virtualization.
https://hacking-on.systems/index.php?id=1

Mount exFAT Filesystem on OpenBSD.
https://www.romanzolotarev.com/openbsd/exfat.html

New OpenBSD Service – systemd.
https://redmine.ungleich.ch/issues/6751

Installing Accessible OpenBSD Laptop.
https://stsp.name/maurice-laptop.html

FreeBSD Performance Change – Reduces Data Cache Miss Rate by 10-14% and Running Time by 5-7%.
https://svnweb.freebsd.org/base?view=revision&revision=348881

Why Red Hat (RHEL) Deprecated BTRFS.
https://news.ycombinator.com/item?id=14907771

Battle Testing Data Integrity Verification with ZFS/BTRFS/mdadm+dm-integrity Solutions.
http://www.unixsheikh.com/articles/battle-testing-data-integrity-verification-with-zfs-btrfs-and-mdadm-dm-integrity.html

OmniOS Community Edition r151030f/r151028af/r151022dd Available.
https://omniosce.org/article/030f-028af-022dd

Why You Should Learn Just a Little AWK.
https://gregable.com/2010/09/why-you-should-know-just-little-awk.html

Why Use Package Managers?
https://uwm.edu/hpc/software-management/

BSD Now 302 – Contention Reduction.
https://www.bsdnow.tv/302

FreeBSD Upgrades Clang/LLVM to 8.0.1 Version.
https://svnweb.freebsd.org/base?view=revision&revision=349004

FreeBSD Modifies ZFS so Multi Threaded Write Use Less 15%-35% CPU Time and Increase Throughput 10% to 40%.
https://twitter.com/FreeBSDHelp/status/1139235988637663235

FreeBSD Adds sys/class/net Devices to linsysfs.
https://github.com/freebsd/freebsd/commit/2c9faf1048125eb00f2662bf826b10eea67f1c80

FreeBSD adds macOS-like three finger drag trackpad gesture to psm(4) driver.
https://svnweb.freebsd.org/base?view=revision&revision=349098

FreeBSD with ZFS without Drives.
FreeBSD+ZFS Without Drives

In Other BSDs for 2019/06/15.
https://www.dragonflydigest.com/2019/06/15/23048.html

ZFS on Linux 0.8.1 Released.
https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.8.1

FreeBSD 11.3-RC1 Available.
https://lists.freebsd.org/pipermail/freebsd-stable/2019-June/091281.html

FreeBSD CTF – UEFI HTTP Boot Support.
https://lists.freebsd.org/pipermail/freebsd-current/2019-June/073593.html

FreeBSD Day is Almost Here.
Help us celebrate 26th anniversary of your favorite open source operating system.
https://www.freebsdfoundation.org/blog/freebsd-day-is-almost-here/

The systemd-openbsd is systemd style init for OpenBSD.
https://github.com/reyk/systemd-openbsd

Building Security Appliance Based on FreeBSD.
https://www.bsdcan.org/2019/schedule/events/1064.en.html
https://twitter.com/BSDTV/status/1140339903118659585

Hardware

AMD Ryzen 3000 APUs.
https://www.anandtech.com/show/14523/amd-ryzen-3000-apus-up-to-vega-11-more-mhz-under-150

AMD 16-Core Ryzen 9 3950X – 3.5GHz with 105W TDP.
https://www.anandtech.com/show/14516/amd-16-core-ryzen-9-3950x-up-to-4-7-ghz-105w-coming-september

AMD Zen 2 Microarchitecture Analysis – Ryzen 3000 and EPYC Rome.
https://www.anandtech.com/show/14525/amd-zen-2-microarchitecture-analysis-ryzen-3000-and-epyc-rome

AMD 16-Core Ryzen 9 3950X with 61K Points – Intel 18-Core i9-9980XE Only Got 46K.
AMD is Currently the Fastest Processor on Geekbench.
https://www.techquila.co.in/amd-ryzen-9-3950x-vs-intels-18-core-i9/

Life

Secret Surveillance Tracks Your Every Move in Stores.
https://www.nytimes.com/interactive/2019/06/14/opinion/bluetooth-wireless-tracking-privacy.html

Other

Mozilla – Technology with Respect and Honesty – Here’s How We Do It.
https://blog.mozilla.org/firefox/firefox-data-privacy-promise/

Firefox Extensions.
https://enchiridion.red/2019/1/18/firefox-extensions/

Firefox – Privacy Related about:config Tweaks.
https://www.privacytools.io/browsers/#about_config

You (Probably) Do Not Need ReCAPTCHA.
https://kevv.net/you-probably-dont-need-recaptcha/

FireEye Exploitation – Project Zero Vulnerability of the Beast.
https://googleprojectzero.blogspot.com/2015/12/fireeye-exploitation-project-zeros.html

EOF

Manage Photography the UNIX Way

After using UNIX for so many years you start to think the UNIX way. This article aims to automate and accelerate the flow of importing photos from camera and storing it for future use.

When I had a lot of time I shoot both RAW and JPEG images at the same time (RAW and JPEG file were written for every picture). Then I used one of the DxO Optics Pro/Raw Theraphee/Darktable applications to make these RAW files shine even more with mass conversion. Then I compared these to out of camera JPEG files and left only the one that suited me best. Its was probably the best way of having ‘the best version’ of each photo but it also took whole a lot of time. Now as I do not have that much time I needed to find a way to make this process fast and almost seamless.

Hardware

I use SONY cameras because they are superior to other brands when it comes to price/performance ratio and also have some important features that are absent in other brands. For example SONY A-mount based cameras – SONY a68 camera offers just so much more for very small amount of money then any near Nikon or Canon competitor. If you want to get grip on these differences take a look at my SONY a68 review at DPReview site – https://www.dpreview.com/forums/thread/4152155 – available here.

a68-lcd.jpg

Besides the price/performance ratio SONY cameras are just too fun/too comfortable to use something different – while providing similar or better results then Nikon/Canon competition. Take the viewfinder for example. Nikon/Canon cameras are ‘by default’ using the optical viewfinder and to switch to LCD panel you need to manually push a button and switch into the PAINFULLY SLOW (autofocus is actually unusable) mode called Live View … but if you want to use viewfinder again then you again need to switch that mode off with a button. How its implemented in SONY? SONY camera just automatically switches to EVF when you attach your eye to the viewfinder and switches back to LCD automatically when you take your eye off of it … and autofocus is same fast on both viewfinder and LCD. This is just one of the examples of course. For example Nikon cameras can not record movie when you are using viewfinder – you can only do it with LCD.

a68-flash.jpg

There is also SONY E-mount system which utilizes newer/different ideas – its generally much more expensive then older A-mount system but has even more features then Canon/Nikon cameras. One of the selling points of SONY E-mount cameras is also their small size – for which feature I recently switched from SONY a68 (A-mount) to SONY a5100 (E-mount) camera.

Approach

I basically use two SONY cameras.

The small and ultra portable SONY RX100 III which is probably the best pocket/compact camera in the world when it comes to price/performance ratio. As it has quite large 1 INCH sensor (2.7 crop factor) it allows to use high ISO values without that much noise which allows to shoot indoors in low light without much loss of quality. It also has tiltable flash which you can point to ceiling to get extra bounced light in low light situations indoors. This small gem generally has all the features that all SONY APS-C/Full Frame cameras have. Same menu interface with same features. Its not some small handicapped cripple like a lot of compact cameras. And its fast too. It even features EVF! It also features XAVC S 50 Mbit video codec which helps greatly in low light situations. Of course in good light conditions this camera shines even more. As it has 24-70mm f/1.8-2.8 light/fast lens it its very universal. The Full Frame depth of field equivalent is even better then most APS-C cameras because its f/4.9-7.6 Full Frame depth of field equivalent is better – for example – then SONY a6400 with its f/3.5-5.6 kit lens – which only has f/5.3-8.4 (because of 1.5 crop ratio for APS-C).

rx100-evf-lcd-on.jpg

You can read more about depth of field equivalence here – https://www.dpreview.com/articles/2666934640/what-is-equivalence-and-why-should-i-care – a good article on DPReview explaining this.

The other SONY camera I used was SONY a68 with following lenses:

  • TAMRON 18-270mm f/3.5-5.6 – all-rounder
  • SONY 35mm f/1.8 – small bokeh low light friend
  • SIGMA 50-150mm f/2.8 – large bokeh friend
  • SAMYANG 85mm f/1.4 – manual focus bokeh master

… but as I checked my ‘habits’ it was that way most of the time:
– use/take small/portable SONY RX100 III because its convenient
– grab SONY a68 with 35mm f/1.8 at house for some bokeh pictures

If you are not sure what ‘bokeh’ means then please check Wikipedia article about it – https://en.wikipedia.org/wiki/Bokeh – available here.

I very rarely used other lenses. Which made me to think how to ‘optimize’ the SONY a68 A-mount camera. Also because SONY a68 built-in flash is not able to point up (to get extra light from ceiling indoors) I also needed dedicated external SONY HVL F20M flash on ISO hot shoe which made this large camera even bigger.

I checked the SONY portfolio and got older SONY a5100 E-mount camera instead. It has nice and fast autofocus from SONY a6000 camera along with XAVC S video codec and useful tiling LCD screen. It even has a touch screen which allows you to take a photo on the place when you touched the screen! It works similar in movies – just touch when you want it to focus. Its probably smallest SONY APS-C body – very close in size to SONY RX100 III … and I got SONY E-mount 35mm f/1.8 lens to it. I also missed 85mm f/1.4 lens so I take different route now. As E-mount system allows one to adapt older lenses with Lens Turbo adapters (about 0.7 ratio) I get an old used Minolta MD 56mm f/1.4 lens and E-mount to MD Lens Turbo adapter from ALIEXPRESS. This way I got small ultimate bokeh machine – with only one downside – manual autofocus – but SONY a5100 provides very nice implementation of Focus Peaking so its still a pleasure to use.

a5100-lcd.jpg

Of course SONY a5100 has its limitations – no viewfinder for example – but I VERY rarely used it anyway – of course intensive outdoor light can be problematic sometimes without EVF – but if someone wants to have EVF then one should get one of the SONY a6000/a6300/a6400/a6500 cameras – they are not much more larger and provide both EVF and hot shoe.

a5100-flash.jpg

Generally SONY RX100 III when powered on its comparable in size with SONY a5100 with SONY 35mm f/1.8 lens. Its the powered off state and lens range (24-70mm on SONY RX100 III) that make a difference – the SONY RX100 III even fits in the pocket – SONY a5100 does not – maybe with SONY 20mm f/2.8 lens.

If you have quite more budget to spend I also recommend the SONY RX100 V/VA which also incorporates very fast phase detection autofocus and 4k video. The SONY RX100 IV only offers 4k video but still has slower contrast autofocus – thus its IMHO pointless to get it. For the record – the SONY RX100 III also uses slower contrast based autofocus and has video up to FullHD (1080p).

top-a5100-a68.jpg

These cameras also share nice feat – they can be charged directly by attaching USB micro cable to them – very convenient – no need to provide dedicated external chargers for batteries. I really liked SONY a68 grip and lots of direct controls but I really like the size/compactness of SONY a5100. While SONY a5100 body weights 283 grams the SONY a68 is 690 grams – for the body alone. Add flash and larger lens to it and you get the idea.

top-rx100-a5100-with-lens-size.jpg

Comparing to the other side the SONY RX100 III weights 290 grams while SONY a5100 wights 437 grams with SONY 35mm f/1.8 lens attached, not bad.

Gear Summary

I have settled on these two cameras for now.

  • SONY RX100 III – gives 24-70mm f/4.9-7.6 depth of field Full Frame equivalent
  • SONY a5100 with these lenses:
    • Sony 35mm f/1.8 OSS – gives 53mm f/2.7 depth of field Full Frame equivalent
    • Minolta MD 56mm f/1.4 with Lens Turbo 0.7x adapter – gives 59mm f/1.5 depth of field Full Frame equivalent

Scripts

I switched off shooting RAW+JPEG images and now I only shoot EXTRA FINE JPEG images with Vivid profile and -0.7 EV (to not have over-burned images).

The 1st part is copying the images to new directory. That means pictures from DCIM directory and movies from PRIVATE directory.

Now the first two scripts come to play – to rename images to something useful. Each Picture and Video will have YYYY.MM.DD.HHMM(x) name.

These are made by these two scripts:

  • photo-rename-images.sh
  • photo-rename-movies.sh

Links to the scripts will be posted later in the article.

The photo-rename-images.sh uses jhead as dependency.

Now as we have everything named as it should be the size needs to be addressed. The videos will be converted using ffmpeg and images will be compressed to 92% JPEG quality with convert utility from ImageMagick suite.

  • photo-requality.sh
  • photo-movie-audio-ac3.sh

One may ask why convert JPEG from 99% to 92% and lose more quality even more? Well, you should check the differences – and one have to try really hard with very large zoom to find any. For most purposes these differences are negligible. You can also use larger value to have quite better quality and less storage savings -take photo-requality.sh 95 for example as consensus.

This is the comparison between original out of camera JPEG file and the same file compressed to 92% quality using convert utility. I was not able to stop any differences – maybe you will.

diff-crop.jpg

One may be also worried about quality loss in the videos as the size savings are that big. I also tried to find these differences and if its really hard to find them then storage savings are justified – at least for me.

I also recently added photo-flow.sh which takes two arguments. First is the device under which the camera SD card is mounted – its mmcsd0s1 on FreeBSD for most of the times. The second is directory ~/photo.NEW in which the pictures and videos will be dumped, renamed and (re)compressed.

I have put these scripts to my external (from WordPress) account on GitHub – https://github.com/vermaden/scripts – here they are:

Flow

As I attached the SD card from one of my cameras to my laptop it was automounted by my automount solution – described here – Automount Removable Media – as /media/mmcsd0s1 directory – that will be first argument for the import scripts. As I import new pictures to ~/photo.NEW directory – that will be the second argument for the import scripts.

Below you will find example output of such import/convertion process. It took about half an hour on 2011 dual-core laptop (ThinkPad T420s). I omitted/cut large parts of the same output with (…) chars in the output.

% photo-flow.sh /media/mmcsd0s1 ~/photo.NEW
/media/mmcsd0s1/DCIM/100MSDCF/DSC00390.JPG -> /home/vermaden/photo.NEW/2019.06.10.DUMP/DSC00390.JPG
/media/mmcsd0s1/DCIM/100MSDCF/DSC00391.JPG -> /home/vermaden/photo.NEW/2019.06.10.DUMP/DSC00391.JPG
/media/mmcsd0s1/DCIM/100MSDCF/DSC00393.JPG -> /home/vermaden/photo.NEW/2019.06.10.DUMP/DSC00393.JPG
(...)
/media/mmcsd0s1/DCIM/100MSDCF/DSC00462.JPG -> /home/vermaden/photo.NEW/2019.06.10.DUMP/DSC00462.JPG
/media/mmcsd0s1/DCIM/100MSDCF/DSC00463.JPG -> /home/vermaden/photo.NEW/2019.06.10.DUMP/DSC00463.JPG
/media/mmcsd0s1/DCIM/100MSDCF/DSC00464.JPG -> /home/vermaden/photo.NEW/2019.06.10.DUMP/DSC00464.JPG
/media/mmcsd0s1/PRIVATE/M4ROOT/CLIP/C0015.MP4 -> /home/vermaden/photo.NEW/2019.06.10.DUMP/C0015.MP4
/media/mmcsd0s1/PRIVATE/M4ROOT/CLIP/C0015M01.XML -> /home/vermaden/photo.NEW/2019.06.10.DUMP/C0015M01.XML

DSC00390.JPG --> 2019.05.08.0732.jpg
DSC00391.JPG --> 2019.05.08.0732a.jpg
DSC00393.JPG --> 2019.05.08.0732b.jpg
(...)
DSC00462.JPG --> 2019.06.07.2110c.jpg
DSC00463.JPG --> 2019.06.07.2110d.jpg
DSC00464.JPG --> 2019.06.07.2110e.jpg
C0015.MP4 -> 2019.06.01.2140.MP4
C0015M01.XML -> 2019.06.01.2140.XML
File './2019.05.22.0543.jpg' converted to '92' quality.
File './2019.06.07.0508a.jpg' converted to '92' quality.
File './2019.06.01.2141.jpg' converted to '92' quality.
(...)
File './2019.05.23.0124c.jpg' converted to '92' quality.
File './2019.06.01.2140e.jpg' converted to '92' quality.
File './2019.05.22.0548a.jpg' converted to '92' quality.
ffmpeg version 4.1.3 Copyright (c) 2000-2019 the FFmpeg developers
(...)
Guessed Channel Layout for Input Stream #0.1 : stereo
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '2019.06.01.2140.MP4':
  Metadata:
    major_brand     : XAVC
    minor_version   : 16785407
    compatible_brands: XAVCmp42iso2
    creation_time   : 2019-06-01T19:40:52.000000Z
  Duration: 00:00:21.60, start: 0.000000, bitrate: 52049 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709/bt709/iec61966-2-4), 1920x1080 [SAR 1:1 DAR 16:9], 50101 kb/s, 50 fps, 50 tbr, 50k tbn, 100 tbc (default)
    Metadata:
      creation_time   : 2019-06-01T19:40:52.000000Z
      handler_name    : Video Media Handler
      encoder         : AVC Coding
    Stream #0:1(und): Audio: pcm_s16be (twos / 0x736F7774), 48000 Hz, stereo, s16, 1536 kb/s (default)
    Metadata:
      creation_time   : 2019-06-01T19:40:52.000000Z
      handler_name    : Sound Media Handler
    Stream #0:2(und): Data: none (rtmd / 0x646D7472), 409 kb/s (default)
    Metadata:
      creation_time   : 2019-06-01T19:40:52.000000Z
      handler_name    : Timed Metadata Media Handler
      timecode        : 83:01:01;02
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (pcm_s16be (native) -> ac3 (native))
Press [q] to stop, [?] for help
[libx264 @ 0x80ddfb400] using SAR=1/1
[libx264 @ 0x80ddfb400] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
[libx264 @ 0x80ddfb400] profile High, level 4.2, 4:2:0, 8-bit
[libx264 @ 0x80ddfb400] 264 - core 157 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=1 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 vbv_maxrate=25000 vbv_bufsize=25000 crf_max=0.0 nal_hrd=none filler=0 ip_ratio=1.40 aq=1:1.00
Output #0, matroska, to '2019.06.01.2140.MP4.mkv':
  Metadata:
    major_brand     : XAVC
    minor_version   : 16785407
    compatible_brands: XAVCmp42iso2
    encoder         : Lavf58.20.100
    Stream #0:0(und): Video: h264 (libx264) (H264 / 0x34363248), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 50 fps, 1k tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2019-06-01T19:40:52.000000Z
      handler_name    : Video Media Handler
      encoder         : Lavc58.35.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 25000000/0/0 buffer size: 25000000 vbv_delay: -1
    Stream #0:1(und): Audio: ac3 ([0] [0][0] / 0x2000), 48000 Hz, stereo, fltp, 160 kb/s (default)
    Metadata:
      creation_time   : 2019-06-01T19:40:52.000000Z
      handler_name    : Sound Media Handler
      encoder         : Lavc58.35.100 ac3
frame= 1080 fps=4.1 q=31.0 Lsize=   30522kB time=00:00:21.59 bitrate=11578.4kbits/s speed=0.0815x    
video:30086kB audio:422kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.046764%
(...)

This is how the pictures look like imported and converted after running the import flow. We still have original 2019.06.01.2140.MP4 movie but we can delete it of course.

% exa ~/photo.NEW/2019.06.10.DUMP
2019.05.08.0732.jpg   2019.05.22.0548.jpg   2019.05.25.2111.jpg   2019.06.01.0914.jpg   2019.06.01.2140.jpg      2019.06.07.0509.jpg
2019.05.08.0732a.jpg  2019.05.22.0548a.jpg  2019.05.25.2111a.jpg  2019.06.01.0915.jpg   2019.06.01.2140.MP4      2019.06.07.0509a.jpg
2019.05.08.0732b.jpg  2019.05.22.0548b.jpg  2019.05.25.2111b.jpg  2019.06.01.2043.jpg   2019.06.01.2140.MP4.mkv  2019.06.07.0509b.jpg
2019.05.08.0733.jpg   2019.05.22.0549.jpg   2019.05.25.2111c.jpg  2019.06.01.2043a.jpg  2019.06.01.2140.XML      2019.06.07.2110.jpg
2019.05.22.0541.jpg   2019.05.22.0550.jpg   2019.05.27.0712.jpg   2019.06.01.2043b.jpg  2019.06.01.2140a.jpg     2019.06.07.2110a.jpg
2019.05.22.0541a.jpg  2019.05.22.0551.jpg   2019.05.27.0712a.jpg  2019.06.01.2043c.jpg  2019.06.01.2140b.jpg     2019.06.07.2110b.jpg
2019.05.22.0542.jpg   2019.05.23.0124.jpg   2019.05.27.0712b.jpg  2019.06.01.2043d.jpg  2019.06.01.2140c.jpg     2019.06.07.2110c.jpg
2019.05.22.0542a.jpg  2019.05.23.0124a.jpg  2019.05.27.0712c.jpg  2019.06.01.2043e.jpg  2019.06.01.2140d.jpg     2019.06.07.2110d.jpg
2019.05.22.0542b.jpg  2019.05.23.0124b.jpg  2019.05.27.0712d.jpg  2019.06.01.2043f.jpg  2019.06.01.2140e.jpg     2019.06.07.2110e.jpg
2019.05.22.0542c.jpg  2019.05.23.0124c.jpg  2019.05.27.0712e.jpg  2019.06.01.2043g.jpg  2019.06.01.2141.jpg
2019.05.22.0543.jpg   2019.05.23.1831.jpg   2019.05.27.0712f.jpg  2019.06.01.2043h.jpg  2019.06.01.2141a.jpg
2019.05.22.0543a.jpg  2019.05.25.2110.jpg   2019.05.27.0713.jpg   2019.06.01.2043i.jpg  2019.06.07.0508.jpg
2019.05.22.0543b.jpg  2019.05.25.2110a.jpg  2019.05.27.0713a.jpg  2019.06.01.2044.jpg   2019.06.07.0508a.jpg

These are differences in size before and after conversion – both for example picture and video.

% ls -lh ~/photo.NEW/2019.06.10.DUMP/2019.06.01.2140.MP4*
-rw-r--r--  1 vermaden  vermaden   134M 2019.06.01 21:41 /home/vermaden/photo.NEW/2019.06.10.DUMP/2019.06.01.2140.MP4
-rw-r--r--  1 vermaden  vermaden    30M 2019.06.10 22:57 /home/vermaden/photo.NEW/2019.06.10.DUMP/2019.06.01.2140.MP4.mkv

% ls -lh /media/mmcsd0s1/DCIM/100MSDCF/DSC00430.JPG ~/photo.NEW/2019.06.10.DUMP/2019.05.27.0712f.jpg
-rw-r--r--  1 vermaden  vermaden   4.4M 2019.06.10 22:53 /home/vermaden/photo.NEW/2019.06.10.DUMP/2019.05.27.0712f.jpg
-rw-r--r--  1 vermaden  vermaden   6.4M 2019.05.27 07:12 /media/mmcsd0s1/DCIM/100MSDCF/DSC00430.JPG

The best savings are in the video – more then 4 times smaller file. The pictures are about 30% smaller.

Totals of the size differences for the whole import are below. First the original dump from camera SD card.

% du -scm /media/mmcsd0s1/DCIM /media/mmcsd0s1/PRIVATE/M4ROOT/CLIP
400     /media/mmcsd0s1/DCIM
135     /media/mmcsd0s1/PRIVATE/M4ROOT/CLIP
534     total

… and converted/imported size.

% rm ~/photo.NEW/2019.06.10.DUMP/2019.06.01.2140.MP4

% du -scm /home/vermaden/photo.NEW/2019.06.10.DUMP/*jpg | tail -1
265     total

% du -scm /home/vermaden/photo.NEW/2019.06.10.DUMP/*mkv | tail -1
30      total

% du -scm ~/photo.NEW/2019.06.10.DUMP
295     /home/vermaden/photo.NEW/2019.06.10.DUMP
295     total

So after import and conversion the pictures went from 400 to 265 MB and movies (actually one movie) went from 135 to 30 MB. The most important thing is that I can import and convert this convent without any interactive and lengthy process.

These scripts (definitely the video renamer one) may be SONY related but nothing stops you from modifying them to the files provided by your camera manufacturer.

Feel free to share your photography flow πŸ™‚

EOF

Valuable News – 2019/06/10

The Valuable News weekly series is dedicated to provide summary about news, articles and other interesting stuff mostly but not always related to the UNIX or BSD systems. Whenever I stumble upon something worth mentioning on the Internet I just put it here.

Today the amount information that we get using various information streams is at massive overload. Thus one needs to focus only on what is important without the need to grep(1) the Internet everyday. Hence the idea of providing such information ‘bulk’ as I already do that grep(1).

UNIX

Use OpenBSD pledge/unveil syscalls in PHP.
https://gist.github.com/tvlooy/0e28e59178be86a5c12096abde6f4bb3

FreeBSD adds natural scrolling support to psm(4) with sysmouse protocol.
https://svnweb.freebsd.org/base?view=revision&revision=348529

FreeBSD adds Elantech touchpad IC type 15 to psm(4) found on ThinkPad L480.
https://svnweb.freebsd.org/base?view=revision&revision=348520

Next macOS Catalina will use ZSH as default login and interactive shell.
https://support.apple.com/en-ca/HT208050

Install Vanilla Forum on FreeBSD 12.
https://www.vultr.com/docs/how-to-install-vanilla-forum-on-freebsd-12

OmniOS Community Edition r151030e/r151028ae/r151022dc Available.
https://omniosce.org/article/030e-028ae-022dc

Treating Openbox like Tiling Window Manager.
https://thomashunter.name/posts/2019-01-27-treating-openbox-like-a-tiling-windowmanager

FreeBSD 2019 Q1 Status Report is Available.
https://www.freebsd.org/news/status/report-2019-01-2019-03.html

NetBSD Validation and Improvements of Debugging Interfaces.
https://blog.netbsd.org/tnf/entry/validation_and_improvements_of_debugging

NetBSD 8.1 Available.
https://www.netbsd.org/releases/formal-8/NetBSD-8.1.html

NetBSD XSAVE and compat32 Kernel Work for LLDB.
https://blog.netbsd.org/tnf/entry/xsave_and_compat32_kernel_work

OPNsense 19.1.9 Released.
https://forum.opnsense.org/index.php?topic=12993.0

The End of Joyent Cloud.
https://www.joyent.com/blog/joyent-announces-strategic-change-to-their-public-cloud-business

Farewell Joyent Public Cloud.
https://chabik.com/2019/06/farewell-joyent-public-cloud/

BSD Now 301 – GPU Passthrough.
https://www.bsdnow.tv/301

FreeBSD 11.3-BETA3 Available.
https://lists.freebsd.org/pipermail/freebsd-stable/2019-June/091257.html

FreeBSD Linux Ports Roadmap.
https://lists.freebsd.org/pipermail/freebsd-emulation/2019-June/017006.html

FreeBSD adds SDIO support.
https://freshbsd.org/commit/freebsd/src/348805

OpenBSD-stable up-to-date packages for amd64/i386/powerpc architectures.
https://dataswamp.org/~solene/2019-06-01-packages-stable.html

Brief History of Solaris (SunOS) Ports.
http://rabbs.com/uuasc/SOLARIS_PPC
https://archive.org/details/solaris251ppc

After Decades in BSDs – TTY Keyboard Status Request Feature Being Proposed For Linux.
https://www.phoronix.com/scan.php?page=news_item&px=TTY-Keyboard-Status-Request-RFC

In Other BSDs for 2019/06/08
https://www.dragonflydigest.com/2019/06/08/23024.html

Tiling Desktop Environment.
https://bitcannon.net/post/pro-desktop/

Hardware

ODROID-H2 Schematics.
https://wiki.odroid.com/odroid-h2/hardware#odroid-h2_schematics

Seagate IronWolf (Pro) 16TB Available.
https://www.servethehome.com/seagate-exos-x16-ironwolf-16tb-and-ironwolf-pro-16tb-shipping/

Drupal on OpenBSD.
https://dev.to/nabbisen/drupal-on-openbsd-4n39

HoneyComb LX2K – Powerful 16 Core Mini ITX ARM Workstation.
https://www.solid-run.com/nxp-lx2160a-family/honeycomb-workstation/

AMD EPYC Rome NAMD and Intel Xeon Response at Computex 2019.
https://www.servethehome.com/amd-epyc-rome-namd-intel-xeon-computex-2019/

PINE64 News – PinePhone/Pinebook Pro/PineTab.
https://www.pine64.org/2019/06/06/june-2019-news-pinephone-pinebook-pro-and-pinetab/

LackRack.
https://wiki.eth0.nl/index.php/LackRack

Semi Review of Raptor Blackbird – POWER9 on the Cheap.
https://www.talospace.com/2019/06/a-semi-review-of-raptor-blackbird.html

XigmaNAS 12.0.0.4.6743 Released.
https://sourceforge.net/projects/xigmanas/files/XigmaNAS-12.0.0.4/12.0.0.4.6743/

XigmaNAS 11.2.0.4.6743 Released.
https://sourceforge.net/projects/xigmanas/files/XigmaNAS-11.2.0.4/11.2.0.4.6743/

Life

Kids Raised without Religion are Kinder and More Empathetic.
https://www.disclose.tv/kids-raised-without-religion-are-kinder-and-more-empathetic-study-discovered-368306

The Best Countries for Female Workers.
https://www.weforum.org/agenda/2019/03/best-countries-for-female-workers

Alone – Decline of family has unleashed epidemic of loneliness.
https://www.city-journal.org/decline-of-family-loneliness-epidemic

Other

Firefox Now Available with Enhanced Tracking Protection with Updates to Facebook Container and Firefox Monitor.
https://blog.mozilla.org/blog/2019/06/04/firefox-now-available-with-enhanced-tracking-protection-by-default/

How Ledger Hacked HSM.
https://cryptosense.com/blog/how-ledger-hacked-an-hsm/

Wasteland 3 in development with target release in 2020 Q2.
https://www.inxile-entertainment.com/wasteland3

EOF

RabbitMQ Cluster on FreeBSD Containers

I really like small and simple dedicated solutions that do one thing well and do it really good – maybe its because I like UNIX that much. Good example of such approach is Minio object storage which implements S3 protocol with distributed clustering, erasure code and builtin web interface along with many other features about which I wrote in the Distributed Object Storage with Minio on FreeBSD article.

The RabbitMQ is another such example – currently probably the most popular implementation of the AMQP protocol – it also comes with small and sleek web interface. The difference is power. Minio comes with very basic user oriented web interface while most administrative and configuration tasks needs to be done from the CLI. The Minio web interface allows one to create/delete buckets there and also to download/upload files. RabbitMQ have so sophisticated web interface that after you enable it you do not need command line anymore. Everything can be accomplished using just web interface.

rabbitmq-logo.png

Compared to other messaging solutions like ActiveMQ or Apache Kafka it is very popular when checked in the Google Trends query.

rabbitmq-trends.jpg

Today I would like to show you RabbitMQ messaging with quite redundant clustered setup with mirrored queues.

You will find Table of Contents below.

  • Jails Setup
  • RabbitMQ Installation
  • RabbitMQ Setup
  • RabbitMQ Plugins
  • RabbitMQ Administrative User
  • RabbitMQ Cluster Setup
  • RabbitMQ Highly Available Policy
  • Feed the Queue
  • Go Language Installation
  • Simple Benchmark
  • High Availability
  • UPDATE 1 – This Month in RabbitMQ
  • UPDATE 2 – Make RabbitMQ Use Less CPU

From all possible virtualization possibilities available on FreeBSD (VirtualBox/Bhyve/QEMU/Jails/Docker) I have chosen the lightweight FreeBSD Containers – Jails πŸ™‚

The legend is the same as usual.

Command run on the host system as root user.

host # command

Command run on the host system as regular user.

host % command

Command run on the rabbitX Jail.

rabbitX # command

Jails Setup

First we will create the base Jails for our setup. Both the host system and Jails Containers use FreeBSD 11.2-RELEASE system.

host # mkdir -p /jail/BASE
host # fetch -o /jail/BASE/11.2-RELEASE.base.txz http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/12.1-RELEASE/base.txz
host # for I in 1 2; do echo ${I}; mkdir -p /jail/rabbit${I}; tar --unlink -xpJf /jail/BASE/11.2-RELEASE.base.txz -C /jail/rabbit${I}; done
1
2
host #

We now have 2 empty clean Jails.

We will now add Jails configuration to the /etc/jail.conf file.

I have used my laptop for the Jail host thus Jails will configured to use the wireless wlan0 interface and 192.168.43.10X addresses. I also added 10.0.0.10X network addresses as this will make it more convenient for me for the purposes of writing this article.

host # for I in 1 2
do
  cat >> /etc/jail.conf << __EOF
rabbit${I} {
  host.hostname = rabbit${I}.local;
  ip4.addr += 192.168.43.10${I};
  ip4.addr += 10.0.0.10${I};
  interface = wlan0;
  path = /jail/rabbit${I};
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.clean;
  mount.devfs;
  allow.raw_sockets;
}

__EOF
done
host #

This is how the /etc/jail.conf file looks after its configured.

host # cat /etc/jail.conf
rabbit1 {
  host.hostname = rabbit1.local;
  ip4.addr += 192.168.43.101;
  ip4.addr += 10.0.0.101;
  interface = wlan0;
  path = /jail/rabbit1;
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.clean;
  mount.devfs;
  allow.raw_sockets;
}

rabbit2 {
  host.hostname = rabbit2.local;
  ip4.addr += 192.168.43.102;
  ip4.addr += 10.0.0.102;
  interface = wlan0;
  path = /jail/rabbit2;
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.clean;
  mount.devfs;
  allow.raw_sockets;
}

Now we can start our Jails.

host # for I in 1 2; do service jail onestart rabbit${I}; done
Starting jails: rabbit1.
Starting jails: rabbit2.

Jails are running properly.

# jls
   JID  IP Address      Hostname                      Path
     1  192.168.43.101  rabbit1.local                 /jail/rabbit1
     2  192.168.43.102  rabbit2.local                 /jail/rabbit2

Time to add DNS server to the Jails so they will have Internet connectivity.

host # for I in 1 2; do cat /jail/rabbit${I}/etc/resolv.conf; done
nameserver 1.1.1.1
nameserver 1.1.1.1

Now we will switch from 'quarterly' to 'latest' packages.

host # for I in 1 2; do sed -i '' s/quarterly/latest/g /jail/rabbit${I}/etc/pkg/FreeBSD.conf; done

host # for I in 1 2; do grep latest /jail/rabbit${I}/etc/pkg/FreeBSD.conf; done
  url: "pkg+http://pkg.FreeBSD.org/${ABI}/latest",
  url: "pkg+http://pkg.FreeBSD.org/${ABI}/latest",

RabbitMQ Installation

We can now install RabbitMQ package.

host # for I in 1 2; do jexec rabbit${I} env ASSUME_ALWAYS_YES=yes pkg install -y rabbitmq; echo; done
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:11:amd64/latest, please wait...
Verifying signature with trusted certificate pkg.freebsd.org.2013102301... done
[rabbit1.local] Installing pkg-1.10.5_5...
[rabbit1.local] Extracting pkg-1.10.5_5: 100%
Updating FreeBSD repository catalogue...
pkg: Repository FreeBSD load error: access repo file(/var/db/pkg/repo-FreeBSD.sqlite) failed: No such file or directory
[rabbit1.local] Fetching meta.txz: 100%    944 B   0.9kB/s    00:01    
[rabbit1.local] Fetching packagesite.txz: 100%    6 MiB 745.4kB/s    00:09    
Processing entries: 100%
FreeBSD repository update completed. 32114 packages processed.
All repositories are up to date.
Updating database digests format: 100%
The following 2 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        rabbitmq: 3.7.15
        erlang-runtime19: 21.3.8.2

Number of packages to be installed: 2

The process will require 104 MiB more space.
41 MiB to be downloaded.
[rabbit1.local] [1/2] Fetching rabbitmq-3.7.15.txz: 100%    9 MiB 762.2kB/s    00:12    
[rabbit1.local] [2/2] Fetching erlang-runtime19-21.3.8.2.txz: 100%   33 MiB 978.8kB/s    00:35    
Checking integrity... done (0 conflicting)
[rabbit1.local] [1/2] Installing erlang-runtime19-21.3.8.2...
[rabbit1.local] [1/2] Extracting erlang-runtime19-21.3.8.2: 100%
[rabbit1.local] [2/2] Installing rabbitmq-3.7.15...
===> Creating groups.
Creating group 'rabbitmq' with gid '135'.
===> Creating users
Creating user 'rabbitmq' with uid '135'.
[rabbit1.local] [2/2] Extracting rabbitmq-3.7.15: 100%
Message from erlang-runtime19-21.3.8.2:

===========================================================================

To use this runtime port for development or testing, just prepend
its binary path ("/usr/local/lib/erlang19/bin") to your PATH variable.

===========================================================================

(...)

// SAME MESSAGES FOR THE OTHER rabbit2 JAIL //

Lets verify that RabbitMQ package has installed successfully.

host # for I in 1 2; do jexec rabbit${I} which rabbitmqctl; done
/usr/local/sbin/rabbitmqctl
/usr/local/sbin/rabbitmqctl

RabbitMQ Setup

We will now configure /etc/hosts files on our Jails.

host # for I in 1 2; do cat >> /jail/rabbit${I}/etc/hosts << __EOF
192.168.43.101 rabbit1
192.168.43.102 rabbit2

__EOF
done

… and fast verification.

host # cat /jail/rabbit?/etc/hosts | grep 192.168.43 | sort -n | uniq -c
2 192.168.43.101 rabbit1
2 192.168.43.102 rabbit2

As we have RabbitMQ package installed we need to enable it and start it.

host # jexec rabbit1 /usr/local/etc/rc.d/rabbitmq rcvar
# rabbitmq
#
rabbitmq_enable="NO"
#   (default: "")

As we see we need to set rabbitmq_enable=YES value in /etc/rc.conf file within each of our Jails.

host # for I in 1 2; do jexec rabbit${I} sysrc rabbitmq_enable=YES; done
rabbitmq_enable:  -> YES
rabbitmq_enable:  -> YES

Now we can start the RabbitMQ in the Jails.

host # for I in 1 2; do jexec rabbit${I} service rabbitmq start; done
Starting rabbitmq.
Starting rabbitmq.

Now we have four RabbitMQ instances up and running.

This is the list of plugins enabled by default. None.

RabbitMQ Plugins

rabbit1 # rabbitmq-plugins list
 Configured: E = explicitly enabled; e = implicitly enabled
 | Status: * = running on rabbit@rabbit1
 |/
[  ] rabbitmq_amqp1_0                  3.7.15
[  ] rabbitmq_auth_backend_cache       3.7.15
[  ] rabbitmq_auth_backend_http        3.7.15
[  ] rabbitmq_auth_backend_ldap        3.7.15
[  ] rabbitmq_auth_mechanism_ssl       3.7.15
[  ] rabbitmq_consistent_hash_exchange 3.7.15
[  ] rabbitmq_event_exchange           3.7.15
[  ] rabbitmq_federation               3.7.15
[  ] rabbitmq_federation_management    3.7.15
[  ] rabbitmq_jms_topic_exchange       3.7.15
[  ] rabbitmq_management               3.7.15
[  ] rabbitmq_management_agent         3.7.15
[  ] rabbitmq_mqtt                     3.7.15
[  ] rabbitmq_peer_discovery_aws       3.7.15
[  ] rabbitmq_peer_discovery_common    3.7.15
[  ] rabbitmq_peer_discovery_consul    3.7.15
[  ] rabbitmq_peer_discovery_etcd      3.7.15
[  ] rabbitmq_peer_discovery_k8s       3.7.15
[  ] rabbitmq_random_exchange          3.7.15
[  ] rabbitmq_recent_history_exchange  3.7.15
[  ] rabbitmq_sharding                 3.7.15
[  ] rabbitmq_shovel                   3.7.15
[  ] rabbitmq_shovel_management        3.7.15
[  ] rabbitmq_stomp                    3.7.15
[  ] rabbitmq_top                      3.7.15
[  ] rabbitmq_tracing                  3.7.15
[  ] rabbitmq_trust_store              3.7.15
[  ] rabbitmq_web_dispatch             3.7.15
[  ] rabbitmq_web_mqtt                 3.7.15
[  ] rabbitmq_web_mqtt_examples        3.7.15
[  ] rabbitmq_web_stomp                3.7.15
[  ] rabbitmq_web_stomp_examples       3.7.15

Time to enable web interface plugin.

host # for I in 1 2; do jexec rabbit${I} rabbitmq-plugins enable rabbitmq_management; done
The following plugins have been configured:
  rabbitmq_management
  rabbitmq_management_agent
  rabbitmq_web_dispatch
Applying plugin configuration to rabbit@rabbit1...
The following plugins have been enabled:
  rabbitmq_management
  rabbitmq_management_agent
  rabbitmq_web_dispatch

started 3 plugins.

(...)

// SAME MESSAGES FOR THE OTHER rabbit2 JAIL //

Now we have web interface plugin enabled in each RabbitMQ FreeBSD Jail.

Big ‘E‘ letter means that this is the plugin that we enabled and small ‘e‘ letter means that this plugin is only enabled as ‘dependency’ for some other plugin we requested to be enabled.

rabbit1 # rabbitmq-plugins list
 Configured: E = explicitly enabled; e = implicitly enabled
 | Status: * = running on rabbit@rabbit1
 |/
[  ] rabbitmq_amqp1_0                  3.7.15
[  ] rabbitmq_auth_backend_cache       3.7.15
[  ] rabbitmq_auth_backend_http        3.7.15
[  ] rabbitmq_auth_backend_ldap        3.7.15
[  ] rabbitmq_auth_mechanism_ssl       3.7.15
[  ] rabbitmq_consistent_hash_exchange 3.7.15
[  ] rabbitmq_event_exchange           3.7.15
[  ] rabbitmq_federation               3.7.15
[  ] rabbitmq_federation_management    3.7.15
[  ] rabbitmq_jms_topic_exchange       3.7.15
[E*] rabbitmq_management               3.7.15
[e*] rabbitmq_management_agent         3.7.15
[  ] rabbitmq_mqtt                     3.7.15
[  ] rabbitmq_peer_discovery_aws       3.7.15
[  ] rabbitmq_peer_discovery_common    3.7.15
[  ] rabbitmq_peer_discovery_consul    3.7.15
[  ] rabbitmq_peer_discovery_etcd      3.7.15
[  ] rabbitmq_peer_discovery_k8s       3.7.15
[  ] rabbitmq_random_exchange          3.7.15
[  ] rabbitmq_recent_history_exchange  3.7.15
[  ] rabbitmq_sharding                 3.7.15
[  ] rabbitmq_shovel                   3.7.15
[  ] rabbitmq_shovel_management        3.7.15
[  ] rabbitmq_stomp                    3.7.15
[  ] rabbitmq_top                      3.7.15
[  ] rabbitmq_tracing                  3.7.15
[  ] rabbitmq_trust_store              3.7.15
[e*] rabbitmq_web_dispatch             3.7.15
[  ] rabbitmq_web_mqtt                 3.7.15
[  ] rabbitmq_web_mqtt_examples        3.7.15
[  ] rabbitmq_web_stomp                3.7.15
[  ] rabbitmq_web_stomp_examples       3.7.15

Now – in order to create a cluster – we need these RabbitMQ instances to share the same ERLANG cookie. The ERLANG cookie can be found at /var/db/rabbitmq/.erlang.cookie on FreeBSD system.

rabbot1 # cat /var/db/rabbitmq/.erlang.cookie; echo
NOEVQNXJDNLAJOSVWNIW
rabbot1 # 

We will need to stop RabbitMQ to change ERLANG cookie.

host # for I in 1 2; do jexec rabbit${I} service rabbitmq stop; done
Stopping rabbitmq.
Waiting for PIDS: 88684.
Stopping rabbitmq.
Waiting for PIDS: 20976.

Let’s set the same ERLANG cookie on each FreeBSD Jail then.

host # for I in 1 2; do cat > /jail/rabbit${I}/var/db/rabbitmq/.erlang.cookie << __EOF
RABBITMQFREEBSDJAILS
__EOF
done

… and now we need to start them again.

host # for I in 1 2; do jexec rabbit${I} service rabbitmq start; done
Starting rabbitmq.
Starting rabbitmq.

Fast verification.

host # for I in 1 2; do jexec rabbit${I} cat /var/db/rabbitmq/.erlang.cookie; done
RABBITMQFREEBSDJAILS
RABBITMQFREEBSDJAILS

RabbitMQ Administrative User

Now we will create administrative user called admin for the RabbitMQ instances.

host # for I in 1 2; do jexec rabbit${I} rabbitmqctl add_user admin ADMINPASSWORD; done
Adding user "admin" ...
Adding user "admin" ...

host # for I in 1 2; do jexec rabbit${I} rabbitmqctl set_user_tags admin administrator; done
Setting tags for user "admin" to [administrator] ...
Setting tags for user "admin" to [administrator] ...

host # for I in 1 2; do jexec rabbit${I} rabbitmqctl set_permissions -p / admin ".*" ".*" ".*" ; done
Setting permissions for user "admin" in vhost "/" ...
Setting permissions for user "admin" in vhost "/" ...

We should now be able to login to the http://192.168.43.101:15672/ (or http://10.0.0.101:15672/ also) RabbitMQ management page.

01-rabbitmq-login.png

After login a useful RabbitMQ dashboard will welcome you.

02-rabbitmq-dashboard.png

RabbitMQ Cluster Setup

We will now create RabbitMQ cluster.

rabbit1 # rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1]}]},
 {running_nodes,[rabbit@rabbit1]},
 {cluster_name,},
 {partitions,[]},
 {alarms,[{rabbit@rabbit1,[]}]}]

rabbit2 # hostname
rabbit2.local

rabbit2 # rabbitmqctl join_cluster rabbit@rabbit1
Error: this command requires the 'rabbit' app to be stopped on the target node. Stop it with 'rabbitmqctl stop_app'.
Arguments given:
        join_cluster rabbit@rabbit1

Usage

rabbitmqctl [--node ] [--longnames] [--quiet] join_cluster [--disc|--ram] 

We first need to stop the RabbitMQ ‘application’ to join the cluster.

rabbit2 # rabbitmqctl stop_app
Stopping rabbit application on node rabbit@rabbit2 ...

rabbit2 # rabbitmqctl join_cluster rabbit@rabbit1
Clustering node rabbit@rabbit2 with rabbit@rabbit1

rabbit2 # rabbitmqctl start_app
Starting node rabbit@rabbit2 ...
 completed with 5 plugins.

rabbit2 # rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2]},
 {cluster_name,},
 {partitions,[]},
 {alarms,[{rabbit@rabbit1,[]},{rabbit@rabbit2,[]}]}]

rabbit1 # rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1]},
 {cluster_name,},
 {partitions,[]},
 {alarms,[{rabbit@rabbit2,[]},{rabbit@rabbit1,[]}]}]

Now we have formed two node RabbitMQ cluster. We will rename it to cluster then.

rabbit1 # rabbitmqctl set_cluster_name rabbit@cluster
Setting cluster name to rabbit@cluster ...

rabbit1 # rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1]},
 {cluster_name,},
 {partitions,[]},
 {alarms,[{rabbit@rabbit2,[]},{rabbit@rabbit1,[]}]}]

Here is how our cluster looks in the web interface.

08-rabbitmq-cluster.png

RabbitMQ Highly Available Policy

To have Highly Available (Mirrored) Queues in RabbitMQ you need to create Policy. We will declare Policy named ha which will match queues whose names begin with ‘ha-‘ prefix so they will be configured with mirroring to all two nodes in the cluster.

This is the command you need to execute to create such Policy.

rabbit1 # rabbitmqctl set_policy ha "^ha-\.*" '{"ha-mode":"all","ha-sync-mode":"automatic"}'
Setting policy "ha-mirror" for pattern "^ha-\." to "{"ha-mode":"all","ha-sync-mode":"automatic"}" with priority "0" for vhost "/" ...

… or alternatively you can use the web interface to create it.

No matter which method you have chosen you will end up with needed ha Policy as shown below.

03-rabbitmq-policy.png

Feed the Queue

We now have two node RabbitMQ cluster with HA for queues that name starts with ha- prefix. We will now test our RabbitMQ setup and will create and feed the queue with send.go script – as you probably guessed – written in Go. We will need to add Go language to our host system.

Go Language Installation

host # pkg install go
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        go: 1.12.5,1

Number of packages to be installed: 1

The process will require 262 MiB more space.
75 MiB to be downloaded.

Proceed with this action? [y/N]: y
(...)

host % go version
go version go1.12.5 freebsd/amd64

This is the send.go script – we will use it to send 10 messages to the ha-default queue. Its based on the RabbitMQ Hello World tutorial.

host % cat send.go
package main

import (
  "log"
  "amqp"
)

func FAIL_ON_ERROR(err error, msg string) {
  if err != nil {
    log.Fatalf("%s: %s", msg, err)
  }
}

func main() {
  conn, err := amqp.Dial("amqp://admin:ADMINPASSWORD@10.0.0.101:5672/")
  FAIL_ON_ERROR(err, "ER: failed to connect to RabbitMQ")
  defer conn.Close()

  ch, err := conn.Channel()
  FAIL_ON_ERROR(err, "ER: failed to open channel")
  defer ch.Close()

  q, err := ch.QueueDeclare(
    "ha-default", // name
    false,        // durable
    false,        // delete when unused
    false,        // exclusive
    false,        // no-wait
    nil,          // arguments
  )
  FAIL_ON_ERROR(err, "ER: failed to declare queue")

  body := "Hello World!"

  for i := 1; i <= 10; i++ {
    err = ch.Publish(
      "",     // exchange
      q.Name, // routing key
      false,  // mandatory
      false,  // immediate
      amqp.Publishing{
        ContentType: "text/plain",
        Body:        []byte(body),
      })
    log.Printf("IN: sent message '%s' (%d)", body, i)
    FAIL_ON_ERROR(err, "ER: failed to publish message")
  }

}


We will now run it.

host % go run send.go
send.go:5:3: cannot find package "amqp" in any of:
        /usr/local/go/src/amqp (from $GOROOT)
        /home/vermaden/.gopkg/src/amqp (from $GOPATH)

We lack the amqp package for the Go language.

We will need to download it from its https://github.com/streadway/amqp page. We will get it by downloading everything in a ZIP package.

host % mkdir -p ~/.gopkg/src
host % cd !$
host % pwd
/home/vermaden/.gopkg/src
host % fetch https://github.com/streadway/amqp/archive/master.zip
host % unzip master.zip 
Archive:  /home/vermaden/.gopkg/src/master.zip
   creating: amqp-master/
 extracting: amqp-master/.gitignore
 extracting: amqp-master/.travis.yml
 (...)
 extracting: amqp-master/uri.go
 extracting: amqp-master/uri_test.go
 extracting: amqp-master/write.go
host % rm master.zip
host % mv amqp-master amqp
host % cd amqp
host % pwd
/home/vermaden/.gopkg/src/amqp
host % exa
_examples          confirms.go         delivery_test.go        LICENSE            spec091.go
spec               confirms_test.go    doc.go                  pre-commit         tls_test.go
allocator.go       connection.go       example_client_test.go  read.go            types.go
allocator_test.go  connection_test.go  examples_test.go        read_test.go       uri.go
auth.go            consumers.go        fuzz.go                 README.md          uri_test.go
certs.sh           consumers_test.go   gen.sh                  reconnect_test.go  write.go
channel.go         CONTRIBUTING.md     go.mod                  return.go          
client_test.go     delivery.go         integration_test.go     shared_test.go     

We also need to make sure that PATH and GOPATH are properly configured. To do so you need to put these in your interactive shell config.

# GO SHELL SETUP
mkdir -p ~/.gopkg
export GOPATH=~/.gopkg
export PATH="${PATH}:~/.gopkg"

Now we can get back to feeding our queue.

host % go run send.go
2019/06/05 13:53:59 IN: sent message 'Hello World!' (1)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (2)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (3)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (4)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (5)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (6)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (7)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (8)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (9)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (10)
% 

The ha-default queue has been created and feeded with 10 messages.

04-rabbitmq-queue

Now we need to ‘receive’ these messages from the queue. This is where receive.go script comes with help. It is also based on the RabbitMQ Hello World tutorial.

host % cat receive.go
package main

import (
  "log"
  "amqp"
)

func FAIL_ON_ERROR(err error, msg string) {
  if err != nil {
    log.Fatalf("%s: %s", msg, err)
  }
}

func main() {
  conn, err := amqp.Dial("amqp://admin:ADMINPASSWORD@10.0.0.102:5672/")
  FAIL_ON_ERROR(err, "ER: failed to connect to RabbitMQ")
  defer conn.Close()

  ch, err := conn.Channel()
  FAIL_ON_ERROR(err, "ER: failed to open channel")
  defer ch.Close()

  q, err := ch.QueueDeclare(
    "ha-default", // name
    false,        // durable
    false,        // delete when unused
    false,        // exclusive
    false,        // no-wait
    nil,          // arguments
  )
  FAIL_ON_ERROR(err, "ER: failed to declare queue")

  msgs, err := ch.Consume(
    q.Name, // queue
    "",     // consumer
    true,   // auto-ack
    false,  // exclusive
    false,  // no-local
    false,  // no-wait
    nil,    // args
  )
  FAIL_ON_ERROR(err, "ER: failed to register consumer")

  forever := make(chan bool)

  go func() {
    for d := range msgs {
      log.Printf("IN: received message: %s", d.Body)
    }
  }()

  log.Printf("IN: waiting for messages")
  log.Printf("IN: to exit press CTRL+C")
  <-forever
}

Here is its output after running. It will not stop running until you end it with CTRL-C sequence.

host % go run receive.go
2019/06/05 13:54:34 IN: waiting for messages
2019/06/05 13:54:34 IN: to exit press CTRL+C
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
^C
%

If you checked the source code carefully then you probably noticed that I ‘sent’ messages to the rabbit1 node (10.0.0.101) while I ‘received’ the messages at the rabbit2 node (10.0.0.102).

Simple Benchmark

We will now make simple benchmark with receive.go script left running and modified send.go script with the for loop with 100000 messages.

host % go run receive.go
2019/06/05 13:52:34 IN: waiting for messages
2019/06/05 13:52:34 IN: to exit press CTRL+C

… and now the messages.

host % go run send.go
2019/06/05 13:53:59 IN: sent message 'Hello World!' (1)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (2)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (3)
(...)
2019/06/05 13:56:26 IN: sent message 'Hello World!' (99998)
2019/06/05 13:56:26 IN: sent message 'Hello World!' (99999)
2019/06/05 13:56:26 IN: sent message 'Hello World!' (100000)
% 

The results of this simple benchmark are below.

05-rabbitmq-benchmark.png

About 4000-5000 messages per second are handled by this RabbitMQ clustered instance within two FreeBSD Jails.

High Availability

Now we will test the high availability of our RabbitMQ cluster.

Currently the ha-default qeue is at rabbit1 node. We will now kill the rabbit1 Jail and see how RabbitMQ web interface reacts.

host # jls
   JID  IP Address      Hostname                      Path
     1  192.168.43.101  rabbit1.local                 /jail/rabbit1
     2  192.168.43.102  rabbit2.local                 /jail/rabbit2

host # killall -9 -j 1

host # umount /jail/rabbit1/dev

Our ha-default queue in a matter of seconds switched to the rabbit2 node – HA works as desired.

06-rabbitmq-ha-node-fail.png

Let’s start rabbit1 Jail to get redundancy back.

host # service jail onestart rabbit1
Starting jails: rabbit1.
host # 

07-rabbitmq-ha-node-back.png

The ha-default queue got redundancy back with +1 mark but it remained on the rabbit2 node.

… and last but not least – little anniversary at the end – this is the 50th article (not counting Valuable News series) on my blog πŸ™‚

UPDATE 1 – This Month in RabbitMQ

The RabbitMQ Cluster on FreeBSD Containers article was featured in the This Month in RabbitMQ – July 2019 episode.

Thanks for mentioning!

UPDATE 2 – Make RabbitMQ Use Less CPU

As reported by Felix Ehlers on Twitter – the RabbitMQ CPU usage will be reduced by setting RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+sbwt none" variable.

EOF

Valuable News – 2019/06/03

The Valuable News weekly series is dedicated to provide summary about news, articles and other interesting stuff mostly but not always related to the UNIX or BSD systems. Whenever I stumble upon something worth mentioning on the Internet I just put it here.

Today the amount information that we get using various information streams is at massive overload. Thus one needs to focus only on what is important without the need to grep(1) the Internet everyday. Hence the idea of providing such information ‘bulk’ as I already do that grep(1).

UNIX

OmniOS Installation in the Amazon Cloud.
https://omniosce.org/setup/aws

Golang Web Server in chroot(8) in OpenBSD.
https://vetelko.gitlab.io/openbsd-golang-web-server-chroot.html

Docker Bug Allows Root Access to Host File System.
https://duo.com/decipher/docker-bug-allows-root-access-to-host-file-system

Broken by Default – You Should Avoid Most DOCKERFILE Examples.
https://pythonspeed.com/articles/dockerizing-python-is-hard/

Running WordPress on OpenBSD.
https://besirovic.com/post/wordpress-on-openbsd/

FreeBSD 12-STABLE has Working Console on EC2 a1.* and *.metal Instances.
https://twitter.com/cperciva/status/1133499607835537408
https://svnweb.freebsd.org/base?view=revision&revision=348342

BSD Now 300 – Big Three.
https://www.bsdnow.tv/300

FreeBSD Foundation 2019 Q1 Status Update.
https://www.freebsdfoundation.org/blog/freebsd-foundation-q1-2019-status-update/

Smartisan.com made $400,000 CAD donation to OpenBSD.
https://twitter.com/canadianbryan/status/1134442873716387840
https://www.openbsdfoundation.org/contributors.html

DragonFly BSD | FreeBSD | Linux Benchmarks on AMD Threadripper.
https://www.phoronix.com/scan.php?page=article&item=dragonfly-55-threadripper

Drupal on OpenBSD.
https://dev.to/nabbisen/drupal-on-openbsd-4n39

Jailer is simple Proof of Concept of build system for building FreeBSD Jails from JAILFILES.
https://gitlab.com/kwiat/jailer

FreeBSD 11.3-BETA2 Available.
https://lists.freebsd.org/pipermail/freebsd-stable/2019-May/091227.html

GOG.com Summer Sale – OpenBSD Highlights.
https://www.reddit.com/r/openbsd_gaming/comments/bvagkt/gogcom_summer_sale_openbsd_highlights/

Check Hard Drive Health on FreeBSD.
https://www.cyberciti.biz/faq/how-to-check-hard-drive-health-on-freebsd/

Hardware

USB Stick as SSD – New Silicon Motion SM3282 Single-Chip Controller for USB SSDs.
https://www.anandtech.com/show/14439/silicon-motion-sm3282-usb-ssds

Life

Kids of 1% are 10 Times More Likely to Become Inventors.
https://bigthink.com/technology-innovation/lost-einsteins-which-kids-become-innovators

Twitter Bans Analyst Who Revealed AntiFa Connections with Journalists.
https://humanevents.com/2019/05/29/twitter-bans-analyst-who-revealed-journalists-antifa-connections/

Google uses Gmail to Track History of Things You Buy.
https://www.cnbc.com/2019/05/17/google-gmail-tracks-purchase-history-how-to-delete-it.html

USA Demands Social Media Details from VISA Applicants.
https://www.bbc.com/news/world-us-canada-48486672

Other

Why I Still Use jQuery in 2019.
https://arp242.net/jquery.html

Temporary Staging Ground for Firefox ppc64 JIT.
https://github.com/classilla/jitpower

Block Fingerprinting with Firefox.
https://blog.mozilla.org/firefox/how-to-block-fingerprinting-with-firefox/

Switch from Chrome to Firefox in Just Few Minutes.
https://www.mozilla.org/en-US/firefox/switch/

Google Just Gave 2 Billion Chrome Users Reason to Switch to Firefox.
https://www.forbes.com/sites/kateoflahertyuk/2019/05/30/google-just-gave-2-billion-chrome-users-a-reason-to-switch-to-firefox/#1b219e42751f

Serious Google Cloud Platform Outage – Status Dashboard.
https://status.cloud.google.com/incident/compute/19003

EOF