Tag Archives: HA

RabbitMQ Cluster on FreeBSD Containers

I really like small and simple dedicated solutions that do one thing well and do it really good – maybe its because I like UNIX that much. Good example of such approach is Minio object storage which implements S3 protocol with distributed clustering, erasure code and builtin web interface along with many other features about which I wrote in the Distributed Object Storage with Minio on FreeBSD article.

The RabbitMQ is another such example – currently probably the most popular implementation of the AMQP protocol – it also comes with small and sleek web interface. The difference is power. Minio comes with very basic user oriented web interface while most administrative and configuration tasks needs to be done from the CLI. The Minio web interface allows one to create/delete buckets there and also to download/upload files. RabbitMQ have so sophisticated web interface that after you enable it you do not need command line anymore. Everything can be accomplished using just web interface.

rabbitmq-logo.png

Compared to other messaging solutions like ActiveMQ or Apache Kafka it is very popular when checked in the Google Trends query.

rabbitmq-trends.jpg

Today I would like to show you RabbitMQ messaging with quite redundant clustered setup with mirrored queues.

You will find Table of Contents below.

  • Jails Setup
  • RabbitMQ Installation
  • RabbitMQ Setup
  • RabbitMQ Plugins
  • RabbitMQ Administrative User
  • RabbitMQ Cluster Setup
  • RabbitMQ Highly Available Policy
  • Feed the Queue
  • Go Language Installation
  • Simple Benchmark
  • High Availability
  • UPDATE 1 – This Month in RabbitMQ
  • UPDATE 2 – Make RabbitMQ Use Less CPU

From all possible virtualization possibilities available on FreeBSD (VirtualBox/Bhyve/QEMU/Jails/Docker) I have chosen the lightweight FreeBSD Containers – Jails πŸ™‚

The legend is the same as usual.

Command run on the host system as root user.

host # command

Command run on the host system as regular user.

host % command

Command run on the rabbitX Jail.

rabbitX # command

Jails Setup

First we will create the base Jails for our setup. Both the host system and Jails Containers use FreeBSD 11.2-RELEASE system.

host # mkdir -p /jail/BASE
host # fetch -o /jail/BASE/11.2-RELEASE.base.txz http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/12.1-RELEASE/base.txz
host # for I in 1 2; do echo ${I}; mkdir -p /jail/rabbit${I}; tar --unlink -xpJf /jail/BASE/11.2-RELEASE.base.txz -C /jail/rabbit${I}; done
1
2
host #

We now have 2 empty clean Jails.

We will now add Jails configuration to the /etc/jail.conf file.

I have used my laptop for the Jail host thus Jails will configured to use the wireless wlan0 interface and 192.168.43.10X addresses. I also added 10.0.0.10X network addresses as this will make it more convenient for me for the purposes of writing this article.

host # for I in 1 2
do
  cat >> /etc/jail.conf << __EOF
rabbit${I} {
  host.hostname = rabbit${I}.local;
  ip4.addr += 192.168.43.10${I};
  ip4.addr += 10.0.0.10${I};
  interface = wlan0;
  path = /jail/rabbit${I};
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.clean;
  mount.devfs;
  allow.raw_sockets;
}

__EOF
done
host #

This is how the /etc/jail.conf file looks after its configured.

host # cat /etc/jail.conf
rabbit1 {
  host.hostname = rabbit1.local;
  ip4.addr += 192.168.43.101;
  ip4.addr += 10.0.0.101;
  interface = wlan0;
  path = /jail/rabbit1;
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.clean;
  mount.devfs;
  allow.raw_sockets;
}

rabbit2 {
  host.hostname = rabbit2.local;
  ip4.addr += 192.168.43.102;
  ip4.addr += 10.0.0.102;
  interface = wlan0;
  path = /jail/rabbit2;
  exec.start = "/bin/sh /etc/rc";
  exec.stop = "/bin/sh /etc/rc.shutdown";
  exec.clean;
  mount.devfs;
  allow.raw_sockets;
}

Now we can start our Jails.

host # for I in 1 2; do service jail onestart rabbit${I}; done
Starting jails: rabbit1.
Starting jails: rabbit2.

Jails are running properly.

# jls
   JID  IP Address      Hostname                      Path
     1  192.168.43.101  rabbit1.local                 /jail/rabbit1
     2  192.168.43.102  rabbit2.local                 /jail/rabbit2

Time to add DNS server to the Jails so they will have Internet connectivity.

host # for I in 1 2; do cat /jail/rabbit${I}/etc/resolv.conf; done
nameserver 1.1.1.1
nameserver 1.1.1.1

Now we will switch from 'quarterly' to 'latest' packages.

host # for I in 1 2; do sed -i '' s/quarterly/latest/g /jail/rabbit${I}/etc/pkg/FreeBSD.conf; done

host # for I in 1 2; do grep latest /jail/rabbit${I}/etc/pkg/FreeBSD.conf; done
  url: "pkg+http://pkg.FreeBSD.org/${ABI}/latest",
  url: "pkg+http://pkg.FreeBSD.org/${ABI}/latest",

RabbitMQ Installation

We can now install RabbitMQ package.

host # for I in 1 2; do jexec rabbit${I} env ASSUME_ALWAYS_YES=yes pkg install -y rabbitmq; echo; done
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:11:amd64/latest, please wait...
Verifying signature with trusted certificate pkg.freebsd.org.2013102301... done
[rabbit1.local] Installing pkg-1.10.5_5...
[rabbit1.local] Extracting pkg-1.10.5_5: 100%
Updating FreeBSD repository catalogue...
pkg: Repository FreeBSD load error: access repo file(/var/db/pkg/repo-FreeBSD.sqlite) failed: No such file or directory
[rabbit1.local] Fetching meta.txz: 100%    944 B   0.9kB/s    00:01    
[rabbit1.local] Fetching packagesite.txz: 100%    6 MiB 745.4kB/s    00:09    
Processing entries: 100%
FreeBSD repository update completed. 32114 packages processed.
All repositories are up to date.
Updating database digests format: 100%
The following 2 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        rabbitmq: 3.7.15
        erlang-runtime19: 21.3.8.2

Number of packages to be installed: 2

The process will require 104 MiB more space.
41 MiB to be downloaded.
[rabbit1.local] [1/2] Fetching rabbitmq-3.7.15.txz: 100%    9 MiB 762.2kB/s    00:12    
[rabbit1.local] [2/2] Fetching erlang-runtime19-21.3.8.2.txz: 100%   33 MiB 978.8kB/s    00:35    
Checking integrity... done (0 conflicting)
[rabbit1.local] [1/2] Installing erlang-runtime19-21.3.8.2...
[rabbit1.local] [1/2] Extracting erlang-runtime19-21.3.8.2: 100%
[rabbit1.local] [2/2] Installing rabbitmq-3.7.15...
===> Creating groups.
Creating group 'rabbitmq' with gid '135'.
===> Creating users
Creating user 'rabbitmq' with uid '135'.
[rabbit1.local] [2/2] Extracting rabbitmq-3.7.15: 100%
Message from erlang-runtime19-21.3.8.2:

===========================================================================

To use this runtime port for development or testing, just prepend
its binary path ("/usr/local/lib/erlang19/bin") to your PATH variable.

===========================================================================

(...)

// SAME MESSAGES FOR THE OTHER rabbit2 JAIL //

Lets verify that RabbitMQ package has installed successfully.

host # for I in 1 2; do jexec rabbit${I} which rabbitmqctl; done
/usr/local/sbin/rabbitmqctl
/usr/local/sbin/rabbitmqctl

RabbitMQ Setup

We will now configure /etc/hosts files on our Jails.

host # for I in 1 2; do cat >> /jail/rabbit${I}/etc/hosts << __EOF
192.168.43.101 rabbit1
192.168.43.102 rabbit2

__EOF
done

… and fast verification.

host # cat /jail/rabbit?/etc/hosts | grep 192.168.43 | sort -n | uniq -c
2 192.168.43.101 rabbit1
2 192.168.43.102 rabbit2

As we have RabbitMQ package installed we need to enable it and start it.

host # jexec rabbit1 /usr/local/etc/rc.d/rabbitmq rcvar
# rabbitmq
#
rabbitmq_enable="NO"
#   (default: "")

As we see we need to set rabbitmq_enable=YES value in /etc/rc.conf file within each of our Jails.

host # for I in 1 2; do jexec rabbit${I} sysrc rabbitmq_enable=YES; done
rabbitmq_enable:  -> YES
rabbitmq_enable:  -> YES

Now we can start the RabbitMQ in the Jails.

host # for I in 1 2; do jexec rabbit${I} service rabbitmq start; done
Starting rabbitmq.
Starting rabbitmq.

Now we have four RabbitMQ instances up and running.

This is the list of plugins enabled by default. None.

RabbitMQ Plugins

rabbit1 # rabbitmq-plugins list
 Configured: E = explicitly enabled; e = implicitly enabled
 | Status: * = running on rabbit@rabbit1
 |/
[  ] rabbitmq_amqp1_0                  3.7.15
[  ] rabbitmq_auth_backend_cache       3.7.15
[  ] rabbitmq_auth_backend_http        3.7.15
[  ] rabbitmq_auth_backend_ldap        3.7.15
[  ] rabbitmq_auth_mechanism_ssl       3.7.15
[  ] rabbitmq_consistent_hash_exchange 3.7.15
[  ] rabbitmq_event_exchange           3.7.15
[  ] rabbitmq_federation               3.7.15
[  ] rabbitmq_federation_management    3.7.15
[  ] rabbitmq_jms_topic_exchange       3.7.15
[  ] rabbitmq_management               3.7.15
[  ] rabbitmq_management_agent         3.7.15
[  ] rabbitmq_mqtt                     3.7.15
[  ] rabbitmq_peer_discovery_aws       3.7.15
[  ] rabbitmq_peer_discovery_common    3.7.15
[  ] rabbitmq_peer_discovery_consul    3.7.15
[  ] rabbitmq_peer_discovery_etcd      3.7.15
[  ] rabbitmq_peer_discovery_k8s       3.7.15
[  ] rabbitmq_random_exchange          3.7.15
[  ] rabbitmq_recent_history_exchange  3.7.15
[  ] rabbitmq_sharding                 3.7.15
[  ] rabbitmq_shovel                   3.7.15
[  ] rabbitmq_shovel_management        3.7.15
[  ] rabbitmq_stomp                    3.7.15
[  ] rabbitmq_top                      3.7.15
[  ] rabbitmq_tracing                  3.7.15
[  ] rabbitmq_trust_store              3.7.15
[  ] rabbitmq_web_dispatch             3.7.15
[  ] rabbitmq_web_mqtt                 3.7.15
[  ] rabbitmq_web_mqtt_examples        3.7.15
[  ] rabbitmq_web_stomp                3.7.15
[  ] rabbitmq_web_stomp_examples       3.7.15

Time to enable web interface plugin.

host # for I in 1 2; do jexec rabbit${I} rabbitmq-plugins enable rabbitmq_management; done
The following plugins have been configured:
  rabbitmq_management
  rabbitmq_management_agent
  rabbitmq_web_dispatch
Applying plugin configuration to rabbit@rabbit1...
The following plugins have been enabled:
  rabbitmq_management
  rabbitmq_management_agent
  rabbitmq_web_dispatch

started 3 plugins.

(...)

// SAME MESSAGES FOR THE OTHER rabbit2 JAIL //

Now we have web interface plugin enabled in each RabbitMQ FreeBSD Jail.

Big ‘E‘ letter means that this is the plugin that we enabled and small ‘e‘ letter means that this plugin is only enabled as ‘dependency’ for some other plugin we requested to be enabled.

rabbit1 # rabbitmq-plugins list
 Configured: E = explicitly enabled; e = implicitly enabled
 | Status: * = running on rabbit@rabbit1
 |/
[  ] rabbitmq_amqp1_0                  3.7.15
[  ] rabbitmq_auth_backend_cache       3.7.15
[  ] rabbitmq_auth_backend_http        3.7.15
[  ] rabbitmq_auth_backend_ldap        3.7.15
[  ] rabbitmq_auth_mechanism_ssl       3.7.15
[  ] rabbitmq_consistent_hash_exchange 3.7.15
[  ] rabbitmq_event_exchange           3.7.15
[  ] rabbitmq_federation               3.7.15
[  ] rabbitmq_federation_management    3.7.15
[  ] rabbitmq_jms_topic_exchange       3.7.15
[E*] rabbitmq_management               3.7.15
[e*] rabbitmq_management_agent         3.7.15
[  ] rabbitmq_mqtt                     3.7.15
[  ] rabbitmq_peer_discovery_aws       3.7.15
[  ] rabbitmq_peer_discovery_common    3.7.15
[  ] rabbitmq_peer_discovery_consul    3.7.15
[  ] rabbitmq_peer_discovery_etcd      3.7.15
[  ] rabbitmq_peer_discovery_k8s       3.7.15
[  ] rabbitmq_random_exchange          3.7.15
[  ] rabbitmq_recent_history_exchange  3.7.15
[  ] rabbitmq_sharding                 3.7.15
[  ] rabbitmq_shovel                   3.7.15
[  ] rabbitmq_shovel_management        3.7.15
[  ] rabbitmq_stomp                    3.7.15
[  ] rabbitmq_top                      3.7.15
[  ] rabbitmq_tracing                  3.7.15
[  ] rabbitmq_trust_store              3.7.15
[e*] rabbitmq_web_dispatch             3.7.15
[  ] rabbitmq_web_mqtt                 3.7.15
[  ] rabbitmq_web_mqtt_examples        3.7.15
[  ] rabbitmq_web_stomp                3.7.15
[  ] rabbitmq_web_stomp_examples       3.7.15

Now – in order to create a cluster – we need these RabbitMQ instances to share the same ERLANG cookie. The ERLANG cookie can be found at /var/db/rabbitmq/.erlang.cookie on FreeBSD system.

rabbot1 # cat /var/db/rabbitmq/.erlang.cookie; echo
NOEVQNXJDNLAJOSVWNIW
rabbot1 # 

We will need to stop RabbitMQ to change ERLANG cookie.

host # for I in 1 2; do jexec rabbit${I} service rabbitmq stop; done
Stopping rabbitmq.
Waiting for PIDS: 88684.
Stopping rabbitmq.
Waiting for PIDS: 20976.

Let’s set the same ERLANG cookie on each FreeBSD Jail then.

host # for I in 1 2; do cat > /jail/rabbit${I}/var/db/rabbitmq/.erlang.cookie << __EOF
RABBITMQFREEBSDJAILS
__EOF
done

… and now we need to start them again.

host # for I in 1 2; do jexec rabbit${I} service rabbitmq start; done
Starting rabbitmq.
Starting rabbitmq.

Fast verification.

host # for I in 1 2; do jexec rabbit${I} cat /var/db/rabbitmq/.erlang.cookie; done
RABBITMQFREEBSDJAILS
RABBITMQFREEBSDJAILS

RabbitMQ Administrative User

Now we will create administrative user called admin for the RabbitMQ instances.

host # for I in 1 2; do jexec rabbit${I} rabbitmqctl add_user admin ADMINPASSWORD; done
Adding user "admin" ...
Adding user "admin" ...

host # for I in 1 2; do jexec rabbit${I} rabbitmqctl set_user_tags admin administrator; done
Setting tags for user "admin" to [administrator] ...
Setting tags for user "admin" to [administrator] ...

host # for I in 1 2; do jexec rabbit${I} rabbitmqctl set_permissions -p / admin ".*" ".*" ".*" ; done
Setting permissions for user "admin" in vhost "/" ...
Setting permissions for user "admin" in vhost "/" ...

We should now be able to login to the http://192.168.43.101:15672/ (or http://10.0.0.101:15672/ also) RabbitMQ management page.

01-rabbitmq-login.png

After login a useful RabbitMQ dashboard will welcome you.

02-rabbitmq-dashboard.png

RabbitMQ Cluster Setup

We will now create RabbitMQ cluster.

rabbit1 # rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1]}]},
 {running_nodes,[rabbit@rabbit1]},
 {cluster_name,},
 {partitions,[]},
 {alarms,[{rabbit@rabbit1,[]}]}]

rabbit2 # hostname
rabbit2.local

rabbit2 # rabbitmqctl join_cluster rabbit@rabbit1
Error: this command requires the 'rabbit' app to be stopped on the target node. Stop it with 'rabbitmqctl stop_app'.
Arguments given:
        join_cluster rabbit@rabbit1

Usage

rabbitmqctl [--node ] [--longnames] [--quiet] join_cluster [--disc|--ram] 

We first need to stop the RabbitMQ ‘application’ to join the cluster.

rabbit2 # rabbitmqctl stop_app
Stopping rabbit application on node rabbit@rabbit2 ...

rabbit2 # rabbitmqctl join_cluster rabbit@rabbit1
Clustering node rabbit@rabbit2 with rabbit@rabbit1

rabbit2 # rabbitmqctl start_app
Starting node rabbit@rabbit2 ...
 completed with 5 plugins.

rabbit2 # rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2]},
 {cluster_name,},
 {partitions,[]},
 {alarms,[{rabbit@rabbit1,[]},{rabbit@rabbit2,[]}]}]

rabbit1 # rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1]},
 {cluster_name,},
 {partitions,[]},
 {alarms,[{rabbit@rabbit2,[]},{rabbit@rabbit1,[]}]}]

Now we have formed two node RabbitMQ cluster. We will rename it to cluster then.

rabbit1 # rabbitmqctl set_cluster_name rabbit@cluster
Setting cluster name to rabbit@cluster ...

rabbit1 # rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1]},
 {cluster_name,},
 {partitions,[]},
 {alarms,[{rabbit@rabbit2,[]},{rabbit@rabbit1,[]}]}]

Here is how our cluster looks in the web interface.

08-rabbitmq-cluster.png

RabbitMQ Highly Available Policy

To have Highly Available (Mirrored) Queues in RabbitMQ you need to create Policy. We will declare Policy named ha which will match queues whose names begin with ‘ha-‘ prefix so they will be configured with mirroring to all two nodes in the cluster.

This is the command you need to execute to create such Policy.

rabbit1 # rabbitmqctl set_policy ha "^ha-\.*" '{"ha-mode":"all","ha-sync-mode":"automatic"}'
Setting policy "ha-mirror" for pattern "^ha-\." to "{"ha-mode":"all","ha-sync-mode":"automatic"}" with priority "0" for vhost "/" ...

… or alternatively you can use the web interface to create it.

No matter which method you have chosen you will end up with needed ha Policy as shown below.

03-rabbitmq-policy.png

Feed the Queue

We now have two node RabbitMQ cluster with HA for queues that name starts with ha- prefix. We will now test our RabbitMQ setup and will create and feed the queue with send.go script – as you probably guessed – written in Go. We will need to add Go language to our host system.

Go Language Installation

host # pkg install go
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        go: 1.12.5,1

Number of packages to be installed: 1

The process will require 262 MiB more space.
75 MiB to be downloaded.

Proceed with this action? [y/N]: y
(...)

host % go version
go version go1.12.5 freebsd/amd64

This is the send.go script – we will use it to send 10 messages to the ha-default queue. Its based on the RabbitMQ Hello World tutorial.

host % cat send.go
package main

import (
  "log"
  "amqp"
)

func FAIL_ON_ERROR(err error, msg string) {
  if err != nil {
    log.Fatalf("%s: %s", msg, err)
  }
}

func main() {
  conn, err := amqp.Dial("amqp://admin:ADMINPASSWORD@10.0.0.101:5672/")
  FAIL_ON_ERROR(err, "ER: failed to connect to RabbitMQ")
  defer conn.Close()

  ch, err := conn.Channel()
  FAIL_ON_ERROR(err, "ER: failed to open channel")
  defer ch.Close()

  q, err := ch.QueueDeclare(
    "ha-default", // name
    false,        // durable
    false,        // delete when unused
    false,        // exclusive
    false,        // no-wait
    nil,          // arguments
  )
  FAIL_ON_ERROR(err, "ER: failed to declare queue")

  body := "Hello World!"

  for i := 1; i <= 10; i++ {
    err = ch.Publish(
      "",     // exchange
      q.Name, // routing key
      false,  // mandatory
      false,  // immediate
      amqp.Publishing{
        ContentType: "text/plain",
        Body:        []byte(body),
      })
    log.Printf("IN: sent message '%s' (%d)", body, i)
    FAIL_ON_ERROR(err, "ER: failed to publish message")
  }

}


We will now run it.

host % go run send.go
send.go:5:3: cannot find package "amqp" in any of:
        /usr/local/go/src/amqp (from $GOROOT)
        /home/vermaden/.gopkg/src/amqp (from $GOPATH)

We lack the amqp package for the Go language.

We will need to download it from its https://github.com/streadway/amqp page. We will get it by downloading everything in a ZIP package.

host % mkdir -p ~/.gopkg/src
host % cd !$
host % pwd
/home/vermaden/.gopkg/src
host % fetch https://github.com/streadway/amqp/archive/master.zip
host % unzip master.zip 
Archive:  /home/vermaden/.gopkg/src/master.zip
   creating: amqp-master/
 extracting: amqp-master/.gitignore
 extracting: amqp-master/.travis.yml
 (...)
 extracting: amqp-master/uri.go
 extracting: amqp-master/uri_test.go
 extracting: amqp-master/write.go
host % rm master.zip
host % mv amqp-master amqp
host % cd amqp
host % pwd
/home/vermaden/.gopkg/src/amqp
host % exa
_examples          confirms.go         delivery_test.go        LICENSE            spec091.go
spec               confirms_test.go    doc.go                  pre-commit         tls_test.go
allocator.go       connection.go       example_client_test.go  read.go            types.go
allocator_test.go  connection_test.go  examples_test.go        read_test.go       uri.go
auth.go            consumers.go        fuzz.go                 README.md          uri_test.go
certs.sh           consumers_test.go   gen.sh                  reconnect_test.go  write.go
channel.go         CONTRIBUTING.md     go.mod                  return.go          
client_test.go     delivery.go         integration_test.go     shared_test.go     

We also need to make sure that PATH and GOPATH are properly configured. To do so you need to put these in your interactive shell config.

# GO SHELL SETUP
mkdir -p ~/.gopkg
export GOPATH=~/.gopkg
export PATH="${PATH}:~/.gopkg"

Now we can get back to feeding our queue.

host % go run send.go
2019/06/05 13:53:59 IN: sent message 'Hello World!' (1)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (2)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (3)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (4)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (5)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (6)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (7)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (8)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (9)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (10)
% 

The ha-default queue has been created and feeded with 10 messages.

04-rabbitmq-queue

Now we need to ‘receive’ these messages from the queue. This is where receive.go script comes with help. It is also based on the RabbitMQ Hello World tutorial.

host % cat receive.go
package main

import (
  "log"
  "amqp"
)

func FAIL_ON_ERROR(err error, msg string) {
  if err != nil {
    log.Fatalf("%s: %s", msg, err)
  }
}

func main() {
  conn, err := amqp.Dial("amqp://admin:ADMINPASSWORD@10.0.0.102:5672/")
  FAIL_ON_ERROR(err, "ER: failed to connect to RabbitMQ")
  defer conn.Close()

  ch, err := conn.Channel()
  FAIL_ON_ERROR(err, "ER: failed to open channel")
  defer ch.Close()

  q, err := ch.QueueDeclare(
    "ha-default", // name
    false,        // durable
    false,        // delete when unused
    false,        // exclusive
    false,        // no-wait
    nil,          // arguments
  )
  FAIL_ON_ERROR(err, "ER: failed to declare queue")

  msgs, err := ch.Consume(
    q.Name, // queue
    "",     // consumer
    true,   // auto-ack
    false,  // exclusive
    false,  // no-local
    false,  // no-wait
    nil,    // args
  )
  FAIL_ON_ERROR(err, "ER: failed to register consumer")

  forever := make(chan bool)

  go func() {
    for d := range msgs {
      log.Printf("IN: received message: %s", d.Body)
    }
  }()

  log.Printf("IN: waiting for messages")
  log.Printf("IN: to exit press CTRL+C")
  <-forever
}

Here is its output after running. It will not stop running until you end it with CTRL-C sequence.

host % go run receive.go
2019/06/05 13:54:34 IN: waiting for messages
2019/06/05 13:54:34 IN: to exit press CTRL+C
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
2019/06/05 13:54:34 IN: received message: Hello World!
^C
%

If you checked the source code carefully then you probably noticed that I ‘sent’ messages to the rabbit1 node (10.0.0.101) while I ‘received’ the messages at the rabbit2 node (10.0.0.102).

Simple Benchmark

We will now make simple benchmark with receive.go script left running and modified send.go script with the for loop with 100000 messages.

host % go run receive.go
2019/06/05 13:52:34 IN: waiting for messages
2019/06/05 13:52:34 IN: to exit press CTRL+C

… and now the messages.

host % go run send.go
2019/06/05 13:53:59 IN: sent message 'Hello World!' (1)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (2)
2019/06/05 13:53:59 IN: sent message 'Hello World!' (3)
(...)
2019/06/05 13:56:26 IN: sent message 'Hello World!' (99998)
2019/06/05 13:56:26 IN: sent message 'Hello World!' (99999)
2019/06/05 13:56:26 IN: sent message 'Hello World!' (100000)
% 

The results of this simple benchmark are below.

05-rabbitmq-benchmark.png

About 4000-5000 messages per second are handled by this RabbitMQ clustered instance within two FreeBSD Jails.

High Availability

Now we will test the high availability of our RabbitMQ cluster.

Currently the ha-default qeue is at rabbit1 node. We will now kill the rabbit1 Jail and see how RabbitMQ web interface reacts.

host # jls
   JID  IP Address      Hostname                      Path
     1  192.168.43.101  rabbit1.local                 /jail/rabbit1
     2  192.168.43.102  rabbit2.local                 /jail/rabbit2

host # killall -9 -j 1

host # umount /jail/rabbit1/dev

Our ha-default queue in a matter of seconds switched to the rabbit2 node – HA works as desired.

06-rabbitmq-ha-node-fail.png

Let’s start rabbit1 Jail to get redundancy back.

host # service jail onestart rabbit1
Starting jails: rabbit1.
host # 

07-rabbitmq-ha-node-back.png

The ha-default queue got redundancy back with +1 mark but it remained on the rabbit2 node.

… and last but not least – little anniversary at the end – this is the 50th article (not counting Valuable News series) on my blog πŸ™‚

UPDATE 1 – This Month in RabbitMQ

The RabbitMQ Cluster on FreeBSD Containers article was featured in the This Month in RabbitMQ – July 2019 episode.

Thanks for mentioning!

UPDATE 2 – Make RabbitMQ Use Less CPU

As reported by Felix Ehlers on Twitter – the RabbitMQ CPU usage will be reduced by setting RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+sbwt none" variable.

EOF

GlusterFS Cluster on FreeBSD with Ansible and GNU Parallel

Today I would like to present an article about setting up GlusterFS cluster on a FreeBSD system with Ansible and GNU Parallel tools.

gluster-logo.png

To cite Wikipedia “GlusterFS is a scale-out network-attached storage file system. It has found applications including cloud computing, streaming media services, and content delivery networks.” The GlusterFS page describes it similarly “Gluster is a scalable, distributed file system that aggregates disk storage resources from multiple servers into a single global namespace.”

Here are its advantages:

  • Scales to several petabytes.
  • Handles thousands of clients.
  • POSIX compatible.
  • Uses commodity hardware.
  • Can use any ondisk filesystem that supports extended attributes.
  • Accessible using industry standard protocols like NFS and SMB.
  • Provides replication/quotas/geo-replication/snapshots/bitrot detection.
  • Allows optimization for different workloads.
  • Open Source.

Lab Setup

It will be entirely VirtualBox based and it will consist of 6 hosts. To not create 6 same FreeBSD installations I used 12.0-RELEASE virtual machine image available from the FreeBSD Project directly:

There are several formats available – qcow2/raw/vhd/vmdk – but as I will be using VirtualBox I used the VMDK one.

I will use different prompts depending on where the command is executed to make the article more readable. Also then there is ‘%‘ at the prompt then a regular user is needed and if there is ‘#‘ at the prompt then a superuser is needed.

gluster1 #    // command run on the gluster1 node
gluster* #    // command run on all gluster nodes
client #      // command run on gluster client
vbhost %      // command run on the VirtualBox host

Here is the list of the machines for the GlusterFS cluster:

10.0.10.11 gluster1
10.0.10.12 gluster2
10.0.10.13 gluster3
10.0.10.14 gluster4
10.0.10.15 gluster5
10.0.10.16 gluster6

Each VirtualBox virtual machine for FreeBSD is the default one (as suggested in the VirtualBox wizard) with 512 MB RAM and NAT Network as shown on the image below.

virtualbox-freebsd-gluster-host.jpg

Here is the configuration of the NAT Network on VirtualBox.

virtualbox-nat-network.jpg

The cloned/copied FreeBSD-12.0-RELEASE-amd64.vmdk image will need to have different UUIDs so we will use VBoxManage internalcommands sethduuid command to achieve this.

vbhost % for I in $( seq 6 ); do cp FreeBSD-12.0-RELEASE-amd64.vmdk    vbox_GlusterFS_${I}.vmdk; done
vbhost % for I in $( seq 6 ); do VBoxManage internalcommands sethduuid vbox_GlusterFS_${I}.vmdk; done

To start the whole GlusterFS environment on VirtualBox use these commands.

vbhost % VBoxManage list vms | grep GlusterFS
"FreeBSD GlusterFS 1" {162a3b6f-4ec9-4709-bff8-162b0c8c9c41}
"FreeBSD GlusterFS 2" {2e30326c-ac5d-41d2-9b28-483375df38f6}
"FreeBSD GlusterFS 3" {6b2747ab-3ec6-4b1a-a28e-5d871d7891b3}
"FreeBSD GlusterFS 4" {12379cf8-31d9-4ff1-9945-465fc3ed15f0}
"FreeBSD GlusterFS 5" {a4b0d515-5924-4517-9052-df238c366f2b}
"FreeBSD GlusterFS 6" {66621755-1b97-4486-aa15-a7bec9edb343}

Check which GlusterFS machines are running.

vbhost % VBoxManage list runningvms | grep GlusterFS
vbhost %

Starting of the machines in VirtualBox Headless mode in parallel.

vbhost % VBoxManage list vms \
           | grep GlusterFS \
           | awk -F \" '{print $2}' \
           | while read I; do VBoxManage startvm "${I}" --type headless & done

After that command you should see these machines running.

vbhost % VBoxManage list runningvms
"FreeBSD GlusterFS 1" {162a3b6f-4ec9-4709-bff8-162b0c8c9c41}
"FreeBSD GlusterFS 2" {2e30326c-ac5d-41d2-9b28-483375df38f6}
"FreeBSD GlusterFS 3" {6b2747ab-3ec6-4b1a-a28e-5d871d7891b3}
"FreeBSD GlusterFS 4" {12379cf8-31d9-4ff1-9945-465fc3ed15f0}
"FreeBSD GlusterFS 5" {a4b0d515-5924-4517-9052-df238c366f2b}
"FreeBSD GlusterFS 6" {66621755-1b97-4486-aa15-a7bec9edb343}

Before we will try connect to our FreeBSD machines we need to make the minimal network configuration. Each FreeBSD machine will have such minimal /etc/rc.conf file as shown example for gluster1 host.

gluster1 # cat /etc/rc.conf
hostname=gluster1
ifconfig_DEFAULT="inet 10.0.10.11/24 up"
defaultrouter=10.0.10.1
sshd_enable=YES

For the setup purposes we will need to allow root login on these FreeBSD GlusterFS machines with PermitRootLogin yes option in the /etc/ssh/sshd_config file. You will also need to restart the sshd(8) service after the changes.

gluster1 # grep '^PermitRootLogin' /etc/ssh/sshd_config
PermitRootLogin yes
# service sshd restart

By using NAT Network with Port Forwarding the FreeBSD machines will be accessible on the localhost ports. For example the gluster1 machine will be available on port 2211, the gluster2 machine will be available on port 2212 and so on. This is shown in the sockstat utility output below.

vbhost % sockstat -l4
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS
vermaden VBoxNetNAT 57622 17 udp4   *:*                   *:*
vermaden VBoxNetNAT 57622 19 tcp4   *:2211                *:*
vermaden VBoxNetNAT 57622 20 tcp4   *:2212                *:*
vermaden VBoxNetNAT 57622 21 tcp4   *:2213                *:*
vermaden VBoxNetNAT 57622 22 tcp4   *:2214                *:*
vermaden VBoxNetNAT 57622 23 tcp4   *:2215                *:*
vermaden VBoxNetNAT 57622 24 tcp4   *:2216                *:*
vermaden VBoxNetNAT 57622 28 tcp4   *:2240                *:*
vermaden VBoxNetNAT 57622 29 tcp4   *:9140                *:*
vermaden VBoxNetNAT 57622 30 tcp4   *:2220                *:*
root     sshd       96791 4  tcp4   *:22                  *:*

I think the corelation between IP address and the port on the host is obvious πŸ™‚

Here is the list of the machines with ports on localhost:

10.0.10.11 gluster1 2211
10.0.10.12 gluster2 2212
10.0.10.13 gluster3 2213
10.0.10.14 gluster4 2214
10.0.10.15 gluster5 2215
10.0.10.16 gluster6 2216

To connect to such machine from the VirtualBox host system you will need this command:

vbhost % ssh -l root localhost -p 2211

To not type that every time you need to login to gluster1 let’s make come changes to ~/.ssh/config file for convenience. This way it will be possible to login in very short way.

vbhost % ssh gluster1

Here is the modified ~/.ssh/config file.

vbhost % cat ~/.ssh/config
# GENERAL
  StrictHostKeyChecking no
  LogLevel              quiet
  KeepAlive             yes
  ServerAliveInterval   30
  VerifyHostKeyDNS      no

# ALL HOSTS SETTINGS
Host *
  StrictHostKeyChecking no
  Compression           yes

# GLUSTER
Host gluster1
  User root
  Hostname 127.0.0.1
  Port 2211

Host gluster2
  User root
  Hostname 127.0.0.1
  Port 2212

Host gluster3
  User root
  Hostname 127.0.0.1
  Port 2213

Host gluster4
  User root
  Hostname 127.0.0.1
  Port 2214

Host gluster5
  User root
  Hostname 127.0.0.1
  Port 2215

Host gluster6
  User root
  Hostname 127.0.0.1
  Port 2216

I assume that you already have some SSH keys generated (with ~/.ssh/id_rsa as private key) so lets remove the need to type password on each SSH login.

vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster1
Password for root@gluster1:

vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster2
Password for root@gluster2:

vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster3
Password for root@gluster3:

vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster4
Password for root@gluster4:

vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster5
Password for root@gluster5:

vbhost % ssh-copy-id -i ~/.ssh/id_rsa gluster6
Password for root@gluster6:

Ansible Setup

As we already have SSH integration now we will configure Ansible to connect to out ‘localhost’ ports for FreeBSD machines.

Here is the Ansible’s hosts file.

vbhost % cat hosts
[gluster]
gluster1 ansible_port=2211 ansible_host=127.0.0.1 ansible_user=root
gluster2 ansible_port=2212 ansible_host=127.0.0.1 ansible_user=root
gluster3 ansible_port=2213 ansible_host=127.0.0.1 ansible_user=root
gluster4 ansible_port=2214 ansible_host=127.0.0.1 ansible_user=root
gluster5 ansible_port=2215 ansible_host=127.0.0.1 ansible_user=root
gluster6 ansible_port=2216 ansible_host=127.0.0.1 ansible_user=root

[gluster:vars]
ansible_python_interpreter=/usr/local/bin/python2.7

Here is the listing of these machines using ansible command.

vbhost % ansible -i hosts --list-hosts gluster
  hosts (6):
    gluster1
    gluster2
    gluster3
    gluster4
    gluster5
    gluster6

Lets verify that out Ansible setup works correctly.

vbhost % ansible -i hosts -m raw -a 'echo' gluster
gluster1 | CHANGED | rc=0 >>



gluster3 | CHANGED | rc=0 >>



gluster2 | CHANGED | rc=0 >>



gluster5 | CHANGED | rc=0 >>



gluster4 | CHANGED | rc=0 >>



gluster6 | CHANGED | rc=0 >>

It works as desired.

We are not able to use Ansible modules other then Raw because by default Python is not installed on FreeBSD as shown below.

vbhost % ansible -i hosts -m ping gluster
gluster1 | FAILED! => {
    "changed": false,
    "module_stderr": "",
    "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r\n",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
}
gluster2 | FAILED! => {
    "changed": false,
    "module_stderr": "",
    "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r\n",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
}
gluster4 | FAILED! => {
    "changed": false,
    "module_stderr": "",
    "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r\n",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
}
gluster5 | FAILED! => {
    "changed": false,
    "module_stderr": "",
    "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r\n",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
}
gluster3 | FAILED! => {
    "changed": false,
    "module_stderr": "",
    "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r\n",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
}
gluster6 | FAILED! => {
    "changed": false,
    "module_stderr": "",
    "module_stdout": "/bin/sh: /usr/local/bin/python2.7: not found\r\n",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
}

We need to get Python installed on FreeBSD.

We will partially use Ansible for this and partially the GNU Parallel.

vbhost % ansible -i hosts --list-hosts gluster \
           | sed 1d \
           | while read I; do ssh ${I} env ASSUME_ALWAYS_YES=yes pkg install python; done
pkg: Error fetching http://pkg.FreeBSD.org/FreeBSD:12:amd64/quarterly/Latest/pkg.txz: No address record
A pre-built version of pkg could not be found for your system.
Consider changing PACKAGESITE or installing it from ports: 'ports-mgmt/pkg'.
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:12:amd64/quarterly, please wait...

… we forgot about setting up DNS in the FreeBSD machines, let’s fix that.

It is as easy as executing echo nameserver 1.1.1.1 > /etc/resolv.conf command on each FreeBSD machine.

Lets verify what input will be sent to GNU Parallel before executing it.

vbhost % ansible -i hosts --list-hosts gluster \
           | sed 1d \
           | while read I; do echo "ssh ${I} 'echo nameserver 1.1.1.1 > /etc/resolv.conf'"; done
ssh gluster1 'echo nameserver 1.1.1.1 > /etc/resolv.conf'
ssh gluster2 'echo nameserver 1.1.1.1 > /etc/resolv.conf'
ssh gluster3 'echo nameserver 1.1.1.1 > /etc/resolv.conf'
ssh gluster4 'echo nameserver 1.1.1.1 > /etc/resolv.conf'
ssh gluster5 'echo nameserver 1.1.1.1 > /etc/resolv.conf'
ssh gluster6 'echo nameserver 1.1.1.1 > /etc/resolv.conf'

Looks reasonable, lets engage the GNU Parallel then.

vbhost % ansible -i hosts --list-hosts gluster \
           | sed 1d \
           | while read I; do echo "ssh ${I} 'echo nameserver 1.1.1.1 > /etc/resolv.conf'"; done | parallel

Computers / CPU cores / Max jobs to run
1:local / 2 / 2

Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
local:0/6/100%/1.0s

We will now verify that the DNS is configured properly on the FreeBSD machines.

vbhost % for I in $( jot 6 ); do echo -n "gluster${I} "; ssh gluster${I} 'cat /etc/resolv.conf'; done
gluster1 nameserver 1.1.1.1
gluster2 nameserver 1.1.1.1
gluster3 nameserver 1.1.1.1
gluster4 nameserver 1.1.1.1
gluster5 nameserver 1.1.1.1
gluster6 nameserver 1.1.1.1

Verification of the DNS by using ping(8) to test Internet connectivity.

vbhost % for I in $( jot 6 ); do echo; echo "gluster${I}"; ssh gluster${I} host freebsd.org; done

gluster1
freebsd.org has address 96.47.72.84
freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
freebsd.org mail is handled by 10 mx1.freebsd.org.
freebsd.org mail is handled by 30 mx66.freebsd.org.

gluster2
freebsd.org has address 96.47.72.84
freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
freebsd.org mail is handled by 30 mx66.freebsd.org.
freebsd.org mail is handled by 10 mx1.freebsd.org.

gluster3
freebsd.org has address 96.47.72.84
freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
freebsd.org mail is handled by 30 mx66.freebsd.org.
freebsd.org mail is handled by 10 mx1.freebsd.org.

gluster4
freebsd.org has address 96.47.72.84
freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
freebsd.org mail is handled by 30 mx66.freebsd.org.
freebsd.org mail is handled by 10 mx1.freebsd.org.

gluster5
freebsd.org has address 96.47.72.84
freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
freebsd.org mail is handled by 10 mx1.freebsd.org.
freebsd.org mail is handled by 30 mx66.freebsd.org.

gluster6
freebsd.org has address 96.47.72.84
freebsd.org has IPv6 address 2610:1c1:1:606c::50:15
freebsd.org mail is handled by 10 mx1.freebsd.org.
freebsd.org mail is handled by 30 mx66.freebsd.org.

The DNS resolution works properly, now we will switch from the default quarterly pkg(8) repository to the latest one which has more frequent updates as the name suggests. We will need to use sed -i '' s/quarterly/latest/g /etc/pkg/FreeBSD.conf command on each FreeBSD machine.

Verification what will be sent to GNU Parallel.

vbhost % ansible -i hosts --list-hosts gluster \
           | sed 1d \
           | while read I; do echo "ssh ${I} 'sed -i \"\" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'"; done
ssh gluster1 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'
ssh gluster2 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'
ssh gluster3 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'
ssh gluster4 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'
ssh gluster5 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'
ssh gluster6 'sed -i "" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'

Let’s send the command to FreeBSD machines then.

vbhost % ansible -i hosts --list-hosts gluster \
           | sed 1d \
           | while read I; do echo "ssh $I 'sed -i \"\" s/quarterly/latest/g /etc/pkg/FreeBSD.conf'"; done | parallel

Computers / CPU cores / Max jobs to run
1:local / 2 / 2

Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
local:0/6/100%/1.0s

As shown below the latest repository is configured in the /etc/pkg/FreeBSD.conf file on each FreeBSD machine.

vbhost % ssh gluster3 tail -7 /etc/pkg/FreeBSD.conf
FreeBSD: {
  url: "pkg+http://pkg.FreeBSD.org/${ABI}/latest",
  mirror_type: "srv",
  signature_type: "fingerprints",
  fingerprints: "/usr/share/keys/pkg",
  enabled: yes
}

We may now get back to Python.

vbhost % ansible -i hosts --list-hosts gluster \
           | sed 1d \
           | while read I; do echo ssh ${I} env ASSUME_ALWAYS_YES=yes pkg install python; done
ssh gluster1 env ASSUME_ALWAYS_YES=yes pkg install python
ssh gluster2 env ASSUME_ALWAYS_YES=yes pkg install python
ssh gluster3 env ASSUME_ALWAYS_YES=yes pkg install python
ssh gluster4 env ASSUME_ALWAYS_YES=yes pkg install python
ssh gluster5 env ASSUME_ALWAYS_YES=yes pkg install python
ssh gluster6 env ASSUME_ALWAYS_YES=yes pkg install python

… and execution on the FreeBSD machines with GNU Parallel.

vbhost % ansible -i hosts --list-hosts gluster \ 
           | sed 1d \
           | while read I; do echo ssh ${I} env ASSUME_ALWAYS_YES=yes pkg install python; done | parallel

Computers / CPU cores / Max jobs to run
1:local / 2 / 2

Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
local:0/6/100%/156.0s

The Python packages and its dependencies are installed.

vbhost % ssh gluster3 pkg info
gettext-runtime-0.19.8.1_2     GNU gettext runtime libraries and programs
indexinfo-0.3.1                Utility to regenerate the GNU info page index
libffi-3.2.1_3                 Foreign Function Interface
pkg-1.10.5_5                   Package manager
python-2.7_3,2                 "meta-port" for the default version of Python interpreter
python2-2_3                    The "meta-port" for version 2 of the Python interpreter
python27-2.7.15                Interpreted object-oriented programming language
readline-7.0.5                 Library for editing command lines as they are typed

Now with Ansible Ping module works as desired.

% ansible -i hosts -m ping gluster
gluster1 | SUCCESS => {
"changed": false,
"ping": "pong"
}
gluster4 | SUCCESS => {
"changed": false,
"ping": "pong"
}
gluster5 | SUCCESS => {
"changed": false,
"ping": "pong"
}
gluster3 | SUCCESS => {
"changed": false,
"ping": "pong"
}
gluster2 | SUCCESS => {
"changed": false,
"ping": "pong"
}
gluster6 | SUCCESS => {
"changed": false,
"ping": "pong"
}

GlusterFS Volume Options

GlusterFS has a lot of options to setup the volume. They are described in the GlusterFS Administration Guide in the Setting up GlusterFS Volumes part. Here they are:

Distributed – Distributed volumes distribute files across the bricks in the volume. You can use distributed volumes where the requirement is to scale storage and the redundancy is either not important or is provided by other hardware/software layers.

Replicated – Replicated volumes replicate files across bricks in the volume. You can use replicated volumes in environments where high-availability and high-reliability are critical.

Distributed Replicated – Distributed replicated volumes distribute files across replicated bricks in the volume. You can use distributed replicated volumes in environments where the requirement is to scale storage and high-reliability is critical. Distributed replicated volumes also offer improved read performance in most environments.

Dispersed – Dispersed volumes are based on erasure codes, providing space-efficient protection against disk or server failures. It stores an encoded fragment of the original file to each brick in a way that only a subset of the fragments is needed to recover the original file. The number of bricks that can be missing without losing access to data is configured by the administrator on volume creation time.

Distributed Dispersed – Distributed dispersed volumes distribute files across dispersed subvolumes. This has the same advantages of distribute replicate volumes, but using disperse to store the data into the bricks.

Striped [Deprecated] – Striped volumes stripes data across bricks in the volume. For best results, you should use striped volumes only in high concurrency environments accessing very large files.

Distributed Striped [Deprecated] – Distributed striped volumes stripe data across two or more nodes in the cluster. You should use distributed striped volumes where the requirement is to scale storage and in high concurrency environments accessing very large files is critical.

Distributed Striped Replicated [Deprecated] – Distributed striped replicated volumes distributes striped data across replicated bricks in the cluster. For best results, you should use distributed striped replicated volumes in highly concurrent environments where parallel access of very large files and performance is critical. In this release, configuration of this volume type is supported only for Map Reduce workloads.

Striped Replicated [Deprecated] – Striped replicated volumes stripes data across replicated bricks in the cluster. For best results, you should use striped replicated volumes in highly concurrent environments where there is parallel access of very large files and performance is critical. In this release, configuration of this volume type is supported only for Map Reduce workloads.

From all of the above still supported the Dispersed volume seems to be the best choice. Like Minio Dispersed volumes are based on erasure codes.

As we have 6 servers we will use 4 + 2 setup which is logical RAID6 against these 6 servers. This means that we will be able to lost 2 of them without service outage. This also means that if we will upload 100 MB file to our volume we will use 150 MB of space across these 6 servers with 25 MB on each node.

We can visualize this as following ASCII diagram.

+-----------+ +-----------+ +-----------+ +-----------+ +-----------+ +-----------+
|  gluster1 | |  gluster2 | |  gluster3 | |  gluster4 | |  gluster5 | |  gluster6 |
|           | |           | |           | |           | |           | |           |
|    brick1 | |    brick2 | |    brick3 | |    brick4 | |    brick5 | |    brick6 |
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
      |             |             |             |             |             |
    25|MB         25|MB         25|MB         25|MB         25|MB         25|MB
      |             |             |             |             |             |
      +-------------+-------------+------+------+-------------+-------------+
                                         |
                                      100|MB
                                         |
                                     +---+---+
                                     | file0 |
                                     +-------+

Deploy GlusterFS Cluster

We will use gluster-setup.yml as our Ansible playbook.

Lets create something for the start, for example to always install the latest Python package.

vbhost % cat gluster-setup.yml
---
- name: Install and Setup GlusterFS on FreeBSD
  hosts: gluster
  user: root
  tasks:

  - name: Install Latest Python Package
    pkgng:
      name: python
      state: latest

We will now execute it.

vbhost % ansible-playbook -i hosts gluster-setup.yml

PLAY [Install and Setup GlusterFS on FreeBSD] **********************************

TASK [Gathering Facts] *********************************************************
ok: [gluster3]
ok: [gluster5]
ok: [gluster1]
ok: [gluster4]
ok: [gluster2]
ok: [gluster6]

TASK [Install Latest Python Package] *******************************************
ok: [gluster4]
ok: [gluster2]
ok: [gluster5]
ok: [gluster3]
ok: [gluster1]
ok: [gluster6]

PLAY RECAP *********************************************************************
gluster1                   : ok=2    changed=0    unreachable=0    failed=0
gluster2                   : ok=2    changed=0    unreachable=0    failed=0
gluster3                   : ok=2    changed=0    unreachable=0    failed=0
gluster4                   : ok=2    changed=0    unreachable=0    failed=0
gluster5                   : ok=2    changed=0    unreachable=0    failed=0
gluster6                   : ok=2    changed=0    unreachable=0    failed=0

We just installed Python on these machines no update was needed.

As we will be creating cluster we need to add time synchronization between the nodes of the cluster. We will use mose obvious solution – the ntpd(8) daemon that is in the FreeBSD base system. These lines are added to our gluster-setup.yml playbook to achieve this goal

  - name: Enable NTPD Service
    raw: sysrc ntpd_enable=YES

  - name: Start NTPD Service
    service:
      name: ntpd
      state: started

After executing the playbook again with the ansible-playbook -i hosts gluster-setup.yml command we will see additional output as the one shown below.

TASK [Enable NTPD Service] ************************************************
changed: [gluster2]
changed: [gluster1]
changed: [gluster4]
changed: [gluster5]
changed: [gluster3]
changed: [gluster6]

TASK [Start NTPD Service] ******************************************************
changed: [gluster5]
changed: [gluster4]
changed: [gluster2]
changed: [gluster1]
changed: [gluster3]
changed: [gluster6]

Random verification of the NTP service.

vbhost % ssh gluster1 ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 0.freebsd.pool. .POOL.          16 p    -   64    0    0.000    0.000   0.000
 ntp.ifj.edu.pl  10.0.2.4         3 u    1   64    1  119.956  -345759  32.552
 news-archive.ic 229.30.220.210   2 u    -   64    1   60.533  -345760  21.104

Now we need to install GlusterFS on FreeBSD machines – the glusterfs package.

We will add appropriate section to the playbook.

  - name: Install Latest GlusterFS Package
    pkgng:
      state: latest
      name:
      - glusterfs
      - ncdu

You can add more then one package to the pkgng Ansible module – for example I have also added ncdu package.

You can read more about pkgng Ansible module by typing the ansible-doc pkgng command or at least its short version with -s argument.

vbhost % ansible-doc -s pkgng
- name: Package manager for FreeBSD >= 9.0
  pkgng:
      annotation:            # A comma-separated list of keyvalue-pairs of the form `[=]'. A `+' denotes adding
                               an annotation, a `-' denotes removing an annotation, and `:' denotes
                               modifying an annotation. If setting or modifying annotations, a value
                               must be provided.
      autoremove:            # Remove automatically installed packages which are no longer needed.
      cached:                # Use local package base instead of fetching an updated one.
      chroot:                # Pkg will chroot in the specified environment. Can not be used together with `rootdir' or `jail'
                               options.
      jail:                  # Pkg will execute in the given jail name or id. Can not be used together with `chroot' or `rootdir'
                               options.
      name:                  # (required) Name or list of names of packages to install/remove.
      pkgsite:               # For pkgng versions before 1.1.4, specify packagesite to use for downloading packages. If not
                               specified, use settings from `/usr/local/etc/pkg.conf'. For newer
                               pkgng versions, specify a the name of a repository configured in
                               `/usr/local/etc/pkg/repos'.
      rootdir:               # For pkgng versions 1.5 and later, pkg will install all packages within the specified root directory.
                               Can not be used together with `chroot' or `jail' options.
      state:                 # State of the package. Note: "latest" added in 2.7

You can read more about this particular module on the following – https://docs.ansible.com/ansible/latest/modules/pkgng_module.html – Ansible page.

We will now add GlusterFS nodes to the /etc/hosts file and add autoboot_delay=1 parameter to the /boot/loader.conf file so our systems will boot 9 seconds faster as 10 is the default delay setting.

Here is out gluster-setup.yml Ansible playbook this far.

vbhost % cat gluster-setup.yml
---
- name: Install and Setup GlusterFS on FreeBSD
  hosts: gluster
  user: root
  tasks:

  - name: Install Latest Python Package
    pkgng:
      name: python
      state: latest

  - name: Enable NTPD Service
    raw: sysrc ntpd_enable=YES

  - name: Start NTPD Service
    service:
      name: ntpd
      state: started

  - name: Install Latest GlusterFS Package
    pkgng:
      state: latest
      name:
      - glusterfs
      - ncdu

  - name: Add Nodes to /etc/hosts File
    blockinfile:
      path: /etc/hosts
      block: |
        10.0.10.11 gluster1
        10.0.10.12 gluster2
        10.0.10.13 gluster3
        10.0.10.14 gluster4
        10.0.10.15 gluster5
        10.0.10.16 gluster6

  - name: Add autoboot_delay to /boot/loader.conf File
    lineinfile:
      path: /boot/loader.conf
      line: autoboot_delay=1
      create: yes

Here is the result of the execution of this playbook.

vbhost % ansible-playbook -i hosts gluster-setup.yml

PLAY [Install and Setup GlusterFS on FreeBSD] **********************************

TASK [Gathering Facts] *********************************************************
ok: [gluster3]
ok: [gluster5]
ok: [gluster1]
ok: [gluster4]
ok: [gluster2]
ok: [gluster6]

TASK [Install Latest Python Package] *******************************************
ok: [gluster4]
ok: [gluster2]
ok: [gluster5]
ok: [gluster3]
ok: [gluster1]
ok: [gluster6]

TASK [Install Latest GlusterFS Package] ****************************************
ok: [gluster2]
ok: [gluster1]
ok: [gluster3]
ok: [gluster5]
ok: [gluster4]
ok: [gluster6]

TASK [Add Nodes to /etc/hosts File] ********************************************
changed: [gluster5]
changed: [gluster4]
changed: [gluster2]
changed: [gluster3]
changed: [gluster1]
changed: [gluster6]

TASK [Enable GlusterFS Service] ************************************************
changed: [gluster1]
changed: [gluster4]
changed: [gluster2]
changed: [gluster3]
changed: [gluster5]
changed: [gluster6]

TASK [Add autoboot_delay to /boot/loader.conf File] ****************************
changed: [gluster3]
changed: [gluster2]
changed: [gluster5]
changed: [gluster1]
changed: [gluster4]
changed: [gluster6]

PLAY RECAP *********************************************************************
gluster1                   : ok=6    changed=3    unreachable=0    failed=0
gluster2                   : ok=6    changed=3    unreachable=0    failed=0
gluster3                   : ok=6    changed=3    unreachable=0    failed=0
gluster4                   : ok=6    changed=3    unreachable=0    failed=0
gluster5                   : ok=6    changed=3    unreachable=0    failed=0
gluster6                   : ok=6    changed=3    unreachable=0    failed=0

Let’s check that FreeBSD machines can now ping each other by names.

vbhost % ssh gluster6 cat /etc/hosts
# LOOPBACK
127.0.0.1      localhost localhost.my.domain
::1            localhost localhost.my.domain

# BEGIN ANSIBLE MANAGED BLOCK
10.0.10.11 gluster1
10.0.10.12 gluster2
10.0.10.13 gluster3
10.0.10.14 gluster4
10.0.10.15 gluster5
10.0.10.16 gluster6
# END ANSIBLE MANAGED BLOCK

vbhost % ssh gluster1 ping -c 1 gluster3
PING gluster3 (10.0.10.13): 56 data bytes
64 bytes from 10.0.10.13: icmp_seq=0 ttl=64 time=1.924 ms

--- gluster3 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 1.924/1.924/1.924/0.000 ms

… and our /boot/loader.conf file.

vbhost % ssh gluster4 cat /boot/loader.conf
autoboot_delay=1

Now we need to create directories for GlusterFS data. Without better idea we will use /data directory with /data/colume1 as the directory for volume1 and bricks will be put as /data/volume1/brick1 dirs. In this setup I will use just one brick per server but in production environment you would probably use one brick per physical disk.

Here is the playbook command we will use to create these directories on FreeBSD machines.

  - name: Create brick* Directories for volume1
    raw: mkdir -p /data/volume1/brick` hostname | grep -o -E '[0-9]+' `

After executing it with ansible-playbook -i hosts gluster-setup.yml command the directories has beed created.

vbhost % ssh gluster2 find /data -ls | column -t
2247168  8  drwxr-xr-x  3  root  wheel  512  Dec  28  17:48  /data
2247169  8  drwxr-xr-x  3  root  wheel  512  Dec  28  17:48  /data/volume2
2247170  8  drwxr-xr-x  2  root  wheel  512  Dec  28  17:48  /data/volume2/brick2


We now need to add glusterd_enable=YES to the /etc/rc.conf file on GlusterFS nodes and then start the GlsuterFS service.

This is the snippet we will add to our playbook.

  - name: Enable GlusterFS Service
    raw: sysrc glusterd_enable=YES

  - name: Start GlusterFS Service
    service:
      name: glusterd
      state: started

Let’s make quick random verification.

vbhost % ssh gluster4 service glusterd status
glusterd is running as pid 2684.

Now we need to proceed to the last part of the GlusterFS setup – create the volume.

We will do this from the gluster1 – the 1st node of the GlusterFS cluster.

First we need to peer probe other nodes.

gluster1 # gluster peer probe gluster1
peer probe: success. Probe on localhost not needed
gluster1 # gluster peer probe gluster2
peer probe: success.
gluster1 # gluster peer probe gluster3
peer probe: success.
gluster1 # gluster peer probe gluster4
peer probe: success.
gluster1 # gluster peer probe gluster5
peer probe: success.
gluster1 # gluster peer probe gluster6
peer probe: success.

Then we can create the volume. We will need to use force option to because for our example setup we will use directories on the root partition.

gluster1 # gluster volume create volume1 \
             disperse-data 4 \
             redundancy 2 \
             transport tcp \
             gluster1:/data/volume1/brick1 \
             gluster2:/data/volume1/brick2 \
             gluster3:/data/volume1/brick3 \
             gluster4:/data/volume1/brick4 \
             gluster5:/data/volume1/brick5 \
             gluster6:/data/volume1/brick6 \
             force
volume create: volume1: success: please start the volume to access data

We can now start the volume1 GlsuerFS volume.

gluster1 # gluster volume start volume1
volume start: volume1: success

gluster1 # gluster volume status volume1
Status of volume: volume1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gluster1:/data/volume1/brick1         N/A       N/A        N       N/A
Brick gluster2:/data/volume1/brick2         N/A       N/A        N       N/A
Brick gluster3:/data/volume1/brick3         N/A       N/A        N       N/A
Brick gluster4:/data/volume1/brick4         N/A       N/A        N       N/A
Brick gluster5:/data/volume1/brick5         N/A       N/A        N       N/A
Brick gluster6:/data/volume1/brick6         N/A       N/A        N       N/A
Self-heal Daemon on localhost               N/A       N/A        N       644
Self-heal Daemon on gluster6                N/A       N/A        N       643
Self-heal Daemon on gluster5                N/A       N/A        N       647
Self-heal Daemon on gluster2                N/A       N/A        N       645
Self-heal Daemon on gluster3                N/A       N/A        N       645
Self-heal Daemon on gluster4                N/A       N/A        N       645

Task Status of Volume volume1
------------------------------------------------------------------------------
There are no active volume tasks

gluster1 # gluster volume info volume1

Volume Name: volume1
Type: Disperse
Volume ID: 68cf9607-16bc-4550-9b6b-16a5c7656f51
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: gluster1:/data/volume1/brick1
Brick2: gluster2:/data/volume1/brick2
Brick3: gluster3:/data/volume1/brick3
Brick4: gluster4:/data/volume1/brick4
Brick5: gluster5:/data/volume1/brick5
Brick6: gluster6:/data/volume1/brick6
Options Reconfigured:
nfs.disable: on
transport.address-family: inet

Here are contents of currently unused/empty brick.

gluster1 # find /data/volume1/brick1
/data/volume1/brick1
/data/volume1/brick1/.glusterfs
/data/volume1/brick1/.glusterfs/indices
/data/volume1/brick1/.glusterfs/indices/xattrop
/data/volume1/brick1/.glusterfs/indices/entry-changes
/data/volume1/brick1/.glusterfs/quarantine
/data/volume1/brick1/.glusterfs/quarantine/stub-00000000-0000-0000-0000-000000000008
/data/volume1/brick1/.glusterfs/changelogs
/data/volume1/brick1/.glusterfs/changelogs/htime
/data/volume1/brick1/.glusterfs/changelogs/csnap
/data/volume1/brick1/.glusterfs/brick1.db
/data/volume1/brick1/.glusterfs/brick1.db-wal
/data/volume1/brick1/.glusterfs/brick1.db-shm
/data/volume1/brick1/.glusterfs/00
/data/volume1/brick1/.glusterfs/00/00
/data/volume1/brick1/.glusterfs/00/00/00000000-0000-0000-0000-000000000001
/data/volume1/brick1/.glusterfs/landfill
/data/volume1/brick1/.glusterfs/unlink
/data/volume1/brick1/.glusterfs/health_check

The 6-node GlusterFS cluster is now complete and volume1 available to use.

Alternative

The GlusterFS’s documentation Quick Start Guide also suggests using Ansible to deploy and manage GlusterFS with gluster-ansible repository or gluster-ansible-cluster but they have below requirements.

  • Ansible version 2.5 or above.
  • GlusterFS version 3.2 or above.

As GlusterFS on FreeBSD is at 3.11.1 version I did not used them.

FreeBSD Client

We will now use another VirtualBox machine – also based on the same FreeBSD 12.0-RELEASE image – to create FreeBSD Client machine that will mount our volume1 volume.

We will need to install glusterfs package with pkg(8) command. Then we will use mount_glusterfs command to mount the volume. Keep in mind that in order to mount GlusterFS volume the FUSE (fuse.ko kernel module is needed.

client # pkg install glusterfs

client # kldload fuse

client # mount_glusterfs 10.0.10.11:volume1 /mnt

client # echo $?
0

client # mount
/dev/gpt/rootfs on / (ufs, local, soft-updates)
devfs on /dev (devfs, local, multilabel)
/dev/fuse on /mnt (fusefs, local, synchronous)

client # ls /mnt
ls: /mnt: Socket is not connected

It is mounted but does not work. The solution to this problem is to add appropriate /etc/hosts entries to the GlusterFS nodes.

client # cat /etc/hosts
::1                     localhost localhost.my.domain
127.0.0.1               localhost localhost.my.domain

10.0.10.11 gluster1
10.0.10.12 gluster2
10.0.10.13 gluster3
10.0.10.14 gluster4
10.0.10.15 gluster5
10.0.10.16 gluster6

Lets mount it again now with needed /etc/hosts entries.

client # umount /mnt

client # mount_glusterfs gluster1:volume1 /mnt

client # ls /mnt
client #

We now have our GlusterFS volume properly mounted and working on the FreeBSD Client machine.

Lets write some file there with dd(8) to see how it works.

client # dd  FILE bs=1m count=100 status=progress
  73400320 bytes (73 MB, 70 MiB) transferred 1.016s, 72 MB/s
100+0 records in
100+0 records out
104857600 bytes transferred in 1.565618 secs (66975227 bytes/sec)

Let’s see how it looks in the brick directory.

gluster1 # ls -lh /data/volume1/brick1
total 25640
drw-------  10 root  wheel   512B Jan  3 18:31 .glusterfs
-rw-r--r--   2 root  wheel    25M Jan  3 18:31 FILE

gluster1 # find /data
/data/
/data/volume1
/data/volume1/brick1
/data/volume1/brick1/.glusterfs
/data/volume1/brick1/.glusterfs/indices
/data/volume1/brick1/.glusterfs/indices/xattrop
/data/volume1/brick1/.glusterfs/indices/xattrop/xattrop-aed814f1-0eb0-46a1-b569-aeddf5048e06
/data/volume1/brick1/.glusterfs/indices/entry-changes
/data/volume1/brick1/.glusterfs/quarantine
/data/volume1/brick1/.glusterfs/quarantine/stub-00000000-0000-0000-0000-000000000008
/data/volume1/brick1/.glusterfs/changelogs
/data/volume1/brick1/.glusterfs/changelogs/htime
/data/volume1/brick1/.glusterfs/changelogs/csnap
/data/volume1/brick1/.glusterfs/brick1.db
/data/volume1/brick1/.glusterfs/brick1.db-wal
/data/volume1/brick1/.glusterfs/brick1.db-shm
/data/volume1/brick1/.glusterfs/00
/data/volume1/brick1/.glusterfs/00/00
/data/volume1/brick1/.glusterfs/00/00/00000000-0000-0000-0000-000000000001
/data/volume1/brick1/.glusterfs/landfill
/data/volume1/brick1/.glusterfs/unlink
/data/volume1/brick1/.glusterfs/health_check
/data/volume1/brick1/.glusterfs/ac
/data/volume1/brick1/.glusterfs/ac/b4
/data/volume1/brick1/.glusterfs/11
/data/volume1/brick1/.glusterfs/11/50
/data/volume1/brick1/.glusterfs/11/50/115043ca-420f-48b5-af05-c9552db2e585
/data/volume1/brick1/FILE

Linux Client

I will also show how to mount GlusterFS volume on the Red Hat clone CentOS in its latest 7.6 incarnation. It will require glusterfs-fuse package installation.

[root@localhost ~]# yum install glusterfs-fuse


[root@localhost ~]# rpm -q --filesbypkg glusterfs-fuse | grep /sbin/mount.glusterfs
glusterfs-fuse            /sbin/mount.glusterfs

[root@localhost ~]# mount.glusterfs 10.0.10.11:volume1 /mnt
Mount failed. Please check the log file for more details.

Similarly like with FreeBSD Client the /etc/hosts entries are needed.

[root@localhost ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.0.10.11 gluster1
10.0.10.12 gluster2
10.0.10.13 gluster3
10.0.10.14 gluster4
10.0.10.15 gluster5
10.0.10.16 gluster6

[root@localhost ~]# mount.glusterfs 10.0.10.11:volume1 /mnt

[root@localhost ~]# ls /mnt
FILE

[root@localhost ~]# mount
10.0.10.11:volume1 on /mnt type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

With apropriate /etc/hosts entries it works as desired. We see the FILE file generated fron the FreeBSD Client machine.

GlusterFS Cluster Redundancy

After messing with the volume and creating and deleting various files I also tested its redundancy. In theory this RAID6 equivalent protection should protect us from the loss of two of six servers. After shutdown of two VirtualBox machines the volume is still available and ready to use.

Closing Thougts

Pity that FreeBSD does not provide more modern GlusterFS package as currently only 3.11.1 version is available.

EOF

IBM TSM (Spectrum Protect) on Veritas Cluster Server

Until today I mostly shared articles about free and open systems. Now its time to share so called enterprise experience πŸ™‚ Not so long ago I made a IBM TSM instance as highly available service on Symantec Veritas Cluster Server.

ibm-tsm-logo.png

If you prefer to use open and free backup solution then check Bareos Backup Server on FreeBSD article.

The IBM TSM (Tivoli Storage Manager) has been rebranded by IBM into IBM Spectrum Protect and in the similar period of time Symantec moved Veritas Cluster Server info InfoScale Availability while creating separate/dedicated Veritas company for this purpose.

The instructions I want to share today are for sure the same for latest versions of Veritas Cluster Server and its later InfoScale Availability incarnations and latest IBM Spectrum Protect 8.1 family introduction was mostly related to rebranding/cleaning of the whole Spectrum Protect/TSM modules and additions, so they all will have common 8.1 label. As these instructions were made for IBM TSM (Spectrum Protect) 7.1.6 version they should still be very similar for current versions.

This highly available IBM TSM instance is part of the whole Backup Consolidation project which uses two physical servers to server both this IBM TSM service and Dell/EMC Networker backup server. When everything is OK then one of the nodes is dedicated to IBM TSM and the other one is used by Dell/EMC Networker, so all physical resources are well saturated and we do not ‘waste’ whole node to wait for 99% of the time empty for the first node to crash. Of course if first node misbehaves or has a hardware failure, then both IBM TSM and Dell/EMC Networker run nicely on single node. It is also very convenient for various maintenance tasks, to be able to switch all services to other node and and work in peace on the first one, but I do not have to tell you that. The third and last service is shared between these two Oracle RMAN Catalog for the Oracle databases metadata information – also for backup/restore purposes.

I will not write here instructions to install the operating system (we use amd64 RHEL 6.x here) or to setup the Veritas Cluster Server as I installed it earlier and its quite simple to set it up. These instructions focus on creating IBM TSM highly available service along using/allocating the resources from the IBM Storwize V5030 storage array where 400 GB SSD disks are dedicated for IBM TSM DB2 database instance and 1.8 TB 10K SAS disks are dedicated for DRAID groups that will be serving space for IBM TSM storage pools implemented in latest IBM TSM container pools with deduplication and compression enabled. The head of IBM Storwize V5030 storage array is shown below.

ibm-tsm-v5030-photo.jpg

Each node is IBM System x3650 M4 server with two dual-port 8Gb FC cards and one dual-port 10GE cards … along with builtin 1GE cards for Veritas Cluster Server heartbeats. Each has 192 GB RAM and dual 6-core CPUs @ 3.5 GHz each which translates to 12 physical cores or 24 HTT threads per node. The three internal SSD drives are used for the system only in RAID1 + SPARE configuration. All clustered resources are from IBM Storwize V5030 FC/SAN storage array. The operating system installed on these nodes is amd64 RHEL 6.x and the Veritas Cluster Server is at 6.2.x version. The IBM System x3650 M4 server is shown below.

ibm-tsm-x3650-m4.jpg

All of the setting/tuning/decisions were made based on the IBM TSM documentation and great IBM Spectrum Protect Blueprints resources from the valuable IBM developerWorks wiki.

Storage Array Setup

First we need to create MDISKS. We used DRAID with double parity protection + spare for each MDISK with 17 SAS 1.8 TB 10K disks each. That gives 14 disks for data 2 for parity and 1 spare from which all provide I/O thanks to DRAID setup. We have three such MDISKs with ~21.7 TB each for the total 65.1 TB for IBM TSM containers. Of course all these 3 ‘pool’ MDISKs are in one Storage Group. The LUNs for the IBM TSM DB2 database were 5 SSD 400 GB disks setup in a DRAID disk with 1 parity and 1 spare disk. This gives 3 disks for data 1 for parity and 1 for spare space. This gives about 1.1 TB for the IBM TSM DB2 database.

Here are LUNs created from these MDISKs.

ibm-tsm-v5030.png

I needed to remove some names of course πŸ™‚

LUNs Initialization

Veritas Service Cluster needs to have storage prepared with disk groups which are similar in concept (but more powerful) then LVM. Below are instructions to first detect and then initialize these LUNs from IBM Storwize V5030 storage array. I marked them in blue for more clarity.

[root@300 ~]# haconf -makerw
[root@300 ~]# vxdisk -o alldgs list
DEVICE                TYPE            DISK         GROUP        STATUS
disk_0                auto:LVM        -            -            online invalid
storwizev70000_00000a auto:cdsdisk    -            (dg_fencing) online
storwizev70000_00000b auto:cdsdisk    stgFC_00B    NSR_dg_nsr   online
storwizev70000_00000c auto:cdsdisk    stgFC_00C    NSR_dg_nsr   online
storwizev70000_00000d auto:cdsdisk    stgFC_00D    NSR_dg_nsr   online
storwizev70000_00000e auto:cdsdisk    stgFC_00E    NSR_dg_nsr   online
storwizev70000_00000f auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_00001a auto:none       -            -            online invalid
storwizev70000_00001b auto:none       -            -            online invalid
storwizev70000_00001c auto:none       -            -            online invalid
storwizev70000_00001d auto:none       -            -            online invalid
storwizev70000_00001e auto:none       -            -            online invalid
storwizev70000_00001f auto:none       -            -            online invalid
storwizev70000_000008 auto:cdsdisk    -            (dg_fencing) online
storwizev70000_000009 auto:cdsdisk    -            (dg_fencing) online
storwizev70000_000010 auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_000011 auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_000012 auto:none       -            -            online invalid
storwizev70000_000013 auto:none       -            -            online invalid
storwizev70000_000014 auto:none       -            -            online invalid
storwizev70000_000015 auto:none       -            -            online invalid
storwizev70000_000016 auto:none       -            -            online invalid
storwizev70000_000017 auto:none       -            -            online invalid
storwizev70000_000018 auto:none       -            -            online invalid
storwizev70000_000019 auto:none       -            -            online invalid
storwizev70000_000020 auto:none       -            -            online invalid
[root@300 ~]# vxdisksetup -i storwizev70000_00001a
[root@300 ~]# vxdisksetup -i storwizev70000_00001b
[root@300 ~]# vxdisksetup -i storwizev70000_00001c
[root@300 ~]# vxdisksetup -i storwizev70000_00001d
[root@300 ~]# vxdisksetup -i storwizev70000_00001e
[root@300 ~]# vxdisksetup -i storwizev70000_00001f
[root@300 ~]# vxdisksetup -i storwizev70000_000012
[root@300 ~]# vxdisksetup -i storwizev70000_000013
[root@300 ~]# vxdisksetup -i storwizev70000_000014
[root@300 ~]# vxdisksetup -i storwizev70000_000015
[root@300 ~]# vxdisksetup -i storwizev70000_000016
[root@300 ~]# vxdisksetup -i storwizev70000_000017
[root@300 ~]# vxdisksetup -i storwizev70000_000018
[root@300 ~]# vxdisksetup -i storwizev70000_000019
[root@300 ~]# vxdisksetup -i storwizev70000_000020
[root@300 ~]# vxdisk -o alldgs list
DEVICE                TYPE            DISK         GROUP        STATUS
disk_0                auto:LVM        -            -            online invalid
storwizev70000_00000a auto:cdsdisk    -            (dg_fencing) online
storwizev70000_00000b auto:cdsdisk    stgFC_00B    NSR_dg_nsr   online
storwizev70000_00000c auto:cdsdisk    stgFC_00C    NSR_dg_nsr   online
storwizev70000_00000d auto:cdsdisk    stgFC_00D    NSR_dg_nsr   online
storwizev70000_00000e auto:cdsdisk    stgFC_00E    NSR_dg_nsr   online
storwizev70000_00000f auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_00001a auto:cdsdisk    -            -            online
storwizev70000_00001b auto:cdsdisk    -            -            online
storwizev70000_00001c auto:cdsdisk    -            -            online
storwizev70000_00001d auto:cdsdisk    -            -            online
storwizev70000_00001e auto:cdsdisk    -            -            online
storwizev70000_00001f auto:cdsdisk    -            -            online
storwizev70000_000008 auto:cdsdisk    -            (dg_fencing) online
storwizev70000_000009 auto:cdsdisk    -            (dg_fencing) online
storwizev70000_000010 auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_000011 auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_000012 auto:cdsdisk    -            -            online
storwizev70000_000013 auto:cdsdisk    -            -            online
storwizev70000_000014 auto:cdsdisk    -            -            online
storwizev70000_000015 auto:cdsdisk    -            -            online
storwizev70000_000016 auto:cdsdisk    -            -            online
storwizev70000_000017 auto:cdsdisk    -            -            online
storwizev70000_000018 auto:cdsdisk    -            -            online
storwizev70000_000019 auto:cdsdisk    -            -            online
storwizev70000_000019 auto:cdsdisk    -            -            online
storwizev70000_000020 auto:cdsdisk    -            -            online
[root@300 ~]# vxdg init TSM0_dg \
                stgFC_020=storwizev70000_000020 \
                stgFC_012=storwizev70000_000012 \
                stgFC_016=storwizev70000_000016 \
                stgFC_013=storwizev70000_000013 \
                stgFC_014=storwizev70000_000014 \
                stgFC_015=storwizev70000_000015 \
                stgFC_017=storwizev70000_000017 \
                stgFC_018=storwizev70000_000018 \
                stgFC_019=storwizev70000_000019 \
                stgFC_01A=storwizev70000_00001a \
                stgFC_01B=storwizev70000_00001b \
                stgFC_01C=storwizev70000_00001c \
                stgFC_01D=storwizev70000_00001d \
                stgFC_01E=storwizev70000_00001e \
                stgFC_01F=storwizev70000_00001f
[root@300 ~]# vxdisk -o alldgs list
DEVICE                TYPE            DISK         GROUP        STATUS
disk_0                auto:LVM        -            -            online invalid
storwizev70000_00000a auto:cdsdisk    -            (dg_fencing) online
storwizev70000_00000b auto:cdsdisk    stgFC_00B    NSR_dg_nsr   online
storwizev70000_00000c auto:cdsdisk    stgFC_00C    NSR_dg_nsr   online
storwizev70000_00000d auto:cdsdisk    stgFC_00D    NSR_dg_nsr   online
storwizev70000_00000e auto:cdsdisk    stgFC_00E    NSR_dg_nsr   online
storwizev70000_00000f auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_00001a auto:cdsdisk    stgFC_01A    TSM0_dg      online
storwizev70000_00001b auto:cdsdisk    stgFC_01B    TSM0_dg      online
storwizev70000_00001c auto:cdsdisk    stgFC_01C    TSM0_dg      online
storwizev70000_00001d auto:cdsdisk    stgFC_01D    TSM0_dg      online
storwizev70000_00001e auto:cdsdisk    stgFC_01E    TSM0_dg      online
storwizev70000_00001f auto:cdsdisk    stgFC_01F    TSM0_dg      online
storwizev70000_000008 auto:cdsdisk    -            (dg_fencing) online
storwizev70000_000009 auto:cdsdisk    -            (dg_fencing) online
storwizev70000_000010 auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_000011 auto:cdsdisk    -            (RMAN_dg)    online
storwizev70000_000012 auto:cdsdisk    stgFC_012    TSM0_dg      online
storwizev70000_000013 auto:cdsdisk    stgFC_013    TSM0_dg      online
storwizev70000_000014 auto:cdsdisk    stgFC_014    TSM0_dg      online
storwizev70000_000015 auto:cdsdisk    stgFC_015    TSM0_dg      online
storwizev70000_000016 auto:cdsdisk    stgFC_016    TSM0_dg      online
storwizev70000_000017 auto:cdsdisk    stgFC_017    TSM0_dg      online
storwizev70000_000018 auto:cdsdisk    stgFC_018    TSM0_dg      online
storwizev70000_000019 auto:cdsdisk    stgFC_019    TSM0_dg      online
storwizev70000_000020 auto:cdsdisk    stgFC_020    TSM0_dg      online
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_instance     maxsize=32G   stgFC_020
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_active_log   maxsize=128G  stgFC_012
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_archive_log  maxsize=384G  stgFC_016
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_db_01        maxsize=300G  stgFC_013
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_db_02        maxsize=300G  stgFC_014
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_db_03        maxsize=300G  stgFC_015
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_db_backup_01 maxsize=900G  stgFC_017
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_db_backup_02 maxsize=900G  stgFC_018
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_db_backup_03 maxsize=900G  stgFC_019
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_pool0_01     maxsize=6700G stgFC_01A
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_pool0_02     maxsize=6700G stgFC_01B
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_pool0_03     maxsize=6700G stgFC_01C
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_pool0_04     maxsize=6700G stgFC_01D
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_pool0_05     maxsize=6700G stgFC_01E
[root@300 ~]# vxassist -g TSM0_dg make TSM0_vol_pool0_06     maxsize=6700G stgFC_01F
[root@300 ~]# vxprint -u h | grep ^sd | column -t
sd  stgFC_00B-01  NSR_vol_index-02          ENABLED  399.95g  0.00  -  -  -
sd  stgFC_00C-01  NSR_vol_media-02          ENABLED  9.96g    0.00  -  -  -
sd  stgFC_00D-01  NSR_vol_nsr-02            ENABLED  79.96g   0.00  -  -  -
sd  stgFC_00E-01  NSR_vol_res-02            ENABLED  9.96g    0.00  -  -  -
sd  stgFC_012-01  TSM0_vol_active_log-01    ENABLED  127.96g  0.00  -  -  -
sd  stgFC_016-01  TSM0_vol_archive_log-01   ENABLED  383.95g  0.00  -  -  -
sd  stgFC_017-01  TSM0_vol_db_backup_01-01  ENABLED  899.93g  0.00  -  -  -
sd  stgFC_018-01  TSM0_vol_db_backup_02-01  ENABLED  899.93g  0.00  -  -  -
sd  stgFC_019-01  TSM0_vol_db_backup_03-01  ENABLED  899.93g  0.00  -  -  -
sd  stgFC_013-01  TSM0_vol_db_01-01         ENABLED  299.95g  0.00  -  -  -
sd  stgFC_014-01  TSM0_vol_db_02-01         ENABLED  299.95g  0.00  -  -  -
sd  stgFC_015-01  TSM0_vol_db_03-01         ENABLED  299.95g  0.00  -  -  -
sd  stgFC_020-01  TSM0_vol_instance-01      ENABLED  31.96g   0.00  -  -  -
sd  stgFC_01A-01  TSM0_vol_pool0_01-01      ENABLED  6.54t    0.00  -  -  -
sd  stgFC_01B-01  TSM0_vol_pool0_02-01      ENABLED  6.54t    0.00  -  -  -
sd  stgFC_01C-01  TSM0_vol_pool0_03-01      ENABLED  6.54t    0.00  -  -  -
sd  stgFC_01D-01  TSM0_vol_pool0_04-01      ENABLED  6.54t    0.00  -  -  -
sd  stgFC_01E-01  TSM0_vol_pool0_05-01      ENABLED  6.54t    0.00  -  -  -
sd  stgFC_01F-01  TSM0_vol_pool0_06-01      ENABLED  6.54t    0.00  -  -  -
[root@300 ~]# vxprint -u h -g TSM0_dg | column -t
TY  NAME                      ASSOC                     KSTATE   LENGTH   PLOFFS  STATE   TUTIL0  PUTIL0
dg  TSM0_dg                   TSM0_dg                   -        -        -       -       -       -
dm  stgFC_01A                 storwizev70000_00001a     -        6.54t    -       -       -       -
dm  stgFC_01B                 storwizev70000_00001b     -        6.54t    -       -       -       -
dm  stgFC_01C                 storwizev70000_00001c     -        6.54t    -       -       -       -
dm  stgFC_01D                 storwizev70000_00001d     -        6.54t    -       -       -       -
dm  stgFC_01E                 storwizev70000_00001e     -        6.54t    -       -       -       -
dm  stgFC_01F                 storwizev70000_00001f     -        6.54t    -       -       -       -
dm  stgFC_012                 storwizev70000_000012     -        127.96g  -       -       -       -
dm  stgFC_013                 storwizev70000_000013     -        299.95g  -       -       -       -
dm  stgFC_014                 storwizev70000_000014     -        299.95g  -       -       -       -
dm  stgFC_015                 storwizev70000_000015     -        299.95g  -       -       -       -
dm  stgFC_016                 storwizev70000_000016     -        383.95g  -       -       -       -
dm  stgFC_017                 storwizev70000_000017     -        899.93g  -       -       -       -
dm  stgFC_018                 storwizev70000_000018     -        899.93g  -       -       -       -
dm  stgFC_019                 storwizev70000_000019     -        899.93g  -       -       -       -
dm  stgFC_020                 storwizev70000_000020     -        31.96g   -       -       -       -

v   TSM0_vol_active_log       fsgen                     ENABLED  127.96g  -       ACTIVE  -       -
pl  TSM0_vol_active_log-01    TSM0_vol_active_log       ENABLED  127.96g  -       ACTIVE  -       -
sd  stgFC_012-01              TSM0_vol_active_log-01    ENABLED  127.96g  0.00    -       -       -

v   TSM0_vol_archive_log      fsgen                     ENABLED  383.95g  -       ACTIVE  -       -
pl  TSM0_vol_archive_log-01   TSM0_vol_archive_log      ENABLED  383.95g  -       ACTIVE  -       -
sd  stgFC_016-01              TSM0_vol_archive_log-01   ENABLED  383.95g  0.00    -       -       -

v   TSM0_vol_db_backup_01     fsgen                     ENABLED  899.93g  -       ACTIVE  -       -
pl  TSM0_vol_db_backup_01-01  TSM0_vol_db_backup_01     ENABLED  899.93g  -       ACTIVE  -       -
sd  stgFC_017-01              TSM0_vol_db_backup_01-01  ENABLED  899.93g  0.00    -       -       -

v   TSM0_vol_db_backup_02     fsgen                     ENABLED  899.93g  -       ACTIVE  -       -
pl  TSM0_vol_db_backup_02-01  TSM0_vol_db_backup_02     ENABLED  899.93g  -       ACTIVE  -       -
sd  stgFC_018-01              TSM0_vol_db_backup_02-01  ENABLED  899.93g  0.00    -       -       -

v   TSM0_vol_db_backup_03     fsgen                     ENABLED  899.93g  -       ACTIVE  -       -
pl  TSM0_vol_db_backup_03-01  TSM0_vol_db_backup_03     ENABLED  899.93g  -       ACTIVE  -       -
sd  stgFC_019-01              TSM0_vol_db_backup_03-01  ENABLED  899.93g  0.00    -       -       -

v   TSM0_vol_db_01            fsgen                     ENABLED  299.95g  -       ACTIVE  -       -
pl  TSM0_vol_db_01-01         TSM0_vol_db_01            ENABLED  299.95g  -       ACTIVE  -       -
sd  stgFC_013-01              TSM0_vol_db_01-01         ENABLED  299.95g  0.00    -       -       -

v   TSM0_vol_db_02            fsgen                     ENABLED  299.95g  -       ACTIVE  -       -
pl  TSM0_vol_db_02-01         TSM0_vol_db_02            ENABLED  299.95g  -       ACTIVE  -       -
sd  stgFC_014-01              TSM0_vol_db_02-01         ENABLED  299.95g  0.00    -       -       -

v   TSM0_vol_db_03            fsgen                     ENABLED  299.95g  -       ACTIVE  -       -
pl  TSM0_vol_db_03-01         TSM0_vol_db_03            ENABLED  299.95g  -       ACTIVE  -       -
sd  stgFC_015-01              TSM0_vol_db_03-01         ENABLED  299.95g  0.00    -       -       -

v   TSM0_vol_instance         fsgen                     ENABLED  31.96g   -       ACTIVE  -       -
pl  TSM0_vol_instance-01      TSM0_vol_instance         ENABLED  31.96g   -       ACTIVE  -       -
sd  stgFC_020-01              TSM0_vol_instance-01      ENABLED  31.96g   0.00    -       -       -

v   TSM0_vol_pool0_01         fsgen                     ENABLED  6.54t    -       ACTIVE  -       -
pl  TSM0_vol_pool0_01-01      TSM0_vol_pool0_01         ENABLED  6.54t    -       ACTIVE  -       -
sd  stgFC_01A-01              TSM0_vol_pool0_01-01      ENABLED  6.54t    0.00    -       -       -

v   TSM0_vol_pool0_02         fsgen                     ENABLED  6.54t    -       ACTIVE  -       -
pl  TSM0_vol_pool0_02-01      TSM0_vol_pool0_02         ENABLED  6.54t    -       ACTIVE  -       -
sd  stgFC_01B-01              TSM0_vol_pool0_02-01      ENABLED  6.54t    0.00    -       -       -

v   TSM0_vol_pool0_03         fsgen                     ENABLED  6.54t    -       ACTIVE  -       -
pl  TSM0_vol_pool0_03-01      TSM0_vol_pool0_03         ENABLED  6.54t    -       ACTIVE  -       -
sd  stgFC_01C-01              TSM0_vol_pool0_03-01      ENABLED  6.54t    0.00    -       -       -

v   TSM0_vol_pool0_04         fsgen                     ENABLED  6.54t    -       ACTIVE  -       -
pl  TSM0_vol_pool0_04-01      TSM0_vol_pool0_04         ENABLED  6.54t    -       ACTIVE  -       -
sd  stgFC_01D-01              TSM0_vol_pool0_04-01      ENABLED  6.54t    0.00    -       -       -

v   TSM0_vol_pool0_05         fsgen                     ENABLED  6.54t    -       ACTIVE  -       -
pl  TSM0_vol_pool0_05-01      TSM0_vol_pool0_05         ENABLED  6.54t    -       ACTIVE  -       -
sd  stgFC_01E-01              TSM0_vol_pool0_05-01      ENABLED  6.54t    0.00    -       -       -

v   TSM0_vol_pool0_06         fsgen                     ENABLED  6.54t    -       ACTIVE  -       -
pl  TSM0_vol_pool0_06-01      TSM0_vol_pool0_06         ENABLED  6.54t    -       ACTIVE  -       -
sd  stgFC_01F-01              TSM0_vol_pool0_06-01      ENABLED  6.54t    0.00    -       -       -
[root@300 ~]# vxinfo -p -g TSM0_dg | column -t
vol   TSM0_vol_instance         fsgen   Started
plex  TSM0_vol_instance-01      ACTIVE
vol   TSM0_vol_active_log       fsgen   Started
plex  TSM0_vol_active_log-01    ACTIVE
vol   TSM0_vol_archive_log      fsgen   Started
plex  TSM0_vol_archive_log-01   ACTIVE
vol   TSM0_vol_db_01            fsgen   Started
plex  TSM0_vol_db_01-01         ACTIVE
vol   TSM0_vol_db_02            fsgen   Started
plex  TSM0_vol_db_02-01         ACTIVE
vol   TSM0_vol_db_03            fsgen   Started
plex  TSM0_vol_db_03-01         ACTIVE
vol   TSM0_vol_db_backup_01     fsgen   Started
plex  TSM0_vol_db_backup_01-01  ACTIVE
vol   TSM0_vol_db_backup_02     fsgen   Started
plex  TSM0_vol_db_backup_02-01  ACTIVE
vol   TSM0_vol_db_backup_03     fsgen   Started
plex  TSM0_vol_db_backup_03-01  ACTIVE
vol   TSM0_vol_pool0_01         fsgen   Started
plex  TSM0_vol_pool0_01-01      ACTIVE
vol   TSM0_vol_pool0_02         fsgen   Started
plex  TSM0_vol_pool0_02-01      ACTIVE
vol   TSM0_vol_pool0_03         fsgen   Started
plex  TSM0_vol_pool0_03-01      ACTIVE
vol   TSM0_vol_pool0_04         fsgen   Started
plex  TSM0_vol_pool0_04-01      ACTIVE
vol   TSM0_vol_pool0_05         fsgen   Started
plex  TSM0_vol_pool0_05-01      ACTIVE
vol   TSM0_vol_pool0_06         fsgen   Started
plex  TSM0_vol_pool0_06-01      ACTIVE
[root@300 ~]# find /dev/vx/dsk -name TSM0_\*
/dev/vx/dsk/TSM0_dg
/dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_06
/dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_05
/dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_04
/dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_03
/dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_02
/dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_01
/dev/vx/dsk/TSM0_dg/TSM0_vol_db_backup_03
/dev/vx/dsk/TSM0_dg/TSM0_vol_db_backup_02
/dev/vx/dsk/TSM0_dg/TSM0_vol_db_backup_01
/dev/vx/dsk/TSM0_dg/TSM0_vol_db_03
/dev/vx/dsk/TSM0_dg/TSM0_vol_db_02
/dev/vx/dsk/TSM0_dg/TSM0_vol_db_01
/dev/vx/dsk/TSM0_dg/TSM0_vol_archive_log
/dev/vx/dsk/TSM0_dg/TSM0_vol_active_log
/dev/vx/dsk/TSM0_dg/TSM0_vol_instance
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_pool0_06     &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_pool0_05     &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_pool0_04     &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_pool0_03     &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_pool0_02     &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_pool0_01     &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_db_backup_03 &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_db_backup_02 &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_db_backup_01 &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_db_03        &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_db_02        &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_db_01        &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_archive_log  &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_active_log   &
[root@300 ~]# mkfs -t vxfs -o bsize=8192,largefiles /dev/vx/rdsk/TSM0_dg/TSM0_vol_instance     &

[root@300 ~]# haconf -dump -makero

Veritas Cluster Server Group

Now as we have LUNs initialized into Disk Group we may now create the cluster service.

[root@300 ~]# haconf -makerw
[root@300 ~]# hagrp -add TSM0_site
VCS NOTICE V-16-1-10136 Group added; populating SystemList and setting the Parallel attribute recommended before adding resources
[root@300 ~]# hagrp -modify TSM0_site SystemList 300 0 301 1
[root@300 ~]# hagrp -modify TSM0_site AutoStartList 300 301
[root@300 ~]# hagrp -modify TSM0_site Parallel 0
[root@300 ~]# hares -add    TSM0_nic_bond0 NIC TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_nic_bond0 Critical 1
[root@300 ~]# hares -modify TSM0_nic_bond0 PingOptimize 1
[root@300 ~]# hares -modify TSM0_nic_bond0 Device bond0
[root@300 ~]# hares -modify TSM0_nic_bond0 Enabled 1
[root@300 ~]# hares -probe  TSM0_nic_bond0 -sys 301
[root@300 ~]# hares -add    TSM0_ip_bond0 IP TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_ip_bond0 Critical 1
[root@300 ~]# hares -modify TSM0_ip_bond0 Device bond0
[root@300 ~]# hares -modify TSM0_ip_bond0 Address 10.20.30.44
[root@300 ~]# hares -modify TSM0_ip_bond0 NetMask 255.255.255.0
[root@300 ~]# hares -modify TSM0_ip_bond0 Enabled 1
[root@300 ~]# hares -link   TSM0_ip_bond0 TSM0_nic_bond0
[root@300 ~]# hares -add    TSM0_dg DiskGroup TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_dg Critical 1
[root@300 ~]# hares -modify TSM0_dg DiskGroup TSM0_dg
[root@300 ~]# hares -modify TSM0_dg Enabled 1
[root@300 ~]# hares -probe  TSM0_dg -sys 301
[root@300 ~]# mkdir /tsm0
[root@301 ~]# mkdir /tsm0

I did not wanted to type all these over and over again so I generated these commands as shown below.

[LOCAL] % cat > LIST << __EOF
stgFC_020    32  /tsm0                         TSM0_vol_instance      TSM0_mnt_instance
stgFC_012   128  /tsm0/active_log              TSM0_vol_active_log    TSM0_mnt_active_log
stgFC_016   384  /tsm0/archive_log             TSM0_vol_archive_log   TSM0_mnt_archive_log
stgFC_013   300  /tsm0/db/db_01                TSM0_vol_db_01         TSM0_mnt_db_01
stgFC_014   300  /tsm0/db/db_02                TSM0_vol_db_02         TSM0_mnt_db_02
stgFC_015   300  /tsm0/db/db_03                TSM0_vol_db_03         TSM0_mnt_db_03
stgFC_017   900  /tsm0/db_backup/db_backup_01  TSM0_vol_db_backup_01  TSM0_mnt_db_backup_01
stgFC_018   900  /tsm0/db_backup/db_backup_02  TSM0_vol_db_backup_02  TSM0_mnt_db_backup_02
stgFC_019   900  /tsm0/db_backup/db_backup_03  TSM0_vol_db_backup_03  TSM0_mnt_db_backup_03
stgFC_01A  6700  /tsm0/pool0/pool0_01          TSM0_vol_pool0_01      TSM0_mnt_pool0_01
stgFC_01B  6700  /tsm0/pool0/pool0_02          TSM0_vol_pool0_02      TSM0_mnt_pool0_02
stgFC_01C  6700  /tsm0/pool0/pool0_03          TSM0_vol_pool0_03      TSM0_mnt_pool0_03
stgFC_01D  6700  /tsm0/pool0/pool0_04          TSM0_vol_pool0_04      TSM0_mnt_pool0_04
stgFC_01E  6700  /tsm0/pool0/pool0_05          TSM0_vol_pool0_05      TSM0_mnt_pool0_05
stgFC_01F  6700  /tsm0/pool0/pool0_06          TSM0_vol_pool0_06      TSM0_mnt_pool0_06
__EOF
[LOCAL]# cat LIST \
  | while read STG SIZE MNTPOINT VOL MNTNAME
    do
      echo sleep 0.2; echo hares -add    ${MNTNAME} Mount TSM0_site
      echo sleep 0.2; echo hares -modify ${MNTNAME} Critical 1
      echo sleep 0.2; echo hares -modify ${MNTNAME} SnapUmount 0
      echo sleep 0.2; echo hares -modify ${MNTNAME} MountPoint ${MNTPOINT}
      echo sleep 0.2; echo hares -modify ${MNTNAME} BlockDevice /dev/vx/dsk/TSM0_dg/${VOL}
      echo sleep 0.2; echo hares -modify ${MNTNAME} FSType vxfs
      echo sleep 0.2; echo hares -modify ${MNTNAME} MountOpt largefiles
      echo sleep 0.2; echo hares -modify ${MNTNAME} FsckOpt %-y
      echo sleep 0.2; echo hares -modify ${MNTNAME} Enabled 1
      echo sleep 0.2; echo hares -probe  ${MNTNAME} -sys 301
      echo sleep 0.2; echo hares -link   ${MNTNAME} TSM0_dg
      echo
    done
[root@300 ~]# hares -add    TSM0_mnt_instance Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_instance Critical 1
[root@300 ~]# hares -modify TSM0_mnt_instance SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_instance MountPoint /tsm0
[root@300 ~]# hares -modify TSM0_mnt_instance BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_instance
[root@300 ~]# hares -modify TSM0_mnt_instance FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_instance MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_instance FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_instance Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_instance -sys 301
[root@300 ~]# hares -link   TSM0_mnt_instance TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_active_log Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_active_log Critical 1
[root@300 ~]# hares -modify TSM0_mnt_active_log SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_active_log MountPoint /tsm0/active_log
[root@300 ~]# hares -modify TSM0_mnt_active_log BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_active_log
[root@300 ~]# hares -modify TSM0_mnt_active_log FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_active_log MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_active_log FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_active_log Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_active_log -sys 301
[root@300 ~]# hares -link   TSM0_mnt_active_log TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_archive_log Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_archive_log Critical 1
[root@300 ~]# hares -modify TSM0_mnt_archive_log SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_archive_log MountPoint /tsm0/archive_log
[root@300 ~]# hares -modify TSM0_mnt_archive_log BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_archive_log
[root@300 ~]# hares -modify TSM0_mnt_archive_log FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_archive_log MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_archive_log FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_archive_log Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_archive_log -sys 301
[root@300 ~]# hares -link   TSM0_mnt_archive_log TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_db_01 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_db_01 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_db_01 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_db_01 MountPoint /tsm0/db/db_01
[root@300 ~]# hares -modify TSM0_mnt_db_01 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_db_01
[root@300 ~]# hares -modify TSM0_mnt_db_01 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_db_01 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_db_01 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_db_01 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_db_01 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_db_01 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_db_02 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_db_02 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_db_02 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_db_02 MountPoint /tsm0/db/db_02
[root@300 ~]# hares -modify TSM0_mnt_db_02 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_db_02
[root@300 ~]# hares -modify TSM0_mnt_db_02 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_db_02 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_db_02 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_db_02 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_db_02 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_db_02 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_db_03 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_db_03 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_db_03 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_db_03 MountPoint /tsm0/db/db_03
[root@300 ~]# hares -modify TSM0_mnt_db_03 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_db_03
[root@300 ~]# hares -modify TSM0_mnt_db_03 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_db_03 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_db_03 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_db_03 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_db_03 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_db_03 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_db_backup_01 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_db_backup_01 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_db_backup_01 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_db_backup_01 MountPoint /tsm0/db_backup/db_backup_01
[root@300 ~]# hares -modify TSM0_mnt_db_backup_01 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_db_backup_01
[root@300 ~]# hares -modify TSM0_mnt_db_backup_01 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_db_backup_01 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_db_backup_01 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_db_backup_01 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_db_backup_01 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_db_backup_01 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_db_backup_02 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_db_backup_02 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_db_backup_02 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_db_backup_02 MountPoint /tsm0/db_backup/db_backup_02
[root@300 ~]# hares -modify TSM0_mnt_db_backup_02 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_db_backup_02
[root@300 ~]# hares -modify TSM0_mnt_db_backup_02 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_db_backup_02 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_db_backup_02 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_db_backup_02 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_db_backup_02 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_db_backup_02 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_db_backup_03 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_db_backup_03 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_db_backup_03 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_db_backup_03 MountPoint /tsm0/db_backup/db_backup_03
[root@300 ~]# hares -modify TSM0_mnt_db_backup_03 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_db_backup_03
[root@300 ~]# hares -modify TSM0_mnt_db_backup_03 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_db_backup_03 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_db_backup_03 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_db_backup_03 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_db_backup_03 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_db_backup_03 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_pool0_01 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_pool0_01 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_pool0_01 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_pool0_01 MountPoint /tsm0/pool0/pool0_01
[root@300 ~]# hares -modify TSM0_mnt_pool0_01 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_01
[root@300 ~]# hares -modify TSM0_mnt_pool0_01 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_pool0_01 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_pool0_01 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_pool0_01 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_pool0_01 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_pool0_01 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_pool0_02 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_pool0_02 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_pool0_02 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_pool0_02 MountPoint /tsm0/pool0/pool0_02
[root@300 ~]# hares -modify TSM0_mnt_pool0_02 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_02
[root@300 ~]# hares -modify TSM0_mnt_pool0_02 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_pool0_02 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_pool0_02 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_pool0_02 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_pool0_02 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_pool0_02 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_pool0_03 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_pool0_03 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_pool0_03 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_pool0_03 MountPoint /tsm0/pool0/pool0_03
[root@300 ~]# hares -modify TSM0_mnt_pool0_03 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_03
[root@300 ~]# hares -modify TSM0_mnt_pool0_03 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_pool0_03 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_pool0_03 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_pool0_03 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_pool0_03 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_pool0_03 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_pool0_04 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_pool0_04 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_pool0_04 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_pool0_04 MountPoint /tsm0/pool0/pool0_04
[root@300 ~]# hares -modify TSM0_mnt_pool0_04 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_04
[root@300 ~]# hares -modify TSM0_mnt_pool0_04 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_pool0_04 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_pool0_04 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_pool0_04 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_pool0_04 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_pool0_04 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_pool0_05 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_pool0_05 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_pool0_05 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_pool0_05 MountPoint /tsm0/pool0/pool0_05
[root@300 ~]# hares -modify TSM0_mnt_pool0_05 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_05
[root@300 ~]# hares -modify TSM0_mnt_pool0_05 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_pool0_05 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_pool0_05 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_pool0_05 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_pool0_05 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_pool0_05 TSM0_dg
[root@300 ~]# hares -add    TSM0_mnt_pool0_06 Mount TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_mnt_pool0_06 Critical 1
[root@300 ~]# hares -modify TSM0_mnt_pool0_06 SnapUmount 0
[root@300 ~]# hares -modify TSM0_mnt_pool0_06 MountPoint /tsm0/pool0/pool0_06
[root@300 ~]# hares -modify TSM0_mnt_pool0_06 BlockDevice /dev/vx/dsk/TSM0_dg/TSM0_vol_pool0_06
[root@300 ~]# hares -modify TSM0_mnt_pool0_06 FSType vxfs
[root@300 ~]# hares -modify TSM0_mnt_pool0_06 MountOpt largefiles
[root@300 ~]# hares -modify TSM0_mnt_pool0_06 FsckOpt %-y
[root@300 ~]# hares -modify TSM0_mnt_pool0_06 Enabled 1
[root@300 ~]# hares -probe  TSM0_mnt_pool0_06 -sys 301
[root@300 ~]# hares -link   TSM0_mnt_pool0_06 TSM0_dg
[root@300 ~]# hares -state | grep TSM0 | grep _mnt_ | \
                while read I; do hares -display $I 2>&1 | grep -v ArgListValues | grep 'largefiles'; done | column -t
TSM0_mnt_active_log    MountOpt  localclus  largefiles
TSM0_mnt_active_log    MountOpt  localclus  largefiles
TSM0_mnt_archive_log   MountOpt  localclus  largefiles
TSM0_mnt_archive_log   MountOpt  localclus  largefiles
TSM0_mnt_db_01         MountOpt  localclus  largefiles
TSM0_mnt_db_01         MountOpt  localclus  largefiles
TSM0_mnt_db_02         MountOpt  localclus  largefiles
TSM0_mnt_db_02         MountOpt  localclus  largefiles
TSM0_mnt_db_03         MountOpt  localclus  largefiles
TSM0_mnt_db_03         MountOpt  localclus  largefiles
TSM0_mnt_db_backup_01  MountOpt  localclus  largefiles
TSM0_mnt_db_backup_01  MountOpt  localclus  largefiles
TSM0_mnt_db_backup_02  MountOpt  localclus  largefiles
TSM0_mnt_db_backup_02  MountOpt  localclus  largefiles
TSM0_mnt_db_backup_03  MountOpt  localclus  largefiles
TSM0_mnt_db_backup_03  MountOpt  localclus  largefiles
TSM0_mnt_instance      MountOpt  localclus  largefiles
TSM0_mnt_instance      MountOpt  localclus  largefiles
TSM0_mnt_pool0_01      MountOpt  localclus  largefiles
TSM0_mnt_pool0_01      MountOpt  localclus  largefiles
TSM0_mnt_pool0_02      MountOpt  localclus  largefiles
TSM0_mnt_pool0_02      MountOpt  localclus  largefiles
TSM0_mnt_pool0_03      MountOpt  localclus  largefiles
TSM0_mnt_pool0_03      MountOpt  localclus  largefiles
TSM0_mnt_pool0_04      MountOpt  localclus  largefiles
TSM0_mnt_pool0_04      MountOpt  localclus  largefiles
TSM0_mnt_pool0_05      MountOpt  localclus  largefiles
TSM0_mnt_pool0_05      MountOpt  localclus  largefiles
TSM0_mnt_pool0_06      MountOpt  localclus  largefiles
TSM0_mnt_pool0_06      MountOpt  localclus  largefiles
[root@300 ~]# hares -add    TSM0_server Application TSM0_site
VCS NOTICE V-16-1-10242 Resource added. Enabled attribute must be set before agent monitors
[root@300 ~]# hares -modify TSM0_server StartProgram   "/etc/init.d/tsm0 start"
[root@300 ~]# hares -modify TSM0_server StopProgram    "/etc/init.d/tsm0 stop"
[root@300 ~]# hares -modify TSM0_server MonitorProgram "/etc/init.d/tsm0 status"
[root@300 ~]# hares -modify TSM0_server Enabled 1
[root@300 ~]# hares -probe  TSM0_server -sys 301
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_active_log
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_archive_log
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_db_01
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_db_02
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_db_03
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_db_backup_01
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_db_backup_02
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_db_backup_03
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_pool0_01
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_pool0_02
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_pool0_03
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_pool0_04
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_pool0_05
[root@300 ~]# hares -link   TSM0_server           TSM0_mnt_pool0_06
[root@300 ~]# hares -link   TSM0_server           TSM0_ip_bond0
[root@300 ~]# hares -link   TSM0_mnt_active_log   TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_archive_log  TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_db_01        TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_db_02        TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_db_03        TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_db_backup_01 TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_db_backup_02 TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_db_backup_03 TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_pool0_01     TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_pool0_02     TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_pool0_03     TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_pool0_04     TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_pool0_05     TSM0_mnt_instance
[root@300 ~]# hares -link   TSM0_mnt_pool0_06     TSM0_mnt_instance
[root@300 ~]# vxdg import TSM0_dg
[root@300 ~]# mount -t vxfs /dev/vx/dsk/TSM0_dg/TSM0_vol_instance /tsm0
[root@301 ~]# mkdir -p /tsm0/active_log
[root@301 ~]# mkdir -p /tsm0/archive_log
[root@300 ~]# mkdir -p /tsm0/db/db_01
[root@300 ~]# mkdir -p /tsm0/db/db_02
[root@300 ~]# mkdir -p /tsm0/db/db_03
[root@300 ~]# mkdir -p /tsm0/db_backup/db_backup_01
[root@300 ~]# mkdir -p /tsm0/db_backup/db_backup_02
[root@300 ~]# mkdir -p /tsm0/db_backup/db_backup_03
[root@300 ~]# mkdir -p /tsm0/pool0/pool0_01
[root@300 ~]# mkdir -p /tsm0/pool0/pool0_02
[root@300 ~]# mkdir -p /tsm0/pool0/pool0_03
[root@300 ~]# mkdir -p /tsm0/pool0/pool0_04
[root@300 ~]# mkdir -p /tsm0/pool0/pool0_05
[root@300 ~]# mkdir -p /tsm0/pool0/pool0_06
[root@300 ~]# find /tsm0
/tsm0
/tsm0/lost+found
/tsm0/active_log
/tsm0/archive_log
/tsm0/db
/tsm0/db/db_01
/tsm0/db/db_02
/tsm0/db/db_03
/tsm0/db_backup
/tsm0/db_backup/db_backup_01
/tsm0/db_backup/db_backup_02
/tsm0/db_backup/db_backup_03
/tsm0/pool0
/tsm0/pool0/pool0_01
/tsm0/pool0/pool0_02
/tsm0/pool0/pool0_03
/tsm0/pool0/pool0_04
/tsm0/pool0/pool0_05
/tsm0/pool0/pool0_06
[root@300 ~]# umount /tsm0
[root@300 ~]# vxdg deport TSM0_dg
[root@300 ~]# haconf -dump -makero
[root@300 ~]# grep TSM0_server /etc/VRTSvcs/conf/config/main.cf
        Application TSM0_server (
        TSM0_server requires TSM0_ip_bond0
        TSM0_server requires TSM0_mnt_active_log
        TSM0_server requires TSM0_mnt_archive_log
        TSM0_server requires TSM0_mnt_db_01
        TSM0_server requires TSM0_mnt_db_02
        TSM0_server requires TSM0_mnt_db_03
        TSM0_server requires TSM0_mnt_db_backup_01
        TSM0_server requires TSM0_mnt_db_backup_02
        TSM0_server requires TSM0_mnt_db_backup_03
        TSM0_server requires TSM0_mnt_instance
        TSM0_server requires TSM0_mnt_pool0_01
        TSM0_server requires TSM0_mnt_pool0_02
        TSM0_server requires TSM0_mnt_pool0_03
        TSM0_server requires TSM0_mnt_pool0_04
        TSM0_server requires TSM0_mnt_pool0_05
        TSM0_server requires TSM0_mnt_pool0_06
        //      Application TSM0_server

Local Per Node Resources

[root@300 ~]# lvcreate -n lv_tmp        -L  4G vg_local
[root@300 ~]# lvcreate -n lv_opt_tivoli -L 16G vg_local
[root@300 ~]# lvcreate -n lv_home       -L  4G vg_local
[root@301 ~]# mkfs.ext3 /dev/vg_local/lv_tmp
[root@301 ~]# mkfs.ext3 /dev/vg_local/lv_opt_tivoli
[root@301 ~]# mkfs.ext3 /dev/vg_local/lv_home
[root@300 ~]# lvcreate -n lv_tmp        -L  4G vg_local
[root@300 ~]# lvcreate -n lv_opt_tivoli -L 16G vg_local
[root@300 ~]# lvcreate -n lv_home       -L  4G vg_local
[root@301 ~]# mkfs.ext3 /dev/vg_local/lv_tmp
[root@301 ~]# mkfs.ext3 /dev/vg_local/lv_opt_tivoli
[root@301 ~]# mkfs.ext3 /dev/vg_local/lv_home
[root@300 ~]# cat /etc/fstab
/dev/mapper/vg_local-lv_root              /           ext3 rw,noatime,nodiratime      1 1
UUID=28d0988a-e6d7-48d8-b0e5-0f70f8eb681e /boot       ext3 defaults                   1 2
UUID=D401-661A                            /boot/efi   vfat umask=0077,shortname=winnt 0 0
/dev/vg_local/lv_swap                     swap        swap defaults                   0 0
/dev/vg_local/lv_tmp                      /tmp        ext3 rw,noatime,nodiratime      2 2
/dev/vg_local/lv_opt_tivoli               /opt/tivoli ext3 rw,noatime,nodiratime      2 2
/dev/vg_local/lv_home                     /home       ext3 rw,noatime,nodiratime      2 2

# VIRT
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0

Install IBM TSM Server Dependencies.

[root@ANY ~]# yum install numactl
[root@ANY ~]# yum install /usr/lib/libgtk-x11-2.0.so.0
[root@ANY ~]# yum install /usr/lib64/libgtk-x11-2.0.so.0
[root@ANY ~]# yum install xorg-x11-xauth xterm fontconfig libICE \
                          libX11-common libXau libXmu libSM libX11 libXt

System /etc/sysctl.conf parameters for both nodes.

[root@300 ~]# cat /etc/sysctl.conf
# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536

# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 206158430208

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

# For SF HA
kernel.hung_task_panic=0

# NetWorker
# connection backlog (hash tables) to the maximum value allowed
net.ipv4.tcp_max_syn_backlog = 8192
net.core.netdev_max_backlog = 8192

# increase the memory size available for TCP buffers
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 8192 524288 16777216
net.ipv4.tcp_wmem = 8192 524288 16777216

# recommended keepalive values
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 20
net.ipv4.tcp_keepalive_time = 600

# recommended timeout after improper close
net.ipv4.tcp_fin_timeout = 60
sunrpc.tcp_slot_table_entries = 64

# for RDBMS 11.2.0.4 rman cat
fs.suid_dumpable = 1
fs.aio-max-nr = 1048576
fs.file-max = 6815744

# support EMC 2016.04.20
net.core.somaxconn = 1024

# 256 * RAM in GB
kernel.shmmni = 65536

# TSM/NSR
kernel.sem = 250 256000 32 65536

# RAM in GB * 1024
kernel.msgmni = 262144

# TSM
kernel.randomize_va_space = 0
vm.swappiness = 0
vm.overcommit_memory = 0
[root@301 ~]# cat /etc/sysctl.conf
# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536

# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 206158430208

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

# For SF HA
kernel.hung_task_panic=0

# NetWorker
# connection backlog (hash tables) to the maximum value allowed
net.ipv4.tcp_max_syn_backlog = 8192
net.core.netdev_max_backlog = 8192

# increase the memory size available for TCP buffers
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 8192 524288 16777216
net.ipv4.tcp_wmem = 8192 524288 16777216

# recommended keepalive values
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 20
net.ipv4.tcp_keepalive_time = 600

# recommended timeout after improper close
net.ipv4.tcp_fin_timeout = 60
sunrpc.tcp_slot_table_entries = 64

# for RDBMS 11.2.0.4 rman cat
fs.suid_dumpable = 1
fs.aio-max-nr = 1048576
fs.file-max = 6815744

# support EMC 2016.04.20
net.core.somaxconn = 1024

# 256 * RAM in GB
kernel.shmmni = 65536

# TSM/NSR
kernel.sem = 250 256000 32 65536

# RAM in GB * 1024
kernel.msgmni = 262144

# TSM
kernel.randomize_va_space = 0
vm.swappiness = 0
vm.overcommit_memory = 0

Install IBM TSM Server

Connect to each node with SSH Forwarding enabled and install IBM TSM server.

[root@300 ~]# chmod +x 7.1.6.000-TIV-TSMSRV-Linuxx86_64.bin
[root@300 ~]# ./7.1.6.000-TIV-TSMSRV-Linuxx86_64.bin
[root@300 ~]# ./install.sh

… and the second node.

[root@301 ~]# chmod +x 7.1.6.000-TIV-TSMSRV-Linuxx86_64.bin
[root@301 ~]# ./7.1.6.000-TIV-TSMSRV-Linuxx86_64.bin
[root@301 ~]# ./install.sh

Options choosen during installation.

INSTALL | DESELECT 'Languages' and DESELECT 'Operations Center'
INSTALL | /opt/tivoli/IBM/IBMIMShared
INSTALL | /opt/tivoli/IBM/InstallationManager/eclipse
INSTALL | /opt/tivoli/tsm

Screenshots from the installation process.

ibm-tsm-install-01

ibm-tsm-install-02

ibm-tsm-install-03

ibm-tsm-install-04

ibm-tsm-install-05

ibm-tsm-install-06

Install IBM TSM Client

[root@300 ~]# yum localinstall gskcrypt64-8.0.50.66.linux.x86_64.rpm \
                               gskssl64-8.0.50.66.linux.x86_64.rpm \
                               TIVsm-API64.x86_64.rpm \
                               TIVsm-BA.x86_64.rpm
[root@301 ~]# yum localinstall gskcrypt64-8.0.50.66.linux.x86_64.rpm \
                               gskssl64-8.0.50.66.linux.x86_64.rpm \
                               TIVsm-API64.x86_64.rpm \
                               TIVsm-BA.x86_64.rpm

Nodes Configuration for IBM TSM Server

[root@300 ~]# useradd -u 1500 -m tsm0
[root@301 ~]# useradd -u 1500 -m tsm0
[root@300 ~]# passwd tsm0
Changing password for user tsm0.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.

[root@301 ~]# passwd tsm0
Changing password for user tsm0.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
[root@300 ~]# tail -1 /etc/passwd
tsm0:x:1500:1500::/home/tsm0:/bin/bash

[root@301 ~]# tail -1 /etc/passwd
tsm0:x:1500:1500::/home/tsm0:/bin/bash
[root@300 ~]# tail -1 /etc/group
tsm0:x:1500:

[root@301 ~]# tail -1 /etc/group
tsm0:x:1500:
[root@300 ~]# cat /etc/security/limits.conf
# ORACLE
oracle              soft    nproc   16384
oracle              hard    nproc   16384
oracle              soft    nofile  4096
oracle              hard    nofile  65536
oracle              soft    stack   10240

# TSM
tsm0                soft    nofile  32768
tsm0                hard    nofile  32768

[root@301 ~]# cat /etc/security/limits.conf
# ORACLE
oracle              soft    nproc   16384
oracle              hard    nproc   16384
oracle              soft    nofile  4096
oracle              hard    nofile  65536
oracle              soft    stack   10240

# TSM
tsm0                soft    nofile  32768
tsm0                hard    nofile  32768
[root@300 ~]# :> /var/run/dsmserv_tsm0.pid
[root@301 ~]# :> /var/run/dsmserv_tsm0.pid
[root@300 ~]# chown tsm0:tsm0 /var/run/dsmserv_tsm0.pid
[root@301 ~]# chown tsm0:tsm0 /var/run/dsmserv_tsm0.pid
[root@300 ~]# hares -state | grep TSM
TSM0_dg               State                 300  OFFLINE
TSM0_dg               State                 301  OFFLINE
TSM0_ip_bond0         State                 300  OFFLINE
TSM0_ip_bond0         State                 301  OFFLINE
TSM0_mnt_active_log   State                 300  OFFLINE
TSM0_mnt_active_log   State                 301  OFFLINE
TSM0_mnt_archive_log  State                 300  OFFLINE
TSM0_mnt_archive_log  State                 301  OFFLINE
TSM0_mnt_db_01        State                 300  OFFLINE
TSM0_mnt_db_01        State                 301  OFFLINE
TSM0_mnt_db_02        State                 300  OFFLINE
TSM0_mnt_db_02        State                 301  OFFLINE
TSM0_mnt_db_03        State                 300  OFFLINE
TSM0_mnt_db_03        State                 301  OFFLINE
TSM0_mnt_db_backup_01 State                 300  OFFLINE
TSM0_mnt_db_backup_01 State                 301  OFFLINE
TSM0_mnt_db_backup_02 State                 300  OFFLINE
TSM0_mnt_db_backup_02 State                 301  OFFLINE
TSM0_mnt_db_backup_03 State                 300  OFFLINE
TSM0_mnt_db_backup_03 State                 301  OFFLINE
TSM0_mnt_instance     State                 300  OFFLINE
TSM0_mnt_instance     State                 301  OFFLINE
TSM0_mnt_pool0_01     State                 300  OFFLINE
TSM0_mnt_pool0_01     State                 301  OFFLINE
TSM0_mnt_pool0_02     State                 300  OFFLINE
TSM0_mnt_pool0_02     State                 301  OFFLINE
TSM0_mnt_pool0_03     State                 300  OFFLINE
TSM0_mnt_pool0_03     State                 301  OFFLINE
TSM0_mnt_pool0_04     State                 300  OFFLINE
TSM0_mnt_pool0_04     State                 301  OFFLINE
TSM0_mnt_pool0_05     State                 300  OFFLINE
TSM0_mnt_pool0_05     State                 301  OFFLINE
TSM0_mnt_pool0_06     State                 300  OFFLINE
TSM0_mnt_pool0_06     State                 301  OFFLINE
TSM0_nic_bond0        State                 300  ONLINE
TSM0_nic_bond0        State                 301  ONLINE
TSM0_server           State                 300  OFFLINE
TSM0_server           State                 301  OFFLINE
[root@300 ~]# hares -online TSM0_mnt_instance -sys $( hostname -s )
[root@300 ~]# hares -online TSM0_ip_bond0     -sys $( hostname -s )
[root@300 ~]# hares -state | grep TSM0 | grep 301 | grep mnt | grep -v instance | awk '{print $1}' \
                | while read I; do hares -online ${I} -sys $( hostname -s ); done
[root@300 ~]# hares -state | grep 301 | grep TSM0
TSM0_dg               State                 301  ONLINE
TSM0_ip_bond0         State                 301  ONLINE
TSM0_mnt_active_log   State                 301  ONLINE
TSM0_mnt_archive_log  State                 301  ONLINE
TSM0_mnt_db_01        State                 301  ONLINE
TSM0_mnt_db_02        State                 301  ONLINE
TSM0_mnt_db_03        State                 301  ONLINE
TSM0_mnt_db_backup_01 State                 301  ONLINE
TSM0_mnt_db_backup_02 State                 301  ONLINE
TSM0_mnt_db_backup_03 State                 301  ONLINE
TSM0_mnt_instance     State                 301  ONLINE
TSM0_mnt_pool0_01     State                 301  ONLINE
TSM0_mnt_pool0_02     State                 301  ONLINE
TSM0_mnt_pool0_03     State                 301  ONLINE
TSM0_mnt_pool0_04     State                 301  ONLINE
TSM0_mnt_pool0_05     State                 301  ONLINE
TSM0_mnt_pool0_06     State                 301  ONLINE
TSM0_nic_bond0        State                 301  ONLINE
TSM0_server           State                 301  OFFLINE
[root@300 ~]# find /tsm0 | grep -v 'lost+found'
/tsm0
/tsm0/active_log
/tsm0/archive_log
/tsm0/db
/tsm0/db/db_01
/tsm0/db/db_02
/tsm0/db/db_03
/tsm0/db_backup
/tsm0/db_backup/db_backup_01
/tsm0/db_backup/db_backup_02
/tsm0/db_backup/db_backup_03
/tsm0/pool0
/tsm0/pool0/pool0_01
/tsm0/pool0/pool0_02
/tsm0/pool0/pool0_03
/tsm0/pool0/pool0_04
/tsm0/pool0/pool0_05
/tsm0/pool0/pool0_06
[root@300 ~]# chown -R tsm0:tsm0 /tsm0

IBM TSM Server Configuration

Connect to one of the nodes with SSH Forwarding enabled.

[root@300 ~]# cd /opt/tivoli/tsm/server/bin
[root@300 /opt/tivoli/tsm/server/bin]# ./dsmicfgx
Preparing to install...
Extracting the JRE from the installer archive...
Unpacking the JRE...
Extracting the installation resources from the installer archive...
Configuring the installer for this system's environment...

Launching installer...

Options choosen during configuration.

INSTALL | Instance user ID:
INSTALL |  Β Β tsm0
INSTALL |
INSTALL | Instance directory:
INSTALL |  Β Β /tsm0
INSTALL |
INSTALL | Database directories:
INSTALL |  Β Β /tsm0/db/db_01
INSTALL |   Β /tsm0/db/db_02
INSTALL |   Β /tsm0/db/db_03
INSTALL |
INSTALL | Active log directory:
INSTALL |  Β Β /tsm0/active_log
INSTALL |
INSTALL | Primary archive log directory:
INSTALL |  Β Β /tsm0/archive_log
INSTALL |
INSTALL | Instance autostart setting:
INSTALL |  Β Β Start automatically using the instance user ID

Screenshots from the configuration process.

ibm-tsm-configure-01

ibm-tsm-configure-02

ibm-tsm-configure-03

ibm-tsm-configure-04

ibm-tsm-configure-05

ibm-tsm-configure-06

ibm-tsm-configure-07

ibm-tsm-configure-08

ibm-tsm-configure-09

Log from the IBM TSM DB2 instance creation.

Creating the database manager instance...
The database manager instance was created successfully.

Formatting the server database...

ANR7800I DSMSERV generated at 16:39:04 on Jun  8 2016.

IBM Tivoli Storage Manager for Linux/x86_64
Version 7, Release 1, Level 6.000

Licensed Materials - Property of IBM

(C) Copyright IBM Corporation 1990, 2016.
All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.

ANR7801I Subsystem process ID is 5208.
ANR0900I Processing options file /tsm0/dsmserv.opt.
ANR0010W Unable to open message catalog for language en_US.UTF-8. The default
language message catalog will be used.
ANR7814I Using instance directory /tsm0.
ANR4726I The ICC support module has been loaded.
ANR0152I Database manager successfully started.
ANR2976I Offline DB backup for database TSMDB1 started.
ANR2974I Offline DB backup for database TSMDB1 completed successfully.
ANR0992I Server's database formatting complete.
ANR0369I Stopping the database manager because of a server shutdown.

Format completed with return code 0
Beginning initial configuration...

ANR7800I DSMSERV generated at 16:39:04 on Jun  8 2016.

IBM Tivoli Storage Manager for Linux/x86_64
Version 7, Release 1, Level 6.000

Licensed Materials - Property of IBM

(C) Copyright IBM Corporation 1990, 2016.
All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.

ANR7801I Subsystem process ID is 8741.
ANR0900I Processing options file /tsm0/dsmserv.opt.
ANR0010W Unable to open message catalog for language en_US.UTF-8. The default
language message catalog will be used.
ANR7814I Using instance directory /tsm0.
ANR4726I The ICC support module has been loaded.
ANR0990I Server restart-recovery in progress.
ANR0152I Database manager successfully started.
ANR1628I The database manager is using port 51500 for server connections.
ANR1636W The server machine GUID changed: old value (), new value (f0.8a.27.61-
.e5.43.b6.11.92.b5.00.0a.f7.49.31.18).
ANR2100I Activity log process has started.
ANR3733W The master encryption key cannot be generated because the server
password is not set.
ANR3339I Default Label in key data base is TSM Server SelfSigned Key.
ANR4726I The NAS-NDMP support module has been loaded.
ANR1794W TSM SAN discovery is disabled by options.
ANR2200I Storage pool BACKUPPOOL defined (device class DISK).
ANR2200I Storage pool ARCHIVEPOOL defined (device class DISK).
ANR2200I Storage pool SPACEMGPOOL defined (device class DISK).
ANR2560I Schedule manager started.
ANR0993I Server initialization complete.
ANR0916I TIVOLI STORAGE MANAGER distributed by Tivoli is now ready for use.
ANR2094I Server name set to TSM0.
ANR4865W The server name has been changed. Windows clients that use "passworda-
ccess generate" may be unable to authenticate with the server.
ANR2068I Administrator ADMIN registered.
ANR2076I System privilege granted to administrator ADMIN.
ANR1912I Stopping the activity log because of a server shutdown.
ANR0369I Stopping the database manager because of a server shutdown.

Configuration is complete.

Modify IBM TSM Server Startup Script

Modified startup script to properly work with Veritas Cluster Server with modification in blue below.

[root@300 ~]# cat /etc/init.d/tsm0
#!/bin/bash
#
# dsmserv       Start/Stop IBM Tivoli Storage Manager
#
# chkconfig: - 90 10
# description: Starts/Stops an IBM Tivoli Storage Manager Server instance
# processname: dsmserv
# pidfile: /var/run/dsmserv_instancename.pid

#***********************************************************************
# Distributed Storage Manager (ADSM)                                   *
# Server Component                                                     *
#                                                                      *
# IBM Confidential                                                     *
# (IBM Confidential-Restricted when combined with the Aggregated OCO   *
# Source Modules for this Program)                                     *
#                                                                      *
# OCO Source Materials                                                 *
#                                                                      *
# 5765-303 (C) Copyright IBM Corporation 1990, 2009                    *
#***********************************************************************

#
# This init script is designed to start a single Tivoli Storage Manager
# server instance on a system where multiple instances might be running.
# It assumes that the name of the script is also the name of the instance
# to be started (or, if the script name starts with Snn or Knn, where 'n'
# is a digit, that the name of the instance is the script name with the
# three letter prefix removed).
#
# To use the script to start multiple instances, install multiple copies
# of the script in /etc/rc.d/init.d, naming each copy after the instance
# it will start.
#
# The script makes a number of simplifying assumptions about the way
# the instance is set up.
# - The Tivoli Storage Manager Server instance runs as a non-root user whose
#   name is the instance name
# - The server's instance directory (the directory in which it keeps all of
#   its important state information) is in a subdirectory of the home
#   directory called tsminst1.
# If any of these assumptions are not valid, then the script will require
# some modifications to work.  To start with, look at the
# instance, instance_user, and instance_dir variables set below...

# First of all, check for syntax
if [[ $# != 1 ]]
then
  echo $"Usage: $0 {start|stop|status|restart}"
  exit 1
fi

prog="dsmserv"
instance=tsm0
serverBinDir="/opt/tivoli/tsm/server/bin"

if [[ ! -e $serverBinDir/$prog ]]
then
   echo "IBM Tivoli Storage Manager Server not found on this system ($serverBinDir/$prog)"
   exit -1
fi

# see if $0 starts with Snn or Knn, where 'n' is a digit.  If it does, then
# strip off the prefix and use the remainder as the instance name.
if [[ ${instance:0:1} == S ]]
then
  instance=${instance#S[0123456789][0123456789]}
elif [[ ${instance:0:1} == K ]]
then
  instance=${instance#K[0123456789][0123456789]}
fi

instance_home=`${serverBinDir}/dsmfngr $instance 2>/dev/null`
if [[ -z "$instance_home" ]]
then
  instance_home="/home/${instance}"
fi
instance_user=tsm0
instance_dir=/tsm0
pidfile="/var/run/${prog}_${instance}.pid"

PATH=/sbin:/bin:/usr/bin:/usr/sbin:$serverBinDir

#
# Do some basic error checking before starting the server
#
# Is the server installed?
if [[ ! -e $serverBinDir/$prog ]]
then
   echo "IBM Tivoli Storage Manager Server not found on this system"
   exit 0
fi

# Does the instance directory exist?
if [[ ! -d $instance_dir ]]
then
 echo "Instance directory ${instance_dir} does not exist"
 exit -1
fi
rc=0

SLEEP_INTERVAL=5
MAX_SLEEP_TIME=10

function check_pid_file()
{
    test -f $pidfile
}

function check_process()
{
    ps -p `cat $pidfile` > /dev/null
}

function check_running()
{
    check_pid_file && check_process
}

start() {
        # set the standard value for the user limits
        ulimit -c unlimited
        ulimit -d unlimited
        ulimit -f unlimited
        ulimit -n 65536
        ulimit -t unlimited
        ulimit -u 16384

        echo -n "Starting $prog instance $instance ... "
        #if we're already running, say so
        status 0
        if [[ $g_status == "running" ]]
        then
           echo "$prog instance $instance already running..."
           exit 0
        else
           $serverBinDir/rc.dsmserv -u $instance_user -i $instance_dir -q >/dev/null 2>&1 &
           # give enough time to server to start
           sleep 5
           # if the lock file got created, we did ok
           if [[ -f $instance_dir/dsmserv.v6lock ]]
           then
              gawk --source '{print $4}' $instance_dir/dsmserv.v6lock>$pidfile
              [ $? = 0 ] && echo "Succeeded" || echo "Failed"
              rc=$?
              echo
              [ $rc -eq 0 ] && touch /var/lock/subsys/${instance}
              return $rc
           else
              echo "Failed"
              return 1
           fi
       fi
}

stop() {
        echo  "Stopping $prog instance $instance ..."
        if [[ -e $pidfile ]]
        then
           # make sure someone else didn't kill us already
           progpid=`cat $pidfile`
           running=`ps -ef | grep $prog | grep -w $progpid | grep -v grep`
           if [[ -n $running ]]
           then
              #echo "executing cmd kill `cat $pidfile`"
              kill `cat $pidfile`

              total_slept=0
              while check_running; do \
                  echo  "$prog instance $instance still running, will check after $SLEEP_INTERVAL seconds"
                  sleep $SLEEP_INTERVAL
                  total_slept=`expr $total_slept + 1`

                  if [ "$total_slept" -gt "$MAX_SLEEP_TIME" ]; then \
                      break
                  fi
              done

              if  check_running
              then
                echo "Unable to stop $prog instance $instance"
                exit 1
              else
                echo "$prog instance $instance stopped Successfully"
              fi
           fi
           # remove the pid file so that we don't try to kill same pid again
           rm $pidfile
           if [[ $? != 0 ]]
           then
              echo "Process $prog instance $instance stopped, but unable to remove $pidfile"
              echo "Be sure to remove $pidfile."
              exit 1
           fi
        else
           echo "$prog instance $instance is not running."
        fi
        rc=$?
        echo
        [ $rc -eq 0 ] && rm -f /var/lock/subsys/${instance}
        return $rc
}

status() {
      # check usage
      if [[ $# != 1 ]]
      then
         echo "$0: Invalid call to status routine. Expected argument: "
         echo "where display_to_screen is 0 or 1 and indicates whether output will be sent to screen."
         exit 100
         # exit 1
      fi
      #see if file $pidfile exists
      # if it does, see if process is running
      # if it doesn't, it's not running - or at least was not started by dsmserv.rc
      if [[ -e $pidfile ]]
      then
         progpid=`cat $pidfile`
         running=`ps -ef | grep $prog | grep -w $progpid | grep -v grep`
         if [[ -n $running ]]
         then
            g_status="running"
         else
            g_status="stopped"
            # remove the pidfile if stopped.
            if [[ -e $pidfile ]]
            then
                rm $pidfile
                if [[ $? != 0 ]]
                then
                    echo "$prog instance $instance stopped, but unable to remove $pidfile"
                    echo "Be sure to remove $pidfile."
                fi
            fi
         fi
      else
        g_status="stopped"
      fi
      if [[ $1 == 1 ]]
      then
            echo "Status of $prog instance $instance: $g_status"
      fi

      if [ "${1}" = "1" ]
      then
        case ${g_status} in
          (stopped) EXIT=100 ;;
          (running) EXIT=110 ;;
        esac
        exit ${EXIT}
      fi
}

restart() {
        stop
        start
}

case "$1" in
  start)
        start
        ;;
  stop)
        stop
        ;;
  status)
        status 1
        ;;
  restart|reload)
        restart
        ;;
  *)
        echo $"Usage: $0 {start|stop|status|restart}"
        exit 1
esac

exit $?

… and the diff(1) between original and modified one.

[root@300 ~]# diff -u /etc/init.d/tsm0 /root/tsm0
--- /etc/init.d/tsm0    2016-07-13 13:20:43.000000000 +0200
+++ /root/tsm0          2016-07-13 13:27:41.000000000 +0200
@@ -207,7 +207,8 @@
       then
          echo "$0: Invalid call to status routine. Expected argument: "
          echo "where display_to_screen is 0 or 1 and indicates whether output will be sent to screen."
-         exit 1
+         exit 100
+         # exit 1
       fi
       #see if file $pidfile exists
       # if it does, see if process is running
@@ -239,6 +240,15 @@
       then
             echo "Status of $prog instance $instance: $g_status"
       fi
+
+      if [ "${1}" = "1" ]
+      then
+        case ${g_status} in
+          (stopped) EXIT=100 ;;
+          (running) EXIT=110 ;;
+        esac
+        exit ${EXIT}
+      fi
 }

 restart() {

Copy tsm0 Profile to the Other Node

[root@300 ~]# pwd
/home
[root@300 /home]# tar -czf - tsm0 | ssh 301 'tar -C /home -xzf -'
[root@300 ~]# cat /home/tsm0/sqllib/db2nodes.cfg
0 TSM0.domain.com 0
[root@301 ~]# cat /home/tsm0/sqllib/db2nodes.cfg
0 TSM0.domain.com 0

IBM TSM Server Start

[root@300 ~]# hares -online TSM0_ip_bond0         -sys 300
[root@300 ~]# hares -online TSM0_mnt_active_log   -sys 300
[root@300 ~]# hares -online TSM0_mnt_archive_log  -sys 300
[root@300 ~]# hares -online TSM0_mnt_db_01        -sys 300
[root@300 ~]# hares -online TSM0_mnt_db_02        -sys 300
[root@300 ~]# hares -online TSM0_mnt_db_03        -sys 300
[root@300 ~]# hares -online TSM0_mnt_db_backup_01 -sys 300
[root@300 ~]# hares -online TSM0_mnt_db_backup_02 -sys 300
[root@300 ~]# hares -online TSM0_mnt_db_backup_03 -sys 300
[root@300 ~]# hares -online TSM0_mnt_instance     -sys 300
[root@300 ~]# hares -online TSM0_mnt_pool0_01     -sys 300
[root@300 ~]# hares -online TSM0_mnt_pool0_02     -sys 300
[root@300 ~]# hares -online TSM0_mnt_pool0_03     -sys 300
[root@300 ~]# hares -online TSM0_mnt_pool0_04     -sys 300
[root@300 ~]# hares -online TSM0_mnt_pool0_05     -sys 300
[root@300 ~]# hares -online TSM0_mnt_pool0_06     -sys 300
[root@300 ~]# hares -state | grep TSM0 | grep 300
TSM0_dg               State                 300  ONLINE
TSM0_ip_bond0         State                 300  ONLINE
TSM0_mnt_active_log   State                 300  ONLINE
TSM0_mnt_archive_log  State                 300  ONLINE
TSM0_mnt_db_01        State                 300  ONLINE
TSM0_mnt_db_02        State                 300  ONLINE
TSM0_mnt_db_03        State                 300  ONLINE
TSM0_mnt_db_backup_01 State                 300  ONLINE
TSM0_mnt_db_backup_02 State                 300  ONLINE
TSM0_mnt_db_backup_03 State                 300  ONLINE
TSM0_mnt_instance     State                 300  ONLINE
TSM0_mnt_pool0_01     State                 300  ONLINE
TSM0_mnt_pool0_02     State                 300  ONLINE
TSM0_mnt_pool0_03     State                 300  ONLINE
TSM0_mnt_pool0_04     State                 300  ONLINE
TSM0_mnt_pool0_05     State                 300  ONLINE
TSM0_mnt_pool0_06     State                 300  ONLINE
TSM0_nic_bond0        State                 300  ONLINE
TSM0_server           State                 300  OFFLINE

[root@300 ~]# cat >> /etc/services << __EOF
DB2_tsm0        60000/tcp
DB2_tsm0_1      60001/tcp
DB2_tsm0_2      60002/tcp
DB2_tsm0_3      60003/tcp
DB2_tsm0_4      60004/tcp
DB2_tsm0_END    60005/tcp
__EOF
[root@300 ~]# hagrp -freeze TSM0_site
[root@300 ~]# hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen

A  300            RUNNING              0
A  301            RUNNING              0

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State

B  NSR_site        300            Y          N               OFFLINE
B  NSR_site        301            Y          N               ONLINE
B  RMAN_site       300            Y          N               OFFLINE
B  RMAN_site       301            Y          N               ONLINE
B  TSM0_site       300            Y          N               PARTIAL
B  TSM0_site       301            Y          N               OFFLINE
B  VCS_site        300            Y          N               OFFLINE
B  VCS_site        301            Y          N               ONLINE

-- GROUPS FROZEN
-- Group

C  TSM0_site

-- RESOURCES DISABLED
-- Group           Type            Resource

H  TSM0_site      Application     TSM0_server
H  TSM0_site      DiskGroup       TSM0_dg
H  TSM0_site      IP              TSM0_ip_bond0
H  TSM0_site      Mount           TSM0_mnt_active_log
H  TSM0_site      Mount           TSM0_mnt_archive_log
H  TSM0_site      Mount           TSM0_mnt_db_01
H  TSM0_site      Mount           TSM0_mnt_db_02
H  TSM0_site      Mount           TSM0_mnt_db_03
H  TSM0_site      Mount           TSM0_mnt_db_backup_01
H  TSM0_site      Mount           TSM0_mnt_db_backup_02
H  TSM0_site      Mount           TSM0_mnt_db_backup_03
H  TSM0_site      Mount           TSM0_mnt_instance
H  TSM0_site      Mount           TSM0_mnt_pool0_01
H  TSM0_site      Mount           TSM0_mnt_pool0_02
H  TSM0_site      Mount           TSM0_mnt_pool0_03
H  TSM0_site      Mount           TSM0_mnt_pool0_04
H  TSM0_site      Mount           TSM0_mnt_pool0_05
H  TSM0_site      Mount           TSM0_mnt_pool0_06
H  TSM0_site      NIC             TSM0_nic_bond0

[root@300 ~]# su - tsm0 -c '/opt/tivoli/tsm/server/bin/dsmserv -i /tsm0'
ANR7800I DSMSERV generated at 16:39:04 on Jun  8 2016.

IBM Tivoli Storage Manager for Linux/x86_64
Version 7, Release 1, Level 6.000

Licensed Materials - Property of IBM

(C) Copyright IBM Corporation 1990, 2016.
All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.

ANR7801I Subsystem process ID is 9834.
ANR0900I Processing options file /tsm0/dsmserv.opt.
ANR0010W Unable to open message catalog for language en_US.UTF-8. The default language message
catalog will be used.
ANR7814I Using instance directory /tsm0.
ANR4726I The ICC support module has been loaded.
ANR0990I Server restart-recovery in progress.
ANR0152I Database manager successfully started.
ANR1628I The database manager is using port 51500 for server connections.
ANR1635I The server machine GUID, 54.80.e8.50.e4.48.e6.11.8e.6d.00.0a.f7.49.2b.08, has
initialized.
ANR2100I Activity log process has started.
ANR3733W The master encryption key cannot be generated because the server password is not set.
ANR3339I Default Label in key data base is TSM Server SelfSigned Key.
ANR4726I The NAS-NDMP support module has been loaded.
ANR1794W TSM SAN discovery is disabled by options.
ANR2803I License manager started.
ANR8200I TCP/IP Version 4 driver ready for connection with clients on port 1500.
ANR9639W Unable to load Shared License File dsmreg.sl.
ANR9652I An EVALUATION LICENSE for IBM System Storage Archive Manager will expire on
08/13/2016.
ANR9652I An EVALUATION LICENSE for Tivoli Storage Manager Basic Edition will expire on
08/13/2016.
ANR9652I An EVALUATION LICENSE for Tivoli Storage Manager Extended Edition will expire on
08/13/2016.
ANR2828I Server is licensed to support IBM System Storage Archive Manager.
ANR2828I Server is licensed to support Tivoli Storage Manager Basic Edition.
ANR2828I Server is licensed to support Tivoli Storage Manager Extended Edition.
ANR2560I Schedule manager started.
ANR0984I Process 1 for EXPIRE INVENTORY (Automatic) started in the BACKGROUND at 01:58:03 PM.
ANR0811I Inventory client file expiration started as process 1.
ANR0167I Inventory file expiration process 1 processed for 0 minutes.
ANR0812I Inventory file expiration process 1 completed: processed 0 nodes, examined 0 objects,
deleting 0 backup objects, 0 archive objects, 0 DB backup volumes, and 0 recovery plan files. 0
objects were retried and 0 errors were encountered.
ANR0985I Process 1 for EXPIRE INVENTORY (Automatic) running in the BACKGROUND completed with
completion state SUCCESS at 01:58:03 PM.
ANR0993I Server initialization complete.
ANR0916I TIVOLI STORAGE MANAGER distributed by Tivoli is now ready for use.
TSM:TSM0>q admin
ANR2017I Administrator SERVER_CONSOLE issued command: QUERY ADMIN

Administrator        Days Since       Days Since      Locked?       Privilege Classes
Name                Last Access     Password Set
--------------     ------------     ------------     ----------     -----------------------
ADMIN                        <1               <1         No         System
ADMIN_CENTER                 halt
ANR2017I Administrator SERVER_CONSOLE issued command: HALT
ANR1912I Stopping the activity log because of a server shutdown.
ANR0369I Stopping the database manager because of a server shutdown.
ANR0991I Server shutdown complete.


[root@300 ~]# hagrp -unfreeze TSM0_site

[root@300 ~]# hares -state | grep TSM0 | grep 302
TSM0_dg               State                 300  ONLINE
TSM0_ip_bond0         State                 300  ONLINE
TSM0_mnt_active_log   State                 300  ONLINE
TSM0_mnt_archive_log  State                 300  ONLINE
TSM0_mnt_db_01        State                 300  ONLINE
TSM0_mnt_db_02        State                 300  ONLINE
TSM0_mnt_db_03        State                 300  ONLINE
TSM0_mnt_db_backup_01 State                 300  ONLINE
TSM0_mnt_db_backup_02 State                 300  ONLINE
TSM0_mnt_db_backup_03 State                 300  ONLINE
TSM0_mnt_instance     State                 300  ONLINE
TSM0_mnt_pool0_01     State                 300  ONLINE
TSM0_mnt_pool0_02     State                 300  ONLINE
TSM0_mnt_pool0_03     State                 300  ONLINE
TSM0_mnt_pool0_04     State                 300  ONLINE
TSM0_mnt_pool0_05     State                 300  ONLINE
TSM0_mnt_pool0_06     State                 300  ONLINE
TSM0_nic_bond0        State                 300  ONLINE
TSM0_server           State                 300  OFFLINE

[root@301 ~]# hares -online TSM0_server -sys 300

Ignore these errors below during first IBM TSM server startup.

IGNORE | ERRORS TO IGNORE DURING FIRST IBM TSM SERVER START
IGNORE | 
IGNORE | DBI1306N  The instance profile is not defined.
IGNORE |
IGNORE | Explanation:
IGNORE |
IGNORE | The instance is not defined in the target machine registry.
IGNORE |
IGNORE | User response:
IGNORE |
IGNORE | Specify an existing instance name or create the required instance.

Install IBM TSM Server Licenses

Screenshots from that process below.

ibm-tsm-install-license-01

ibm-tsm-install-license-02

ibm-tsm-install-license-03

ibm-tsm-install-license-04

Lets now register licenses for the IBM TSM.

tsm: TSM0_SITE>register license file=/opt/tivoli/tsm/server/bin/tsmee.lic
ANR2852I Current license information:
ANR2853I New license information:
ANR2828I Server is licensed to support Tivoli Storage Manager Basic Edition.
ANR2828I Server is licensed to support Tivoli Storage Manager Extended Edition.

IBM TSM Client Configuration on the IBM TSM Server Nodes

[root@300 ~]# cat > /opt/tivoli/tsm/client/ba/bin/dsm.opt << __EOF
SERVERNAME TSM0
__EOF

[root@301 ~]# cat > /opt/tivoli/tsm/client/ba/bin/dsm.opt << __EOF
SERVERNAME TSM0
__EOF

[root@300 ~]# cat > /opt/tivoli/tsm/client/ba/bin/dsm.sys << __EOF
SERVERNAME TSM0
COMMMethod TCPip
TCPPort 1500
TCPSERVERADDRESS localhost
SCHEDLOGNAME /opt/tivoli/tsm/client/ba/bin/dsmsched.log
ERRORLOGNAME /opt/tivoli/tsm/client/ba/bin/dsmerror.log
SCHEDLOGRETENTION 7 D
ERRORLOGRETENTION 7 D
__EOF

[root@301 ~]# cat > /opt/tivoli/tsm/client/ba/bin/dsm.sys << __EOF
SERVERNAME TSM0
COMMMethod TCPip
TCPPort 1500
TCPSERVERADDRESS localhost
SCHEDLOGNAME /opt/tivoli/tsm/client/ba/bin/dsmsched.log
ERRORLOGNAME /opt/tivoli/tsm/client/ba/bin/dsmerror.log
SCHEDLOGRETENTION 7 D
ERRORLOGRETENTION 7 D
__EOF

Install lin_tape on IBM TSM Server

[root@ALL]# uname -r
2.6.32-504.el6.x86_64

[root@ALL]# uname -r | sed 's|.x86_64||g'
2.6.32-504.el6

[root@ALL]# yum --showduplicates list kernel-devel | grep 2.6.32-504.el6
kernel-devel.x86_64            2.6.32-504.el6                 rhel-6-server-rpms

[root@ALL]# yum install rpm-build kernel-devel-2.6.32-504.el6

[root@ALL]# rpm -Uvh /root/rpmbuild/RPMS/x86_64/lin_tape-3.0.10-1.x86_64.rpm
Preparing...                ########################################### [100%]
   1:lin_tape               ########################################### [100%]
Starting lin_tape...
lin_tape loaded

[root@ALL]# rpm -Uvh lin_taped-3.0.10-rhel6.x86_64.rpm
Preparing...                ########################################### [100%]
   1:lin_taped              ########################################### [100%]
Starting lin_tape...
lin_taped loaded

[root@ALL]# /etc/init.d/lin_tape start
Starting lin_tape... lin_taped already running. Abort!

[root@ALL]# /etc/init.d/lin_tape restart
Shutting down lin_tape... lin_taped unloaded
Starting lin_tape...

Library Configuration

This is quite unusual configuration as the IBM TS3310 library with 4 LTO4 drives are logically partitioned into two logical libraries with 2 drives dedicated to Dell/EMC Networker and 2 drives dedicated to the IBM TSM server. Such library is shown below.

ibm-tsm-ts3310.jpg

The changers and tape drives for each backup system.

Networker | (L) 000001317577_LLA changer0
TSM       | (L) 000001317577_LLB changer1_persistent_TSM0
Networker | (1) 7310132058       tape0
Networker | (2) 7310295146       tape1
TSM       | (3) 7310214751       tape2_persistent_TSM0
TSM       | (4) 7310214904       tape3_persistent_TSM0
[root@300 ~]# find /dev/IBM*
/dev/IBMchanger0
/dev/IBMchanger1
/dev/IBMSpecial
/dev/IBMtape
/dev/IBMtape0
/dev/IBMtape0n
/dev/IBMtape1
/dev/IBMtape1n
/dev/IBMtape2
/dev/IBMtape2n
/dev/IBMtape3
/dev/IBMtape3n

We will use UDEV for persistent configuration.

[root@300 ~]# udevadm info -a -p $(udevadm info -q path -n /dev/IBMtape0)    | grep -i serial
    ATTR{serial_num}=="7310132058"
[root@300 ~]# udevadm info -a -p $(udevadm info -q path -n /dev/IBMtape1)    | grep -i serial
    ATTR{serial_num}=="7310295146"
[root@300 ~]# udevadm info -a -p $(udevadm info -q path -n /dev/IBMtape2)    | grep -i serial
    ATTR{serial_num}=="7310214751"
[root@300 ~]# udevadm info -a -p $(udevadm info -q path -n /dev/IBMtape3)    | grep -i serial
    ATTR{serial_num}=="7310214904"
[root@300 ~]# udevadm info -a -p $(udevadm info -q path -n /dev/IBMchanger0) | grep -i serial
    ATTR{serial_num}=="000001317577_LLA"
[root@300 ~]# udevadm info -a -p $(udevadm info -q path -n /dev/IBMchanger1) | grep -i serial
    ATTR{serial_num}=="000001317577_LLB"
[root@300 ~]# cat /proc/scsi/IBM*
lin_tape version: 3.0.10
lin_tape major number: 239
Attached Changer Devices:
Number  model       SN                HBA             SCSI            FO Path
0       3576-MTL    000001317577_LLA  qla2xxx         2:0:1:1         NA
1       3576-MTL    000001317577_LLB  qla2xxx         4:0:1:1         NA
lin_tape version: 3.0.10
lin_tape major number: 239
Attached Tape Devices:
Number  model       SN                HBA             SCSI            FO Path
0       ULT3580-TD4 7310132058        qla2xxx         2:0:0:0         NA
1       ULT3580-TD4 7310295146        qla2xxx         2:0:1:0         NA
2       ULT3580-TD4 7310214751        qla2xxx         4:0:0:0         NA
3       ULT3580-TD4 7310214904        qla2xxx         4:0:1:0         NA

[root@300 ~]# cat /etc/udev/rules.d/98-lin_tape.rules
KERNEL=="IBMtape*", SYSFS{serial_num}=="7310132058", MODE="0660", SYMLINK="IBMtape0"
KERNEL=="IBMtape*", SYSFS{serial_num}=="7310295146", MODE="0660", SYMLINK="IBMtape1"
KERNEL=="IBMtape*", SYSFS{serial_num}=="7310214751", MODE="0660", SYMLINK="IBMtape2_persistent_TSM0"
KERNEL=="IBMtape*", SYSFS{serial_num}=="7310214904", MODE="0660", SYMLINK="IBMtape3_persistent_TSM0"
KERNEL=="IBMchanger*", ATTR{serial_num}=="000001317577_LLB", MODE="0660", SYMLINK="IBMchanger1_persistent_TSM0"

[root@301 ~]# /etc/init.d/lin_tape stop
Shutting down lin_tape... lin_taped unloaded

[root@301 ~]# rmmod lin_tape

[root@301 ~]# /etc/init.d/lin_tape start
Starting lin_tape...

New persistent devices.

[root@301 ~]# find /dev/IBM*
/dev/IBMchanger0
/dev/IBMchanger1
/dev/IBMchanger1_persistent_TSM0
/dev/IBMSpecial
/dev/IBMtape
/dev/IBMtape0
/dev/IBMtape0n
/dev/IBMtape1
/dev/IBMtape1n
/dev/IBMtape2
/dev/IBMtape2n
/dev/IBMtape2_persistent_TSM0
/dev/IBMtape3
/dev/IBMtape3n
/dev/IBMtape3_persistent_TSM0

Lets update the paths to the tape drives now.

tsm: TSM0_SITE>query path f=d

                   Source Name: TSM0_SITE
                   Source Type: SERVER
              Destination Name: TS3310
              Destination Type: LIBRARY
                       Library:
                     Node Name:
                        Device: /dev/IBMchanger0
              External Manager:
              ZOS Media Server:
                  Comm. Method:
                           LUN:
                     Initiator: 0
                     Directory:
                       On-Line: Yes
Last Update by (administrator): ADMIN
         Last Update Date/Time: 09/16/2014 13:36:14

                   Source Name: TSM0_SITE
                   Source Type: SERVER
              Destination Name: DRIVE0
              Destination Type: DRIVE
                       Library: TS3310
                     Node Name:
                        Device: /dev/IBMtape0
              External Manager:
              ZOS Media Server:
                  Comm. Method:
                           LUN:
                     Initiator: 0
                     Directory:
                       On-Line: Yes
Last Update by (administrator): SERVER_CONSOLE
         Last Update Date/Time: 07/14/2016 14:02:02

                   Source Name: TSM0_SITE
                   Source Type: SERVER
              Destination Name: DRIVE1
              Destination Type: DRIVE
                       Library: TS3310
                     Node Name:
                        Device: /dev/IBMtape1
              External Manager:
              ZOS Media Server:
                  Comm. Method:
                           LUN:
                     Initiator: 0
                     Directory:
                       On-Line: Yes
Last Update by (administrator): SERVER_CONSOLE
         Last Update Date/Time: 07/14/2016 13:59:48

tsm: TSM0_SITE>update path TSM0_SITE TS3310 SRCType=SERVER DESTType=LIBRary online=no
ANR1722I A path from TSM0_SITE to TS3310 has been updated.

tsm: TSM0_SITE>update path TSM0_SITE TS3310 SRCType=SERVER DESTType=LIBRary device=/dev/IBMchanger1_persistent_TSM0
ANR1722I A path from TSM0_SITE to TS3310 has been updated.

tsm: TSM0_SITE>update path TSM0_SITE TS3310 SRCType=SERVER DESTType=LIBRary online=yes
ANR1722I A path from TSM0_SITE to TS3310 has been updated.

tsm: TSM0_SITE>update drive TS3310           DRIVE1           SERial=AUTODetect element=AUTODetect
ANR8467I Drive DRIVE1 in library TS3310 updated.

tsm: TSM0_SITE>update drive TS3310           DRIVE1         online=no
ANR8467I Drive DRIVE1 in library TS3310 updated.

tsm: TSM0_SITE>update drive TS3310           DRIVE1           SERial=AUTODetect element=AUTODetect
ANR8467I Drive DRIVE1 in library TS3310 updated.

tsm: TSM0_SITE>update drive TS3310           DRIVE1         online=yes
ANR8467I Drive DRIVE1 in library TS3310 updated.

tsm: TSM0_SITE>update drive TS3310           DRIVE1           SERial=AUTODetect element=AUTODetect
ANR8467I Drive DRIVE1 in library TS3310 updated.

tsm: TSM0_SITE>update drive TS3310           DRIVE1         online=yes
ANR8467I Drive DRIVE1 in library TS3310 updated.

tsm: TSM0_SITE>update path TSM0_SITE DRIVE0 SRCType=SERVER autodetect=yes DESTType=DRIVE library=ts3310 device=/dev/IBMtape2_persistent_TSM0
ANR1722I A path from TSM0_SITE to TS3310 DRIVE0 has been updated.

tsm: TSM0_SITE>update drive TS3310           DRIVE0           SERial=AUTODetect element=AUTODetect
ANR8467I Drive DRIVE0 in library TS3310 updated.

tsm: TSM0_SITE>update path TSM0_SITE DRIVE1 SRCType=SERVER autodetect=yes DESTType=DRIVE library=ts3310 device=/dev/IBMtape3_persistent_TSM0
ANR1722I A path from TSM0_SITE to TS3310 DRIVE1 has been updated.

tsm: TSM0_SITE>update path TSM0_SITE DRIVE1 SRCType=SERVER DESTType=DRIVE library=ts3310 online=yes
ANR1722I A path from TSM0_SITE to TS3310 DRIVE1 has been updated.

tsm: TSM0_SITE>update path TSM0_SITE DRIVE0 SRCType=SERVER DESTType=DRIVE library=ts3310 online=yes
ANR1722I A path from TSM0_SITE to TS3310 DRIVE0 has been updated.


Lets verify that our library works properly.

tsm: TSM0_SITE>audit library TS3310 checklabel=barcode
ANS8003I Process number 2 started.

tsm: TSM0_SITE>query proc

Process      Process Description      Process Status
  Number
--------     --------------------     -------------------------------------------------
       2     AUDIT LIBRARY            ANR8459I Auditing volume inventory for library
                                       TS3310.


tsm: TSM0_SITE>query act
(...)

08/04/2016 14:30:41      ANR2017I Administrator ADMIN issued command: AUDIT
                          LIBRARY TS3310 checklabel=barcode  (SESSION: 8)
08/04/2016 14:30:41      ANR0984I Process 2 for AUDIT LIBRARY started in the
                          BACKGROUND at 02:30:41 PM. (SESSION: 8, PROCESS: 2)
08/04/2016 14:30:41      ANR8457I AUDIT LIBRARY: Operation for library TS3310
                          started as process 2. (SESSION: 8, PROCESS: 2)
08/04/2016 14:30:46      ANR8358E Audit operation is required for library TS3310.
                          (SESSION: 8, PROCESS: 2)
08/04/2016 14:30:51      ANR8439I SCSI library TS3310 is ready for operations.
                          (SESSION: 8, PROCESS: 2)

(...)

08/04/2016 14:31:26      ANR0985I Process 2 for AUDIT LIBRARY running in the
                          BACKGROUND completed with completion state SUCCESS at
                          02:31:26 PM. (SESSION: 8, PROCESS: 2)

(...)

IBM TSM Storage Pool Configuration

IBM TSM container storage pool creation.

tsm: TSM0_SITE>define stgpool POOL0_stgFC stgtype=directory
ANR2249I Storage pool POOL0_stgFC is defined.

tsm: TSM0_SITE>define stgpooldirectory POOL0_stgFC /tsm0/pool0/pool0_01,/tsm0/pool0/pool0_02,/tsm0/pool0/pool0_03,/tsm0/pool0/pool0_04,/tsm0/pool0/pool0_05,/tsm0/pool0/pool0_06
ANR3254I Storage pool directory /tsm0/pool0/pool0_01 was defined in storage pool POOL0_stgFC.
ANR3254I Storage pool directory /tsm0/pool0/pool0_02 was defined in storage pool POOL0_stgFC.
ANR3254I Storage pool directory /tsm0/pool0/pool0_03 was defined in storage pool POOL0_stgFC.
ANR3254I Storage pool directory /tsm0/pool0/pool0_04 was defined in storage pool POOL0_stgFC.
ANR3254I Storage pool directory /tsm0/pool0/pool0_05 was defined in storage pool POOL0_stgFC.
ANR3254I Storage pool directory /tsm0/pool0/pool0_06 was defined in storage pool POOL0_stgFC.

tsm: TSM0_SITE>q stgpooldirectory

Storage Pool Name     Directory                                         Access
-----------------     ---------------------------------------------     ------------
POOL0_stgFC           /tsm0/pool0/pool0_01                              Read/Write
POOL0_stgFC           /tsm0/pool0/pool0_02                              Read/Write
POOL0_stgFC           /tsm0/pool0/pool0_03                              Read/Write
POOL0_stgFC           /tsm0/pool0/pool0_04                              Read/Write
POOL0_stgFC           /tsm0/pool0/pool0_05                              Read/Write
POOL0_stgFC           /tsm0/pool0/pool0_06                              Read/Write


IBM TSM Backup Policies Configuration

Below is an example policy.

tsm: TSM0_SITE>def dom  FS backret=30 archret=30
ANR1500I Policy domain FS defined.

tsm: TSM0_SITE>def pol  FS FS
ANR1510I Policy set FS defined in policy domain FS.

tsm: TSM0_SITE>def mg   FS FS FS_1DAY
ANR1520I Management class FS_1DAY defined in policy domain FS, set FS.

tsm: TSM0_SITE>def co   FS FS FS_1DAY   STANDARD type=backup destination=POOL0_STGFC verexists=32 verdeleted=1 retextra=31 retonly=14
ANR1530I Backup copy group STANDARD defined in policy domain FS, set FS, management class FS_1DAY.

tsm: TSM0_SITE>def mg   FS FS FS_1MONTH
ANR1520I Management class FS_1MONTH defined in policy domain FS, set FS.

tsm: TSM0_SITE>def co   FS FS FS_1MONTH STANDARD type=backup destination=POOL0_STGFC  verexists=4 verdeleted=1 retextra=91 retonly=14
ANR1530I Backup copy group STANDARD defined in policy domain FS, set FS, management class FS_1MONTH.

tsm: TSM0_SITE>as defmg FS FS FS_1DAY
ANR1538I Default management class set to FS_1DAY for policy domain FS, set FS.

tsm: TSM0_SITE>act pol  FS FS
ANR1554W DEFAULT Management class FS_1DAY in policy set FS FS does not have an ARCHIVE copygroup:  files will not be archived by default if this set is activated.

Do you wish to proceed? (Yes (Y)/No (N)) y
ANR1554W DEFAULT Management class FS_1DAY in policy set FS FS does not have an ARCHIVE copygroup:  files will not be archived by default if this set is activated.
ANR1514I Policy set FS activated in policy domain FS.



I hope that the amount of instructions did not discouraged you from one of the best enterprise backup systems – the IBM TSM (now IBM Spectrum Protect) and on of the best high availability cluster – the Veritas Cluster Server πŸ™‚

EOF

Highly Available DHCP Server on FreeBSD

Today I would like to share a highly available DHCP server setup on FreeBSD system, but it should be similarly simple on other UNIX and Unix-like systems. I will use the most obvious choice here – the Internet Systems Consortium implementation – ISC DHCP server – available in the FreeBSD Ports and packages as well.

ISC

Since some time ISC is developing a new DHCP server – Kea – with which they intend to eventually replace the ISC DHCP in most server implementations. They also recommend that new implementers consider using Kea instead ISC DHCP and implement ISC DHCP only if Kea does not meet their needs. Kea currently does not include either client or relay for example. Maybe I will make an UPDATE to this post or a separate article some time.

Also Kea got high availability mode just a month ago so if I would be writing this article little earlier then such setup would not be possible with Kea. It also shows how young Kea implementation is thus I would stick to ISC DHCP server for now and ‘watch’ Kea development for the future.

Architecture

Below is the POOR MAN’S ASCII ARCHITECT diagram showing our ISC DHCP setup.

  +-------------+              +-------------+
  | {primary}   |              | {secondary} |
  | DHCPs1      | ==== HA ==== | DHCPs2      |
  | 10.0.10.251 |              | 10.0.10.252 |
  +-------------+              +-------------+
                 \            /
  +------------------------------------------+
  | ADDRESS POOL  10.0.10.x/24  ADDRESS POOL |
  +------------------------------------------+
              \                  /
               +----------------+
               | {DHCP CLIENTS} |
               +----------------+

The setup of each DHCP server node is very simple. Its FreeBSD 11.2-RELEASE installed on a 4 GB GPT partition using UFS for the / filesystem and only 666 MB are used as shown below.

root@DHCPs1:/ # uname -v
FreeBSD 11.2-RELEASE #0 r335510: Fri Jun 22 04:32:14 UTC 2018     root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC 

root@DHCPs1:/ # gpart show
=>     40  8388528  ada0  GPT  (4.0G)
       40     1024     1  freebsd-boot  (512K)
     1064  8386560     2  freebsd-ufs  (4.0G)
  8387624      944        - free -  (472K)

root@DHCPs1:/ # du -smc * | sort -n
0       sys
1       COPYRIGHT
1       dev
1       entropy
1       libexec
1       media
1       mnt
1       net
1       proc
1       root
1       tmp
2       bin
4       etc
7       sbin
8       var
10      rescue
12      lib
128     boot
499     usr
666     total

The 128 MB of RAM is enough for small amount of clients. There is still 32 MB free memory along with 32 MB of Inactive and Buffered memory that can be swapped out. Not to mention that each getty process takes about 2 MB ram and instead of 8 you just only need 1 of them. In other words you would be able to run it even with as low as 64 MB of RAM.

root@DHCPs1:~ # top -b -o res
last pid: 15205;  load averages:  0.13,  0.25,  0.29  up 0+07:39:11    20:03:48
16 processes:  2 running, 14 sleeping

Mem: 1688K Active, 30M Inact, 26M Wired, 3800K Buf, 32M Free
Swap:


  PID USERNAME    THR PRI NICE   SIZE    RES STATE    TIME    WCPU COMMAND
38897 dhcpd         1  20    0 16424K 10724K select   0:00   0.00% dhcpd
30199 root          1  20    0 13160K  8036K RUN      0:00   0.00% sshd
15106 root          1  28    0 12848K  7136K select   0:00   0.00% sshd
53100 root          1  20    0  9180K  5040K select   0:02   0.00% devd
31079 root          1  20    0  7412K  3640K pause    0:00   0.00% csh
15205 root          1  20    0  7916K  3060K RUN      0:00   0.00% top
15960 root          1  20    0  6464K  2480K nanslp   0:00   0.00% cron
69084 root          1  20    0  6412K  2364K select   0:01   0.00% syslogd
28412 root          1  52    0  6408K  2124K ttyin    0:00   0.00% getty
28188 root          1  52    0  6408K  2124K ttyin    0:00   0.00% getty
28504 root          1  52    0  6408K  2124K ttyin    0:00   0.00% getty
28972 root          1  52    0  6408K  2124K ttyin    0:00   0.00% getty
29736 root          1  52    0  6408K  2124K ttyin    0:00   0.00% getty
29080 root          1  52    0  6408K  2124K ttyin    0:00   0.00% getty
30106 root          1  52    0  6408K  2124K ttyin    0:00   0.00% getty
29392 root          1  52    0  6408K  2124K ttyin    0:00   0.00% getty



The /etc/rc.conf file for DHCP nodes DHCPs1 and DHCPs2 is the same (besides hostname and address).

root@DHCPs1:/ # cat /etc/rc.conf
hostname=DHCPs1
ifconfig_em0="inet 10.0.10.251/24 up"
sshd_enable=YES
sendmail_enable=NONE
clear_tmp_enable=YES
syslogd_flags="-ss"
dumpdev=NO

The /etc/sysctl.conf and /boot/loader.conf files modifications are not needed.

Now you will have to install the ISC DHCP server, as the current version is 4.4.x the package will be named accordingly – isc-dhcp44-server – lets add it using the pkg(8) command.

root@DHCPs1:/ # pkg update -f -y
The package management tool is not yet installed on your system.
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:11:amd64//quarterly, please wait...
Verifying signature with trusted certificate pkg.freebsd.org.2013102301... done
[nextcloud] Installing pkg-1.10.5...
[nextcloud] Extracting pkg-1.10.5: 100%
Updating FreeBSD repository catalogue...
pkg: Repository FreeBSD load error: access repo file(/var/db/pkg/repo-FreeBSD.sqlite) failed: No such file or directory
[nextcloud] Fetching meta.txz: 100%    944 B   0.9kB/s    00:01
[nextcloud] Fetching packagesite.txz: 100%    6 MiB 530.8kB/s    00:12
Processing entries: 100%
FreeBSD repository update completed. 31134 packages processed.
All repositories are up to date.
root@DHCPs1:/ # echo ?
0
root@DHCPs1:/ #

Now lets install isc-dhcp44-server package.

root@DHCPs1:/ # pkg install isc-dhcp44-server
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        isc-dhcp44-server: 4.4.1_3 [FreeBSD]

Number of packages to be installed: 1

The process will require 6 MiB more space.

Proceed with this action? [y/N]: y
[1/1] Installing isc-dhcp44-server-4.4.1_3...
===> Creating groups.
Creating group 'dhcpd' with gid '136'.
===> Creating users
Creating user 'dhcpd' with uid '136'.
[1/1] Extracting isc-dhcp44-server-4.4.1_3: 100%
Message from isc-dhcp44-server-4.4.1_3:

****  To setup dhcpd, please edit /usr/local/etc/dhcpd.conf.

****  This port installs the dhcp daemon, but doesn't invoke dhcpd by default.
      If you want to invoke dhcpd at startup, add these lines to /etc/rc.conf:

            dhcpd_enable="YES"                          # dhcpd enabled?
            dhcpd_flags="-q"                            # command option(s)
            dhcpd_conf="/usr/local/etc/dhcpd.conf"      # configuration file
            dhcpd_ifaces=""                             # ethernet interface(s)
            dhcpd_withumask="022"                       # file creation mask

****  If compiled with paranoia support (the default), the following rc.conf
      options are also supported:

            dhcpd_chuser_enable="YES"           # runs w/o privileges?
            dhcpd_withuser="dhcpd"              # user name to run as
            dhcpd_withgroup="dhcpd"             # group name to run as
            dhcpd_chroot_enable="YES"           # runs chrooted?
            dhcpd_devfs_enable="YES"            # use devfs if available?
            dhcpd_rootdir="/var/db/dhcpd"       # directory to run in
            dhcpd_includedir=""       # directory with config-
                                                  files to include

****  WARNING: never edit the chrooted or jailed dhcpd.conf file but
      /usr/local/etc/dhcpd.conf instead which is always copied where
      needed upon startup.

Now update the pkg(8) repository data and install the isc-dhcp44-server package on DHCPs2 node.

The configuration uses single network segment 10.0.10.0/24 for the clients in the range of 10-250 values in the last octet. The parameter split 128 will split the load equally between DHCP server nodes. As this is just example, we will use 1.1.1.1 and 9.9.9.9 DNS servers and ‘domain.com‘ domain. For the record, the split 128 parameter is set only on the primary node – DHCPs1 in our case. As the man dhcpd.conf page suggests we will “use the same master configuration file for both servers, and have a separate file that contains the peer declaration and includes the master file.” as “This will help you to avoid configuration mismatches.”

root@DHCPs1:/ # cat /usr/local/etc/dhcpd.conf
# CORE
failover peer "ha-dhcp" {
  primary;
  address 10.0.10.251;
  port 678;
  peer address 10.0.10.252;
  peer port 678;
  max-response-delay 60;
  max-unacked-updates 10;
  mclt 3600;
  split 128;
  load balance max seconds 3;
}

include "/usr/local/etc/dhcpd.conf.SHARED";
root@DHCPs1:/ # cat /usr/local/etc/dhcpd.conf.SHARED
# CLIENTS
subnet 10.0.10.0 netmask 255.255.255.0 {
  default-lease-time         604800;
  max-lease-time             604800;
  option routers             10.0.10.254;
  option broadcast-address   10.0.10.255;
  option subnet-mask         255.255.255.0;
  option domain-search       "domain.com";
  option domain-name-servers 1.1.1.1,9.9.9.9;

  pool {
    failover peer "ha-dhcp";
    range 10.0.10.10 10.0.10.250;
  }
}

… and the secondary node.

root@DHCPs2:~ # cat /usr/local/etc/dhcpd.conf
# CORE
failover peer "ha-dhcp" {
  secondary;
  address 10.0.10.252;
  port 678;
  peer address 10.0.10.251;
  peer port 678;
  max-response-delay 60;
  max-unacked-updates 10;
  mclt 3600;
  load balance max seconds 3;
}

include "/usr/local/etc/dhcpd.conf.SHARED";
root@DHCPs2:/ # cat /usr/local/etc/dhcpd.conf.SHARED
# CLIENTS
subnet 10.0.10.0 netmask 255.255.255.0 {
  default-lease-time         604800;
  max-lease-time             604800;
  option routers             10.0.10.254;
  option broadcast-address   10.0.10.255;
  option subnet-mask         255.255.255.0;
  option domain-search       "domain.com";
  option domain-name-servers 1.1.1.1,9.9.9.9;

  pool {
    failover peer "ha-dhcp";
    range 10.0.10.10 10.0.10.250;
  }
}

The /usr/local/etc/dhcpd.conf.SHARED file is identical on both nodes.

Now lets start the DHCP server on both nodes.

root@DHCPs1:~ # sysrc dhcpd_enable=YES
dhcpd_enable:  -> YES
root@DHCPs1:~ # service isc-dhcpd start
Starting dhcpd.
Internet Systems Consortium DHCP Server 4.4.1
Copyright 2004-2018 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/
Config file: /usr/local/etc/dhcpd.conf
Database file: /var/db/dhcpd/dhcpd.leases
PID file: /var/run/dhcpd/dhcpd.pid
Wrote 122 leases to leases file.
Listening on BPF/em0/08:00:27:3c:ab:c8/10.0.10.0/24
Sending on   BPF/em0/08:00:27:3c:ab:c8/10.0.10.0/24
Sending on   Socket/fallback/fallback-net
failover peer ha-dhcp: I move from normal to startup

… and the same on secondary node.

root@DHCPs2:~ # sysrc dhcpd_enable=YES
dhcpd_enable:  -> YES
root@DHCPs2:~ # service isc-dhcpd onestart
Starting dhcpd.
Internet Systems Consortium DHCP Server 4.4.1
Copyright 2004-2018 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/
Config file: /usr/local/etc/dhcpd.conf
Database file: /var/db/dhcpd/dhcpd.leases
PID file: /var/run/dhcpd/dhcpd.pid
Wrote 122 leases to leases file.
Listening on BPF/em0/08:00:27:de:9b:3d/10.0.10.0/24
Sending on   BPF/em0/08:00:27:de:9b:3d/10.0.10.0/24
Sending on   Socket/fallback/fallback-net
failover peer ha-dhcp: I move from communications-interrupted to startup

Now, as the both nodes for the highly available DHCP server are started, lets try to get some DHCP lease on the DHCP client – DHCPc in our example.

root@DHCPc:~ # dhclient em0
DHCPREQUEST on em0 to 255.255.255.255 port 67
DHCPREQUEST on em0 to 255.255.255.255 port 67
DHCPACK from 10.0.10.251
bound to 10.0.10.131 -- renewal in 302119 seconds.
root@DHCPc:~ # ifconfig em0
em0: flags=8843 metric 0 mtu 1500
        options=9b
        ether 08:00:27:d9:45:96
        hwaddr 08:00:27:d9:45:96
        inet 10.0.10.131 netmask 0xffffff00 broadcast 10.0.10.255
        nd6 options=29
        media: Ethernet autoselect (1000baseT )
        status: active

We can see that the DHCP client –Β DHCPc – got the 10.0.10.131 address.

We can of course set permanent address for it with the host option in the /usr/local/etc/dhcpd.conf.SHARED config file as show below.

The needed ‘addon’ is shown below.

  group
  {
    host DHCPc {
      hardware ethernet 08:00:27:d9:45:96;
      fixed-address 10.0.10.9;
    }
  }

It needs to be added on both nodes in the /usr/local/etc/dhcpd.conf.SHARED config file, here is how the new shared config file would look like.

root@DHCPs1:~ # cat /usr/local/etc/dhcpd.conf.SHARED
# CLIENTS
subnet 10.0.10.0 netmask 255.255.255.0 {
  default-lease-time         604800;
  max-lease-time             604800;
  option routers             10.0.10.254;
  option broadcast-address   10.0.10.255;
  option subnet-mask         255.255.255.0;
  option domain-search       "domain.com";
  option domain-name-servers 1.1.1.1,9.9.9.9;

  group
  {
    host DHCPc {
      hardware ethernet 08:00:27:d9:45:96;
      fixed-address 10.0.10.9;
    }
  }

  pool {
    failover peer "ha-dhcp";
    range 10.0.10.10 10.0.10.250;
  }
}

Now copy the /usr/local/etc/dhcpd.conf.SHARED file to the second node.

Lets try again to get the address from the same DHCP client.

root@DHCPc:~ # pkill dhclient
root@DHCPc:~ # service netif restart
root@DHCPc:~ # dhclient em0
DHCPREQUEST on em0 to 255.255.255.255 port 67
DHCPREQUEST on em0 to 255.255.255.255 port 67
DHCPACK from 10.0.10.252
bound to 10.0.10.131 -- renewal in 1665 seconds.
DHCPREQUEST on em0 to 255.255.255.255 port 67
DHCPREQUEST on em0 to 255.255.255.255 port 67
DHCPNAK from 10.0.10.252
DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 3
DHCPOFFER from 10.0.10.251
DHCPOFFER from 10.0.10.252
DHCPOFFER already seen.
DHCPREQUEST on em0 to 255.255.255.255 port 67
DHCPACK from 10.0.10.252
bound to 10.0.10.9 -- renewal in 302400 seconds.
root@DHCPc:~ # ifconfig em0
em0: flags=8843 metric 0 mtu 1500
        options=9b
        ether 08:00:27:d9:45:96
        hwaddr 08:00:27:d9:45:96
        inet 10.0.10.9 netmask 0xffffff00 broadcast 10.0.10.255
        nd6 options=29
        media: Ethernet autoselect (1000baseT )
        status: active

Now we got the permanent 10.0.10.9 address.

You can now experiment with these values in the /etc/rc.conf file:

  • dhcpd_flags
  • dhcpd_ifaces
  • dhcpd_withumask
  • dhcpd_chuser_enable
  • dhcpd_withuser
  • dhcpd_withgroup
  • dhcpd_chroot_enable
  • dhcpd_devfs_enable
  • dhcpd_rootdir
  • dhcpd_includedirnclude

… with the all other possible options from the man dhcpd.conf page πŸ™‚

EOF