Ghost in the Shell – Part 3

Time to bring some life into the Ghost in the Shell series with Part 3 article.

You may want to check other articles in the Ghost in the Shell series on the Ghost in the Shell – Global Page where you will find links to all episodes of the series along with table of contents for each episode’s contents.

Query Functions

I haven’t found better name for that solution. There are generally two types of UNIX people. These that prefer to navigate and operate with basic ls/cd/mv/mkdir/rm commands and those who use some file manager like Midnight Commander (mc) or ranger or vifm or … you get the idea. I have tried various CLI file managers but always came back to navigate without them. If you are one of those people then these Query Functions are for you πŸ™‚

The so called Query Functions are for filter the information you look for. For example if you have directory with large number of files, then you would probably do something like that.

% ls | grep QUERY

… or if you also want to include subdirectories then something like that.

% find . | grep QUERY

For both of these examples you would also probably want to sometimes search case sensitive or insensitive depending on the need.

That leads us to four Query Functions:

  • q is an equivalent of ls | grep -i QUERY command.
  • Q is an equivalent of ls | grep QUERY command.
  • qq is an equivalent of find . | grep -i QUERY command.
  • QQ is an equivalent of find . | grep QUERY command.

Thus if I need to query the contents of directory while searching for something is very fast with q SOMETHING.

These are definitions of these Query Functions:

# SHORT QUERY FUNCTIONS q()
  q() {
    if [ ${#} -eq 1 ]
    then
      ls | grep --color -i ${1} 2> /dev/null
    else
      echo "usage: q string"
    fi
  }
     
# SHORT QUERY FUNCTIONS Q()
  Q() {
    if [ ${#} -eq 1 ]
    then
      ls | grep --color ${1} 2> /dev/null
    else
      echo "usage: Q string"
    fi
  }

# SHORT QUERY FUNCTIONS qq()
  qq() {
    if [ ${#} -eq 1 ]
    then
      find . \
        | grep -i ${1} 2> /dev/null \
        | cut -c 3-999 \
        | grep --color -i ${1} 2> /dev/null
    else
      echo "usage: qq string"
    fi
  }

# SHORT QUERY FUNCTIONS QQ()
  QQ() {
    if [ ${#} -eq 1 ]
    then
      find . \
        | grep ${1} 2> /dev/null \
        | cut -c 3-999 \
        | grep ${1} 2> /dev/null
    else
      echo "usage: QQ string"
    fi
  }

The qq and QQ functions uses grep(1) two times to make sure the output is colored.

I assume that You use colored grep(1) described in Ghost in the Shell – Part 2 article.

If you prefer to use alias(1) instead then they would look like that.

# SHORT QUERY FUNCTIONS q() Q() qq() QQ()
  alias q="ls | grep --color -i"
  alias Q="ls | grep --color"
  alias qq="find . | grep -i"
  alias QQ="find . | grep"

The qq and QQ will be little more limited as with functions its possible to trim the output to the exact needs with cut(1).

q.png

qq.png

Lots of people use recursive history search which also helps, but what if you used/typed needed command long ago with the arguments you need now? You would probably search the command with history(1) command and then using grep(1) to limit the results to what you look for. I keep enormous large list of commands to keep in history – with my current setting of 655360 the ~/.zhistory (ZSH) file takes about 2.7 MB size. I also wanted to be sure that two identical commands would not be kept in history hence the setopt hist_ignore_all_dups ZSH option enabled. When I wc -l my ~/.zhistory file it currently has 75695 lines of commands.

% grep HISTSIZE /usr/local/etc/zshrc
export HISTSIZE=655360
export SAVEHIST=${HISTSIZE}

% grep dups /usr/local/etc/zshrc
setopt hist_ignore_all_dups

Now back to Query Functions for history:

  • h is an equivalent of cat ~/.zhistory | grep -i QUERY command.
  • H is an equivalent of cat ~/.zhistory | grep QUERY command.

They fit in aliases this time. In alias(1) we will use just grep(1) to not ‘do’ Useless Use of Cat.

Here are the Query Functions for history.

# SHORT HISTORY ALIASES h() H()
  alias h='< ~/.zhistory grep -i'
  alias H='< ~/.zhistory grep'

h

… but what if we would like to filter the outputs of q family and h family Query Functions? The obvious response is using grep(1) like q QUERY | grep ANOTHER or h QUERY | grep ANOTHER for example. To make that faster we will make g and G shortcuts.

  • g is an equivalent of grep -i command.
  • G is an equivalent of just grep command.

Here they are.

# SHORT GREP FUNCTIONS g() G()
  alias g='grep -i'
  alias G='grep'

Now it will be just q QUERY | g ANOTHER and h QUERY | G ANOTHER for example.

To clear terminal output you may use clear(1) command, some prefer [CTRL]-[L] shortcut but I find ‘c‘ alias to be the fastest solution.

# SHORT GREP FUNCTIONS c()
  alias c='clear'

To make the solution complete I would also add exa(1) here with an alias of ‘e‘.

# SHORT LISTING WITH e()
  alias e='exa --time-style=long-iso --group-directories-first'

Why exa(1) will you ask while there is BSD ls(1) and GNU ls(1) (installed as gls(1) on FreeBSD to not confuse). To add GNU ls(1) to FreeBSD system use the coreutils package.

Well, the BSD ls(1) has two major cons:

  • It is not able to sort directories first.
  • It selects width for ALL columns based on single longest file name.

BSD-ls.png

The BSD ls(1) was used as following alias:

alias ls='ls -p -G -D "%Y.%m.%d %H:%M"'

The GNU ls(1) does not have these two problems but it does color the output only on the very limited pattern like:

  • Not executable file.
  • Executable file.
  • Directory.
  • Link.
  • Device.

GNU-ls.png

The GNU ls(1) was used as following alias:

gls -p -G --color --time-style=long-iso --group-directories-first --quoting-style=literal

Here is where exa(1) comes handy as it does not have any cons like FreeBSD’s ls(1) and it colors a lot more types of files.

e.png

exa --time-style=long-iso --group-directories-first

Its still very simple coloring based on file extension and not magic number as plain (empty) text file SOME-NOT-FILE.pdf is colored like PDF document.

e-pdf.png

But even this ‘limited’ coloring helps in 99% of the cases and while with BSD ls(1) and GNU ls(1) all of these files ‘seem’ like plain text files with exa(1) its obvious from the start which are plain files, which are images and which are ‘documents’ like PDF files for example.

Where Is My Space

On all UNIX and Linux systems there exists du(1) command. Combined with sort(1) it is universal way of searching for space eaters. Example for the / root directory with -g flag to display units in gigabytes.

# cd /
# du -sg * | sort -n
1       bin
1       boot
1       compat
1       COPYRIGHT
1       data
1       dev
1       entropy
1       etc
1       lib
1       libexec
1       media
1       mnt
1       net
1       proc
1       rescue
1       root
1       sbin
1       sys
1       tmp
1       var
2       jail
8       usr
305     home

Contents of UNIX System Resources directory with -m flag to display unit in megabytes.

# cd /usr
# du -sm * | sort -n
1       libdata
1       obj
1       tests
3       libexec
11      sbin
13      include
45      lib32
56      lib
58      share
105     bin
1080    ports
1343    src
5274    local

But its PITA to type cd and du all the time, not to mention that some oldschool UNIX systems does not provide -g or -m flags so on HP-UX you are limited to kilobytes at most.

You may also try -h (human readable) with sort -h (sort human readable) du(1) variant.

# du -smh * | sort -h
512B    data
512B    net
512B    proc
512B    sys
4.5K    COPYRIGHT
4.5K    entropy
5.5K    dev
6.5K    mnt
 53K    media
143K    tmp
205K    libexec
924K    bin
2.2M    etc
3.9M    root
4.6M    sbin
6.2M    rescue
6.6M    lib
 90M    boot
117M    compat
564M    jail
667M    var
5.4G    usr
297G    home

This is where ncdu(1) comes handy. Its ncurses based disk usage analyzer which helps finding that space eaters in very fast time without typing the same commands over and over again. Here is ncdu(1) in action.

First it calculates the sizes of the files.

ncdu.png

After a while you get the output sorted by size.

ncdu-usr.png

If you hit [ENTER] on the directory you will be instantly moved into that directory.

ncdu-usr-local.png

If you delete something with ‘d‘ then remember to recalculate the output with ‘r‘ letter.

It also has great options such as spawning shell ‘b‘ in the current directory or toggle between apparent size and disk usage with ‘a‘ option. The latter is very useful when you use filesystem with builtin compression like ZFS.

       up, k  Move cursor up
     down, j  Move cursor down
 right/enter  Open selected directory
  left, <, h  Open parent directory
           n  Sort by name (ascending/descending)
           s  Sort by size (ascending/descending)
           C  Sort by items (ascending/descending)
           d  Delete selected file or directory
           t  Toggle dirs before files when sorting
           g  Show percentage and/or graph
           a  Toggle between apparent size and disk usage
           c  Toggle display of child item counts
           e  Show/hide hidden or excluded files
           i  Show information about selected item
           r  Recalculate the current directory
           b  Spawn shell in current directory
           q  Quit ncdu

The apparent size using the du(1) command.

Disk usage.

% du -sm books
39145   books

Apparent size.

% du -smA books
44438   books

So I have 1.13 compression ratio on the ZFS filesystem. More then 5 GB saved just in that directory πŸ™‚

Where Are My Files

Once I got some space back I also wanted to know if there are some directories with enormous amount of very small files.

First I came up with my own files-count.sh script solution which is not that long.

#! /bin/sh

export LC_ALL=C

if [ ${#} -eq 0 ]
then
  DIR=.
else
  DIR="${1}"
fi

find "${DIR}" -type d -maxdepth 1 -mindepth 1 \
  | cut -c 3- \
  | while read I
    do
      find "${I}" | wc -l | tr -d '\n'
      echo " ${I}"
    done | sort -n

It works reliably but same as with du | sort tandem you have to retype it (or at least use cd(1) and hit [UP] arrow again) … but then I discovered that ncdu(1) also counts files! It does not provide ‘startup’ argument to start in this count files mode but when you hit ‘c‘ letter it will instantly display count of files in each scanned directory. To sort this output by the count of files hit the ‘C‘ letter (large ‘C‘ letter).

ncdu-files.png

The files-count.sh script still has one advantage over ncdu(1) – the latter stops counting files at 100k which is shown on the screenshot so if You need to search for really big amount of files or just about 100k then files-count.sh script will be more accurate/adequate.

% cd /usr
% files-count.sh 
       1 obj
      36 libdata
     299 sbin
     312 libexec
     390 tests
     498 bin
     723 lib32
     855 lib
    2127 include
   16936 share
  159945 src
  211854 ports
  266021 local

… but what if there were some very big files hidden somewhere deep in the directories tree? The du(1) or ncdu(1) will not help here. As usual I though about short files-big.sh script that will do the job.

#! /bin/sh

export LC_ALL=C

if [ ${#} -eq 0 ]
then
  DIR=.
else
  DIR="${1}"
fi

find "${DIR}" -type f -exec stat -f "%16z; doas rm -f \"%N\"" {} ';' | sort -n

An example usage on the /var directory.

# cd /var
# files-big.sh | tail
        10547304; doas rm -f "./tmp/kdecache-vermaden/icon-cache.kcache"
        29089823; doas rm -f "./db/clamav/clamav-2671b72fce703c2133c61e5bf85aad19.tmp/clamav-373e311ca7f610a39c7cf5c5c5a4fd83.tmp/daily.hdb"
        30138884; doas rm -f "./tmp/pkg-provides-wyK2"
        48271360; doas rm -f "./db/pkg/repo-HardenedBSD.sqlite"
        54816768; doas rm -f "./db/pkg/repo-FreeBSD.sqlite"
        66433024; doas rm -f "./db/pkg/local.sqlite"
        82313216; doas rm -f "./db/clamav/clamav-2671b72fce703c2133c61e5bf85aad19.tmp/clamav-373e311ca7f610a39c7cf5c5c5a4fd83.tmp/daily.hsb"
       117892267; doas rm -f "./db/clamav/main.cvd"
       132431872; doas rm -f "./db/clamav/daily.cld"
       614839082; doas rm -f "./db/pkg/provides/provides.db"

The output is in ‘executable’ format so if you select whole line and paste it into terminal, then this file will be deleted. By default it uses doas(1) but nothing can stop you from putting sudo(8) there. Not sure if you will find it useful but it helped me at least dozen times.

How Many Copies Do You Keep

I often find myself keeping the same files in several places which also wastes space (unless you use ZFS deduplication of course).

The dedup.sh script I once made is little larger so I will not paste it here and just put a link to it.

It has the following options available. You may search/compare files by name or size (fast) or by its MD5 checksum (slow).

% dedup.sh
usage: dedup.sh OPTION DIRECTORY
  OPTIONS: -n   check by name (fast)
           -s   check by size (medium)
           -m   check by md5  (slow)
           -N   same as '-n' but with delete instructions printed
           -S   same as '-s' but with delete instructions printed
           -M   same as '-m' but with delete instructions printed
  EXAMPLE: dedup.sh -s /mnt

Simple usage example.

% cd misc/man
% cp zfs-notes zfs-todo
% dedup.sh -M .
count: 2 | md5: 4ff4be66ab7e5484de2bf7c168ff995a
  doas rm -rf "./zfs-notes"
  doas rm -rf "./zfs-todo"

count: 2 | md5: 6d87f5b1317ea189165fcdc71380735c
  doas rm -rf "./x11"
  doas rm -rf "./xinit"

By copying the zfs-notes file into the zfs-todo file I wanted to show you what dedup.sh will print on the screen, but accidentally I also found another duplicate πŸ™‚

The output of dedup.sh is simple and like with files-big.sh script selecting the while line and pasting it into the terminal will remove the duplicate. By default it uses doas(1) but you can change it into sudo(8) if that works better for you.

Unusual cron(1) Intervals

Most of us already remember what the five fields of crontab(5) file mean, but what if you would like to run command every second … or after reboot only? The answer lies in the man 5 crontab page. Here are these exotic options.

string          meaning
------          -------
@reboot         Run once, at startup of cron.
@yearly         Run once a year, "0 0 1 1 *".
@annually       (same as @yearly)
@monthly        Run once a month, "0 0 1 * *".
@weekly         Run once a week, "0 0 * * 0".
@daily          Run once a day, "0 0 * * *".
@midnight       (same as @daily)
@hourly         Run once an hour, "0 * * * *".
@every_minute   Run once a minute, "*/1 * * * *".
@every_second   Run once a second.

Check cron(1) Environment

Many times I found myself lost lots of time debugging what went wrong when my script was run by the crontab(5) file. Often it was some variable missing or some command or script I used was not in the PATH variable.

To make that debugging faster You can use ENV.sh script to just store the cron(1) environment.

% cat ENV.sh
env > /tmp/ENV.out

The ENV.sh script will write current environment in the /tmp/ENV.out file.

Lets put it into the crontab(5) for a test.

% crontab -l | grep ENV
@every_second ~/ENV.sh

Now after at most a second you can check for the contents of the /tmp/ENV.out file.

% cat /tmp/ENV.out
LOGNAME=vermaden
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin
PWD=/home/vermaden
HOME=/home/vermaden
USER=vermaden
SHELL=/bin/sh

Now you can easily debug the scripts run by the crontab(5) … at least on the environment part πŸ™‚

Simple HTTP Server

I found myself many times in a situation that I would want to allow download of some files from my machine and SSH could not be used.

This is when python(1) comes handy. It has SimpleHTTPServer (or http.server in Python 3 version) so you can instantly start HTTP server in any directory!

Here are the commands for both Python versions.

  • Python 2.x – python -m SimpleHTTPServer PORT
  • Python 3.x – python -m http.server PORT

I even made a simple http.sh wrapper script to make it even more easy.

#! /bin/sh

if ${#} -ne 1 ]
then
  echo "usage: ${0##*/} PORT"
  exit 1
fi

python -m SimpleHTTPServer ${1}

Example usage.

% cd misc/man
% http.sh 8080
Serving HTTP on 0.0.0.0 port 8080 ...
127.0.0.1 - - [14/Sep/2018 23:06:50] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [14/Sep/2018 23:06:50] code 404, message File not found
127.0.0.1 - - [14/Sep/2018 23:06:50] "GET /favicon.ico HTTP/1.1" 404 -
127.0.0.1 - - [14/Sep/2018 23:09:15] "GET /bhyve HTTP/1.1" 200 -

To stop it simply hit [CTRL]-[C] interrupt sequence.

Here is how it looks in the Epiphany browser.

http.png

Simple FTP Server

Similarly with FTP service, another Python goodie called pyftpdlib (Python FTP Server Library) provides that.

Mine ftp.py wrapper is little bigger as you can write quite comlicated setups with pyftpdlib but mine is simple, it starts in the current directory and adds read only anonymous user and read/write user named writer with WRITER password.

#! /usr/bin/env python

from sys                   import argv,exit
from pyftpdlib.authorizers import DummyAuthorizer
from pyftpdlib.handlers    import FTPHandler
from pyftpdlib.servers     import FTPServer

if len(argv) != 2:
  print "usage:", argv[0], "PORT"
  print
  exit(1)
  
authorizer = DummyAuthorizer()
authorizer.add_user("writer", "WRITER", ".", perm="elradfmw")
authorizer.add_anonymous(".")
handler = FTPHandler
handler.authorizer = authorizer
handler.passive_ports = range(60000, 60001)
address = ("0.0.0.0", argv[1])
ftpd = FTPServer(address, handler)
ftpd.serve_forever()

The ftp.py is handy if you want to enable someone to upload something for you (or you are doing it o the other machine) when SSH/SCP is not possible for some reason.

To stop it simply hit [CTRL]-[C] interrupt sequence.

Here is its terminal startup and logs.

% cd misc/man
% ftp.py 2121
[I 2018-09-14 23:21:53] }}} starting FTP server on 0.0.0.0:2121, pid=64399 {{{
[I 2018-09-14 23:21:53] concurrency model: async
[I 2018-09-14 23:21:53] masquerade (NAT) address: None
[I 2018-09-14 23:21:53] passive ports: 60000->60000

… and how Firefox renders its contents.

ftp.png

Hope you will find some of these useful, see you at Part 4 some day.

UPDATE 1 – More Short Functions

As time flies by I also added several other ‘short functions’ that make my life easier. They are related to mine Universal File Opener named see.sh.
This is the part that I added to mine ~/.zshrc shell config.

# SHORT see.sh OPEN ALIASES
  alias s='see.sh'
  alias o='see-pipe-open.sh'

The additional see-pipe-open.sh helper script is meant to be used in pipes to open all files from stdin.
Example below

% ls
bsd.1.pdf  bsd.2.pdf  bsd.png  unix.1.pdf  unix.2.pdf  NOTES.txt

% q bsd
bsd.1.pdf
bsd.2.pdf
bsd.png

% q bsd | g pdf
bsd.1.pdf
bsd.2.pdf

% q bsd | g pdf | o
// see.sh will open bsd.1.pdf with mupdf(1)
// see.sh will open bsd.2.pdf with mupdf(1)

Now – the q bsd | g pdf | o will open bsd.1.pdf and bsd.2.pdf files according to what is configured in the see.sh handler. In my case it mupdf(1) would be used to open both of them.

As for the other s shortcut – its just faster to type s bsd.1.pdf than see.sh bsd.1.pdf to open a file at terminal πŸ™‚

Regards.

EOF

5 thoughts on “Ghost in the Shell – Part 3

  1. Pingback: Ghost in the Shell – Part 1 | πšŸπšŽπš›πš–πšŠπšπšŽπš—

  2. Pingback: Ghost in the Shell – Part 2 | πšŸπšŽπš›πš–πšŠπšπšŽπš—

  3. Pingback: Ghost in the Shell – Part 4 | πšŸπšŽπš›πš–πšŠπšπšŽπš—

  4. Filip H.F. Slagter

    Instead of using `ls | grep`, why not use the more portable alternative of `find $path -regex pattern` (or -iregex for case-insensitive matches)?
    AFAIK the output of find is less prone to differ between implementations, compared to ls, and it’s one less pipe to use. Also, if you don’t need the use of regular expressions, you can also use ‘-name’ or ‘-iname’ (case-insensitive) instead, which allow for literal string matches with globbing support (i.e. -name ‘*.jpg’).
    If you want an output similar to ls, you can use `find $path -ls`, which gives an output similar to `ls -dils` (for GNU find), or `ls -dgils` (for BSD find on macOS).
    If you only want it to return the basename, rather than the full path, you can use: `find $path -exec basename {} \;`

    As you can see, another benefit of `find` is that you can directly execute commands with its output, without needing to pipe its output to another command.

    Thanks for the recommendation of `ncdu` btw; that looks like a better options than writing my own solution with `find` and `du`.

    Like

    Reply

Leave a comment