I recently got in the need of splitting quite large amount of audio files into smaller equal parts. The first thought that came to my mind was – probably thousand or more people had similar problem in the past so its already solved – so I went directly to the web search engine.
The found solutions seem not that great or work partially only … or not work like I expected them to work. After looking at one of the possible solutions in a bash(1) script I started to modify it … but it turned out that writing my own solution was faster and easier … and simpler.
Today I will share with you my solution to automatically split audio files into small equal parts.
Existing Solutions
In my search for existing solutions I indeed found some tools that will allow me to achieve what I need. I will not try to talk them one after another.
mp3splt
The first one I found was the audio/mp3splt port (and package) available on FreeBSD. So I installed it with typical pkg(8) command as shown below.
# pkg install mp3splt
It installed properly … but returned Segmentation Fault instead of actually working. I even submitted a PR for that in the FreeBSD Bugzilla – 264866 – but no update till now.
Thus I removed that package and went to search for something that works.
Brasero
Someone on some forum suggested using CD/DVD burning software – Brasero – because one of its features is audio splitting – so I installed the sysutils/brasero package now.
# pkg install brasero
It turns out that it really works. Some screenshots below.
… but that did not satisfied my because I wanted an automated/unattended solution instead of ‘clicking’ each file separately to split them. I also did not liked the fact that I needed to specify time in seconds.
mp3split
Do not confuse with mentioned earlier mp3splt command. The mp3split is a unattended one created in a bash(1) script – https://diegosanchezp.github.io/blog/mp3split/ – available and described here. One of its downsides (for me) was that it needed additional external ‘list’ file with times and titles for the parts.
I did not wanted to write this each time so I generated a long enough list file that will cover any possible file no matter the length with the following loop.
% seq 0 10 10000 \ | while read MIN do seq 0 10 50 \ | while read SEC do echo ${MIN}:${SEC} done done > list.txt % head list.txt 0:0 0:10 0:20 0:30 0:40 0:50 10:0 10:10 10:20 10:30
I needed to split these audio files every 10 minutes. I redirected that output into the list.txt file. I then fetched and made executable the mentioned mp3split script.
% fetch https://raw.githubusercontent.com/diegosanchezp/mp3split/master/mp3split.sh % chmod +x mp3split.sh % ./mp3split.sh --help zsh: ./mp3split.sh: bad interpreter: /bin/bash: no such file or directory % head -1 ./mp3split.sh #!/bin/bash
So now we will have to remove linuxisms from the script. Lets hope its only the interpreter part.
% head -1 ./mp3split.sh #! /usr/bin/env bash % ./mp3split.sh --help ./mp3split.sh: illegal option -- - Invalid option: - Usage: mp3split [OPTIONS] inputaudio tracklist Options: -s: do a simulation without writing anything to disk -h: print this help -e extension: set output extension, if extension is equal to "" keep extension of input file The script will output all the splitted files in the current/working directory.
Better. Lets try to use it.
% ./mp3split.sh LARGE-AUDIO-FILE.mp3 list.txt === Begin to create mp3 split files === 0:0.mp3: Protocol not found Processed 0:0 to 0:10; 0:0.mp3 0:10.mp3: Protocol not found Processed 0:10 to 0:20; 0:10.mp3 0:20.mp3: Protocol not found Processed 0:20 to 0:30; 0:20.mp3 0:30.mp3: Protocol not found Processed 0:30 to 0:40; 0:30.mp3 0:40.mp3: Protocol not found Processed 0:40 to 0:50; 0:40.mp3 0:50.mp3: Protocol not found Processed 0:50 to 10:0; 0:50.mp3 10:0.mp3: Protocol not found Processed 10:0 to 10:10; 10:0.mp3 10:10.mp3: Protocol not found Processed 10:10 to 10:20; 10:10.mp3 10:20.mp3: Protocol not found Processed 10:20 to 10:30; 10:20.mp3 10:30.mp3: Protocol not found Processed 10:30 to 10:40; 10:30.mp3 10:40.mp3: Protocol not found Processed 10:40 to 10:50; 10:40.mp3 10:50.mp3: Protocol not found Processed 10:50 to 20:0; 10:50.mp3 20:0.mp3: Protocol not found Processed 20:0 to 20:10; 20:0.mp3 20:10.mp3: Protocol not found Processed 20:10 to 20:20; 20:10.mp3 20:20.mp3: Protocol not found Processed 20:20 to 20:30; 20:20.mp3 ^C
Some strange error message Protocol not found … after small investigation it turns out that two characters fix for the ffmpeg(1) command will do. The diff(1) is available below.
% diff -u mp3split.sh mp3split.sh.FIXED.sh --- mp3split.sh 2022-06-25 22:34:25.499718000 +0200 +++ mp3split.sh.FIXED.sh 2022-06-25 22:37:45.580845000 +0200 @@ -25,7 +25,7 @@ outfile="$tracktitle.$ext" # Begin splitting files with ffmpeg - [ ! "$simulate" = true ] && ffmpeg -nostdin -y -loglevel error -i "$inputaudio" -ss "$start" -to "$end" -acodec copy "$outfile" + [ ! "$simulate" = true ] && ffmpeg -nostdin -y -loglevel error -i "$inputaudio" -ss "$start" -to "$end" -acodec copy ./"$outfile" echo "Processed $start to $end; $outfile" }
Now lets try to use the fixed version.
% ./mp3split.sh.FIXED.sh LARGE-AUDIO-FILE.mp3 list.txt === Begin to create mp3 split files === Processed 0:0 to 0:10; 0:0.mp3 Processed 0:10 to 0:20; 0:10.mp3 Processed 0:20 to 0:30; 0:20.mp3 Processed 0:30 to 0:40; 0:30.mp3 Processed 0:40 to 0:50; 0:40.mp3 Processed 0:50 to 10:0; 0:50.mp3 Processed 10:0 to 10:10; 10:0.mp3 Processed 10:10 to 10:20; 10:10.mp3 Processed 10:20 to 10:30; 10:20.mp3 Processed 10:30 to 10:40; 10:30.mp3 Processed 10:40 to 10:50; 10:40.mp3 Processed 10:50 to 20:0; 10:50.mp3 Processed 20:0 to 20:10; 20:0.mp3 Processed 20:10 to 20:20; 20:10.mp3 Processed 20:20 to 20:30; 20:20.mp3 Processed 20:30 to 20:40; 20:30.mp3 Processed 20:40 to 20:50; 20:40.mp3 Processed 20:50 to 30:0; 20:50.mp3 Processed 30:0 to 30:10; 30:0.mp3 Processed 30:10 to 30:20; 30:10.mp3 Processed 30:20 to 30:30; 30:20.mp3 Processed 30:30 to 30:40; 30:30.mp3 Processed 30:40 to 30:50; 30:40.mp3 Processed 30:50 to 40:0; 30:50.mp3 Processed 40:0 to 40:10; 40:0.mp3 Processed 40:10 to 40:20; 40:10.mp3 Processed 40:20 to 40:30; 40:20.mp3 Processed 40:30 to 40:40; 40:30.mp3 Processed 40:40 to 40:50; 40:40.mp3 Processed 40:50 to 50:0; 40:50.mp3 Processed 50:0 to 50:10; 50:0.mp3 Processed 50:10 to 50:20; 50:10.mp3 Processed 50:20 to 50:30; 50:20.mp3 Processed 50:30 to 50:40; 50:30.mp3 Processed 50:40 to 50:50; 50:40.mp3 Invalid duration specification for to: 60:0 Processed 50:50 to 60:0; 50:50.mp3 Invalid duration specification for ss: 60:0 Processed 60:0 to 60:10; 60:0.mp3 Invalid duration specification for ss: 60:10 Processed 60:10 to 60:20; 60:10.mp3 Invalid duration specification for ss: 60:20 Processed 60:20 to 60:30; 60:20.mp3 Invalid duration specification for ss: 60:30 Processed 60:30 to 60:40; 60:30.mp3 Invalid duration specification for ss: 60:40 Processed 60:40 to 60:50; 60:40.mp3 Invalid duration specification for ss: 60:50 Processed 60:50 to 70:0; 60:50.mp3 Invalid duration specification for ss: 70:0 Processed 70:0 to 70:10; 70:0.mp3 Invalid duration specification for ss: 70:10 Processed 70:10 to 70:20; 70:10.mp3 ^C
Great … so after the file ended it will still try EVERY goddamn position from the list.txt file. It was also not able to reach the final ‘ending’ part without ‘visiting’ each time from the list.txt file. Enough is enough. I tried.
Custom Script Solution
After trying to modify the mp3split script even more I came to the conclusion that it will take less time to write my own solution from scratch … and this is exactly what I did. I wrote the audio-split.sh in POSIX /bin/sh interpreter for portability. After an hour later 50 lines of code did exactly what I needed – not counting the __usage() function for help information.
Here is the __usage() contents by the way.
The idea/needs were:
- split large file automatically/unattended into equal parts
- create new dir in which these parts are created
- new dir must have same name as specified file (without extension)
- each part will get a ' - xxx' suffix (like ' - 001' for first part) with original extension
… and they were met.
Here is the output of running audio-split.sh command.
% ffmpeg -i LARGE-AUDIO-FILE.mp3 2>&1 | grep Duration Duration: 00:44:55.99, start: 0.025057, bitrate: 171 kb/s % audio-split.sh 10 LARGE-AUDIO-FILE.mp3 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 001.mp3 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 002.mp3 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 003.mp3 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 004.mp3 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 005.mp3 % du -sm LARGE-AUDIO-FILE.mp3 56 LARGE-AUDIO-FILE.mp3 % du -smc LARGE-AUDIO-FILE/* 13 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 001.mp3 13 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 002.mp3 13 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 003.mp3 13 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 004.mp3 7 LARGE-AUDIO-FILE/LARGE-AUDIO-FILE - 005.mp3 56 total
The total size is the same (or similar in larger files). After listening to the parts I came to the conclusion that it works properly. The audio file is about 45 minutes long and the script created 4 10 minutes long files and 1 that is less then 5 minutes. Not sure if you also have such needs but if yes then you may now use another solution – audio-split.sh – for it π