Skip to content
Fragmented Development

Removing leading/trailing spaces in the shell

Whitespace causes lots of interesting issues with the command line - whether it is present in file names, arguments, or any other data flowing between commands. Quoting can help, but there's some very particular edge cases I've encountered in my scripts.

Leading and trailing whitespace

If you are piping output directly into another command, occasionally you can get whitespace characters before or after your data. You have to use quotes to make sure internal whitspace is preserved, but that also leaves any whitespace characters at the beginning and/or end of your data.

I use three methods of removing leading or trailing whitespace: echo, xargs, and sed.


The echo method involves echoing a value without quotes, and re-assigning it to a variable:

TEST='   lousy  spaces!     '
TEST="$( echo $TEST )"

echo "$TEST"
# Prints:
#lousy spaces!

Benefits:

Drawbacks:


The xargs method is my personal favorite, although it's a bizarre one.

echo '   lousy  spaces!     ' | xargs
# Prints:
#lousy spaces!

The xargs command, as a side-effect, strips out leading and trailing whitespaces.

If you think about what its doing, it makes sense; it's taking each unescaped "word" string, and sending it to stdout (or wherever). It doesn't care about whitespace, so it gets truncated along the way.

Benefits:

Drawbacks:


The sed method is not something I've used personally. It's something sed was created for - stream editing - but seems like a costly solution to the problem.

It relies on regular expressions, which are usually slower than other forms of text manipulation. This method is probably plenty performant, but I tend to leave any regex as a last resort.

echo '   lousy  spaces!     ' | sed 's/^[[:space:]]\+//' \
 | sed 's/[[:space:]]\+$//'

I've included the long [[:space:]] format because it's compatible with non-GNU sed, but if you're rockin' the GNU toolchain, you can use \s as a terse and sensible replacement.

Benefits:

Drawbacks:


The real solution would be to quit doing so much data processing in the shell, and move to an actual scripting/programming language. However, there's a delicate balance between what belongs in the shell, and what needs its own fully-fledged script; and these techniques have let me accomplish a lot in scripts that didn't need a full programming language backing it.

Update

hackerdefo has submitted several other excellent methods for your consideration. I'm particular fond of the awk implementation myself.

Note: The tr method removes all spaces, not just leading and trailing.

echo -e " Fragmented Development " | tr -d "[:blank:]"
echo -e " Fragmented Development " | awk '{$1=$1};1'
echo -e " Fragmented Development " | ruby -pe 'gsub(/^\s+/, "").gsub(/\s+$/, $/)'
echo -e " Fragmented Development " | perl -plne 's/^\s*//;s/\s*$//;s/\s+/ /;'

Tags: linux terminal


Comments

You can have multiple pattern commands with sed, so you only need to call it once. If sed isn't installed the "actual" programming language might not either.

Nat!


Add Your Comment