Unix Shell Patterns

Introduction

Whenever a pattern can be used to select files with wpkg, one can use a full Unix like Shell Pattern unless otherwise specified.

Examples:

*.txt
dev/sda[0-9]
home/*/.profile
usr/bin/*
usr/share/doc/*/copyright
var/log/*.???

The following describes the capabilities that are supported by the wpkg implementation.

API Detail: The exact function used for this purpose is called memfile::glob().

Asterisk (*)

The asterisk character can be used to match any character any number of times. It is often used to select all the files with a specific extension. For example, to select all the dynamic library files from a Linux project, the following pattern would be used:

*.so

Although this pattern will match the file named ".so", since it generally doesn't exist, it will still work exactly as expected. In most cases, a Unix shell ignores (or hides) files that start with a period (.) so it would look like ".so" doesn't match. This is not the case with the packager environment. In most cases all the files are included (although "." and ".." are automatically skiped as required.)

Question Mark (?)

The question mark matches any character, but exactly one character. If the end of the name was already reached then it is not a match (i.e. the question mark can only match an existing character other than the null terminator.)

In regard to the previous example, one of the following patterns would ensure that the ".so" file is not a match:

?*.so
*?.so

Character Set ([...])

A character set is delimited by square brackets. The brackets can include one of the following:

1) As the very first character: "!" or "^", which inverses the meaning of the set from include to except. In other words, if the character otherwise matches the content of the set, then fasle is returned. When the character doesn't match it returns true.

2) As the very first matching character: "-", when the set starts with a dash, it is taken literally meaning that the input filename may include a dash at that location.

3) Range defined as character, dash, character; for example, to accept all lowercase letters you write:

[a-z]

This character set means all the letters from 'a' to 'z'. In such a range, the letter on the right has to be larger or equal to the letter on the left. Any number of ranges or single character can be added. For example, to accept hexadecimal numbers one can use:

[0-9a-fA-F]

In this case we have 3 ranges: 0-9 for all digits, and then a-f and A-F for the first 6 letters of the alphabet.

The dash charater being used for ranges, there are only two ways to include it: (1) appear at very beginning; or (2) appear on one or both sides of a character set declaration:

[-0-9]
[0-9---]

These two sets are the same.

4) Any other character found in the character set as to be an exact match.

Exact Match

The other characters in the pattern are expected to exactly match the corresponding characters in the input filenames being checked.

For example the following pattern matches a path that ends with a slash:

*/