CodeGnome Consulting, LTD

Programming - DevOps - Project Management - Information Security

Parsing Initials From GECOS

| Comments

Why Would You Need a User’s Initials?

Some programs want you to enter someone’s initials for tracking edits in a word-processing document, or perhaps you need to find all the user stories assigned to one person using their initials as search criteria for the Pivotal Tracker API. Whatever the reason, I’ve occasionally needed to have a programmatic way to extract a user’s initials.

Use the GECOS Field

The easiest way to do this is to use some basic shell utilities to parse the /etc/passwd file for the data you need. The fifth field of each record in this file is called the GECOS field. On Linux systems, it usually contains five comma-delimited fields.

  1. Full Name
  2. Room Number
  3. Work Phone
  4. Home Phone
  5. Other

There are some limitations—and some non-limitations—on the format of the GECOS field. According to the chfn(1) manual page:

These fields must not contain any colons. Except for the other field, they should not contain any comma or equal sign. It is also recommended to avoid non-US-ASCII characters, but this is only enforced for the phone numbers. The other field is used to store accounting information used by other applications.

However, you may also run into problems if you use quotes around your comma-separated values, so don’t do that. For example:

Problematic Full-Name Field
1
sudo chfn -f \""Chief Miles O'Brien\"" mobrien

Furthermore, what would you do with a full-name field that contained 'Chief Miles O'Brien' that has single-quotes both within and surrounding the name—besides bang your head on the desk, I mean? Just don’t do it.

Parsing the GECOS Field

Okay, so the GECOS field is what we want. How do we parse it?

Parsing the GECOS Field
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Get the full_name from the GECOS field. See chfn(1) and
# CHFN_RESTRICT from login.defs(5) for additional details.
grab-fullname () {
    local login="${1:-$LOGNAME}"
    getent passwd "$login" | cut -d: -f5 | cut -d, -f1
}

# Break names on spaces and hyphens.
grab-initials () {
    grep -E -o '[^[:space:]-]+' | cut -c1 | tr -d "\n" | tr a-z A-Z
}

# Pass a different login name to grab-fullname if you want initials
# for someone other than the current user.
export INITIALS=$(grab-fullname | grab-initials)

Note that I’m using a mix of Bash parameter expansions and shell utilities from GNU coreutils to get the results that I want. If you want a pure shell solution, or a completely portable script that’s free of bashisms, feel free to tinker.

Some Minor Caveats

This technique works pretty well, but has a few minor caveats related to the source data.

  1. If your name is “John Jacob Jingleheimer Schmidt” then your initials will be JJJS. We’re not enforcing a maximum length here.
  2. If your name is “Madonna” then your sole initial will be M. We’re not enforcing a minimum length, either.
  3. If your name is “Miles O'Brien” then your initials will be MO. That may or may not be what you expect, but it seems sensible to me.
  4. If you ignored the advice about quoting, and your GECOS field contains "John Jacob Jingleheimer Schmidt",,, then you will end up with "JJS. This is most certainly not what you want, and you will need to take the additional step of stripping off any double-quotes before splitting the name.
  5. If you populate your GECOS field with silly things like John "Q for Cute" Public,,, then you’ll get what you deserve: J"FCP.

As always, if you have special needs then you can fix your data, modify your algorithm, or adjust your expectations. Pick at least one.

Comments