>

How to Get the List You Really Want: Why Identifiers Matter in List Enhancement Data Requests

By Alex Yeh-Quevedo, Data Analytics Engineer 

Requests to add additional data to a list are common and often straightforward. The information exists, the desired output is clear, and the goal is simple: connect one dataset to another and return a more complete picture. 

Whether that connection happens smoothly depends on the structure of the file itself, specifically, the identifier it contains. 

University data systems (official databases like PeopleSoft, HCM, or FIS) link records using a stable key. A stable key, also called an identifier, is a value that uniquely points to one person and stays consistent even when names or email addresses change. A stable key allows us to line records up reliably across sources. When a list includes that key, the match is direct and dependable. When it does not, the connection becomes less certain—not because the request is unreasonable, but because names and emails (often suggested as ways to link lists by the requester) were never intended to be system-wide anchors. 

A spreadsheet with first name, last name, and email looks complete. In everyday work, those fields are usually enough to identify a person. In enterprise data environments, however, these data points are descriptive attributes, not stable keys required to make a definitive match. 

We know, of course, that names are neither unique nor stable. Individuals share them and change them. They also use nicknames and can opt to include or omit middle initials. Systems may also store names in different formats. For example, one system might show “Katherine O’Brien-Smith” while another shows “Smith, Katherine OBrien” or “Katie Smith,” which makes automated matching less predictable. 

Emails are often close to unique, but they are still not considered stable keys. Email addresses can change over time, forward to other accounts, or be stored differently across systems. Even when correct, an email address may not be the field a source system uses as its primary reference. 

When a file lacks a stable key, the task of joining datasets shifts into record comparison. Instead of matching one key to one record, we must decide which record “looks right” based on partial clues (name variants, departments, older emails, etc.). We may still get results, but, without a stable key, there is no built-in safeguard confirming that each match is the person you intended. 

When the appropriate identifier is present, the outcome is faster turnaround and far more confidence in the match. We can connect records directly, reduce follow-up questions, and make it easier to validate the results later if questions come up. 

Identifiers That Work Across Vanderbilt Systems 

For most requests to add data to a list, VUNetID is the most versatile identifier because it is used by all students, staff, and faculty. A “Person ID” from a source system, labelled Emplid (PeopleSoft), Person Number (HCM), or FIS ID (FIS), will work just as well as a VUNetID in most cases. However, if the request involves faculty data coming from the Faculty Information System (FIS), FIS_ID is the most dependable key.

Note: HCM also includes a field named Person_ID, which exists only in HCM and is not typically used to join across systems.

Selecting the identifier that aligns with the data source you are pulling from ensures that the match is both accurate and repeatable. 

The use of VUNetID as a unique identifier is why VUNetIDs are almost never changed even when a person’s name changes and their email address is updated. VUNetIDs function in Vanderbilt’s system as arbitrary unique identifiers that happen to be associated with one version of the person’s name.   

Preparing a File for Additional Data Requests 

To support timely and accurate results, include the identifier that corresponds to the source system. The following points and questions are helpful to review as a check before submitting a request to match data.  

Recommended columns: 

  • Identifier (VUNetID, Emplid, Person Number, or FIS ID, as appropriate) 
  • Name (helpful for readability and verification) 
  • Email (useful as a reference field, though not ideal as a join key) 

Providing brief context is also helpful: 

  • Who is represented on the list? (faculty, staff, students, or a combination) 
  • What information should be added (e.g., faculty title, tenure status, department, job title)? If you know where that information lives (FIS, HCM, Student, etc.), include it, and if you do not, a plain-language description of what you need is enough for us to confirm the right source. 
  • If timing matters, include an “as of” date or academic term to ensure the data reflects the correct point in time. 

This information allows the work to proceed efficiently while maintaining confidence in the results.