A data dictionary

What Is a Data Dictionary?

A data dictionary is the set of data names, attributes, rules, locations, types, characteristics, allowable entries and other elements that go into specifying the contents of a database. It is often referred to as the Metadata of the database — this is the detail about the data in the database. A data dictionary can also specify if a field is mandatory or optional. Mandatory fields lead to those “hard stops” you might encounter, for example, when filling out an internet form or other computerized data form. In the database, all items should be defined and this is completed via a Data Dictionary.

Why Use a Data Dictionary?

Names matter. What we call the data items, how each one is stored, and standards for completion, for example, are all included as details in a data dictionary. One system might call a patient’s admission date the “Admit date” while another names the field the “Admission date” and yet another calls it just “ADdate.” This is the detail about “the data in the database.” These names will matter when we go to transfer or merge data sets, for example. If the names are not consistent, then it takes a great deal more programming to make the systems merge and communicate.

You may have observed some “JOIN” statements in SQL. Those Join statements absolutely depended on being able to have the same names across different tables, to make the connections between tables and pull all the data as asked for/queried from the database.

Data dictionaries should be available to departments entering and using the data, not just to the Database Administrator. This will assure consistency from manager to manager, and department to department, as each element is used. Also, any data analyst who fulfills management requests for data pulls will need the detailed data dictionary and data descriptions.

No employee should be able to edit and change the data dictionary, though, except for authorized database and systems administrators. Other departments may be able to submit suggestions, and even be asked to participate in that process, but ultimately the data dictionary must reflect the actual structure of the system. The database and data dictionary should be updated periodically, in fact, but editing should remain under tight control.

What Does a Data Dictionary Look Like?

It is presented as a table — actually, a table with information about the data tables. Instead of the attribute/field names going across the top row though, each field name is listed down the first column on the left. The other details about each field, are specified in in the same row as the field/attribute’s name, within appropriately named columns.

Below is one example of a data dictionary.

Maternity Care Data Dictionary

Attribute Type Length Definition, format Mandatory field y/n
Patient_ID autonumber 4 Unique patient identifier Y
Patient Lname text 35 Legal last name, hyphens allowed Y
Patient Fname text 30 Legal first name Y
Middle name text 30 Middle name N
DOB Date/time 8 MMDDYYYY Y
Date pregnancy confirmed Date/time 8 Date pregnancy confirmed by self-test or visit N
Method pregnancy confirmed text 2 Self-test (ST) or by practitioner (PR) N
Date first maternal medical visit Date/time 8 Date first seen in clinic Y
Due date Estimate Date/time 9 Estimated due date for full term pregnancy Y
Marital status text 1 M=married S= single C= other committed/domestic partnership N
Para Number   Number prior pregnancies carried to term Y
Number living children Number   Children currently alive, who were born to this mother N
Maternal vitamins started text 8 Y/N N

Benefits of a Data Dictionary

The data dictionary is the first line of effort in assuring data quality and consistency. It improves our standardization and control over how items are defined in the database. When new items are added to the database, they will also be standardized across the hospital or healthcare system.

With improved documentation, better decision making can occur. Also, those who run data analysis and reports will know what is and is not included in each field. This is especially important when a data item is not self-explanatory.

As a department focused on assuring data integrity, the HIM department plays a key role in making sure that database definitions are well-maintained and accurate. Social, legal and political changes may result in changes to the database. For example, historically, it was not always possible for a woman to test and confirm her own pregnancy prior to seeing a medical practitioner. Now, confirming a pregnancy prior to a doctor visit is a common occurrence. The database needs to reflect current medical practice and guidelines.

Data Integrity

Overall, our data documentation and security actions will work together to assure data integrity. Key concepts of medical record integrity and accountability include its:

· Confidentiality

· Availability, including need-to-know basis

· Accountability

· Non-repudiation

Non-repudiation involves being able to prove that an action took place and is verifiable. Database system usage logs are good records of who accessed the data and when. The individual who completed a system entry, if that medical Order or action occurred under their authenticated signature, should not be able to deny that they were responsible for the entry.

Thus, we can see that all these elements work together to assure the usability and integrity of the information developed and maintained in the process of patient care.