Privacy and Health Research. "Data" Vocabulary

05/01/1997

Although definitions need not be belabored here, a few concepts and items of vocabulary are necessary.

Data is taken to mean discrete bits of information. As one dictionary has it: "Data are facts or figures from which conclusions may be inferred." For most research now, data are converted into numerical form for processing by computers.

Data-subjects are the people about whom data are collected.

Databases are collections of data, recorded in standardized fashion, ordered for reference or research purposes.

Database research, then, is research that analyzes data in such collections.

Information is data set within a context of meaning. Raw data (such as lists of numbers that stand for blood-enzyme concentrations, or units on a mental-depression scale) make no "sense" as facts unless the measurement method and descriptive scale are known. And before any scientific meaning can be inferred, the data must be tied with data on other characteristics of the data-subjects and the circumstances.

Personally identifiable data are data that are associated with real persons, or that can be associated with real persons by deduction from descriptors such as birthdate, physical characteristics, occupation, residential location, social identification number, or history. Synonyms are "personal data" and "individually identifiable data." Often for brevity the descriptors, such as the person's name, that associate the data with a real person are referred to just as "identifiers."

Processing or handling of data, in an ethical or legal sense, may refer to recording, storing, retrieving, duplicating, transferring, destroying—in effect, any action through which someone may become cognizant of, or move, or alter, data.34 Verb lists of this kind are unavoidable; privacy of the data-subject can be affected by any such operations.


(34) The European Union Data Privacy Directive (95/46/EC) at Article 2(b) defines "processing" as being "any operation or set of operations which is performed upon personal data, whether or not by automatic means, such as collection, recording, organization, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, blocking, erasure or destruction."