Citation: Salimah H. Meghani, Eeeseung Byun, Jesse Chittams. Conducting Research with Vulnerable Populations: Cautions and Considerations in Interpreting Outliers in Disparities Research[J]. AIMS Public Health, 2014, 1(1): 25-32. doi: 10.3934/publichealth.2014.1.25
[1] | Erin Linnenbringer, Sarah Gehlert, Arline T. Geronimus . Black-White Disparities in Breast Cancer Subtype: The Intersection of Socially Patterned Stress and Genetic Expression. AIMS Public Health, 2017, 4(5): 526-556. doi: 10.3934/publichealth.2017.5.526 |
[2] | Eleni L. Tolma, Kimberly Engelman, Julie A. Stoner, Cara Thomas, Stephanie Joseph, Ji Li, Cecily Blackwater, J. Neil Henderson, L. D. Carson, Norma Neely, Tewanna Edwards . The Design of a Multi-component Intervention to Promote Screening Mammography in an American Indian Community: The Native Women’s Health Project. AIMS Public Health, 2016, 3(4): 933-955. doi: 10.3934/publichealth.2016.4.933 |
[3] | LacreishaEjike-King, RashidaDorsey . Reducing Ex-offender Health Disparities through the Affordable Care Act: Fostering Improved Health Care Access and Linkages to Integrated Care. AIMS Public Health, 2014, 1(2): 76-83. doi: 10.3934/publichealth.2014.2.76 |
[4] | Karent Zorogastua, Pathu Sriphanlop, Alyssa Reich, Sarah Aly, Aminata Cisse, Lina Jandorf . Breast and Cervical Cancer Screening among US and non US Born African American Muslim Women in New York City. AIMS Public Health, 2017, 4(1): 78-93. doi: 10.3934/publichealth.2017.1.78 |
[5] | Lee R Mobley, Tzy-Mey (May) Kuo . Geographic and Demographic Disparities in Late-stage Breast and Colorectal Cancer Diagnoses Across the US. AIMS Public Health, 2015, 2(3): 583-600. doi: 10.3934/publichealth.2015.3.583 |
[6] | Marybeth Gasman, Tiffany Smith, Carmen Ye, Thai-Huy Nguyen . HBCUs and the Production of Doctors. AIMS Public Health, 2017, 4(6): 579-589. doi: 10.3934/publichealth.2017.6.579 |
[7] | Quynh Nhu (Natasha) B. La Frinere-Sandoval, Catherine Cubbin, Diana M. DiNitto . Perceived neighborhood social cohesion and cervical and breast cancer screening utilization among U.S.-born and immigrant women. AIMS Public Health, 2022, 9(3): 559-573. doi: 10.3934/publichealth.2022039 |
[8] | Christian E. Vazquez, Catherine Cubbin . Associations between breastfeeding duration and overweight/obese among children aged 5–10: a focus on racial/ethnic disparities in California. AIMS Public Health, 2019, 6(4): 355-369. doi: 10.3934/publichealth.2019.4.355 |
[9] | Dreaves Hilary A . How Health Impact Assessments (HIAs) Help Us to Select the Public Health Policies Most Likely to Maximise Health Gain, on the Basis of Best Public Health Science. AIMS Public Health, 2016, 3(2): 235-241. doi: 10.3934/publichealth.2016.2.235 |
[10] | Catherine Cubbin, Katherine Heck, Tara Powell, Kristen Marchi, Paula Braveman . Racial/Ethnic Disparities in Depressive Symptoms Among Pregnant Women Vary by Income and Neighborhood Poverty. AIMS Public Health, 2015, 2(3): 411-425. doi: 10.3934/publichealth.2015.3.411 |
Addressing the needs of understudied and vulnerable populations first and foremost necessitate the correct application and interpretation of research that is designed to understand sources of disparities in healthcare or health systems outcomes. In this brief research report, we discuss some important concerns and considerations in handling “outliers” when conducting disparities-related research. To illustrate these concerns, we use data from our recently completed study that investigated sources of disparities in cancer pain outcomes between African Americans and Whites with cancer-related pain.
Undertreatment of pain in the United States has been characterized by the recent Institute of Medicine report as a public health “crisis, ” with an accompanying fiscal burden of up to $635 billion annually [1]. Approximately 14 million Americans are living with the diagnoses of cancer and an additional 1.6 million people are diagnosed with cancer each year [2]. While adequate pain management remains a challenge for all cancer patients, African Americans represent a unique group suffering disproportionally as a result of cancer and cancer pain. Compared to Whites, African Americans have higher rates of cancer and co-morbid conditions and are more likely to seek health care in advanced stages of their disease [3]. Despite this, consistent evidence suggests that African American patients have worst cancer pain outcomes of all racial and ethnic groups due to not only inadequate prescription [4,5,6,7,8] but also lack of adherence to analgesics even when they are prescribed to them [9,10]. The reasons for lack of adherence to analgesia, however, have not been fully investigated. To this end, we designed a choice-based conjoint analysis (CBC) experiment to understand the heuristics and salient concerns underlying analgesic treatment decision-making for African Americans and Whites with cancer-related pain.
CBC is a trade-off analysis technique to understand what people value and what drives them to choose one set of alternatives over another when faced with competing choices [11]. By asking individuals to make trade-offs between an important but limited set of attributes, a unique set of values (“part-worth utilities”) can be derived. These part-worth utilities model the underlying latent preference function such that a higher part-worth utility represents a higher value an individual assigns to that attribute [12].
In our study, the construct of interest was preferences for analgesic treatment for cancer pain. Based on pilot work, a randomized-design, computer-assisted CBC experiment was developed using 5 key attributes: type of analgesic; expected pain relief; type of side-effects; severity of side-effects; and out-of-pocket cost (see Meghani, Chittams, Hanlon & et al., 2013, for detailed description of CBC methods) [13]. The relative importance scores (utilities) of each of these 5 attributes were measured on a continuous scale. The main findings were that, on average, African Americans and Whites employed different heuristics in pain treatment decision-making. African Americans were more likely than Whites to make cancer pain treatment decisions based on type of analgesic side-effects (see Table 1).
CBC Attribute | Whites (N = 139) | African Americans (N = 102) | p-values |
CBC= Choice-based Conjoint Analysis | |||
Pain Relief with Analgesics | 36.71‡ | 26.83‡ | < 0.001 |
Type of Analgesic Side-effects | 19.29‡ | 28.72‡ | < 0.001 |
Severity of Side-effects | 18.55‡ | 16.81‡ | 0.225 |
Type of Analgesic | 13.52‡ | 16.66‡ | 0.176 |
Out of Pocket Cost | 11.93‡ | 10.98‡ | 0.355 |
Pertinent to the present report, we evaluated the CBC utilities statistically to understand if there were any outliers or systematic patterns to the distribution of these salient variables by racial subgroups. An outlier is an observation further away from the rest of the data usually at least 3 standard deviations from the mean on the standardized scale. Outliers and influential points can be caused by random variations, measurement errors or “true heterogeneity” in a phenomenon [14]. As may be evident, for those conducting disparities-related research, it is critical to investigate the “true heterogeneity” hypothesis by investigating any systematic patterns within the distribution of extreme values—this has implications for correct statistical handling of outliers but more importantly for appropriate interpretation of the subgroup data and subsequent intervention/program development.
Participants were recruited from two outpatient oncology clinics of a tertiary academic medical center in Philadelphia. Patients were included in the study if they were self-identified African Americans or Whites, were at least 18 years of age, and had a diagnosis of solid tumor or myeloma, and cancer-related pain. All patients provided informed consent. The study was approved by the institutional review board of the University of Pennsylvania.
The CBC utilities were estimated using Sawtooth Software CBC/HB system [15]. To understand systematic differences in the distribution of outliers between the two groups, we conducted a test for influential points labeling them by respondent’s race/ethnicity and compared these values using histograms and box plots as well as checking highest or lowest values. The assessment was conducted in SPSS for Windows, version 20.0 (IBM Corp., NY, USA).
We define an outlier in a set of data to be an observation (or subset of observations) which appears to be inconsistent with the remainder of that set of data. Statistical calculations can answer this question: If the values were all sampled from a Gaussian (“normal”) distribution, what is the chance that one value will be far away from the rest? Thus, a useful way to quantify an extreme value is by the number of standard deviations that a value is from the mean. This statistic applied to the most extreme value in a sample is called the Extreme Studentized Deviate (or ESD) and is defined as follows: maxi=1, .., n|Yi-y|/S, where y is estimated by the sample mean, and S is estimated by the sample standard deviation [16]. The appropriate critical values depend on the sampling distribution of the ESD statistic for samples of size n from a normal distribution. A more general rule of thumb is to consider any observation greater than 3 standard deviations from the mean as a potential outlier.
The sample size was 241(African Americans = 102; Whites = 139). There was no difference in age between African Americans and Whites (p = 0.194). However, African Americans were more likely females (p = 0.019), belonged to a lower income bracket (p < 0.001), and were less likely to carry private health insurance when compared to Whites (p < 0.001; see Table 2).
Variable | Total (N = 241) | African Americans (N = 102) | Whites (N = 139) | p-values† |
†p-values are based on t-tests for continuous variables and chi-squared tests for categorical variables. | ||||
Mean (SD) | ||||
Age | 53.7 (11.0) | 52.7 (10.1) | 54.5 (11.6) | 0.194 |
Frequency (%) | ||||
Gender | 0.019 | |||
Male | 111 (46) | 38 (37) | 73 (53) | |
Female | 130 (54) | 64 (63) | 66 (47) | |
Marital Status | < 0.001 | |||
Married | 133 (55) | 33(32) | 100 (72) | |
Separated/Divorced/Widowed | 62 (26) | 42 (41) | 20 (14) | |
Never Married | 46 (19) | 27(27) | 19 (14) | |
Education | 0.011 | |||
Elementary | 3 (1) | 2 (2) | 1 (2) | |
High School | 84 (35) | 42 (41) | 42 (42) | |
College/Trade School | 117 (49) | 51 (50) | 66 (51) | |
More Than College | 37 (15) | 7 (7) | 30 (7) | |
Income | < 0.001 | |||
< 30, 000 | 85 (35) | 57 (56) | 28 (20) | |
30–50, 000 | 44 (18) | 26 (25) | 18 (13) | |
50–70, 000 | 41 (17) | 13 (13) | 28 (20) | |
70–90, 000 | 25 (11) | 3 (3) | 22 (16) | |
![]() | ||||
Health Insurance | < 0.001 | |||
Private | 123 (51) | 30 (29) | 93 (67) | |
Medicaid | 33 (14) | 28 (27) | 5 (4) | |
Medicare | 50 (21) | 25 (25) | 25 (18) | |
Other | 34 (14) | 19 (19) | 15 (10) |
CBC utilities had a very clear pattern of extreme values by racial subgroups. For instance, when compared to Whites, African Americans were disproportionately more likely to have extreme values for the utility of “side-effects” (see Figure 1). The systematic patterns of extreme values are consistent with the earlier findings of poor clinical management of pain and side-effects in African Americans [5,17,18]. We observed this pattern in other variables (e.g., pain levels and analgesic barriers) that pertained to the phenomenon of interest. These findings raise the need for additional conceptual and methodological considerations in handling outliers in disparities research.
An outlier is an observation further away from the rest of the data usually at least 3 standard deviations from the mean on the standardized scale. When outliers on the higher end of the distribution remain in the estimated models, these can result in over inflated means compared to those of models without outliers; thus, resulting in a poor estimate of the central tendency of the population. A histogram plot of the data may reveal the appearance of a log normal (right skewed) distribution (see Figure 2). The variance of a lognormal distribution is a function of the expected mean [19]. For instance, if a subgroup (as in African Americans in our study) has a significantly larger expected mean for a particular lognormal outcome variable, then the researcher may expect the subgroup to have more variability around their mean and more outliers.
It is critical to distinguish whether these outliers are potentially resulting from measurement errors, imply random variations or represent a true heterogeneity in the phenomenon. If outliers are accurate observations that reflect a true heterogeneity in the phenomenon, they could be interesting outliers. Interesting outliers are defined as data that have been regarded as outlying observations but these are not resulting from inaccuracies, such as errors in observations or coding [20]. It may be evident that these considerations are particularly salient in health disparities research where extreme values may actually be representative of a clinical reality, such as unequal treatment or disproportionate burden of symptoms in certain subgroups. Below, we suggest ways to identify and handle outliers in disparities research.
From a statistical perspective, a careful examination of the distribution of the outcome variable of interest could help reveal important racial disparities related outliers. An effective visual method may include box-and-whisker plots or stem and left plots with the race of the extreme observations displayed. Plotting residuals after estimating a model may also identify residuals that appear out of range. It is more likely that observations responsible for these large residuals are outliers [20]. When outliers are detected, it is important to make sure that these are not coding errors (such as a missing data code of 99). If these outliers are not coding errors, in general, estimating a model with or without the outlying cases can be considered. Explaining why these outlying cases are further away from the population of interest rather than removing these outliers from the model would reveal important findings in disparities research.
There may be a mediation effect of extreme values affecting the relationship between race and the outcome. The distribution issue should be addressed before considering the mediation theory. Initially, researchers may examine estimates of the central tendency such as: median, geometric and arithmetic mean, normality tests with or without outliers, or even a t-test between racial/ethnic groups. With many common inferential statistical methods, the focus is on measuring the central tendency, area where most of data is centered. A normal distribution assumption is required for a t-test when comparing two groups. When this assumption is not met, the impact of outliers and influential data can be diminished by a log transformation of the outcome variable or non-parametric method (e.g., Wilcoxon rank sum test). Since the arithmetic mean is influenced by outliers, it is often replaced by the median or geometric mean in those instances when the data is skewed. Robust approaches, such as generalized estimating equation methods focused on estimating mean population effects, can also be considered to handle outliers [20].
On the other hand, researchers may feel that these outliers represent an important sub-population deserving careful examination to determine if there is something that explains their poor outcome that may be potentially addressed with an intervention. The researchers may actually choose to conduct a case study on these outliers. Thus, removing or log transforming clinically important extreme values or robust approaches may represent a missed opportunity in understanding a potentially targetable area of intervention.
This study was supported by the ARRA Challenge Grant to Dr. Salimah H. Meghani from the National Institutes of Health/National Institute of Nursing Research (NIHRC1NR011591). The corresponding author, Dr. Eeeseung Byun, is currently supported by a training grant from the National Institutes of Health/National Institute of Nursing Research (T32 NR007088).
The authors have no conflicts of interest to disclose.
[1] | Institute of Medicine (2011) Relieving Pain in America: A Blueprint for Transforming Prevention, Care, Education, and Research. Washington, DC: The National Academies Press. |
[2] | National Research Council (2013) Delivering High-Quality Cancer Care: Charting a New Course for a System in Crisis. Washington, DC: The National Academies Press. |
[3] | American Cancer Society (2011) Cancer Facts & Figures for African Americans 2011-2012. Atlanta: American Cancer Society Inc. |
[4] | Smedley BD, Stith AY, Nelson AR (2002) Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Washington, DC: The National Academies Press. |
[5] |
Meghani SH, Byun E, Gallagher RM (2012) Time to take stock: a meta-analysis and systematic review of analgesic treatment disparities for pain in the United States. Pain Med 13: 150-174. doi: 10.1111/j.1526-4637.2011.01310.x
![]() |
[6] |
Anderson KO, Green CR, Payne R (2009) Racial and ethnic disparities in pain: causes and consequences of unequal care. J Pain 10: 1187-1204. doi: 10.1016/j.jpain.2009.10.002
![]() |
[7] |
Cintron A, Morrison RS (2006) Pain and ethnicity in the United States: A systematic review. J Palliat Med 9:1454-1473. doi: 10.1089/jpm.2006.9.1454
![]() |
[8] | Meghani SH, Polomano RC, Tait RC, et al. (2012) Advancing a national agenda to eliminate disparities in pain care: directions for health policy, education, practice, and research. Pain Med13: 5-28. |
[9] | Meghani SH, Hanlon A, Bubanj J, et al. (2013) Do self-reported analgesic barriers translate into objective analgesic adherence for cancer pain? J Pain 14: S38. |
[10] |
Rhee YO, Kim E, Kim B (2012) Assessment of pain and analgesic use in African American cancer patients: factors related to adherence to analgesics. J Immigr Minor Health 14:1045-1051. doi: 10.1007/s10903-012-9582-x
![]() |
[11] |
Green P, Rao V (1971) Conjoint measurement for quantifying judgmental data. J Mark Res 8:355-363. doi: 10.2307/3149575
![]() |
[12] | Orme BK (2006) Getting started with conjoint analysis: Strategies for product design and pricing research. Madison: Research Publishers LLC. |
[13] | Meghani SH, Chittams J, Hanlon A, et al. (2013) Measuring preferences for analgesic treatment for cancer pain: How do African Americans and Whites perform on choice-based conjoint analysis experiments. BMC Med Inform Decis Mak 12: 118 |
[14] | Barnett V, Lewis T (1994) Outliers in Statistical Data, 3 Eds. , Chichester: John Wiley & Sons Ltd. |
[15] | Sawtooth Software, Inc (2009) The CBC/HB System for Hierarchical Bayes Estimation Version 5. 0 Technical Paper. Sequim: Sawtooth Software, Inc. |
[16] | Rosner B (2006) Fundamentals of Biostatistics, 6 Eds. , Belmont: Thompson Brooks/Cole, 325. |
[17] |
Anderson KO, Green CR, Payne R (2009) Racial and ethnic disparities in pain: causes and consequences of unequal care. J Pain 10: 1187-1204. doi: 10.1016/j.jpain.2009.10.002
![]() |
[18] |
Cintron A, Morrison RS (2006) Pain and ethnicity in the United States: A systematic review. J Palliat Med 9: 1454-1473. doi: 10.1089/jpm.2006.9.1454
![]() |
[19] | Krishnan, V. (2006) Probability and Random Processes. Hoboken: John Wiley & Sons, Inc. |
[20] |
Aguinis H, Gottfredson RK, Joo H (2013) Best-Practice Recommendations for Defining, Identifying, and Handling Outliers. Organ Res Meth 16: 270-301. doi: 10.1177/1094428112470848
![]() |
1. | Staja Q. Booker, Keela A. Herr, Toni Tripp-Reimer, Culturally Conscientious Pain Measurement in Older African Americans, 2016, 38, 0193-9459, 1354, 10.1177/0193945916648952 | |
2. | Andrea Aternali, P. Maxwell Slepian, Hance Clarke, Karim S. Ladha, Rita Katznelson, Karen McRae, Ze'ev Seltzer, Joel Katz, Presurgical distress about bodily sensations predicts chronic postsurgical pain intensity and disability 6 months after cardiothoracic surgery, 2022, 163, 0304-3959, 159, 10.1097/j.pain.0000000000002325 | |
3. | Katie Fitzgerald Jones Jessica S. merlin, Julie W. Childers, 2023, 9780323847025, 85, 10.1016/B978-0-323-84702-5.00011-7 | |
4. | Gillian A.H. Klassen, Darrell Cole, Reg Klassen, Tyson MacGillvary, Theresa Nepinak, Jim Murray, Cynthia Nepinak, Craig Park, Shawn Oswold, Michael Hoover, Yaniv Loran, Dawn Sutherland, Jacob A. Burack, An exploration of self-continuity for rural Indigenous youth: Considering the influence of community and cultural factors on perceiving oneself across time, 2024, 1363-4615, 10.1177/13634615241260624 |
CBC Attribute | Whites (N = 139) | African Americans (N = 102) | p-values |
CBC= Choice-based Conjoint Analysis | |||
Pain Relief with Analgesics | 36.71‡ | 26.83‡ | < 0.001 |
Type of Analgesic Side-effects | 19.29‡ | 28.72‡ | < 0.001 |
Severity of Side-effects | 18.55‡ | 16.81‡ | 0.225 |
Type of Analgesic | 13.52‡ | 16.66‡ | 0.176 |
Out of Pocket Cost | 11.93‡ | 10.98‡ | 0.355 |
Variable | Total (N = 241) | African Americans (N = 102) | Whites (N = 139) | p-values† |
†p-values are based on t-tests for continuous variables and chi-squared tests for categorical variables. | ||||
Mean (SD) | ||||
Age | 53.7 (11.0) | 52.7 (10.1) | 54.5 (11.6) | 0.194 |
Frequency (%) | ||||
Gender | 0.019 | |||
Male | 111 (46) | 38 (37) | 73 (53) | |
Female | 130 (54) | 64 (63) | 66 (47) | |
Marital Status | < 0.001 | |||
Married | 133 (55) | 33(32) | 100 (72) | |
Separated/Divorced/Widowed | 62 (26) | 42 (41) | 20 (14) | |
Never Married | 46 (19) | 27(27) | 19 (14) | |
Education | 0.011 | |||
Elementary | 3 (1) | 2 (2) | 1 (2) | |
High School | 84 (35) | 42 (41) | 42 (42) | |
College/Trade School | 117 (49) | 51 (50) | 66 (51) | |
More Than College | 37 (15) | 7 (7) | 30 (7) | |
Income | < 0.001 | |||
< 30, 000 | 85 (35) | 57 (56) | 28 (20) | |
30–50, 000 | 44 (18) | 26 (25) | 18 (13) | |
50–70, 000 | 41 (17) | 13 (13) | 28 (20) | |
70–90, 000 | 25 (11) | 3 (3) | 22 (16) | |
![]() | ||||
Health Insurance | < 0.001 | |||
Private | 123 (51) | 30 (29) | 93 (67) | |
Medicaid | 33 (14) | 28 (27) | 5 (4) | |
Medicare | 50 (21) | 25 (25) | 25 (18) | |
Other | 34 (14) | 19 (19) | 15 (10) |
CBC Attribute | Whites (N = 139) | African Americans (N = 102) | p-values |
CBC= Choice-based Conjoint Analysis | |||
Pain Relief with Analgesics | 36.71‡ | 26.83‡ | < 0.001 |
Type of Analgesic Side-effects | 19.29‡ | 28.72‡ | < 0.001 |
Severity of Side-effects | 18.55‡ | 16.81‡ | 0.225 |
Type of Analgesic | 13.52‡ | 16.66‡ | 0.176 |
Out of Pocket Cost | 11.93‡ | 10.98‡ | 0.355 |
Variable | Total (N = 241) | African Americans (N = 102) | Whites (N = 139) | p-values† |
†p-values are based on t-tests for continuous variables and chi-squared tests for categorical variables. | ||||
Mean (SD) | ||||
Age | 53.7 (11.0) | 52.7 (10.1) | 54.5 (11.6) | 0.194 |
Frequency (%) | ||||
Gender | 0.019 | |||
Male | 111 (46) | 38 (37) | 73 (53) | |
Female | 130 (54) | 64 (63) | 66 (47) | |
Marital Status | < 0.001 | |||
Married | 133 (55) | 33(32) | 100 (72) | |
Separated/Divorced/Widowed | 62 (26) | 42 (41) | 20 (14) | |
Never Married | 46 (19) | 27(27) | 19 (14) | |
Education | 0.011 | |||
Elementary | 3 (1) | 2 (2) | 1 (2) | |
High School | 84 (35) | 42 (41) | 42 (42) | |
College/Trade School | 117 (49) | 51 (50) | 66 (51) | |
More Than College | 37 (15) | 7 (7) | 30 (7) | |
Income | < 0.001 | |||
< 30, 000 | 85 (35) | 57 (56) | 28 (20) | |
30–50, 000 | 44 (18) | 26 (25) | 18 (13) | |
50–70, 000 | 41 (17) | 13 (13) | 28 (20) | |
70–90, 000 | 25 (11) | 3 (3) | 22 (16) | |
![]() | ||||
Health Insurance | < 0.001 | |||
Private | 123 (51) | 30 (29) | 93 (67) | |
Medicaid | 33 (14) | 28 (27) | 5 (4) | |
Medicare | 50 (21) | 25 (25) | 25 (18) | |
Other | 34 (14) | 19 (19) | 15 (10) |