Cancer Among Hispanics In New Jersey

Technical Notes

NJ Home Page DHSS Home Page Table of Contents

The New Jersey State Cancer Registry

The objectives of the New Jersey State Cancer Registry (NJSCR) are to:

The New Jersey State Cancer Registry is a population-based incidence registry that serves the entire State of New Jersey, with a population of approximately eight million people. The NJSCR was established by legislation (NJSA 26:2-104 et. seq.) and includes all cases of cancer diagnosed in New Jersey residents since October 1, 1978. New Jersey regulations (NJAC 8:57A) require the reporting of all newly diagnosed cancer cases to the NJSCR within three months of hospital discharge or six months of diagnosis, whichever is sooner. Reports are filed by hospitals, diagnosing physicians, dentists, and independent clinical laboratories. Every hospital in New Jersey is now reporting cancer cases electronically. In addition, reporting agreements are maintained with New York, Pennsylvania, Delaware, Florida, Maryland, and other states so that New Jersey residents diagnosed with cancer outside New Jersey can be identified.

NJSCR has collected data on Hispanic origin since 1979. The definition of this variable has been revised several times, most recently in 1987. The current Hispanic origin data element has nine categories including non-Spanish/non-Hispanic (0), Mexican (1), Puerto Rican (2), Cuban (3), South or Central American (4), Other Spanish (includes European), 5), Spanish, not otherwise specified (6), Spanish surname only (7), and Unknown whether Spanish or not (9).

All primary invasive and in situ neoplasms, except certain carcinomas of the skin and cervical cancer in situ diagnosed after 1994, are reportable to the NJSCR. The information collected by the NJSCR includes basic patient identification, demographic characteristics of the patient, medical information on each cancer diagnosis (such as the anatomic site, histologic type and summary stage of disease), and vital status (alive or deceased) determined annually. For deceased cases, the underlying cause of death is also included. The primary site, behavior, grade, and histology of each cancer are coded according to the International Classification of Disease for Oncology, 2nd edition.27 The NJSCR follows the data standards set by the North American Association of Central Cancer Registries (NAACCR), including the use of the Surveillance Epidemiology and End Results (SEER) multiple primary rules.28-32

The NJSCR is a member of NAACCR, an organization that sets standards for cancer registries, facilitates data exchange, and publishes cancer data. The NJSCR also has been a participant of the National Program of Cancer Registries sponsored by the Centers for Disease Control and Prevention since it began in 1994. In 2000, the NJSCR attained the NAACCR Gold Medal for high quality data for the third consecutive year.

Description of Algorithm for Designating Hispanic Ethnicity

The NJSCR used data on birthplace, marital status, race and surname to augment the number of reported cases and decedents with Hispanic ethnicity in the registry during 1990-1996. These years were selected because, beginning in 1990, reliable estimates of the Hispanic population by gender and age became available. At the time work on this report began, the most recent complete year of data available from the NJSCR was 1996.

The method used to assign Hispanic ethnicity to cases was adapted from algorithms developed by the Illinois State Cancer Registry (ISCR)1,2 and by the NJSCR. The ISCR used the 1990 Census surname list to classify surnames according to the percent of persons with that surname in the U.S. Census who identified themselves as Hispanic.

The ISCR evaluation of their algorithm concluded that 1) surnames and their relationships to Hispanic status presented in the 1990 Census surname list were very similar to those observed for Illinois cancer patients and decedents during years 1986-1996, 2) Hispanic non-U.S. birthplaces were demonstrated to be valid indirect identifiers of Hispanic status, and 3) exclusion of patients and decedents based on race, birthplace and/or surname status from indirect identification was shown to increase positive predictive values for Hispanic status.

The ISCR used the 1990 U.S. Census surname list to assign Hispanic ethnicity. The Census list includes 25,276 Spanish surnames, which were classified into 28 categories based upon the proportion of householders who identified themselves as Hispanic in the 1990 census. These categories were then collapsed into six broad categories: "heavily Hispanic", "generally Hispanic", "moderately Hispanic", "occasionally Hispanic", "rarely Hispanic", and "no match." These categories are defined as follows:

Spanish Surname ClassificationProportion of Householders who identified
themselves as Hispanic
Heavily Hispanic > 75%
Generally Hispanic 51% - 75%
Moderately Hispanic26% - 50%
Occasionally Hispanic6% - 25%
Rarely Hispanic<= 5%
no matchNo matching surname on the census list

Birthplace also plays a role in assigning Hispanic ethnicity. There were two groups of birthplaces pertaining to Hispanic ethnicity: (a) birthplaces associated with a high probability of Hispanic ethnicity, and (b) birthplaces associated with a high prevalence of Spanish surnames but low probability of Hispanic ethnicity. The groups are as follows:

High Probability of Hispanic EthnicityHigh Prevalence of Spanish Surnames
but Low Probability of Hispanic Ethnicity
Puerto Rico, Mexico, Cuba, Central America (Guatemala, Belize, Honduras, El Salvador, Nicaragua, Costa Rica, Panama), South America (Colombia, Venezuela, Ecuador, Peru, Bolivia, Chile, Argentina, Paraguay, and Uruguay), Spain including Canary Islands, Balearic Island, and Andorra.Atlantic/Caribbean Area (except Cuba and Puerto Rico); Panama Canal Zone, Brazil, Guyana, Surinam, Hawaii, French Guyana, Europe (except Spain) including Portugal; and Asia including the Philippines.

The procedures of the algorithm are summarized as follows.

  1. If the information received from the cancer reporting source has already identified the patient as Hispanic, then the case retains the classification of Hispanic ethnicity.
  2. If individuals have heavily Hispanic surnames (maiden names for ever-married women, last names for males, and last names for never-married women or ever-married women without maiden names), they are assigned Hispanic ethnicity with the following exceptions: 1) those who were born in a birthplace associated with high Spanish surname prevalence but low probability of Hispanic ethnicity are non-Hispanic, and 2) those who were American Indian, Filipino or Hawaiian are non-Hispanic.
  3. The algorithm assigns those whose birthplace is associated with a high probability of Hispanic ethnicity as Hispanic, except for cases whose surname appears in the rarely Hispanic or no match census Spanish surname categories.
As a result of using the above algorithm, the NJSCR was able to assign an additional 33% of cases as Hispanic to the incidence data and 29% to the mortality data for the period 1990-1996. This enhancement is consistent with that reported by the ISCR.

Data Sources

The cancer incidence data contained in this report are from the New Jersey State Cancer Registry (NJSCR), New Jersey Department of Health and Senior Services. For this report, incident cancer cases diagnosed only in the invasive cases are included (except for bladder cancer and for figures and tables involving stage at diagnosis); the in situ cases are excluded. The reason for excluding the in situ cases is that data on cancer incidence for the U.S. and other cancer registries published by the federal government do not include in situ cases or include in situ cases separately from the invasive cases. Following the SEER multiple primary rules, individuals could be counted more than once if they were diagnosed with two or more primary cancers.

The mortality data in this report originate in Vital Statistics, NJ Department of Health and Senior Services, and is subsequently processed by the Center for Health Statistics, NJ Department of Health and Senior Services.

Annual population estimates for New Jersey, used to calculate the incidence rates, for the years 1990 through 1996 are from the National Cancer Institute's Surveillance Epidemiology and End Results (SEER) program.

Data Quality

In 1998,1999, and 2000, NAACCR awarded the NJSCR its Gold Standard, the highest standard possible, for the quality of the 1995,1996, and 1997 data, respectively. The measures used to judge the quality of data are nationally based criteria assessing completeness, timeliness, quality and accuracy. These same quality indicators applied to earlier NJSCR data also have demonstrated a high degree of accuracy and reliability of the data presented in this report.

While our estimates of completeness are very high, some cases of cancer among New Jersey residents who were diagnosed and/or treated in other states may not yet have been reported to us by other state registries. This fact should be considered in interpreting the data for the more recent years. However, these relatively few cases would not significantly affect the cancer rates in these years, or alter the overall trends presented in this report.

Calculation of Rates

All the incidence rates were age-adjusted using the 1970 U.S. Standard Population. This allows comparisons among the rates by year, race, and geographic area. An explanation of why and how the incidence rates were age-adjusted follows.

Cancer occurs at different rates in different age groups, making age a very important risk factor for cancer. Therefore, incidence rates are frequently calculated separately for specific age groups. These rates are referred to as age-specific rates. The age-specific rate for a time period of length t is calculated as follows:



ra = the age-specific rate for age-group a,
na = the number of events (cancer diagnoses or deaths, for example) in age-group a during the time period,
t = the length of time in years, and
Pa = average size of the population in age-group a during time t (mid-year population or average of the mid-year populations).

Multiplying ra by 100,000 expresses the rate as the number of cases per 100,000 persons.

When comparing rates across different population subgroups, e.g. by race, or across different years, it is important to account for differences in age distributions. We calculate an age-adjusted rate using a weighted-average of the age-specific rates. This method of age adjustment is known as direct age-standardization. The age-adjusted rate is obtained by using the age distribution of a standard population as the weights:



R = the age-adjusted rate,
ra = the age-specific rate for age-group a, and
Std.Pa = the number of people in age group a of the standard population.

Multiplying the age-adjusted rate by 100,000 expresses it as the number of cases per 100,000 persons.

The standard population used for age adjustment throughout this report is the 1970 U.S. Standard Population. This is the traditional standard population used in much of the published cancer incidence data.

Also, rates based on low counts are very unstable and may fluctuate greatly from year to year due to chance and other factors. For this reason, rates based on low counts should be interpreted cautiously.

NJ Home Page DHSS Home Page Table of Contents