Air toxics and the risk of autism spectrum disorder: the results of a population based case–control study in southwestern Pennsylvania

This study was approved by the University of Pittsburgh Institutional Review Board
(IRB number PRO10010240).

Ascertainment of cases

Cases of ASD for this study were children born between January 1, 2005 and December
31, 2009 in Allegheny, Armstrong, Beaver, Butler, Washington, or Westmoreland County
in southwestern Pennsylvania and who were currently residing in the six-county area.
Our goal was to enroll approximately half of the prevalent cases among this birth
cohort. Based on 23,399 births in 2007 in the six county area 22] and a prevalence of ASD of 6 per 1,000 (one in 166), we estimated that 140–141 children
per year would be diagnosed with ASD in the study area. We anticipated enrolling half
(70–71) of these children each year to result in 250 cases during the 3½?year period
of the study.

There is no autism registry in Pennsylvania, and therefore no centralized agency that
could be accessed for permission to contact parents of children with ASD for the purposes
of conducting a study. Our investigation used an extensive outreach campaign to recruit
ASD cases from a combination of 1) ASD specialty diagnostic and treatment centers,
2) private pediatric and psychiatry practices, 3) school-based special needs programs
(starting at age 5), and 4) autism support groups. Per our IRB guidelines, we were
not allowed to directly contact parents of children with ASD. Therefore, we provided
informational packets to these agencies and organizations with a letter, contact sheet
and pre-addressed envelope to be returned to our office, and these agencies mailed
the information to families with ASD. When a contact sheet was returned, we were permitted
to contact the mother to describe the study and request consent to participate.

A case of ASD was defined as any child 1) who scored a 15 or above on the Social Communication
Questionnaire (SCQ), a positive screen for the presence of autistic features, and
2) for whom there was written documentation, including ADOS or other test results,
of a diagnosis of an ASD from a child psychologist or psychiatrist. Cases were not
included in the study if the child was adopted, parents were not English speaking,
or a parent was not available for interview. A total of 217 cases were consented and
interviewed for the study.

Ascertainment of controls

The study was designed to have two different sets of controls. The first control group
(interviewed controls) was recruited from a random selection of 5007 births weighted
by sex (4:1 male:female) from the Pennsylvania Department of Health (PA DOH) state
birth registry files for 2005 to 2009 in the six-county area. Interviewed controls
were frequency matched to the cases on year of birth, sex, and race. We recruited
through a direct letter appeal signed by the Pennsylvania Secretary of Health. The
rules for both the PADOH IRB and that of the University of Pittsburgh mandated that
there would be no direct contact with potential controls, except for the opportunity
to return an envelope and contact sheet indicating refusal to be in the study or an
indication that the parent was interested in enrolling his or her child in the study
by providing contact information. We requested a postal service return to sender,
address correction requested. However, since we used an address from the birth certificate
that was several years old, it is likely that many of the letters were never delivered
to the intended resident or were simply ignored, making it difficult to determine
a true response rate.

After we obtained informed consent, parents were screened for inclusion criteria and
administered the SCQ for their child. Children with an SCQ 15 or with a reported
diagnosis of ASD were not included as controls. Other exclusion criteria were the
same as those for cases. The first control group consisted of 226 eligible controls
that were consented and interviewed.

For each of the cases and the interviewed controls, a personal interview with the
mother was conducted by trained interviewers using a structured questionnaire, adapted
from the CDC’s Study to Explore Early Development (SEED). The questionnaire included
parental demographic and socioeconomic information, a detailed residential history,
maternal and paternal occupational history, family history of ASD, smoking history,
maternal reproductive and pregnancy history, and child’s medical history. Data was
obtained on all residential addresses and the corresponding start and end dates that
the mother/child lived at those addresses from three months prior to last menstrual
period (LMP) until the child’s second birthday.

The second control group (birth certificate or BC controls) consisted of a random
sample of births occurring from 2005 to 2009 for the six county area of study, weighted
with a male to female ratio of 4:1 and year of birth. Birth certificate information
on the cases and controls, consisting of residence at birth, age of mother, smoking
history, maternal education, race and other infant characteristics, was then used
for the second case–control analysis. Of the total sample of 5,007 birth certificates,
16 were identified as being in our case (ASD) population and were removed from the
control group.

Exposure assessment

Exposure to ambient hazardous air pollution concentrations was estimated using modeled
data from the 2005 NATA assessment. The 2005 NATA estimates are an annual average
by census tract and were downloaded from the US EPA website (http://www.epa.gov/ttn/atw/nata2005/tables.html accessed April 16, 2014). Out of the 177 air toxics available through NATA, we examined
the distribution, variability, and correlations of 37 air toxics characterized as
having neurological, developmental or endocrine-disrupting effects by one of the previous
studies 17]–19] or the US EPA 20]. Seven chemicals (carbon tetrachloride, chloroform, ethylene dibromide, ethylene
dichloride, hexachlorobenzene, methyl chloride, and PCBs) were excluded from further
analysis due to little diversity in their distributions within the six-county area,
leaving a total of 30 NATA compounds for analysis.

For the analyses of the interviewed cases and controls, the residential addresses
obtained during the interview were geocoded to an X, Y coordinate using ArcGIS (version
10.1; ESRI Inc., Redlands, CA) and verified manually. When an address could not be
successfully geocoded in ArcGIS, other methods were used, including MapQuest Latitude/Longitude
Finder (http://developer.mapquest.com/web/tools/lat-long-finder). Year 2000 census tracts (11 digit FIPS codes) for each address were assigned using
ArcGIS 10.1, linking to 2009 Tiger Line files for the 2000 United States census. We
calculated person-specific exposure estimates for each of the air toxic compounds,
taking into account the locations of and changes in residence and the time spent at
each residence. For each child, average exposure estimates were computed for the time
periods of pregnancy, first year of life, and second year of life. Two participants
who lived at a residence outside of the United States for which no NATA data was available
were excluded from analysis, leaving an analytic group of 217 cases and 224 controls.

For the birth certificate data analysis, NATA concentrations were linked to census
tract of residence at birth. All births that could be linked to a PA DOH birth certificate
either contained the census tract of birth or the zip code of birth. When only zip
code was provided, the 2010 ZCTA shapefile was used to calculate the geographical
center of each zip code in ArcGIS 10.2. Then, each ZCTA centroid was spatially linked
to the 2000 census tract that contains it. Of the 217 cases, one of the births could
not be linked to its birth certificate, 187 had a census tract on the birth certificate,
and 29 only had a zip code of birth. Of the 5,007 potential controls, 16 births were
actually in our case population, 4,194 had a census tract on the birth certificate,
and 797 only had a zip code of birth. However, 20 control births could not be assigned
a NATA exposure: Eighteen could not be linked to a census tract as the documented
zip code was not in the 2010 ZCTA shapefile, and two had census tracts documented
on the birth certificate that did not match a census tract in the 2005 NATA database.
Therefore, the final population in the analysis of BC controls was 216 cases and 4,971
controls.

Statistical analysis

We used logistic regression to investigate the association between exposure to NATA
air pollutants and the risk of autism spectrum disorder. In order to calculate individual
odds ratios, quartile cut points were calculated for each of the 30 NATA pollutants.
These were based on the distribution among the interviewed controls for use in each
respective case–control comparison. The three highest quartiles were individually
compared to the lowest quartile. For the interviewed cases and controls, separate
logistic regression models were conducted for each pollutant during the pregnancy
period and secondarily for the first and second year of life. For the birth certificate
control comparison, only residence at the time of birth was available. All analyses
were adjusted for maternal age, education, race, smoking, child’s birth year and child’s
sex.

In addition to examining compounds individually, we also grouped compounds by structural
properties into three classifications: metals excluding selenium (arsenic, cadmium,
chromium, lead, manganese, mercury, and nickel), aromatic solvents (benzene, ethyl
benzene, styrene, toluene and xylenes), and chlorinated solvents (methylene chloride,
perchloroethylene, trichloroethylene, trichloroethane, and vinyl chloride). Index
scores were computed for each of the structural groups of metals, aromatic solvents,
and chlorinated solvents by summing the quartiles for the compounds in each group.
Similar to what was done for the individual compounds, quartile cut points of these
scores were calculated based on the distribution of the index scores among the interviewed
controls. Logistic regression models comparing highest quartiles to the lowest quartile
were conducted for each of the indices for both the interviewed and BC comparisons,
controlling for mother’s age, education, race, smoking, child’s year of birth and
sex. IBM SPSS Statistics 20 and 22 were used for all analyses. No formal adjustment
was made for multiple comparisons.

Additionally, we noted a significantly higher number of multiple births reported among
cases compared to controls (8.4 % among the cases; 4.0 % and 3.8 % among the interviewed
and birth certificate control groups, respectively). As there is a high rate of prematurity
and other problems associated with multiple births, we conducted a sensitivity analysis
with and without the inclusion of multiple births for both case–control comparisons.

One of the last steps involved a backward multiple logistic analysis of all agents
identified as significant in either case–control comparison with adjustment for mother’s
age, race, education, smoking, child’s birth year, and child’s sex. This was done
in order to consider the most significant effects of NATA compounds while controlling
for the same covariates that were used in the previous logistic regression models
for individual pollutants.

Finally, air toxics are often correlated with each other, and people are often simultaneously
exposed to a complex mixture of air pollutants. In our study, the Spearman correlation
matrix revealed that many of the air toxics were highly correlated (p??0.01). Similar to the methodology detailed by von Ehrenstein et al 21], we conducted a factor analysis to further examine the correlation structure of our
set of 30 air toxics. Factors were extracted using Principal Component Analysis (PCA)
and rotated using varimax rotation. The eigenvalue 1 rule was used to determine which
factors to retain 21].