SFARI Base
SFARI Base is a central database of clinical and genetic information about families affected by autism and other neurodevelopmental disorders, provided as part of the Simons Foundation Autism Research Initiative (SFARI). At present, the database contains phenotype data from the Simons Simplex Collection and the Simons Variation in Individuals Project.
SIMONS SIMPLEX COLLECTION (SSC)
By the time recruitment ended, SFARI had recruited 2,700 families into the simplex collection. You can review the list of instruments used in the SSC. For additional information about the collection, please review the SSC Researcher Welcome Packet.
Parallel genome-wide scans of the collection are being carried out by Michael Wigler’s laboratory at Cold Spring Harbor Laboratory in New York (using a NimbleGen platform) and by Matthew State at Yale University (using an Illumina platform). Their findings will be incorporated into future data distributions as they become available.
Approximately 4,000 to 6,000 phenotypic data points are available for each SSC family; these variables are defined in the available Data Dictionary. SFARI Base Variable Search is a tool that is available to all researchers who wish to explore the Simons Simplex Collection dataset prior to submitting an application. It is designed to assist researchers in identifying variables of interest to their research.
Researchers who are interested in the dataset are encouraged to use this tool to become familiar with the variables that are available. All information related to phenotype data and biological materials has been de-identified, i.e., recorded in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects (see CFR 45 Part 46.101(b)). The Institutional Review Protocol governing this collection is available, as is an example of a consent form used at one of the collection sites.
SIMONS VARIATION IN INDIVIDUALS PROJECT (SIMONS VIP)
Data collection is ongoing for the Simons VIP, with the goal of recruiting 100 individuals with a 16p11.2 deletion and 100 individuals with a 16p11.2 duplication.
Whole blood DNA is being extracted from blood and fibroblasts are cultured from skin biopsy samples at the Rutgers University Cell and DNA Repository in New Jersey.
The initial data release is now available to approved researchers, and data will be updated roughly quarterly. You can access the overall characteristics of the current data and biospecimens on SFARI Base.
As with the SSC, all information related to phenotype data and biological materials has been de-identified, i.e., recorded in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects (see CFR 45 Part 46.101(b)). The Institutional Review Protocol governing this collection is available, as is an example of a consent form used at one of the collection sites.
These collections are publicly available to researchers through an application process.






