Back to top

- Expectations Test

The Expectations Test is a child self-report measure developed to measure expectations for emotions, experiences, and efficacy for social situations in general and especially when there is a concern about sexual or physical abuse. In response to 16 ambiguous photographs of children, children are asked to describe how they think the child will feel, what will happen to the child, and if the child can control what will happen to him/her in each photograph.

Responses are scored for 9 categories of expectations: 1) Sexual Abuse, 2) Physical Harm, 3) Separation from a Parent, 4) Other
Negative Experience (e.g., time-out), 5) Emotional Distress, 6) Neutral, 7) Physical Contact (coded as a positive experience, e.g., being hugged or kissed), 8) Other Positive Situations, and 9) Unknown Outcome. Responses are also scored for five emotions (scared, sad, angry, fine, and happy). Scores for each of the five emotions and nine expectations are derived by summing answers across the 16 photographs. Scales for predicting a history of sexual abuse, a history of physical abuse, a history of exposure to family violence, and the level of posttraumatic stress can also be calculated.


Gully, Kevin, J., Ph.D.

Gully, K. J. (2003). Expectations Test: Professional Manual. Salt Lake City, UT: PEAK Ascent, L.L.C.

Cost Involved
Age Range: 
Measure Type: 
In-depth Assessment
Measure Format: 
Administered Assessment


Number of Items: 
Average Time to Complete (min): 
Reporter Type: 
Can readminister whenever deemed necessary.
Response Format: 

Children look at the photographs and choose from a series of emotions to describe how the child in the story feels, tell what will happen to the child, and state whether the child can control what is happening to them. Emotions include scared, sad, angry, fine, and happy.

Responses are classified into 9 categories of expected experience: 1) Sexual Abuse, 2) Physical Harm, 3) Separation from a Parent, 4) Other Negative Experiences, 5) Emotional Distress, 6) Neutral, 7) Physical Contact (as a positive experience), 8) Other Positive Situations, and 9) Unknown Outcome.

Materials Needed: 
Testing Stimuli
Sample Items: 
DomainsScaleSample Items
Social ExpectationsEmotionsResponse to 16 photographic images.
ExperiencesResponse to 16 photographic images.
Efficacy Do you think the child can stop that (expected experience) from happening?
Trauma Related IndicesSexual abuse
Physical abuse
Exposure to family violence
Posttraumatic stress
Information Provided: 
Areas of Concern/Risks
Continuous Assessment
Graphs (e.g. of elevated scale)
Program Evaluation Information
Raw Scores


Training to Administer: 
Training to Interpret: 
Prior Experience in Psych Testing/Interpretation

Parallel or Alternate Forms

Parallel Forms: 
Alternate Forms: 
Different Age Forms: 
Altered Version Forms: 


Clinical Cutoffs: 
Clinical Cutoffs Description: 

Borderline = 7th to 16th percentile or 84th to 93rd percentile; Clinical = below 7th percentile or above 93rd percentile

Internal ConsistencyAcceptableCronbach's Alpha0.420.820.65
Parallel/Alternate Forms
References for Reliability: 

Efficacy Test-Retest on clinical sample = .88 (Gully, 2003a). However, Theus (2002) reported test-retest for school sample: min=.25, max=.76, avg=.58 (depending on scale).

The difference in Theus’s study may be related to a deviation from the protocol.

INTERNAL CONSISTENCY (Cronbach’s alpha):
Emotions: Scared (.71), Sad (.65), Angry (.42), Fine (.65), Happy (.67)
Experience: Sexual Abuse (.82), Physical Harm (.77), Separation (.68), Other Negative
(.51), Distress (.52), Neutral, (.72), Physical Contact (.45), Other Positive (.69), Unknown

INTERRATER: Gully (2000) 21 children undergoing court evaluations had administrations of the Expectations Test videotaped. Videotapes were used to evaluate interrater reliability.

Expected emotions average K=.94 (range=.85-1), expected experiences (average K=.86 (range=.86-1). The range for expected experiences is given above. Gully (2003a) reported on the development of scales to predict histories of abuse.

TEST-RETEST RELIABILITIES (Intraclass correlation)
For 27 children beginning therapy or evaluation, assessed over one week:
Predicting a history of sexual abuse (.88)
Predicting a history of physical abuse (.94)
Predicting a history of exposure to family violence (.91)
Predicting posttraumatic stress (.92)

The internal consistency of the abuse-specific scales is as follows:
Predicting a history of sexual abuse (.79)
Predicting a history of physical abuse (.83)
Predicting a history of exposure to family violence (.78)
Predicting posttraumatic stress (.75)

References for Content Validity: 

There were initially 384 black-and-white photographs that were staged to be ambiguous. About 40 photographs were pilot tested based on a subjective judgment that some, but not all, children would expect that the child in the photograph was experiencing a distressing or abusive event. Approximately 130 children participated in the pilot testing.

A photograph was included in the final set of 16 photographs if the photograph: (a) provided a breadth of responses matching the histories of the children for psychopathology and abuse, (b) helped balance for gender, and (c) promoted a perception of ethnic diversity.

Construct Validity Evaluated: 
Construct Validity: 
Validity TypeNot knownNot foundNonclincal SamplesClinical SamplesDiverse Samples
Sensitive to ChangeYes
Intervention EffectsYes
Longitudinal/Maturation EffectsYes
Sensitive to Theoretically Distinct GroupsYesYes
Factorial ValidityYes
References for Construct Validity: 

Gully (2000) reports correlations between TSCC scales and scores on specific feelings and expectations that are in the expected direction. For example, expectations for Scared correlates with TSCC scales of Anger, Anxiety, Depression, and Posttraumatic Stress; and expectations for Sexual Abuse correlates with TSCC scales of Anxiety, Depression, Posttraumatic Stress, and Sexual Concerns.

Children receiving mental health or court evaluation services had higher expectations for sexual abuse and physical harm; and separate and fewer fine, neutral, or other positive expectations than did children recruited from an elementary school. However, the number of children in the elementary school sample was small (n=39).

Gully (2003a) examined the psychometrics of 4 additional Expectations scales. He reported some age differences with older children more likely to have higher scores on the Sexual Abuse and the Exposure to Family Violence scales, and no correlations between gender and ethnicity. The sample had a significant number of Hispanic/Latinos but was otherwise predominantly White. Sensitivity and Specificity rates are reported below.

Criterion Validity: 
Not KnownNot FoundNonclinical SamplesClinical SamplesDiverse Samples
Predictive Validity:Yes
Postdictive Validity: Yes
References for Criterion Validity: 

Gully (2003a):
Sensitivity and Specificity rates are provided for 4 trauma-related indices: 1) Sexual Abuse (Sensitivity = 81%; Specificity = 78%), 2) Physical Abuse
(Sensitivity = 84%; Specificity = 78%), 3) Exposure to Family Violence (Sensitivity = 65%; Specificity = 63%), and 4) Posttraumatic Stress (Sensitivity
= 86%; Specificity = 54%).

A Sexual-versus-Physical Abuse Index correctly classified 73% of the sexually abused children versus 61% of the physically abused children.
A Suicide Risk Index is provided (Sensitivity = 92%; Specificity = 46%). Across the classification functions the results were not as definitive when
applied to a new sample. The author recommends caution in use of the classification functions.

Overall Psychometric Limitations: 

1. Psychometrics obtained primarily by the author of the measure.
2. Low test-retest reliability and internal consistency found for some scales and in the Theus (2002) study. However, the trauma scales for sexual abuse, physical abuse, exposure to family violence, and posttraumatic stress have good internal consistency and test-retest reliability.



Population Information

Population Used for Measure Development: 

From a nonclinical sample 300 children were selected. Inclusion criteria included that, according to parents, children had never received mental health services and had not been sexually or physically abused.

Of these, 150 of the children were from two regular grade schools, 42 children were in a grade school for high-risk families, 21 children were selected from a child-supervision-andactivity program for high-risk families, and 87 children were from two high schools.

The majority of the sample (290/300) came from the Salt Lake City, Utah, urban area, while a small percentage (10/300) lived in a rural community in Nevada.

The sample of 300 children is balanced across aged 4-17 (M=10.4, SD=3.94); gender (56% female, 44% male), and ethnicity (Anglo-European = 54%, Hispanic/Latino = 28%, Other/ Mixed/Biracial = 9%, Not Indicated = 4%, African-American = 3%, Asian = 2%, and Pacific Islander = 1%).

Of the 300 children, 261 provided data on Efficacy because procedures for assessing efficacy were added during the initial data-collection phase.

Populations with which Measure Has Demonstrated Reliability and Validity: 
Physical Abuse
Sexual Abuse
Domestic Violence
Use with Diverse Populations: 
Population Type: Measure Used with Members of this GroupMembers of this Group Studied in Peer-Reviewed JournalsReliableGood PsychometricsNorms AvailableMeasure Developed for this Group
1. NetherlandsYes
2. MexicoYes

Pros & Cons/References


1. Simple to administer.
2. May be helpful in identifying children who have been sexually and physically abused.
3. Provides clinically relevant information that can be further explored in treatment.
4. A useful way to gather information from children across a wide age range that does not require reading skills or input from a caretaker.


1. This is a relatively new measure that is still in its beginning stages and is currently not widely used.
2. Basic psychometrics have been examined primarily by measure's author. Need additional studies to look at psychometrics, especially in light of the wide variation in the range of values given for test-retest reliability and internal consistency.
3. Going back to the photographs at the end of testing to ask about efficacy makes the testing choppy.
4. Administering this test requires more training than do many standardized measures.
5. While the trauma related indices are promising they require more research, especially given that the results were not as definitive when applied to a new sample.
6. The measure’s cost, given that it is still under development.

Author Comments: 

1. The Expectations Test was designed in part to provide valuable information about a child when a reliable informant might not be available (e.g., children entering foster care or children being examined as part of a custody evaluation) and to better ensure assessment is based on multiple methods and sources.
2. Chapter 3 in the Manual provides recommendations for interpretation of the results that is grounded in research findings and easy to understand.


The reference for the manual is:

Gully, K. J. (2003b). Expectations Test: Professional Manual. Salt Lake City, UT: Peak Ascent, L.L.C.
A PsychInfo search for the words “Expectations Test” and “Gully” anywhere revealed that the measure has been referenced in 3 peer-reviewed journal articles.

1. Brillesliiper-Kater, S. N., Friedrich, W. N., & Corwin, D. L. (2004). Sexual knowledge and emotional reaction as indicators of sexual abuse in young children: Theory and research challenges. Child Abuse and Neglect, 28(1), 1007-1017.
2. Gully, K. J. (2003a). Expectations Test: Trauma scales for sexual abuse, physical abuse, exposure to family violence, and posttraumatic stress. Child Maltreatment, 8(3), 218-229.
3. Gully, K. J. (2000). Initial development of the Expectations Test for children: A tool to investigate social information-processing. Journal of Clinical Psychology, 56(2),1551-1563.

Developer of Review: 
Kelly M. Ginn, M.A.
Editor of Review: 
Nicole Taylor, Ph.D., Robyn Igelman, M.A., Madhur Kulkarni, M.S., Chandra Ghosh Ippen, Ph.D.
Last Updated: 
Monday, December 2, 2013