BDI-II: Beck Depression Inventory-II

Reviewed by: Constantin Rezlescu | Associate Professor | UCL Psychology

TL;DR

  • The BDI-II is the gold standard 21-item self-report measure for assessing depression severity in adolescents and adults, grounded in Beck's cognitive theory and aligned with DSM-5 criteria, with scores ranging 0-63 across minimal to severe depression categories.
  • With over 9,000 published studies, the BDI-II demonstrates excellent psychometric properties including internal consistency (α=0.92-0.93), test-retest reliability (r=0.93), and diagnostic accuracy (81% sensitivity, 92% specificity at cutoff ≥14), validated across 25+ countries and diverse populations.
  • The inventory is copyrighted by Pearson Assessments and requires purchase for legal use, taking 10-15 minutes to complete with applications spanning clinical trials, treatment monitoring, cognitive therapy integration, and comprehensive depression assessment in research and clinical settings.

Introduction

The Beck Depression Inventory-II (BDI-II) is one of the most extensively researched and widely used self-report measures for assessing the severity of depressive symptoms in adolescents and adults. Developed by Aaron T. Beck and colleagues in 1996 as a revision of the original 1961 BDI, this 21-item questionnaire represents the gold standard for measuring depression severity in clinical practice and research settings.

Dr. Aaron Beck, the father of cognitive therapy, created the BDI-II to capture the full spectrum of depressive symptoms with particular emphasis on the cognitive and affective components that are central to his cognitive theory of depression. Unlike brief screening tools designed primarily for case detection, the BDI-II was designed specifically as a comprehensive severity measure that provides detailed assessment of depression’s impact across multiple domains.

The BDI-II is the most cited depression assessment instrument in psychological research, with over 9,000 published studies demonstrating its clinical utility and validity across diverse populations and settings.

Depression Through the Cognitive Lens

Beck’s cognitive model of depression, which forms the theoretical foundation for the BDI-II, posits that depression arises from and is maintained by characteristic patterns of negative thinking. This includes the cognitive triad of negative views about:

  • The self – “I am worthless and inadequate”
  • The world – “Life is full of insurmountable obstacles”
  • The future – “Things will never get better”

These cognitive distortions lead to the emotional, behavioral, and physical symptoms that characterize clinical depression. The BDI-II systematically assesses both the cognitive-affective symptoms (sadness, pessimism, guilt, self-dislike) and somatic symptoms (fatigue, sleep disturbance, appetite changes) that comprise the depressive syndrome.

Theoretical Foundation

The BDI-II is grounded in Beck’s cognitive model of depression, which emphasizes the role of negative cognitive patterns in the development and maintenance of depressive symptoms. The inventory systematically evaluates both cognitive-affective symptoms and somatic symptoms, providing a comprehensive picture of depression severity that aligns with cognitive-behavioral therapy (CBT) principles.

The 1996 revision (BDI-II) updated the original inventory to better align with DSM-IV criteria (and subsequently DSM-5), while maintaining its foundation in cognitive theory. Key changes included:

Updated content:

  • Replaced items on body image, work difficulty, weight loss, and somatic preoccupation
  • Added items on agitation, worthlessness, concentration difficulty, and loss of energy
  • Better coverage of atypical symptoms (increased sleep, increased appetite)

Improved timeframe:

  • Extended reference period from one week to two weeks to match DSM criteria
  • Changed response format to better capture symptom severity gradations

Enhanced clinical utility:

  • Strengthened correspondence with diagnostic criteria while maintaining cognitive focus
  • Improved sensitivity to change for treatment monitoring
  • Better differentiation across severity levels

This theoretical grounding makes the BDI-II particularly valuable not just for assessment, but for understanding the cognitive mechanisms underlying depression and planning cognitive-behavioral interventions.

🎯 Clinical Gold Standard: The BDI-II is the most cited depression assessment instrument in psychological research, with over 9,000 published studies demonstrating its clinical utility and validity.

Key Features

Assessment Characteristics

  • 21 items covering comprehensive symptom domains
  • 10-15 minutes administration time
  • Ages 13+ through adult with extensive validation across age groups
  • 4-point severity scale (0-3) for each item
  • Two-week timeframe matching DSM-5 diagnostic criteria
  • Copyrighted measure requiring purchase from Pearson Assessments

Depression Dimensions Assessed

Cognitive-affective symptoms:

  • Sadness, pessimism, past failure
  • Loss of pleasure, guilty feelings
  • Punishment feelings, self-dislike
  • Self-criticalness, suicidal thoughts
  • Crying, agitation, loss of interest
  • Indecisiveness

Somatic symptoms:

  • Loss of energy, changes in sleeping pattern
  • Irritability, changes in appetite
  • Concentration difficulty, tiredness or fatigue
  • Loss of interest in sex, worthlessness

Research and Clinical Applications

  • Clinical trials – Gold standard outcome measure in depression research
  • Treatment monitoring – Track symptom changes across therapy sessions
  • Cognitive therapy – Perfect alignment with Beck’s therapeutic model
  • Severity assessment – Comprehensive evaluation beyond simple screening
  • Research standard – Most validated depression measure globally
  • Cross-cultural studies – Validated in 25+ languages and countries

View Testable Demo

► Click here to try the Testable implementation

Assess depression severity across cognitive, affective, and somatic symptom domains.

Scoring and Interpretation

Response Format

Each of the 21 items presents four statements reflecting increasing levels of symptom severity (scored 0-3). Participants select the statement that best describes how they have felt during the past two weeks, including today.

Sample BDI-II Item Structures

Item 1 (Sadness):

  • 0: I do not feel sad
  • 1: I feel sad much of the time
  • 2: I am sad all the time
  • 3: I am so sad or unhappy that I can’t stand it

Item 2 (Pessimism):

  • 0: I am not discouraged about my future
  • 1: I feel more discouraged about my future than I used to
  • 2: I do not expect things to work out for me
  • 3: I feel my future is hopeless and will only get worse

Item 9 (Suicidal Thoughts or Wishes):

  • 0: I don’t have any thoughts of killing myself
  • 1: I have thoughts of killing myself, but I would not carry them out
  • 2: I would like to kill myself
  • 3: I would kill myself if I had the chance

Complete BDI-II Content Domains

Items 1-13 (Cognitive-Affective): Sadness, pessimism, past failure, loss of pleasure, guilty feelings, punishment feelings, self-dislike, self-criticalness, suicidal thoughts or wishes, crying, agitation, loss of interest, indecisiveness

Items 14-21 (Somatic): Loss of energy, changes in sleeping pattern, irritability, changes in appetite, concentration difficulty, tiredness or fatigue, loss of interest in sex

Note: Some items allow selection of multiple response options for increase/decrease (e.g., sleep, appetite)

Scoring Procedure

  1. Sum all item responses (range: 0-63)
  2. Higher scores indicate greater depression severity
  3. Individual items can be examined for specific symptom patterns
  4. Item 9 requires immediate clinical attention if scored >0

BDI-II Severity Classification

Total ScoreSeverity Level
0-13Minimal depression
14-19Mild depression
20-28Moderate depression
29-63Severe depression

Clinical Cut-offs and Recommendations

  • Score ≥14: Suggests clinically significant depression requiring clinical attention
  • Score ≥20: Indicates moderate-to-severe depression requiring treatment intervention
  • Score ≥29: Severe depression; intensive treatment and close monitoring needed
  • Item 9 > 0: Immediate suicide risk assessment and safety planning required

Meaningful Change

  • ≥5 point reduction: Suggests clinically meaningful improvement (Dozois et al., 1998)
  • Score <14: Common remission criterion in clinical trials
  • 50% reduction: Often used as “responder” criterion in research

Research Evidence and Psychometric Properties

Reliability Evidence

  • Internal consistency: α = 0.92 for psychiatric outpatients, α = 0.93 for college students (Beck et al., 1996)
  • Test-retest reliability: r = 0.93 over one-week interval in outpatient sample (Beck et al., 1996)
  • Split-half reliability: r = 0.91-0.93 across diverse samples (Dozois et al., 1998)
  • Cross-cultural consistency: α = 0.84-0.94 across international samples (Wang & Gorenstein, 2013)

Validity Evidence

Convergent validity:

  • BDI-I correlation: r = 0.93, demonstrating excellent correspondence with original version (Beck et al., 1996)
  • Hamilton Depression Rating Scale: r = 0.71, strong correlation with clinician-rated measure (Beck et al., 1996)
  • Clinical diagnosis: Significant differences between depressed and non-depressed groups (Beck et al., 1996)
  • PHQ-9: r = 0.73-0.84 with DSM-based screening measure (various studies)

Discriminant validity:

  • Anxiety measures: r = 0.47-0.60, showing overlap but distinctiveness (Beck et al., 1996)
  • Hopelessness Scale: r = 0.68, expected relationship but distinct constructs (Beck et al., 1996)
  • Diagnostic specificity: Better discrimination of depression severity than other measures (Wang & Gorenstein, 2013)

Factor structure:

  • Two-factor model most common: Cognitive-affective and somatic factors (Dozois et al., 1998)
  • Alternative models: Some studies support single general factor or three-factor solutions
  • Clinical utility: Factor structure varies somewhat by population but total score remains most reliable (Wang & Gorenstein, 2013)

Diagnostic Accuracy

  • Optimal cutoff (≥14): Sensitivity 81%, specificity 92% for detecting major depression in primary care (Wang & Gorenstein, 2013)
  • Higher cutoffs: ≥20 provides higher specificity (96%) with reduced sensitivity (71%)
  • ROC analysis: Area under curve typically 0.85-0.95 across studies, indicating excellent discrimination (Wang & Gorenstein, 2013)

Treatment Sensitivity

  • Effect size detection: Large effect sizes (d = 0.8-1.5) for treatment response across therapy and medication trials (Beck et al., 1996)
  • Change sensitivity: More sensitive to treatment effects than clinician-rated measures in some studies (Dozois et al., 1998)
  • Progress tracking: Reliable indicator of symptom improvement across CBT, medication, and combined treatments (various studies)

Cross-Cultural Validation

  • Global validation: Validated in 25+ countries including Spain, China, Japan, Brazil, Iran, Turkey, and many others (Wang & Gorenstein, 2013)
  • Translation equivalence: Consistent psychometric properties across language versions (Wang & Gorenstein, 2013)
  • Cultural adaptations: Some variation in optimal cutoff scores (14-23 range) across cultures
  • International research: Standard measure in multinational treatment trials (various studies)

Special Populations

Adolescents (13-18):

  • Good reliability (α = 0.89-0.92) and validity for teen populations (Steer & Clark, 1997)
  • Same cutoffs generally applicable with clinical judgment

Older adults (65+):

  • Valid but may require higher cutoffs (≥16) due to medical comorbidity (various studies)
  • Somatic items may be elevated due to normal aging or medical conditions

Medical populations:

  • Elevated scores common due to somatic symptom overlap (various studies)
  • Consider using cognitive-affective subscale alone or higher cutoffs (≥16-18)
  • Remains valid for detecting depression in medical illness despite symptom overlap

Meta-Analytic Evidence

  • Comprehensive review: 118 studies examining BDI-II psychometric properties showing consistent excellence across clinical, psychiatric, and medical populations (Wang & Gorenstein, 2013)
  • Reliability generalization: Mean coefficient alpha 0.90 across all populations studied (Wang & Gorenstein, 2013)
  • Construct validity: Strong evidence for measuring depressive symptom severity as intended (Richter et al., 1998)

Usage Guidelines and Applications

Primary Clinical Applications

  • Comprehensive depression assessment in mental health settings for detailed severity measurement
  • Treatment outcome monitoring as gold standard for tracking therapy and medication response
  • Clinical trials research as primary or secondary outcome measure
  • Cognitive therapy alignment providing direct support for Beck’s therapeutic model
  • Symptom profiling for identifying specific intervention targets in treatment planning

Clinical Decision Support by Severity

Minimal depression (0-13):

  • Routine monitoring and preventive interventions
  • Psychoeducation about depression risk factors
  • Lifestyle recommendations (exercise, sleep hygiene, stress management)

Mild depression (14-19):

  • Consider psychoeducation and self-help resources
  • Brief counseling or supportive therapy
  • Monitor closely; may progress without intervention
  • Lifestyle interventions and behavioral activation

Moderate depression (20-28):

  • Psychotherapy (CBT, IPT) or medication recommended
  • Structured treatment plan with regular monitoring
  • Consider combination therapy if prior treatments unsuccessful
  • Weekly or biweekly assessment during active treatment

Severe depression (29+):

  • Intensive treatment required (combination therapy often warranted)
  • More frequent monitoring (weekly)
  • Consider hospitalization if severe functional impairment or safety concerns
  • Immediate psychiatric consultation recommended

Treatment Monitoring Guidelines

Baseline assessment:

  • Establish pre-treatment severity and symptom profile
  • Individual item analysis to identify specific treatment targets
  • Set treatment goals based on symptom presentation

During treatment:

  • Administer every 1-2 weeks during acute phase
  • Track total score and individual symptom changes
  • ≥5 point reduction indicates meaningful progress (Dozois et al., 1998)
  • Adjust treatment plan if insufficient improvement after 4-6 weeks

Outcome evaluation:

  • Post-treatment assessment at therapy conclusion
  • Follow-up assessments (3, 6, 12 months) for relapse monitoring
  • Score <14 commonly used as remission criterion
  • Early detection of symptom return for relapse prevention

Cognitive-Behavioral Therapy Integration

Symptom identification:

  • Item-by-item review with client to identify target symptoms
  • Focus on cognitive items (pessimism, self-dislike, guilt) as cognitive therapy targets
  • Track changes in specific cognitive distortions over treatment

Progress monitoring:

  • Graph scores across sessions to visualize improvement
  • Celebrate gains and identify areas needing more focus
  • Use setbacks as learning opportunities

Homework integration:

  • Self-monitoring assignments between sessions
  • Tracking relationship between thoughts and mood
  • Behavioral experiments informed by symptom profile

Research Applications

Clinical trials:

  • Primary outcome measure for depression treatment efficacy studies
  • Change scores or remission rates (<14) as endpoints
  • Responder analysis (≥50% reduction or ≥5 point change)

Mechanism research:

  • Understanding cognitive and behavioral processes underlying treatment effects
  • Mediator and moderator analyses of treatment response

Psychometric research:

  • Gold standard for validating new depression measures
  • Benchmark for establishing concurrent validity

Special Populations Considerations

Medical patients:

  • Consider elevated cutoffs (≥16-18) due to somatic symptom overlap
  • May use cognitive-affective subscale alone
  • Interpret somatic items cautiously in chronic illness

Elderly populations:

  • May overweight somatic symptoms due to age-related changes
  • Higher cutoffs (≥16) sometimes recommended
  • Consider medical comorbidity in interpretation

Adolescents (13+):

  • Valid using same cutoffs as adults
  • Consider developmental context and family factors
  • Integrate with parent/teacher reports when appropriate

Diverse cultural backgrounds:

  • Use culturally adapted versions when available
  • Consider cultural expression of depression symptoms
  • Population-specific norms when established

Limitations and Cautions

  • Not diagnostic alone: Clinical interview required for definitive MDD diagnosis
  • Copyright restrictions: Must be purchased from Pearson Assessments
  • Administration time: Longer than brief screening tools (10-15 minutes)
  • Somatic symptom overlap: May inflate scores in medical illness or chronic pain
  • Reading level: Requires 5th-6th grade reading ability
  • Response bias: Can be influenced by motivation to appear more or less depressed

Import & Customize Testable Template

► Import scale to your Testable account – Add this scale. Modify instructions, edit questions, adjust presentation. Test anyone (including yourself)

► Try Testable version – View the full implementation of this scale in Testable.

► View detailed implementation guide in Testable – Step by step instructions for complete customization.

► Browse other tests and scales in Testable Library – The largest collection of ready-made psychological tests and scales.

Copyright and Usage Responsibility: Check that you have the proper rights and permissions to use this assessment tool in your research. This may include purchasing appropriate licenses, obtaining permissions from authors/copyright holders, or ensuring your usage falls within fair use guidelines.

The BDI-II is copyrighted by Pearson Assessments and must be purchased for legal use. Unlike some public domain instruments, the BDI-II requires proper licensing for administration in clinical practice and research settings. Pearson Assessments holds exclusive rights to the inventory and provides official scoring materials, normative data, and interpretation guidelines.

Proper Attribution: When using or referencing this scale, cite the original development manual:

  • Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Beck Depression Inventory-II Manual. San Antonio, TX: Psychological Corporation.

Usage Requirements: Researchers and clinicians must obtain proper licensing through Pearson Assessments before administering the BDI-II. This includes purchasing official test materials, scoring keys, and normative data. Unauthorized reproduction or administration constitutes copyright infringement.

Academic and Research Use: Educational institutions and research organizations can obtain appropriate licensing for academic use, including student training and research projects. Contact Pearson Assessments for specific academic pricing and usage agreements.

References

Primary Development Manual:

  • Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Beck Depression Inventory-II Manual. San Antonio, TX: Psychological Corporation.

Original BDI Development:

  • Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4(6), 561-571.

Psychometric Validation Studies:

  • Dozois, D. J., Dobson, K. S., & Ahnberg, J. L. (1998). A psychometric evaluation of the Beck Depression Inventory-II. Psychological Assessment, 10(2), 83-89.
  • Steer, R. A., & Clark, D. A. (1997). Psychometric characteristics of the Beck Depression Inventory-II with college students. Measurement and Evaluation in Counseling and Development, 30(3), 128-136.
  • Beck, A. T., Steer, R. A., Ball, R., & Ranieri, W. F. (1996). Comparison of Beck Depression Inventories-IA and -II in psychiatric outpatients. Journal of Personality Assessment, 67(3), 588-597.

Comprehensive Reviews:

  • Wang, Y. P., & Gorenstein, C. (2013). Psychometric properties of the Beck Depression Inventory-II: A comprehensive review. Revista Brasileira de Psiquiatria, 35(4), 416-431.
  • Richter, P., Werner, J., Heerlein, A., Kraus, A., & Sauer, H. (1998). On the validity of the Beck Depression Inventory: A review. Psychopathology, 31(3), 160-168.
Illustration of a sad panda sitting hunched over in a misty bamboo forest with head lowered and eyes downcast, surrounded by gray fog, with the Testable logo and text "BDI-II Beck Depression Inventory-II"
A melancholy panda sitting alone in the fog — embodying sadness, hopelessness, and loss of interest measured by the BDI-II (Beck Depression Inventory-II)

Frequently Asked Questions

What does the BDI-II measure?

The BDI-II measures the severity of depressive symptoms in adolescents and adults aged 13+. It assesses 21 symptom domains including cognitive-affective symptoms (sadness, pessimism, guilt, self-dislike) and somatic symptoms (fatigue, sleep disturbance, appetite changes) over a two-week period, providing a comprehensive depression severity score ranging from 0-63.

How long does the BDI-II take to complete?

The BDI-II typically takes 10-15 minutes to complete. It consists of 21 items, each presenting four statements reflecting increasing symptom severity. Respondents select the statement that best describes how they have felt during the past two weeks, including today.

Is the BDI-II free to use?

No, the BDI-II is not free. It is copyrighted by Pearson Assessments and must be purchased for legal use in clinical practice and research settings. Researchers and clinicians need proper licensing before administering the inventory. Unauthorized reproduction or administration constitutes copyright infringement.

How is the BDI-II scored?

The BDI-II is scored by summing all 21 item responses, each rated 0-3, yielding a total score of 0-63. Severity classifications are: 0-13 (minimal), 14-19 (mild), 20-28 (moderate), and 29-63 (severe depression). Scores ≥14 suggest clinically significant depression, and any score on Item 9 (suicidal thoughts) requires immediate clinical attention.

What's the difference between BDI-II and PHQ-9?

The BDI-II is a comprehensive 21-item severity measure grounded in Beck's cognitive theory, taking 10-15 minutes, while the PHQ-9 is a brief 9-item DSM-based screening tool taking 2-3 minutes. The BDI-II provides detailed symptom profiling and is the research gold standard, whereas the PHQ-9 is designed for quick screening in primary care. They correlate highly (r=0.73-0.84).

How reliable is the BDI-II?

The BDI-II demonstrates excellent reliability across populations. Internal consistency ranges from α=0.92-0.93 in clinical samples, with test-retest reliability of r=0.93 over one week. Cross-cultural studies show α=0.84-0.94 internationally. Meta-analyses across 118 studies report mean coefficient alpha of 0.90, confirming consistently strong reliability across diverse populations and settings.
Last Updated: