The Berg Balance Test: A Tool to Predict Falls in the Elderly

The Berg Balance Test is a useful tool to identify the risk of falling in the elderly.

Citation/s:
1. Riddle DL, Stratford PW. Interpreting validity indexes for diagnostic test: an illustration using the Berg Balance Test. Phys Ther. 1999; 79: 939-948.
2. Shumway-Cook A, Baldwin M, Plissar NJ, Gueber W. Predicting the probability for falls in community-dwelling older adults. Phys. Ther. 1997; 71: 589-622.
3. Bogle Thorbahn JD, Newton RA. Use of Berg Balance Test to predict falls in elderly persons. Phys Ther. 1996;76: 576-583.

Lead Author's name and fax: Daniel Riddle. (Fax/ E- Mail not provided).


Three-part Clinical Question: In the elderly (ages 65 or more), is the Berg Balance Test (BBT) a valid test to predict the risk of falls?


Search Terms: Not applicable. The article was assigned.

The Study: A clinical perspective/ review of two previous research reports.


The Study Patients: Riddle & Stratford (1999) chose two previous studies that used the BBT as a tool to indicate risk for falls.

In the study conducted by Bogle Thorbahn and Newton (1996), there were a total of 66 subjects. The mean age of the subjects was 79.2, SD 6.2, range 69-94 (in years). 24% were males and 76% were females. The average Berg Balance Test (BBT) score reported for this group was 48.2, SD 9.9, range 9-56. 17% were classified as fallers (per the gold standard).

In the study conducted by Shumway- Cook, Baldwin, Polissar and Geuber (1997), there were 44 subjects with an average age of 76.1 years, SD 6.6, range 65-94. 27% were males and 73% were females. The average BBT score for this group was 46.1, SD 10.5 with a range of 18-56. The proportion of fallers for this group was 50%. Subjects in both the studies were of similar ages, gender distribution and BBT scores. However, there were more fallers (gold standard classification) in the study conducted by Shumway-Cook and colleagues.


Study Features: Independent, non-blind comparison with a reference (gold) standard. There was an appropriate spectrum of patients. The gold standard was applied regardless of the test result.

Target disorder and Gold Standard: Risk for falls; falling in the past 6 months.


Diagnostic test: The Berg Balance Test. Physical and Occupational therapists use this test to measure a person's balance and monitor their status/ response to treatment in terms of balance over a course of disease/ disorder. Patients are asked to perform 14 tasks that are representative of daily activities requiring balance such as sitting, standing, leaning over and stepping. Each task is rated on a 5- point scale, where 0= unable to perform and 4= normal performance. Overall scores can range from 0 (i.e., severely impaired balance) to 56 (excellent balance).

For the purpose of the study, the test results are two-level (positive or negative, based upon the score obtained).

The Evidence:

 

Target Disorder: Falling

Test: Berg Balance Test

Present

Absent

Test Result

Num

Prop

Num

Prop

Likelihood RatiosTR>

Positive

15

a

3

b

11.67

3.62 to 37.61TR>

Negative

18

c

74

d

0.57

0.41 to 0.78

Sensitivity: 45%; CI: 28 to 62

Specificity: 96%; CI: 92 to 100

Prevalence: 30%; CI: 21 to 39

Positive Predictive Value: 83%; CI: 66 to 100

Negative Predictive Value: 80%; CI: 72 to 89

Comments:

1.      83% of subjects were correctly identified as fallers (the gold standard) based upon the dichotomous rule to classify fallers at a cut-off point of <40 (i.e., BBT score of <40) and thus, 17% were misclassified as fallers. That is, the positive predictive value of BBT with a cut-off score <40 is 83%.

2.      Similarly, 80% (with the same dichotomy based upon the BBT score) were correctly classified as non-fallers (the gold standard). Thus, the negative predictive value of the test is 80%. (Note: In the article this was wrongly   calculated at 67%).

3.      The BBT has high specificity when positive scores are set below 40. Sensitivity (and thus, the negative predictive value) increases with setting the scores higher. With the binary outcome of the test set at <40, clinicians can identify fallers with high accuracy. With higher scores, non-fallers can be identified with increasing confidence. The CI for specificity of the test increases with decreased score/ sample size (the width of the CI becoming narrower), i.e., one can be more confident identifying the target disorder, if the test is positive. Conversely, the width of the CI becomes wider, CI for sensitivity increases with increased scores/ sample size. That is, one may be more confident in identifying people without the disorder with a negative test. In order to correctly identify non-fallers and avoid classifying a faller as a non- faller (thus, prevent adverse effects) sensitivity must be optimized, i.e., the BBT should be used with higher cut-off score. 

4.      The validity of the test was strengthened by the test being performed independently of the gold standard. The spectrum of the patients was appropriate. The threats to validity stem from the fact that, the methodology for testing does not discuss blinding of the raters in the two studies. Was it because blinding was not possible for individual tests? It would be hard not to know the score of the BBT by the one administering it and not to know when one falls. Also, it is unclear if the raters of the BBT in the two studies were blinded to whether or not a subject had fallen in the previous six months? Was there a work-up bias? Could there have been a work-up bias due to the age-range of the subjects, i.e, while rating a 94- year old versus a 65-year old? The other questions that are unanswered relates to the fact that though the studies had similar definitions/ criteria and subjects, why did Shumway- Cook and colleagues' study yield more fallers (50%) as opposed to 17% in Bogle Thorbahn and Newton's? Also, this difference in results is also interesting because in the former study, people with co-morbidities that could affect balance where excluded while they were included in the latter.

5.      The post-test positive likelihood for a target disorder with a positive BBT reaching > 80% to identify falls risk compared to the pre-test probability of 30%, is a significant increase (with BBT cut-off score of <40).  And thus, the BBT with a cut-off score of < 40 is a valid tool to predict falls risk. Likewise, with a negative BBT score (> or equal to 40) the negative likelihood ratio (CI 0.4-0.8) of 0.6 indicates that a patient is 0.6 times more as likely to be a faller as a non-faller.

6.      If a patient scores between 40 and 44 on the BBT, the positive likelihood ratio is 2.8. That is, with that BBT score, the patient is 2.8 times more likely to be a faller. The 95% CI ranges from 0.9 to 8.5, overlaps 1 (thus, there is no change in the probability of the disorder). Therefore, one cannot be confident that the BBT score between 40 and 44 could be predictive of falls. However, with a cut-off score of  <40, the likelihood ratio is increased to11.7 (95% CI= 3.6- 37). With a score of <40, a patient is 12 times more likely to be a faller than a non-faller. The positive likelihood ratio is optimal at a cut-off score of <40 and decreases with increasing cut-off scores, thereafter. Therefore, with higher scores the probability that a target disorder is actually present is diminished. Conversely, the negative likelihood ratio is optimal at a cut-off score of <35 and reduces with increasing scores. This shows that with higher cut-off scores, a negative test demonstrates that the target disorder is absent with greater certainty.

7.      Clinical Relevance of the Test:

A test that increases the post-test positive likelihood to >80% (as with the cut-off score at <40 with the optimal positive likelihood ratio at 11.7%) compared to 30% as in the study, is certainly desirable and useful to identify patients that are at high risk for falls. The higher cut-off scores decrease the post-test probability to below 60%, thus, increasing the uncertainty of the diagnosis with a target disorder. However, as discussed earlier, scores with higher cut-offs (greater sensitivity) will more accurately avoid misclassification of a faller as a non- faller. A negative test with higher cut-off scores will mean that a patient is less likely to be a faller. 

In my opinion, given the ease, simplicity, safety and cost-effectiveness to administer, the Berg Balance Test could be a strong diagnostic tool to identify fallers in the elderly population. It can be used as a strong prognostic tool to measure outcomes of interventions as well.

Appraised by: Joe Wells, OT.

AmeriCare Health Services, LLC.502 Clinton Street, Defiance, OH 43512.Fax: (419) 782 0105. Email: joewells@americare-health.com

Tuesday, June 14, 2005

Kill or update by: 01/01/2006

Particular to my patient:

Pre-test probability:

30%

Test Result

Post-test probability

Positive

83%

Negative

20%