Description of disease Observer variation. Treatment Observer variation. Symptoms and causes Observer variation Prophylaxis Observer variation
Application of the velocity profile method is recommended for reliable measurement of flow volume in larger vessels, and ultrasonic flowmetry is a useful clinical tool for this purpose. We used the velocity profile in conjunction with a minor modification in the conventional velocity profile method and examined the reproducibility of flowmetry from color Doppler data. Data of three examiners were allowed to analyze intraobserver reproducibility and interobserver agreement in the common carotid artery, and we measured flow volume in the peripheral vessels of healthy individuals. Estimated flow volumes in five healthy examinees were 350 to 550 ml/min and did not vary significantly between examiners. Interobserver correlation was good (r 1=0.63), but intraobserver correlations in two sonographers were excellent (r 1=0.85) in by one who was experienced in this method and poor (r 1=0.32) in the other. Good interobserver agreement and intraobserver reproducibility of experienced examiners suggests that this
BACKGROUND: The incidence of incidental pulmonary embolism (IPE) in cancer patients is increasing. There is scant information on the interobserver agreement among radiologists about the diagnosis of distal incidental clots and the actual radiologic extension of IPE. METHODS: A total of 88 contrast-enhanced computed tomography (CT) scans of cancer patients with IPE were reassessed blindly by two expert thoracic radiologists. First, 62 scans were reassessed and the interobserver agreement on most proximal extent of IPE was calculated between the two expert radiologists as well as between the initial and expert reading, using the kappa statistic. The sample was enriched with 26 additional scans for a total of 30 segmental and 29 subsegmental IPE to determine the interobserver agreement on distal clots. RESULTS: The level of agreement regarding the most proximal extent of IPE between the expert radiologists was very good (kappa 0.84; 95% CI, 0.73-0.95) and poor between the original radiologist and ...
Five observers using the Jensen modification of the Evans classification and the AO classification (with and without subgroups) classified the radiographs of 88 trochanteric hip fractures. Each observer classified the radiographs independently on two occasions 3 months apart. Kappa statistical analysis was used for determination of intra- and inter-observer variation. For the Jensen classification, the mean kappa value was 0.52 (range: 0.44-0.60) for intra-observer variation and 0.34 (range: 0.17-0.38) for inter-observer variation. For the AO system with subgroups, the mean kappa value was 0.42 (range: 0.20-0.65) for intra-observer variation and 0.33 (range: 0.14-0.48) for inter-observer variation. For the AO classification system without subgroups, the mean kappa value was 0.71 (range: 0.60-0.81) for intra-observer variation and 0.62 (range: 0.50-0.71) for inter-observer variation. We recommend classifying trochanteric fractures into three groups as that of the AO system without the subgroups. For ease
TY - JOUR. T1 - Interobserver variation in target volume for salvage radiotherapy in recurrent prostate cancer patients after radical prostatectomy using CT versus combined CT and MRI. T2 - A multicenter study (KROG 13-11). AU - Lee, Eonju. AU - Park, Won. AU - Ahn, Sung Hwan. AU - Cho, Jae Ho. AU - Kim, Jin Hee. AU - Cho, Kwan Ho. AU - Choi, Young Min. AU - Kim, Jae Sung. AU - Kim, Jin Ho. AU - Jang, Hong Seok. AU - Kim, Young Seok. AU - Nam, Taek Keun. N1 - Publisher Copyright: © 2018. The Korean Society for Radiation Oncology.. PY - 2018/3. Y1 - 2018/3. N2 - Purpose: To investigate interobserver variation in target volume delineations for prostate cancer salvage radiotherapy using planning computed tomography (CT) versus combined planning CT and magnetic resonance imaging (MRI). Materials and Methods: Ten radiation oncologists independently delineated a target volume on the planning CT scans of five cases with different pathological status after radical prostatectomy. Two weeks later, this ...
TY - JOUR. T1 - Intra- and inter-observer variability in contouring prostate and seminal vesicles. T2 - Implications for conformal treatment planning. AU - Fiorino, Claudio. AU - Reni, Michele. AU - Bolognesi, Angelo. AU - Cattaneo, Giovanni Mauro. AU - Calandrino, Riccardo. PY - 1998/6. Y1 - 1998/6. N2 - Background and purpose: Accurate contouring of the clinical target volume (CTV) is a fundamental prerequisite for successful conformal radiotherapy of prostate cancer. The purpose of this study was to investigate intra- and inter-observer variability in contouring prostate (P) and seminal vesicles (SV) and its impact on conformal treatment planning in our working conditions. Materials and methods: Inter-observer variability was investigated by asking five well-trained radiotherapists of contouring on CT images the P and the SV of six supine-positioned patients previously treated with conformal techniques. Short-term intra-observer variability was assessed by asking the radiotherapists to ...
Objectives: To evaluate inter-observer agreement for microscopic measurement of inflammation in synovial tissue using manual quantitative, semiquantitative and computerised digital image analysis.. Methods: Paired serial sections of synovial tissue, obtained at arthroscopic biopsy of the knee from patients with rheumatoid arthritis (RA), were stained immunohistochemically for T lymphocyte (CD3) and macrophage (CD68) markers. Manual quantitative and semiquantitative scores for sub-lining layer CD3+ and CD68+ cell infiltration were independently derived in 6 international centres. Three centres derived scores using computerised digital image analysis. Inter-observer agreement was evaluated using Spearmans Rho and intraclass correlation coefficients (ICCs).. Results: Paired tissue sections from 12 patients were selected for evaluation. Satisfactory inter-observer agreement was demonstrated for all 3 methods of analysis. Using manual methods, ICCs for measurement of CD3+ and CD68+ cell infiltration ...
OBJECTIVE: Accurate and consistent outcome assessment is essential to randomized clinical trials. We aimed to explore observer variation in the assessment of outcome in a recently completed trial of dexanabinol in head injury and to consider steps to reduce such variation. METHODS: Eight hundred sixty-one patients with severe traumatic brain injury who were admitted to 86 centers were included in a multicenter, placebo-controlled, Phase III trial. Outcome was assessed at 3 and 6 months postinjury using the extended Glasgow Outcome Scale; standardized assessment was facilitated by the use of a structured interview. Before initiation of trial centers, outcome ratings were obtained for sample cases to establish initial levels of agreement. Training sessions in outcome assessment were held, and problems in assigning outcome were investigated. During the trial, a process of central review was established to monitor performance. Interobserver variation was analyzed using the κ statistic. RESULTS: ...
Radiological sacroiliitis in Behçets syndrome (BS) has been a subject of controversy. We have examined pelvic radiographs of 38 patients with BS and 28 age and sex matched controls which we reported previously, and also 17 with ankylosing spondylitis (AS), 27 with non-renal familial Mediterranean fever (FMF), and 33 with primary osteoarthrosis (OA). Initially, five observers assessed radiographs on two different occasions according to the New York criteria for sacroiliitis in a blind protocol. Later, three of them examined the various possible abnormalities of the sacroiliac (SI) joints after training sessions. Although the inter- and intraobserver variation was quite high, all observers found the expected changes in patients with AS. The abnormalities detected in the other diseases were either mild, inconsistent, or both. Erosions were confined to patients with AS, and osteophytes and glenoid sulci to patients with OA. We conclude that high observer variation in interpreting a film of the ...
Inter-observer variation is a well-known problem in medical practice. Gardenia et al. first reported on this issue in the 1950 s [3], and it became a subject for discussion in the radiotherapeutic community in the 1970 s. In the 1990 s, many articles were published about inter-observer variation for a variety of cancers: prostate cancer [4], brain tumors [5], breast cancer [6] head and neck cancer [12, 13], and lung cancer [14, 15]. However, we were unable to find any papers that examined inter-observer variation for pituitary adenoma and meningioma; to the best of our knowledge, this is the first such report.. DVHs analysis by superimposing different contours from multiple clinicians onto the default treatment plan showed higher maximal dose for optic tract (Figure 3). It was increased to 23.64 Gy (268% higher dose than default plan) for the pituitary adenoma and 19.39 Gy (131%) for the meningioma. These results imply that contour deviations across plans could easily cause unexpectedly higher ...
The variation between two observers in grading 100 biopsies and the corresponding main specimens of rectal carcinomas has been examined. Using kappa statistics, which take account of chance agreement, we found a highly significant level of agreement. As expected, higher levels were obtained for intraobserver agreement. However, disagreements between observers were in many instances haphazard and there were differences in bias between them. Fifty paired biopsies and main tumours were graded by five observers and the results analysed for bias and by kappa statistics for overall and conditional agreement. These methods revealed significant overall agreement but the levels for some observer pairs did not differ significantly from chance. Examination for observer bias indicated differing standards of grading, and haphazard disagreements reached high levels for some observer pairs. The intraobserver agreement between the grade of the biopsy and the corresponding main tumour varied from 56-69% but ...
article{4b17a0d1-2cce-4173-9bdc-8643ca9215c4, abstract = {,p,Background and purpose: Substantial inter-observer variations in target delineation have been presented previously. Target delineation for paediatric cases is difficult due to the small number of children, the variation in paediatric targets, the number of study protocols, and the individual patients specific needs and demands. Uncertainties in target delineation might lead to under-dosage or over-dosage. The aim of this work is to apply the concept of a consensus volume and good quality treatment plans to visualise and quantify inter-observer target delineation variations in dosimetric terms in addition to conventional geometrically based volume concordance indices.,/p,,p,Material and methods: Two paediatric cases were used to demonstrate the potential of adding dose metrics when evaluating target delineation diversity; Hodgkins disease (case 1) and rhabdomyosarcoma of the parotid gland (case 2). The variability in target ...
BACKGROUND AND PURPOSE: To determine the extent of inter-observer variation in delineation of the heart and left anterior descending coronary artery (LADCA) and its impact on estimated doses. METHODS AND MATERIALS: Nine observers from five centres delineated the heart and LADCA on fifteen patients receiving left breast radiotherapy. The delineations were carried out twice, first without guidelines and then with a set of common guidelines. RESULTS: For the heart, most spatial variation in delineation was near the base of the heart whereas for the LADCA most variation was in its length at the apex of the heart. Common guidelines reduced the spatial variation for the heart and the length of the LAD, but increased the variation in the anterior-posterior/right-left plane. The coefficients of variation (CV) in the estimated doses to the heart were: mean dose 7.5% without and 3.6% with guidelines, maximum dose 8.7% without and 4.0% with guidelines. The CVs in the estimated doses to the LADCA were: mean dose 27
Objective: To analyse inter-observer variation between a neuroradiologist and neurosurgeon in the MRI diagnosis of lumbar nerve root compression. Although lumbar MFI is primarily analyzed and reported by a radiologist, neurosurgeons often analyse it independently as they have sufficient clinical background as well as radiological expertise to diagnose most spinal pathologies on Magnetic Resonance Imaging (MRI).Methods: Retrospective analysis was carried out for images of 54 patients who underwent MRI between March and July 2010 of lumbar spine with suspected lumbar disc herniation and nerve root compression, at Aga Khan Hospital, Karachi, Pakistan. One fellowship trained neuroradiologist and one neurosurgeon evaluated the images on PACS system separately. Both observers were unaware of the patients clinical history and each others findings. Lumbar discs at L3-L4, L4-L5 and L5-S1 levels were evaluated by both observers for disc disease and nerve compression. Findings were recorded on a proforma and
Objective: To analyse inter-observer variation between a neuroradiologist and neurosurgeon in the MRI diagnosis of lumbar nerve root compression. Although lumbar MFI is primarily analyzed and reported by a radiologist, neurosurgeons often analyse it independently as they have sufficient clinical background as well as radiological expertise to diagnose most spinal pathologies on Magnetic Resonance Imaging (MRI).Methods: Retrospective analysis was carried out for images of 54 patients who underwent MRI between March and July 2010 of lumbar spine with suspected lumbar disc herniation and nerve root compression, at Aga Khan Hospital, Karachi, Pakistan. One fellowship trained neuroradiologist and one neurosurgeon evaluated the images on PACS system separately. Both observers were unaware of the patient\s clinical history and each other\s findings. Lumbar discs at L3-L4, L4-L5 and L5-S1 levels were evaluated by both observers for disc disease and nerve compression. Findings were recorded on a proforma and
The most important issue in the current study is the presentation of a novel eHealth tool, the D-Foot aimed to be used by CPOs. The results of the current study show a high level of agreement for the risk classification (inter-agreement 0.83, pooled kappa 0.31, varying from 0.16 to 1.00 at single departments), (Table 2). The corresponding intra-rater agreement was 0.88 (pooled kappa 0.63, varying from 0.42 to 1.00) at single departments. A high degree of inter- and intra-rater reliability was found for the presence of Charcot foot deformity and amputation (agreement of , 0.90, kappa , 0.73) [30]. These risk factors are easy to detect by visual inspection. The agreement between the observers was adequate when it came to the Ipswich Touch Test and hallux valgus/varus, all of which showed an agreement between 0.79-0.86 and a kappa of , 0.56. As expected, the intra-rater agreement was generally higher than the inter-rater agreement. Measurements of foot length and width (Table 3) using a foot ...
Purpose: PET/CT is a standard medical imaging used in the delineation of gross tumor volume (GTV) in case of radiation therapy for lung tumors. However, PET/CT could present some limitations such as resolution and standardized uptake value threshold. Moreover, chest MRI has shown good potential in diagnosis for thoracic oncology. Therefore, we investigated the influence of chest MRI on inter-observer variability of GTV delineation.Methods and Materials: Five observers contoured the GTV on CT for 14 poorly defined lung tumors during three contouring phases based on true daily clinical routine and acquisition: CT phase, with only CT images; PET phase, with PET/CT; and MRI phase, with both PET/CT and MRI. Observers waited at least 1 week between each phases to decrease memory bias. Contours were compared using descriptive statistics of volume, coefficient of variation (COV), and Dice similarity coefficient (DSC).Results: MRI phase volumes (median 4.8 cm3) were significantly smaller than PET phase volumes
In breast cancer, there is a growing body of evidence that tumor-infiltrating lymphocytes (TILs) may have clinical utility and may be able to direct clinical decisions for subgroups of patients. Clinical utility is, however, not sufficient for warranting the implementation of a new biomarker in the routine practice, and evaluation of the analytical validity is needed, including testing the reproducibility of decentralized assessment of TILs. The aim of this study was to evaluate the inter-observer agreement of TILs assessment using a standardized method, as proposed by the International TILs Working Group 2014, applied to a cohort of breast cancers reflecting an average breast cancer population ...
The observer reliability of VLS is fair to good with intraobserver reliability being better than interobserver reliability. This supports the use of VLS for detection of gastrointestinal ischemia. ...
The Kappa values for the first series of ten carcinomas of various degrees of differentiation showed good to very good agreement for MIB-1-LI (Kappa 0.56-0.72). However, we found very high inter-observer variabilities (Kappa 0.04-0.14) in the read-outs of the G2 carcinomas. It was not possible to explain the inconsistencies exclusively by any of the following factors: (i) pathologists divergent definitions of what counts as a positive nucleus (ii) the mode of assessment (counting vs. eyeballing), (iii) immunostaining technique, and (iv) the selection of the tumor area in which to count. Despite intensive confrontation of all participating pathologists with the problem, inter-observer agreement did not improve when the same slides were re-examined 4 months later (Kappa 0.01-0.04) and intra-observer agreement was likewise poor (Kappa 0.00-0.35 ...
PubMed comprises more than 30 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites.
Nonblinded assessors of subjective measurement scale outcomes in randomized clinical trials tended to generate substantially biased effect sizes. Standardized mean differences were exaggerated by a pooled standard deviation of 0.23 (95% CI 0.40 to 0.06) or, in relative terms, by 68% (95% CI 14% to 230%).. Observer bias can be perceived as the result of the interaction between observers predispositions and the subjectivity of the outcome. Predispositions are likely to differ substantially from observer to observer and from trial to trial. In some trials, conscientious nonblinded assessors may overcompensate for an expected bias in favour of the experimental intervention and paradoxically induce a bias favouring the control, whereas other trials will have fairly neutral assessors with no important bias. Thus, the degree of observer bias in trials with clearly predisposed outcome assessors is likely to be considerably higher than the mean we see here, which is based on all of the included trials. ...
Reproducibility of grading H pylori related gastritis is high using the updated Sydney system. Despite the novel criteria for scoring atrophy, there was imperfect agreement on this feature between two independent histopathologists.
PURPOSE: We aimed to determine the intra- and interobserver agreement on the software analysis of very low dose hepatic perfusion CT (pCT).. METHODS: A total of 53 pCT examinations were obtained from 21 patients (16 men, 5 women; mean age, 60.4 years) with proven liver metastasis from various primary cancers. The pCT examinations were analyzed by two readers independently and perfusion parameters were noted for whole liver, whole metastasis, metastasis wall, and normal-looking liver (liver tissue without metastasis) in regions of interest (ROIs). Readers repeated the analysis after an interval of one month. Intra- and interobserver agreements were assessed with intraclass correlation coefficients (ICC) and Bland-Altman statistics.. RESULTS: The mean ICCs of all ROIs between readers were 0.91, 0.93, 0.86, 0.45, 0.53, and 0.66 for blood flow (BF), blood volume (BV), permeability, arterial liver perfusion (ALP), portal venous perfusion (PVP) and hepatic perfusion index (HPI), respectively. The mean ...
Results Analysis of diagnosis for all circulations and all readers gave a composite κ value of 0.86 and pairwise-weighted κ (κp-w) value of 0.91, both regarded as almost perfect agreement. This was due to the high proportion of responses that showed partial agreement. Analysis of Gleason Sum Score gave κ=0.38 and κp-w=0.58 over all circulations and all readers, indicating that discrepancies occur at the boundary between adjacent grades and may not be as clinically significant as suggested by composite κ. ...
Objective: To report the agreement between gray-scale intravascular ultrasound (GS-IVUS) and optical coherence tomography (OCT) in assessing the bioresorbable vascular scaffolds (BVS) structures and their respective reproducibility. Background: BVS are composed of an erodible polymer. Ultrasound and light signals backscattered from polymeric material differs from metallic stents using GS-IVUS and OCT. Methods: Forty-five patients included in the ABSORB trial were treated with a 3.0 × 18 mm BVS and imaged with GS-IVUS 20 MHz and OCT post-implantation. Qualitative (ISA, side-branch struts, protrusion, and dissections) and quantitative (number of struts, lumen, and scaffold area) measurements were assessed by two investigators. The agreement and the inter- and intraobserver reproducibility were investigated using the kappa (κ) and the interclass correlation coefficient (ICC). Results: GS-IVUS and OCT agreement was predominantly poor at a lesion, frame, and strut level analysis (κ and ICC ,0.4) ...
Abstract. The purpose of this pilot study was to assess whether orthodontic treatment planning is reproducible when carried out using digital records compared with clinical examinations or using standard records. The study also assessed patients opinion of face-to-face consultations and potential use of teleorthodontics. The study was designed as a prospective observational cross-sectional pilot study and carried out in a UK dental teaching hospital involving 27 subjects. Four consultant Orthodontists carried out treatment planning, firstly following a clinical examination, then using standard records, and then using digital records. Each subject completed a questionnaire. Cohens kappa coefficient and Fleiss kappa coefficient were used to assess intra-observer reproducibility and inter-observer reproducibility of treatment planning decisions, respectively. A change in the diagnostic information format affected treatment planning reproducibility for half of the observers. Inter-observer ...
The aim was to assess intraobserver reliability of a new semi-automated technique of embryo volumetry. Power calculations suggested 46 subjects with viable, singleton pregnancies were required for reliability analysis. Crown rump length (CRL) of each
PubMed Central Canada (PMC Canada) provides free access to a stable and permanent online digital archive of full-text, peer-reviewed health and life sciences research publications. It builds on PubMed Central (PMC), the U.S. National Institutes of Health (NIH) free digital archive of biomedical and life sciences journal literature and is a member of the broader PMC International (PMCI) network of e-repositories.
Background Interim PET/CT is widely performed in lymphoma patients in clinical practice and clinical trials. Visual assessment using a 5-point scale is proposed for PET/CT interpretation, but intra- and inter-observer variation is not fully investigated. Purpose To investigate intra- and inter-observer variations in the reporting of interim positron emission tomography/computed tomography (PET/CT) in lymphoma patients, and the influence of clinical information on the interpretation. Material and Methods Three expert readers from different institutions interpreted interim PET/CT images of 42 consecutive patients with malignant lymphoma twice, with and without clinical information ...
We investigated the interrater reliability and accuracy of two independent medical doctors in using NINCDS/ADRDA criteria to classify 82 elderly subjects enrolled in OPTIMA, a longitudinal study investigating dementia. Kappa statistics revealed moderate agreement (0.5) in overall classification of dementia type, and almost perfect agreement (0.9) on the absence or presence of dementia. Combining NINCDS/ADRDA possible and probable Alzheimers disease (AD) categories produced substantial agreement (0.7). Comparison with CERAD histopathological criteria for AD showed that combining possible and probable AD resulted in a high sensitivity and accuracy, but a low specificity. To increase specificity, the NINCDS/ADRDA probable AD category should be used alone. An important finding was that the accuracy of diagnoses of AD made from the case notes alone was not different from the diagnoses obtained following active involvement with participants.
OBJECTIVES: Lung-RADS represents a categorical system published by the American College of Radiology to standardise management in lung cancer screening. The purpose of the study was to quantify how well readers agree in assigning Lung-RADS categories to screening CTs; secondary goals were to assess causes of disagreement and evaluate its impact on patient management.. METHODS: For the observer study, 80 baseline and 80 follow-up scans were randomly selected from the NLST trial covering all Lung-RADS categories in an equal distribution. Agreement of seven observers was analysed using Cohens kappa statistics. Discrepancies were correlated with patient management, test performance and diagnosis of malignancy within the scan year.. RESULTS: Pairwise interobserver agreement was substantial (mean kappa 0.67, 95% CI 0.58-0.77). Lung-RADS category disagreement was seen in approximately one-third (29%, 971) of 3360 reading pairs, resulting in different patient management in 8% (278/3360). Out of the 91 ...
The results of this study showed a good to fair intraexaminer agreement of the 2 types of landmarks analyzed. The 2 examiners proved to be in good, moderate, and fair agreement when they classified the roof of the mandibular canal and the mental foramen. Because the intraexaminer agreement was classified as fair and moderate in some instances, it was not possible to have an interexaminer agreement. To achieve interexaminer agreement, it is necessary to have at least a good intraexaminer agreement, which was not achieved in this study. The interpretations of both examiners were similar in relation to the detection of MCR and MF with good agreement of the left side. This may be explained by the preradiographic interpretation calibration that was advised by an experienced radiologist and supported by the confidence intervals results. On the other hand, the agreement for both examiners and for both landmarks of the right sides was not similar. The results showed similar tendencies; ...
Several studies2,7,15,16 have analyzed the intra-rater reliability of the 6MWT; therefore, this test has been considered reliable for assessing functional capacity in patients with COPD after a practice test. However, there is a lack of studies verifying the inter-rater reliability for this population.. The intra-rater 6MWT reliability in our study presented ICC values for walked distance ,0.75, indicating excellent reliability. This analysis has been already studied in subjects with chronic respiratory disease by many authors, who found ICC values ranging from 0.82 to 0.99,7,12,14,15,33-35 confirming the findings of our study. The studies mentioned above were conducted with COPD,7,15,34 with obstructive disease and restrictive lung diseases,12 and with lung disease in the final stage.35 The last 2 studies not performed the second 6MWT, with an interval of 30 min after the first 6MWT, according to the standards of the ATS/ERS.7,14 Furthermore, we found low coefficient of variation values (0.06), ...
In order to conduct studies on shared decision-making (SDM) and to implement SDM in routine practice, psychometrically tested measures are needed. The development of the short 5-item version of the OPTION scale (Observer OPTION5) allows to assess SDM from an observer perspective. Observer OPTION5 is so far only available in English and Dutch. The aim of this study was to translate the Observer OPTION5 rating scale into German and to test its psychometric properties. The German Observer OPTION5 was tested in a secondary data analysis of audio-recordings of patient-physician-consultations (N = 79) in German primary care practices. Demographic data were analysed using descriptive statistics. To assess inter- and intra-rater reliability, intraclass correlation coefficients (ICCs) were calculated. For assessing concurrent validity, a correlation (Spearmans Rho) of the sum score of Observer OPTION5 and Observer OPTION12 was calculated. The consultations dealt with decisions regarding type 2 diabetes (N = 31)
With more prevalent gastroesophageal reflux disease comes increased cases of Barretts esophagus and esophageal adenocarcinoma. Image-enhanced endoscopy using linked-color imaging (LCI) differentiates between mucosal colors. We compared LCI, white light imaging (WLI), and blue LASER imaging (BLI) in diagnosing reflux esophagitis (RE). Consecutive RE patients (modified Los Angeles [LA] classification system) who underwent esophagogastroduodenoscopy using WLI, LCI, and BLI between April 2017 and March 2019 were selected retrospectively. Ten endoscopists compared WLI with LCI or BLI using 142 images from 142 patients. Visibility changes were scored by endoscopists as follows: 5, improved; 4, somewhat improved; 3, equivalent; 2, somewhat decreased; and 1, decreased. For total scores, 40 points was considered improved visibility, 21-39 points was comparable to white light, and | 20 points equaled decreased visibility. Inter- and intra-rater reliabilities (Intra-class Correlation Coefficient [ICC]) were also
The purpose of this study was to examine the interrater reliability of a new evidence-based classification system for Para Vaa. Twelve Para Vaa athletes were classified by three classifier teams each consisting of a medical and a technical classifier. Interrater reliability was assessed by calculating intraclass correlation for the overall class allocation and total scores of trunk, leg, and on-water test batteries and by calculating Fleisss kappa and percentage of total agreement in the individual tests of each test battery. All classifier teams agreed with the overall class allocation of all athletes, and all three test batteries exhibited excellent interrater reliability. At a test level, agreement between classifiers was almost perfect in 14 tests, substantial in four tests, moderate in four tests, and fair in one test. The results suggest that a Para Vaa athlete can expect to be allocated to the same class regardless of which classifier team conducts the classification. ...
OBJECTIVES: To assess the degree of interobserver agreement of MRI in the diagnostic assessment of pancreatic cysts (PCs). METHODS: Magnetic resonance imaging sets of images of 62 patients with PCs (32 with histological confirmation and 30 with clinical diagnosis) were reviewed by 4 experienced radiologists. Features scored included septations, nodules, solid components, pancreatic duct communication, and wall thickening (,2 mm). Radiologists were asked whether they considered the PC mucinous and if the PC was suspicious for malignancy. Furthermore, they had to choose a classifying diagnosis. Intraclass correlation coefficient (ICC) was used to measure agreement within the group. RESULTS: Interobserver agreement for septations and nodules was fair (ICC, 0.36 and 0.23, respectively). Agreement for the presence of solid components was fair (ICC, 0.23), agreement for communication with the pancreatic duct was moderate (ICC, 0.53), and agreement for wall thickening was moderate (ICC, 0.44). There ...
Despite the many studies on venous haemodynamics using duplex, only a few evaluated the normal values, variability and reproducibility. Therefore, the range and variability of venous diameter, compressibility, flow and reflux were measured. To obtain normal values, 42 healthy individuals (42 limbs, 714 vein segments) with no history of venous disease were scanned by duplex. To determine the reproducibility the intra-observer variability was measured in 11 healthy individuals (187 vein segments) and the inter-observer variability in 15 healthy individuals (255 vein segments) and 13 patients (169 vein segments) previously diagnosed with deep venous thrombosis. Of the 714 normal vein segments, 708 (99%) were traceable, including the crural veins. Of the traceable vein segments, 675 (95%) were compressible and in 696 (98%) flow was present. Of the 42 common femoral vein segments, in 25 (60%) the reflux duration exceeded 1.0 s, but in the other proximal vein segments the reflux duration was less than ...
In this study, we found a poor correlation between naked-eye assessment of the CR time and qCR time measures in both laymen and clinical staff. Further, we observed poor naked-eye intra-observer repeatability and interobserver agreement by clinical staff in their assessment of CR time. The use of a categorical evaluation of time measurement did not improve agreement between naked-eye estimations and machine-derived classifications.. It is self-evident to most clinicians that different observers, not only in regard to the CR test, often disagree in clinical assessments based on naked-eye observation.24 Previous studies on the reliability of the CR test have partially addressed this by showing a lack of interobserver agreement, but neither performance on the task to actually determine return to normal skin colour, nor the intra-observer repeatability for a group of observers on a standardised set of cases has been assessed previously.11 28 29. We have added the use of an objective technique to ...
Background : Stigmata of hemorrhage predict rebleeding and outcome of patients with bleeding peptic ulcers. There are variabilities in reported incidences of stigmata and their respective rebleeding risks. We sought to study the interobserver agreement among experts. Methods : Between June 1994 and July 1994,100 consecutive patients with...
JK Olsen et al. have tested the reliability of pain assessment with this new tool. Inter-rater and intra-rater reliabilities were all above the defined cutt off 0.75 for excellent results. The DoloCuff is now in production and has a great potential for systematic follow-up on sensitivity to pain, on both an individual basis and group wise. The article is accepted in Pain Pratice.. For more information on the study of the equipment, please contact [email protected] ...
Interobserver agreement for the assessment of handicap in stroke patients was investigated in a group of 10 senior neurologists and 24 residents from two centers. One hundred patients were separately interviewed by two physicians in different combinations. The degree of handicap was recorded by each observer on the modified Rankin scale, which has six grades (0-5). The agreement rates were corrected for chance (kappa statistics). Both physicians agreed on the degree of handicap in 65 patients; they differed by one grade in 32 patients and by two grades in 3 patients. Kappa for all pairwise observations was 0.56; the value for weighted kappa (with quadratic disagreement weights) was 0.91. Our results confirm the value of the modified Rankin scale in the assessment of handicap in stroke patients; nevertheless, further improvements are possible. ...
Purpose: To investigate visual rating of pelvis and knee position in young athletes during lower extremity functional tests. Methods: Pelvis and knee alignment, in 23 athletes, was visually rated by 66 physiotherapists. Peak two-dimensional (2D) and three-dimensional (3D) kinematics were also quantified. Ratings were compared to consensus visual ratings of an expert panel. The consensus ratings were also compared to peak kinematics. Reliability was determined using percentage agreement (PA) and the first order agreement coefficient (AC1). Sensitivity, specificity, diagnostic odds ratio (DOR) and differences in kinematics between groups based on the expert visual ratings were calculated to assess rating validity. Results: Mean intra-rater agreement was substantial (PA: 79-88%, AC1: 0.60-0.78). Inter-rater agreement ranged from fair to substantial (PA: 67-80%; AC1: 0.37-0.61). Sensitivity (≥80%) and specificity (≥50%) were acceptable for all tests except the Drop Jump. Experience (DOR 1.6-2.8 times
Results Intraclass correlation coefficients (ICCs) for intra-rater reliability for CDEIS, SES-CD and GELS (95% CIs) were 0.89 (0.86 to 0.93), 0.91 (0.89 to 0.95) and 0.81 (0.77 to 0.89), respectively, with standard error of measurement (SEM) of 2.10, 2.42 and 1.15. The corresponding ICCs for inter-rater reliability were 0.71 (0.63 to 0.76), 0.83 (0.75 to 0.88) and 0.62 (0.52 to 0.70), with SEM of 3.42, 3.07 and 1.63, respectively. Correlation between CDEIS and GELS was 0.75, between SES-CD and GELS was 0.74 and between CDEIS and SES-CD was 0.92. The most common sources of disagreement were interpretation of superficial ulceration, definition of disease site at the ileocolonic anastomosis, assessment of anorectal lesions and grading severity of stenosis. ...
Purpose: To determine external, middle, and inner ear abnormalities on high-resolution computed tomography (HRCT) of temporal bone in patients with microtia and to predict anatomic external and middle ear anomalies as well as the degree of functional hearing impairment based on clinical grades of microtia. Materials and Methods: It was a retrospective study conducted on Indian population. Fifty-two patients with microtia were evaluated for external, middle, and inner ear anomalies on HRCT of temporal bone. Clinical grading of microtia was done based on criteria proposed by Weerda et al. in 37 patients and degree of hearing loss was assessed using pure tone audiometry or brainstem-evoked response in 32 patients. Independent statistical correlations of clinical grades of micotia with both external and middle ear anomalies detected on HRCT and the degree of hearing loss were finally obtained. Results: The external, middle, and inner ear anomalies were present in 93.1%, 74.5%, and 2.7% patients, ...
Reproducible and unbiased methods to quantify alveolar structure are important for research on many lung diseases. However, manually estimating alveolar structure through stereology is time consuming and inter-observer variability is high. The objective of this work was to develop and validate a fast, reproducible and accurate (semi-)automatic alternative. A FIJI-macro was designed that automatically segments lung images to binary masks, and counts the number of test points falling on tissue and the number of intersections of the air-tissue interface with a set of test lines. Manual selection remains necessary for the recognition of non-parenchymal tissue and alveolar exudates. Volume density of alveolar septa ([Formula: see text]) and mean linear intercept of the airspaces (Lm) as measured by the macro were compared to theoretical values for 11 artificial test images and to manually counted values for 17 lungs slides using linear regression and Bland-Altman plots. Inter-observer agreement ...
We included randomised clinical trials with blinded and non-blinded assessment of the same binary outcome. We excluded trials where it was unclear which group was experimental and which was control as such trials would not allow us to determine the direction of any bias; trials in which only a subgroup of patients had been evaluated by blinded and non-blinded assessors, unless they were selected at random; trials in which blinded and non-blinded assessors had access to each others results (for example, blinded assessments were provided to non-blinded assessors as a quality enhancement procedure); and trials where initially blinded assessors clearly had become unblinded-for example, when radiographs showed ceramic material indicative of the experimental intervention. Finally, we excluded trials with blinded end point committees adjudicating the assessments made by non-blinded clinicians because such adjudication often involves previous knowledge of the non-blinded assessment or is restricted to ...
Assessment of the Intra- and Inter-Observer Reliabilities of Ultrasonographically Measured Optic Nerve Sheath Diameters in Normal Adults, Li-juan Wang, Li-min Chen, Ying Chen, Yang
Knowledge of the accuracy of chest radiograph findings in acute lower respiratory infection in children is important when making clinical decisions. I conducted a systematic review of agreement between and within observers in the detection of radiographic features of acute lower respiratory infections in children, and described the quality of the design and reporting of studies, whether included or excluded from the review. Included studies were those of observer variation in the interpretation of radiographic features of lower respiratory infection in children (neonatal nurseries excluded) in which radiographs were read independently and a clinical population was studied. I searched MEDLINE, HealthSTAR and HSRPROJ databases (1966 to 1999), handsearched the reference lists of identified papers and contacted authors of identified studies. I performed the data extraction alone. Ten studies of observer interpretation of radiographic features of lower respiratory infection in children were identified. Seven
Objective: Capsule endoscopy is a novel investigation for diagnosing small bowel diseases. However, its interpretation is highly subjective and the potential variability may compromise its accuracy and reliability. Here we studied the potential inter-observer variations on the interpretation of capsule endoscopy. Method: Two residents and one specialist in gastroenterology independently reviewed 58 capsule endoscopy studies in the same sequential order. The gastric transit time, small bowel transit time, and the most significant small bowel lesion were independently recorded. The consensus transit time was determined by the joint review of the three gastroenterologists. The gold standard for small bowel diagnoses was based on final surgical, endoscopic findings or consensus diagnosis. Results: Clinically significant and relevant small bowel lesions were found in 32 (55%) cases by consensus review. The overall mean accuracy in determining gastric emptying time, small bowel transit time and ...
Purpose: To assess reproducibility of central corneal thickness (CCT) measurement by means of ultrasonic pachymetry.. Methods: Fifty one volunteers underwent three sessions of CCT measurements, each consisting of three CCT measurements, performed by each of three different observers. Intra- and interobserver reproducibility was calculated by means of intraclass correlation coefficient (ICC). The expected range of variability between two independent evaluations was calculated using scatter plots of each test-retest difference against their mean. The standard deviation of the mean differences in the test-retest scores was used to describe the differences in the score spread.. Results: The ICC ranges of the intra- and interobserver evaluations were 0.95-0.97 and 0.89-0.95 respectively; the expected variability was ⩽±1% and ⩽± 2% respectively (95% confidence interval).. Conclusions: The measurement of CCT by means of ultrasonic pachymetry is highly reproducible.. ...
Repeatability and interobserver reproducibility of Artemis-2 high-frequency ultrasound in determination of human corneal thickness Kelechi C Ogbuehi, Uchechukwu L OsuagwuOutpatient Clinic, Department of Optometry, King Saud University, Riyadh, Kingdom of Saudi ArabiaBackground: The purpose of this study was to assess the repeatability and limits of agreement of corneal thickness values measured by a high-frequency ultrasound (Artemis-2), hand-held ultrasound pachymeter (DGH-500) and a specular microscope (SP-3000P).Methods: Central corneal thickness (CCT) was analyzed in this prospective randomized study that included 32 patients (18 men and 14 women) aged 21–24 years. Measurements were obtained in two sessions, one week apart, by two examiners with three devices in a randomized order. Nine measurements were taken (three with each device) on one randomly selected eye of each patient in each measurement session. The coefficient of repeatability and interobserver reproducibility for the values of
Background: This study, promoted by Italian Association of Radiotherapy and Clinical Oncology (AIRO) Head and Neck Group, aimed to assess the current national practice of target volume delineation on a case of neck lymph node metastases from unknown primary evaluating inter-observer variability, in a setting of primary radiotherapy. Materials and methods: A case of metastatic neck lymph node from occult primary was proposed to 17 radiation oncologists. A national reference RT center was identified and considered as benchmark. Participants were requested to delineate target volumes. A structured questionnaire was administered. A comparison between following parameters of the CTVs was performed: centroids distances, Dice similarity index (DSI), Jaccard index and mean distance to agreement (MDA). Volume expressed in cubic centimeters and CTVs cranio-caudal extension were evaluated. Results: Sixteen of 17 radiation oncologists recommended three CTVs dose levels. (CTV HD, CTV ID and CTV LD); CTV ID ...
Background: The aim of this study was to develop a research tool used to assess the efficiency a goalkeeper's actions in a game of futsal. Material/Methods: Author's own proposal of an observation sheet was created and subject to a validation procedure. To assess intra-rater reliability and inter-rater reliability, the ICC test was used. Results: There was a strong compatibility of ratings of the intra-rater reliability - 1.00 (95% Cl 1.00-1.00) and the inter-rater reliability − 0.99 (95% CI 0.99-1.00), which proves the reliability of the proposed research tool. Conclusions: The developed sheet allows the registration and evaluation of individual performance and cooperation in terms of goalkeeper's game objectives pursued both in offence and defence ...
TY - JOUR. T1 - Evaluation of interobserver agreement for assessing lymph node staging in pancreatic cancer using current Endoscopic Ultrasound (EUS) criteria. AU - Gress, F.. AU - Schmitt, C.. AU - Catalane, M.. AU - Affronti, John Paul. AU - Binmoeller, K.. AU - Stevens, P.. AU - Savides, T.. AU - Bhutani, M.. AU - Ciaccia, D.. AU - Nicki, N.. AU - Faigel, D.. AU - Birk, J.. AU - Roubein, L.. AU - Lightdale, C.. PY - 1998/12/1. Y1 - 1998/12/1. N2 - Endoscopic Ultrasound (EUS) has been reported to be an accurate modality for T staging of pancreatic cancer (CA). However, lymph node staging has been less accurate. Recent data suggests that current EUS criteria used to determine benign or malignant status of a lymph node might be inadequate. Aim of Study: To determine the effects of interobserver variation on the overall accuracy of lymph node staging in pancreatic CA. Methods: Twelve patients with previously diagnosed pancreatic ductal adenocarcinoma underwent staging with EUS. Surgical ...
TY - JOUR. T1 - INTRA- AND INTEROBSERVER VARIABILITY OF ULTRASONOGRAPHIC MEASUREMENTS OF THE ADRENAL GLANDS IN HEALTHY BEAGLES. AU - Barberet, Virginie. AU - Pey, Pascaline. AU - Duchateau, Luc. AU - Combes, Anais. AU - Daminet, Sylvie. AU - Saunders, Jimmy H.. PY - 2010. Y1 - 2010. N2 - The aim of the present study was to establish which adrenal gland measurement was characterized by the least variations. To do this, we quantified the variability of seven different size measurements of the canine adrenal gland (maximal length, maximal height at the cranial and caudal poles on longitudinal and transverse images, and maximal width of the cranial and caudal poles) within observer, between observer, and between dogs based on three different measurements made by each of the three observers in six healthy Beagle dogs. The height of the caudal pole of both adrenal glands measured on longitudinal images had the lowest intra-and interobserver variability, while measurements of the length had the highest ...
The purpose of this study was to investigate the interexaminer reliability of the McKenzie algorithm. Thirty-one subjects (25 females and 6 males), ages 20 to 77, with reported neck pain participated in this study. Each subject was examined twice by two McKenzie trained physical therapists. The subjects were evaluated separately utilizing standard McKenzie Cervical Assessment formats and procedures. Upon completion of the assessment, each therapist used an adapted McKenzie cervical algorithm to classify each patient into one of the possible syndromes (Postural, Dysfunction, or Derangements 1-7). Only five diagnostic categories contained enough data to accurately examine reliability and, therefore, coefficient alpha was selected to analyze internal consistency between scores. The results of this study demonstrated fair to excellent interexaminer reliability (.736 to 1.00) for dysfunction and derangements 1, 3, and 7. The poor reliability found with derangement 4 may be attributed to difficulty in
TY - JOUR. T1 - Pre- and post-training session evaluation for interobserver agreement and diagnostic accuracy of probe-based confocal laser endomicroscopy for biliary strictures. AU - Talreja, Jayant P.. AU - Turner, Brian G.. AU - Gress, Frank G.. AU - Ho, Sammy. AU - Sarkaria, Savreet. AU - Paddu, Naveen. AU - Natov, Nikola. AU - Bharmal, Sheila. AU - Gaidhane, Monica. AU - Sethi, Amrita. AU - Kahaleh, Michel. PY - 2014/7. Y1 - 2014/7. N2 - Background and Aim Current diagnostic modalities for indeterminate biliary strictures offer low accuracy. Probe-based confocal laser endomicroscopy (pCLE) permits microscopic assessment of mucosal structures by obtaining real-time high-resolution images of the mucosal layers of the gastrointestinal tract. Previously, an interobserver study demonstrated poor to fair agreement even among experienced confocal endomicroscopy operators. Our objective was to assess interobserver agreement and diagnostic accuracy upon completion of a pCLE training session. Methods ...
BACKGROUND AND PURPOSE: The 6-minute walk test (6MWT) is widely used as a clinical outcome measure. However, the reliability of the 6MWT is unknown in individuals who have recently experienced a hip fracture. The aim of this study was to evaluate the relative and absolute interrater reliability of the 6MWT in individuals with hip fracture.. METHODS: Two senior physical therapy students independently examined a convenience sample of 20 participants in a randomized order. Their assessments were separated by 2 days and followed the guidelines of the American Thoracic Society. Hip fracture-related pain was assessed with the Verbal Ranking Scale.. RESULTS: Participants (all women) with a mean (standard deviation) age of 78.1 (5.9) years performed the test at a mean of 31.5 (5.8) days postsurgery. Of the participants, 10 had a cervical fracture and 10 had a trochanteric fracture. Excellent interrater reliability (intraclass correlation coefficient [ICC2.1] = 0.92; 95% confidence interval, 0.81-0.97) ...
Studies of elderly patients with Garden-I and Garden-II femoral neck fractures (FNFs) suggest that a preoperative posterior tilt of the femoral head of at least 20° increases the risk of fixation failure. A recently published treatment algorithm recommended hemiarthroplasty over internal fixation for elderly patients with Garden-I and Garden-II FNFs and a preoperative posterior tilt of at least 20°. However, the reliability of the method used to measure the posterior tilt has not been assessed according to recommended standards for reliability trials. Four orthopedic registrars and four consultants measured the posterior tilt angle in 50 preoperative lateral radiographs at two occasions six weeks apart. We estimated inter- and intrarater reliability by intraclass correlation coefficient (ICC). We also assessed repeatability by the repeatability coefficient (RC) and agreement by the minimal detectable change (MDC). Based on the suggested cutoff value of 20°, we reported the overall percentage and
Expert psychiatrists conducting work disability evaluations often disagree on work capacity (WC) when assessing the same patient. More structured and standardised evaluations focusing on function could improve agreement. The RELY studies aimed to establish the inter-rater reproducibility (reliability and agreement) of functional evaluations in patients with mental disorders applying for disability benefits and to compare the effect of limited versus intensive expert training on reproducibility. We performed two multi-centre reproducibility studies on standardised functional WC evaluation (RELY 1 and 2). Trained psychiatrists interviewed 30 and 40 patients respectively and determined WC using the Instrument for Functional Assessment in Psychiatry (IFAP). Three psychiatrists per patient estimated WC from videotaped evaluations. We analysed reliability (intraclass correlation coefficients [ICC]) and agreement (standard error of measurement [SEM] and proportions of comparisons within prespecified limits
Purpose: Develop a direct observation (DO) system to serve as a criterion measure for the calibration of models applied to free-living (FL) accelerometer data. Methods: Ten participants (19.4 ± 0.8 years) were video-recorded during four, one-hour FL sessions in different settings: 1) school, 2) home, 3) community, and 4) physical activity. For each setting, 10-minute clips from three randomly selected sessions were extracted and coded by one expert coder and up to 20 trained coders using the Observer XT software (Noldus, Wageningen, the Netherlands). The coder defines each whole-body movement which was further described with three modifiers: 1) locomotion, 2) activity type, and 3) MET value (used to categorize intensity level). Percent agreement was calculated for intra- and inter-rater reliability. For intra-rater reliability, the criterion coder coded all 12 clips twice, separated by at least one week between coding sessions. For inter-rater reliability, coded clips by trained coders were ...
Objective To evaluate the applicability, reproducibility, and diagnostic performance of a new 2D-shear wave elastography (SWE) using the comb-push technique (2D CP-SWE) for detection of hepatic fibrosis, using histopathology as the reference standard. Materials and methods This prospective study was approved by the institutional review board, and informed consent was obtained from all patients. The liver stiffness (LS) measurements were obtained from 140 patients, using the new 2D-SWE, which uses comb-push excitation to produce shear waves and a time-aligned sequential tracking method to detect shear wave signals. The applicability rate of 2D CP-SWE was estimated, and factors associated with its applicability were identified. Intraobserver reproducibility was evaluated in the 105 patients with histopathologic diagnosis, and interobserver reproducibility was assessed in 20 patients. Diagnostic performance of the 2D CP-SWE for hepatic fibrosis was evaluated by receiver operating characteristic (ROC)
Purpose: To estimate inter-observer agreement with regard to describing adnexal masses using the International Ovarian Tumor Analysis (IOTA) terminology and the risk of malignancy calculated using IOTA logistic regression models LR1 and LR2, and to elucidate what explained the largest inter-observer differences in calculated risk of malignancy. Experimental Design: 117 women with adnexal masses were examined with transvaginal gray scale and power Doppler ultrasound by two independent experienced sonologists who described the masses using IOTA terminology. The risk of malignancy was calculated using LR1 and LR2. A predetermined risk of malignancy cutoff of 10% indicated malignancy. Results: There were 94 benign, four borderline and 19 invasively malignant tumors. There was substantial variability between the two sonologists in measurement results and some variability in assessment of categorical variables (agreement 40-98%, Kappa 0.30-0.91). Inter-observer agreement when classifying tumors as ...
Examiner A found 25 BC, examiner B found 30 BC in 220 knees examined (κ = 0.35; 95% CI; 0.14-0.56) and inter-observer reliability was moderate. When US examination taken as the reference, receiver operating characteristic analysis revealed an area under the curve of 0.58 (95% CI; 0.51-0.65) for examiner A and 0.57 (95% CI; 0.50-0.64) for examiner B, showing a weak agreement between physical examination and US assessment. ...
TY - JOUR. T1 - Interrater reliability of an etiologic classification of ischemic stroke. AU - Johnson, C. J.. AU - Kittner, C. J.. AU - McCarter, R. J.. AU - Sloan, R. J.. AU - Stern, Barney. AU - Buchholz, David. AU - Price, T. R.. PY - 1995. Y1 - 1995. N2 - Precise identification of the cause of stroke is critical to research and clinical practice. Published series of ischemic stroke show considerable variation in the proportion of cases classified as atherosclerotic large-vessel disease, lacunar infarct, cardioembolic stroke, stroke of other known cause, and stroke of undetermined etiology. We describe the development and use of an etiology-specific classification of ischemic stroke. The interrater reliability of the classification is then evaluated. Methods A total of 160 cases of ischemic strokes in young adults were reviewed by paired neurologists who assigned cases to prioritized categories. The results of paired ratings were evaluated for each of the potential causes. Interrater ...
TY - JOUR. T1 - The joint council on thoracic surgery education coronary artery assessment tool has high interrater reliability. AU - Lee, Richard. AU - Enter, Daniel. AU - Lou, Xiaoying. AU - Feins, Richard H.. AU - Hicks, George L.. AU - Gasparri, Mario. AU - Takayama, Hiroo. AU - Young, J Nilas. AU - Calhoon, John H.. AU - Crawford, Fred A.. AU - Mokadam, Nahush A.. AU - Fann, James I.. PY - 2013/6. Y1 - 2013/6. N2 - Background: Barriers to incorporation of simulation in cardiothoracic surgery training include lack of standardized, validated objective assessment tools. Our aim was to measure interrater reliability and internal consistency reliability of a coronary anastomosis assessment tool created by the Joint Council on Thoracic Surgery Education. Methods: Ten attending surgeons from different cardiothoracic residency programs evaluated nine video recordings of 5 individuals (1 medical student, 1 resident, 1 fellow, 2 attendings) performing coronary anastomoses on two simulation models, ...
BACKGROUND: Accurate limb volume measurement is key in the assessment of outcomes in lymphedema microsurgery. There are two commonly used methods as follows: manual circumferential measurement (tape) or Perometer measurement. There are no data on the intra- and interclass correlation of either method, making it difficult to establish a gold standard of limb volume measurement. We aim to assess the intra- and interclass correlation of each method to establish the most appropriate method for clinical practice and future research studies, aiming to compare the accuracy and reliability of tape measurement as assessed against Perometer measurement. METHODS AND RESULTS: Student volunteers and experts (lymphedema practitioners) were each asked to perform repeat tape and Perometer measurements on the upper or lower limb of one healthy volunteer. Perometer measurements were globally more accurate than tape (average SE [Perometer]: 23.23 vs. 77.21 [tape]). For intraobserver reliability, experts outperformed
The Berlin Definition of ARDS maintains a link to prior definitions with diagnostic criteria of timing, chest imaging, origin of edema, and hypoxemia. Patients may have ARDS if the onset is within 1 week of a known clinical insult or new/worsening respiratory symptoms. For the bilateral opacities on chest radiograph criterion, a reference set of chest radiographs has been developed to enhance inter-observer reliability. The pulmonary artery wedge pressure criterion for hydrostatic edema was removed, and illustrative vignettes were created to guide judgments about the primary cause of respiratory failure. If no risk factor for ARDS is apparent, however, objective evaluation (e.g., echocardiography) is required to help rule out hydrostatic edema. A minimum level of positive end-expiratory pressure and mutually exclusive PaO2/FiO2 thresholds were chosen for the different levels of ARDS severity (mild, moderate, severe) to better categorize patients with different outcomes and potential responses to ...
Differences between IG using 4D-CBCT as gold standard and the two IG techniques using 3D-CBCT were 3.6 mm (IG-3D) and 1.9 mm (IG-ITV) on average. These uncertainties of 3D CBCT IG appear especially large when compared to the average base-line shift of 4.9 mm in our study, the reason for performing soft-tissue IG. Korreman et al. estimated the residual uncertainty of the IG procedure to 20 % of the initial motion [5], which is optimistic based on our results. Differences in the tumor position between 4D-CBCT and 3D-CBCT based IG increased with increasing motion magnitude of the pulmonary targets and increased with worse image quality scores of 3D-CBCT. These results clearly indicate that 3D-CBCT is not fully sufficient for full motion integration into IG.. This finding of improved accuracy using 4D-CBCT compared to 3D-CBCT is in contrast to the study by Hugo et al. [7], which could be explained by two reasons. First, our study is based on a larger number of patients and poor image quality of the ...
TY - JOUR. T1 - A comparative analysis of pediatric uroflowmetry curves. AU - Vijverberg, Marianne A W. AU - Klijn, Aart J. AU - Rabenort, Ad. AU - Bransen, Jeroen. AU - Kok, Esther T. AU - Wingens, Johanna P M. AU - de Jong, Tom P V M. N1 - Copyright © 2011 Wiley Periodicals, Inc.. PY - 2011/11. Y1 - 2011/11. N2 - AIMS: This study was conducted to try to objectify assessment of pediatric uroflowmetry curves.MATERIALS AND METHODS: Nine professionals in pediatric incontinence care judged 480 pediatric uroflows. On a 1-5 scale, where 1 = anomalous and 5 = normal, uroflows were assessed on four items: staccato, interrupted, flow time and obstruction. Eighty uroflows were re-evaluated for intra-observer agreement. After staccato and interrupted flow had been defined more sharply, another 100 uroflows were analyzed. Cohens Kappa test for nominally classified data was applied to assess agreement. Kappa value of ,0.20 denoted poor agreement, 0.21-0.40 fair, 0.41-0.60 moderate, 0.61-0.80 substantial ...
INTRODUCTION. Acute lower respiratory tract infections (ALRI) in children are a leading cause of death and constitute a substantial burden of disease in developed and developing countries.1,2 A significant proportion of children with ALRI presenting to emergency wards may have also concurrent wheezing of varying severity.3-6. The World Health Organization (WHO) promotes a case-detection and antibiotic management policy for ALRI, particularly pneumonia.7* Tachypnea and chest retraction are the key findings for making a diagnosis of pneumonia and putting patients on antibiotic therapy. There is strong evidence supporting the effectiveness of this policy in reducing childhood mortality due to pneumonia.8,9 However, in children with ALRI and wheezing, it is difficult to determine whether the difficulty in breathing is due to pneumonia or to bronchial obstruction underlying the wheezing. Physicians faced with these patients usually prescribe inhaled or nebulized beta-adrenergics and systemic ...
This article highlights the myths and misunderstandings surrounding the straight leg raise (SLR) test for sciatica. Unfortunately, neither intra- nor inter-observer reliability of the passive SLR test has ever been agreed upon. In addition, there is poor consensus about what constitutes a positive SLR test in terms of pain location, leg elevation limitation or clinical significance. Until there are stricter performance standards and uniform agreement, researchers and clinicians should interpret the test with caution. We believe a true positive SLR should be the reproduction or exacerbation of the typical leg dominant pain in the affected limb at any degree of passive elevation. Those with only increased back pain or any leg pain other than that presenting as the chief complaint should be regarded as false positives ...
Background It is estimated that between 34% and 50% of Australian women entering pregnancy are overweight and obese, which is associated with an increased risk in complications for both the woman and...
The global move towards more conformal radiotherapy for rectal cancer requires better imaging modalities that both visualise the disease accurately and are reproducible; to reduce interobserver variation. This review explores the advances in imaging modalities used in target volume delineation, with a view to make recommendations for current clinical practice and to propose future directions for research. A systematic review was conducted using MEDLINE and EMBASE. Articles considered relevant by the authors were included. Planning with orthogonal films is being replaced by computed tomography (CT) simulation. This is now considered the gold standard and allows conformal three-dimensional planning. Magnetic resonance imaging (MRI) has been shown to overcome some of the limitations of CT and can be used either as a diagnostic image to visually aid planning, or as a planning MRI carried out in the treatment position and co-registered with the planning CT. The latter approach has been shown to ...
OBJECTIVE: Magnetic resonance imaging (MRI) of the spine is increasingly important in the assessment of inflammatory activity in clinical trials with patients with ankylosing spondylitis (AS). We investigated feasibility, inter-reader reliability, sensitivity to change, and discriminatory ability of 3 different scoring methods for MRI activity and change in activity of the spine in patients with AS. METHODS: Thirty sets of spinal MRI at baseline and after 24 weeks of followup, derived from a randomized clinical trial comparing a tumor necrosis factor (TNF)-blocking drug (n = 20) with placebo (n = 10) and selected to cover a wide range of activity at baseline and change in activity, were presented electronically in a partial latin-square design to 9 experienced readers from different countries (Europe, Canada). Readers scored each set of MRI 3 times, using 3 different methods including the Ankylosing Spondylitis spine Magnetic Resonance Imaging-activity [ASspiMRI-a, grading activity (0-6) per ...
Purpose: : To compare endothelial cell density (ECD) and central corneal thickness (CCT) measurements assessed with the laser-scanning-in-vivo-confocal-microscope HRT II corneal module and noncontact specular microscopy. Intrasession intra-observer and inter-observer agreement of the two instruments in a cohort of normal subjects were also determined. Methods: : This prospective observational cross-sectional study included 48 healthy subjects (mean age of 50.9±13.7 years, range 28-82). All subjects underwent ECD and CCT measurement with the laser-scanning-in-vivo-confocal-microscope HRT II corneal module (HRT-CM) and a noncontact-specular-microscope (NCSM) (Tomey EM-3000). The measurements were repeated 3 times with each device by two independent observers during the same session. The differences between the CCT and ECD values recorded by the two instruments were calculated using the paired t-test. The agreement between devices was evaluated using the Bland-Altman method. Intra- and ...
Background/Purpose: Although Ultrasound (US) has demonstrated to be a sensitive and specific tool, its feasibility in daily clinical practice is still under debate. We have developed and validated a fast 4-joints ultrasonographic (US) score to assess disease activity in RA patients. This score named REUMA (Rapid Evaluation by US to Monitor Arthritis) showed an excellent correlation with 28-joint US assessment and good responsiveness. In order to generalize the utilization of this new and simple US score, we evaluated the performance of this score using an external sample of RA patients and assessed the intra and inter-reader reliability. Material and Methods: We conducted a multicenter cross-sectional study, including ambulatory patients with RA diagnosed according to ACR/EULAR 2010 criteria. Clinical data, demographic and disease characteristics were recorded. The 4-joints US score was calculated for each patient including bilateral radio and intracarpal joint and second metacarpophalangeal. ...
Purpose: The purposes of this study were to: 1) investigate the inter-rater and intra-rater reliability of use of the Flexicurve for measurement of spinal length (L), thoracic (TL) and lumbar length (LL), thoracic (TW) and lumbar width (LW), and 2) q
RESULTS: Five different pain classification schemas, six self-report measures of pain, and two measures of pain impact on functioning were selected based on our inclusion criteria. The majority of the studies identified in these areas reported inter- and intra-rater reliability information. Of the little validity data found for pain screening measures, it was difficult to compare due to the variability of the descriptors used. No data on sensitivity was identified ...
Results:. As scored by the readers, the mean chronicity index score varied from 2.3 to 4.8 on a 12-point scale (P = 0.001) and the mean activity index score varied from 5.8 to 11.4 on a 24-point scale (P = 0.0001). Pairs of readers gave scores within 1 point for the chronicity index and within 2 points for the activity index in 50% of cases, and risk group assignments based on chronicity index (three strata) and activity index (two strata) were concordant in 59% and 76% of cases, respectively. Intraclass correlation coefficients for inter-reader agreement were 0.58 for the chronicity index (P , 0.01) and 0.52 for the activity index (P , 0.01). Intrareader agreement was uniformly higher than inter-reader agreement, but mean intraclass correlation coefficients exceeded 0.70 for only 1 of the 10 index components. Repeated readings yielded chronicity index scores that were more than 1 point discordant in 45% of cases and activity index scores that were more than 2 points discordant in 43% of cases. ...
Background The collapsibility index of inferior vena cava (cIVC) is widely used to decide fluid infusion in spontaneously breathing intensive care unit patients. The authors hypothesized that high inspiratory efforts may induce false-positive high cIVC values. This study aims at determining a value of diaphragmatic motion recorded by echography that could predict a high cIVC (more than or equal to 40%) in healthy volunteers. Methods The cIVC and diaphragmatic motions were recorded for three levels of inspiratory efforts. Right and left diaphragmatic motions were defined as the maximal diaphragmatic excursions. Receiver operating characteristic curves evaluated the performance of right diaphragmatic motion to predict a cIVC more than or equal to 40% defining the best cutoff value. Results Among 52 included volunteers, interobserver reproducibility showed a generalized concordance correlation coefficient (ρ c ) above 0.9 for all echographic parameters. Right diaphragmatic motion correlated with ...
Introduction: This study evaluated the influence of cast-gold posts on the diagnostic ability of a cone beam computed tomography (CBCT) system in assessing longitudinal root fractures. In addition, the influence of gutta-percha and variations in voxel resolution were assessed. Methods: One hundred eighty endodontically prepared teeth were divided into 3 experimental and 3 control groups and placed in a dry human skull. The teeth in the experimental groups were artificially fractured. Certain experimental and control groups were filled with gutta-percha cones. Other experimental and control groups were filled with cast-gold posts. All the teeth were viewed by using a tomography scan with 2 voxel resolution protocols (0.3-mm and 0.2-mm). A calibrated examiner, blinded to the protocol, assessed the images by using the nominated scan software. Results: The kappa values obtained for intraobserver reproducibility were 0.84 and 0.93 for 0.3-mm and 0.2-mm voxel resolution, respectively. The presence of ...
We take measurements every day to control processes and to accept or reject products. Often, there is little thought that goes into understanding the measurement system. We take it for granted that the numbers are good. The quality of a measurement system is determined by the statistical properties of the data that are generated. We know, for example, that when the same person measures the same part with the same instrument that there can be different results. This is termed repeatability - re
We use cookies to ensure that we give you the best experience on our website. If you click Continue well assume that you are happy to receive all cookies and you wont see this message again. Click Find out more for information on how to change your cookie settings ...
Purpose: The purpose of this study was to investigate the correlation between model observer and human observer performance in CT imaging for the task of lesion detection and localization when the lesion location is uncertain.Methods: Two cylindrical rods (3-mm and 5-mm diameters) were placed in a 35 × 26 cm torso-shaped water phantom to simulate lesions with −15 HU contrast at 120 kV. The phantom was scanned 100 times on a 128-slice CT scanner at each of four dose levels (CTDIvol = 5.7, 11.4, 17.1, and 22.8 mGy). Regions of interest (ROIs) around each lesion were extracted to generate images with signal-present, with each ROI containing 128 × 128 pixels. Corresponding ROIs of signal-absent images were generated from images without lesion mimicking rods. The location of the lesion (rod) in each ROI was randomly distributed by moving the ROIs around each lesion. Human observer studies were performed by having three trained observers identify the presence or absence of lesions, indicating the ...
INTRODUCTION: Manual interpretation of immunohistochemistry (IHC) is a subjective, time-consuming and variable process, with an inherent intra-observer and inter-observer variability. Automated image analysis approaches offer the possibility of developing rapid, uniform indicators of IHC staining. In the present article we describe the development of a novel approach for automatically quantifying oestrogen receptor (ER) and progesterone receptor (PR) protein expression assessed by IHC in primary breast cancer. METHODS: Two cohorts of breast cancer patients (n = 743) were used in the study. Digital images of breast cancer tissue microarrays were captured using the Aperio ScanScope XT slide scanner (Aperio Technologies, Vista, CA, USA). Image analysis algorithms were developed using MatLab 7 (MathWorks, Apple Hill Drive, MA, USA). A fully automated nuclear algorithm was developed to discriminate tumour from normal tissue and to quantify ER and PR expression in both cohorts. Random forest clustering was
OBJECTIVES: To test the reproducibility of the ABILOCO questionnaire. To validate the patient self-reporting method and the third-party assessment of the stroke patients locomotion ability by a treating physical therapist. DESIGN: Prospective study. SETTING: University hospital. PARTICIPANTS: Adult stroke patients (N=28; 59+/-13y). The time since stroke ranged from 3 to 253 weeks. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURE: The ABILOCO questionnaire. RESULTS: The results of patient self-assessment and the results of the third-party assessments by the physiotherapists at a 2-week interval were highly correlated (intraclass correlation coefficient [ICC]=.77 and ICC=.89, respectively). The results of the patient self-assessment and the third-party assessment by the physical therapist were both well correlated to assessment by an independent medical examiner who observed the patient during the 13 ABILOCO activities (ICC=.69 and ICC=.87, respectively). CONCLUSIONS: The use of ABILOCO as a ...
Although biplane right anterior oblique-left anterior oblique (RAO/LAO) quantitative left ventricular (LV) angiography is commonly performed, justification of LV volume calculation using the area length method (originally formulated from anteroposterior-lateral (AP/LAT) angiograms) has been limited. To assess whether RAO/LAO and AP/LAT LV volumes are similar when computed by the area length method formula, we performed biplane cine LV angiography in both RAO/LAO and AP/LAT projections in random sequence in 21 patients and four LV models of known volume. LV silhouettes were drawn independently by two trained observers. Calculated angiographic volume of the models correlated almost exactly with their true volume (r = 0.999), establishing the absolute accuracy of this system. Rotation of the LV models through 90 degrees of obliquity at 10 degree increments demonstrated a mean change from true volume of only -5.4 +/- 0.7% (p less than 0.001). In the patient studies, rotation to the 30 degree RAO/60 ...
Background: Staff and relatives often act as advocates for people with severe to profound intellectual disability (ID). Since staff and relatives make proxy judgements about quality of life for people with severe to profound ID, it is important to know how well the perceptions of the two groups correspond with each other. Method: Fifty-one staff-family dyads completed the QOL-PMD questionnaire. Agreement between proxies was assessed using the proportion of observer agreement (Po) and Wilcoxon signed-rank tests. Results: Proxies agreed relatively strongly about the applicability of questionnaire items. There was also relatively strong agreement about the clients QOL, except for items related to internal, subjective experiences (e.g., sexual fulfillment, pain). Conclusion: People with severe to profound ID are not able to report their QOL well. Because the people making proxy judgements about their QOL are not in good agreement on some of the most critical subjective indicators, careful ...
Our current study demonstrated that the Meyer scale has moderate interreader agreement immediately after treatment and strong agreement at follow-up, and for both immediate posttreatment and follow-up, the agreement for the Meyer scale is comparable with that of the Raymond scale. Furthermore, the performance of the Meyer and Raymond scales for predicting major recurrence risk, based on immediate posttreatment results, was fair and similar between scales. These data indicate that the Meyer and Raymond scales have similar performance and consistency levels for aneurysm occlusion evaluation and prediction of a major recurrence.. The interobserver agreement in our study for the Raymond scale is higher than that previously reported by Tollard et al,19 who reported a fair κ statistic at 0.276. Interobserver agreement statistics for the Raymond scale ranged from 0.28 to 0.83 in prior studies.19⇓⇓⇓⇓-24 Most interesting, the Meyer scale agreement score was higher than that of the Raymond scale ...
BURGOS D, MARÍA EUGENIA; MANTEROLA D, CARLOS y SANHUEZA C, ANTONIO. Construction of a scale to assess methodological quality of diagnostic tests articles. Rev Chil Cir [online]. 2011, vol.63, n.5, pp.493-494. ISSN 0718-4026. http://dx.doi.org/10.4067/S0718-40262011000500009.. Introduction: Despite the methodological quality (MQ) of scientific publications is a multidimensional concept difficult to understand, their evaluation is essential at the time of making decisions that support our clinical practice. However, in the field of diagnostic tests (DT), which is in a steady and rapid development, there are no valid and reliable instruments to assess MQ. Aim: To report the results of the generation of items and domains of a scale to determine MQ in studies of DT and to determine interobserver reliability of this scale. Material and Methods: Construction of a scale to assess MQ of DT articles and pilot study to determine interobserver reliability. Designed scale was applied to 20 DT studies ...
Group-sequential testing is widely used in pivotal therapeutic, but rarely in diagnostic research, although it may save studies, time, and costs. The purpose of this paper was to demonstrate a group-sequential analysis strategy in an intra-observer study on quantitative FDG-PET/CT measurements, illuminating the possibility of early trial termination which implicates significant potential time and resource savings. Primary lesion maximum standardised uptake value (SUVmax) was determined twice from preoperative FDG-PET/CTs in 45 ovarian cancer patients. Differences in SUVmax were assumed to be normally distributed, and sequential one-sided hypothesis tests on the population standard deviation of the differences against a hypothesised value of 1.5 were performed, employing an alpha spending function. The fixed-sample analysis (N = 45) was compared with the group-sequential analysis strategies comprising one (at N = 23), two (at N = 15, 30), or three interim analyses (at N = 11, 23, 34), respectively, which
Purpose: Recent advances in medical imaging technologies provide opportunities to quantify the tumor phenotype throughout the course of treatment non-invasively. The emerging field of Radiomics addresses this by converting medical images into minable data by applying a large number of quantitative imaging algorithms. Accurate tumor segmentation is one of the main challenges of Radiomics. It has been shown that semiautomatic segmentation approaches efficiently reduce inter-observer variability as compared to the time consuming manual delineations. In this study, a semiautomatic volumetric segmentation algorithm, implemented in the free and publicly available 3D-Slicer platform, was investigated in terms of its robustness for Radiomics features quantification. Methods: Fifty-six 3D-Radiomics features, quantifying phenotypic differences based on the tumor intensity, shape and texture, were extracted from the computed tomography images of twenty lung cancer patients. These Radiomics features were ...