Application of the velocity profile method is recommended for reliable measurement of flow volume in larger vessels, and ultrasonic flowmetry is a useful clinical tool for this purpose. We used the velocity profile in conjunction with a minor modification in the conventional velocity profile method and examined the reproducibility of flowmetry from color Doppler data. Data of three examiners were allowed to analyze intraobserver reproducibility and interobserver agreement in the common carotid artery, and we measured flow volume in the peripheral vessels of healthy individuals. Estimated flow volumes in five healthy examinees were 350 to 550 ml/min and did not vary significantly between examiners. Interobserver correlation was good (r 1=0.63), but intraobserver correlations in two sonographers were excellent (r 1=0.85) in by one who was experienced in this method and poor (r 1=0.32) in the other. Good interobserver agreement and intraobserver reproducibility of experienced examiners suggests that this
BACKGROUND: The incidence of incidental pulmonary embolism (IPE) in cancer patients is increasing. There is scant information on the interobserver agreement among radiologists about the diagnosis of distal incidental clots and the actual radiologic extension of IPE. METHODS: A total of 88 contrast-enhanced computed tomography (CT) scans of cancer patients with IPE were reassessed blindly by two expert thoracic radiologists. First, 62 scans were reassessed and the interobserver agreement on most proximal extent of IPE was calculated between the two expert radiologists as well as between the initial and expert reading, using the kappa statistic. The sample was enriched with 26 additional scans for a total of 30 segmental and 29 subsegmental IPE to determine the interobserver agreement on distal clots. RESULTS: The level of agreement regarding the most proximal extent of IPE between the expert radiologists was very good (kappa 0.84; 95% CI, 0.73-0.95) and poor between the original radiologist and ...
Five observers using the Jensen modification of the Evans classification and the AO classification (with and without subgroups) classified the radiographs of 88 trochanteric hip fractures. Each observer classified the radiographs independently on two occasions 3 months apart. Kappa statistical analysis was used for determination of intra- and inter-observer variation. For the Jensen classification, the mean kappa value was 0.52 (range: 0.44-0.60) for intra-observer variation and 0.34 (range: 0.17-0.38) for inter-observer variation. For the AO system with subgroups, the mean kappa value was 0.42 (range: 0.20-0.65) for intra-observer variation and 0.33 (range: 0.14-0.48) for inter-observer variation. For the AO classification system without subgroups, the mean kappa value was 0.71 (range: 0.60-0.81) for intra-observer variation and 0.62 (range: 0.50-0.71) for inter-observer variation. We recommend classifying trochanteric fractures into three groups as that of the AO system without the subgroups. For ease
Objectives: To evaluate inter-observer agreement for microscopic measurement of inflammation in synovial tissue using manual quantitative, semiquantitative and computerised digital image analysis.. Methods: Paired serial sections of synovial tissue, obtained at arthroscopic biopsy of the knee from patients with rheumatoid arthritis (RA), were stained immunohistochemically for T lymphocyte (CD3) and macrophage (CD68) markers. Manual quantitative and semiquantitative scores for sub-lining layer CD3+ and CD68+ cell infiltration were independently derived in 6 international centres. Three centres derived scores using computerised digital image analysis. Inter-observer agreement was evaluated using Spearmans Rho and intraclass correlation coefficients (ICCs).. Results: Paired tissue sections from 12 patients were selected for evaluation. Satisfactory inter-observer agreement was demonstrated for all 3 methods of analysis. Using manual methods, ICCs for measurement of CD3+ and CD68+ cell infiltration ...
OBJECTIVE: Accurate and consistent outcome assessment is essential to randomized clinical trials. We aimed to explore observer variation in the assessment of outcome in a recently completed trial of dexanabinol in head injury and to consider steps to reduce such variation. METHODS: Eight hundred sixty-one patients with severe traumatic brain injury who were admitted to 86 centers were included in a multicenter, placebo-controlled, Phase III trial. Outcome was assessed at 3 and 6 months postinjury using the extended Glasgow Outcome Scale; standardized assessment was facilitated by the use of a structured interview. Before initiation of trial centers, outcome ratings were obtained for sample cases to establish initial levels of agreement. Training sessions in outcome assessment were held, and problems in assigning outcome were investigated. During the trial, a process of central review was established to monitor performance. Interobserver variation was analyzed using the κ statistic. RESULTS: ...
Radiological sacroiliitis in Behçets syndrome (BS) has been a subject of controversy. We have examined pelvic radiographs of 38 patients with BS and 28 age and sex matched controls which we reported previously, and also 17 with ankylosing spondylitis (AS), 27 with non-renal familial Mediterranean fever (FMF), and 33 with primary osteoarthrosis (OA). Initially, five observers assessed radiographs on two different occasions according to the New York criteria for sacroiliitis in a blind protocol. Later, three of them examined the various possible abnormalities of the sacroiliac (SI) joints after training sessions. Although the inter- and intraobserver variation was quite high, all observers found the expected changes in patients with AS. The abnormalities detected in the other diseases were either mild, inconsistent, or both. Erosions were confined to patients with AS, and osteophytes and glenoid sulci to patients with OA. We conclude that high observer variation in interpreting a film of the ...
Inter-observer variation is a well-known problem in medical practice. Gardenia et al. first reported on this issue in the 1950 s [3], and it became a subject for discussion in the radiotherapeutic community in the 1970 s. In the 1990 s, many articles were published about inter-observer variation for a variety of cancers: prostate cancer [4], brain tumors [5], breast cancer [6] head and neck cancer [12, 13], and lung cancer [14, 15]. However, we were unable to find any papers that examined inter-observer variation for pituitary adenoma and meningioma; to the best of our knowledge, this is the first such report.. DVHs analysis by superimposing different contours from multiple clinicians onto the default treatment plan showed higher maximal dose for optic tract (Figure 3). It was increased to 23.64 Gy (268% higher dose than default plan) for the pituitary adenoma and 19.39 Gy (131%) for the meningioma. These results imply that contour deviations across plans could easily cause unexpectedly higher ...
The variation between two observers in grading 100 biopsies and the corresponding main specimens of rectal carcinomas has been examined. Using kappa statistics, which take account of chance agreement, we found a highly significant level of agreement. As expected, higher levels were obtained for intraobserver agreement. However, disagreements between observers were in many instances "haphazard" and there were differences in bias between them. Fifty paired biopsies and main tumours were graded by five observers and the results analysed for bias and by kappa statistics for overall and conditional agreement. These methods revealed significant overall agreement but the levels for some observer pairs did not differ significantly from chance. Examination for observer bias indicated differing standards of grading, and haphazard disagreements reached high levels for some observer pairs. The intraobserver agreement between the grade of the biopsy and the corresponding main tumour varied from 56-69% but ...
Objective: To analyse inter-observer variation between a neuroradiologist and neurosurgeon in the MRI diagnosis of lumbar nerve root compression. Although lumbar MFI is primarily analyzed and reported by a radiologist, neurosurgeons often analyse it independently as they have sufficient clinical background as well as radiological expertise to diagnose most spinal pathologies on Magnetic Resonance Imaging (MRI).Methods: Retrospective analysis was carried out for images of 54 patients who underwent MRI between March and July 2010 of lumbar spine with suspected lumbar disc herniation and nerve root compression, at Aga Khan Hospital, Karachi, Pakistan. One fellowship trained neuroradiologist and one neurosurgeon evaluated the images on PACS system separately. Both observers were unaware of the patients clinical history and each others findings. Lumbar discs at L3-L4, L4-L5 and L5-S1 levels were evaluated by both observers for disc disease and nerve compression. Findings were recorded on a proforma and
Objective: To analyse inter-observer variation between a neuroradiologist and neurosurgeon in the MRI diagnosis of lumbar nerve root compression. Although lumbar MFI is primarily analyzed and reported by a radiologist, neurosurgeons often analyse it independently as they have sufficient clinical background as well as radiological expertise to diagnose most spinal pathologies on Magnetic Resonance Imaging (MRI).Methods: Retrospective analysis was carried out for images of 54 patients who underwent MRI between March and July 2010 of lumbar spine with suspected lumbar disc herniation and nerve root compression, at Aga Khan Hospital, Karachi, Pakistan. One fellowship trained neuroradiologist and one neurosurgeon evaluated the images on PACS system separately. Both observers were unaware of the patient\s clinical history and each other\s findings. Lumbar discs at L3-L4, L4-L5 and L5-S1 levels were evaluated by both observers for disc disease and nerve compression. Findings were recorded on a proforma and
In breast cancer, there is a growing body of evidence that tumor-infiltrating lymphocytes (TILs) may have clinical utility and may be able to direct clinical decisions for subgroups of patients. Clinical utility is, however, not sufficient for warranting the implementation of a new biomarker in the routine practice, and evaluation of the analytical validity is needed, including testing the reproducibility of decentralized assessment of TILs. The aim of this study was to evaluate the inter-observer agreement of TILs assessment using a standardized method, as proposed by the International TILs Working Group 2014, applied to a cohort of breast cancers reflecting an average breast cancer population ...
The Kappa values for the first series of ten carcinomas of various degrees of differentiation showed good to very good agreement for MIB-1-LI (Kappa 0.56-0.72). However, we found very high inter-observer variabilities (Kappa 0.04-0.14) in the read-outs of the G2 carcinomas. It was not possible to explain the inconsistencies exclusively by any of the following factors: (i) pathologists divergent definitions of what counts as a positive nucleus (ii) the mode of assessment (counting vs. eyeballing), (iii) immunostaining technique, and (iv) the selection of the tumor area in which to count. Despite intensive confrontation of all participating pathologists with the problem, inter-observer agreement did not improve when the same slides were re-examined 4 months later (Kappa 0.01-0.04) and intra-observer agreement was likewise poor (Kappa 0.00-0.35 ...
Nonblinded assessors of subjective measurement scale outcomes in randomized clinical trials tended to generate substantially biased effect sizes. Standardized mean differences were exaggerated by a pooled standard deviation of 0.23 (95% CI 0.40 to 0.06) or, in relative terms, by 68% (95% CI 14% to 230%).. Observer bias can be perceived as the result of the interaction between observers predispositions and the subjectivity of the outcome. Predispositions are likely to differ substantially from observer to observer and from trial to trial. In some trials, conscientious nonblinded assessors may overcompensate for an expected bias in favour of the experimental intervention and paradoxically induce a bias favouring the control, whereas other trials will have fairly neutral assessors with no important bias. Thus, the degree of observer bias in trials with clearly predisposed outcome assessors is likely to be considerably higher than the mean we see here, which is based on all of the included trials. ...
Reproducibility of grading H pylori related gastritis is high using the updated Sydney system. Despite the novel criteria for scoring atrophy, there was imperfect agreement on this feature between two independent histopathologists.
PURPOSE: We aimed to determine the intra- and interobserver agreement on the software analysis of very low dose hepatic perfusion CT (pCT).. METHODS: A total of 53 pCT examinations were obtained from 21 patients (16 men, 5 women; mean age, 60.4 years) with proven liver metastasis from various primary cancers. The pCT examinations were analyzed by two readers independently and perfusion parameters were noted for whole liver, whole metastasis, metastasis wall, and normal-looking liver (liver tissue without metastasis) in regions of interest (ROIs). Readers repeated the analysis after an interval of one month. Intra- and interobserver agreements were assessed with intraclass correlation coefficients (ICC) and Bland-Altman statistics.. RESULTS: The mean ICCs of all ROIs between readers were 0.91, 0.93, 0.86, 0.45, 0.53, and 0.66 for blood flow (BF), blood volume (BV), permeability, arterial liver perfusion (ALP), portal venous perfusion (PVP) and hepatic perfusion index (HPI), respectively. The mean ...
Results Analysis of diagnosis for all circulations and all readers gave a composite κ value of 0.86 and pairwise-weighted κ (κp-w) value of 0.91, both regarded as almost perfect agreement. This was due to the high proportion of responses that showed partial agreement. Analysis of Gleason Sum Score gave κ=0.38 and κp-w=0.58 over all circulations and all readers, indicating that discrepancies occur at the boundary between adjacent grades and may not be as clinically significant as suggested by composite κ. ...
The aim was to assess intraobserver reliability of a new semi-automated technique of embryo volumetry. Power calculations suggested 46 subjects with viable, singleton pregnancies were required for reliability analysis. Crown rump length (CRL) of each
Background Interim PET/CT is widely performed in lymphoma patients in clinical practice and clinical trials. Visual assessment using a 5-point scale is proposed for PET/CT interpretation, but intra- and inter-observer variation is not fully investigated. Purpose To investigate intra- and inter-observer variations in the reporting of interim positron emission tomography/computed tomography (PET/CT) in lymphoma patients, and the influence of clinical information on the interpretation. Material and Methods Three expert readers from different institutions interpreted interim PET/CT images of 42 consecutive patients with malignant lymphoma twice, with and without clinical information ...
The results of this study showed a "good" to "fair" intraexaminer agreement of the 2 types of landmarks analyzed. The 2 examiners proved to be in "good," "moderate," and "fair" agreement when they classified the roof of the mandibular canal and the mental foramen. Because the intraexaminer agreement was classified as "fair" and "moderate" in some instances, it was not possible to have an interexaminer agreement. To achieve interexaminer agreement, it is necessary to have at least a "good" intraexaminer agreement, which was not achieved in this study. The interpretations of both examiners were similar in relation to the detection of MCR and MF with "good" agreement of the left side. This may be explained by the preradiographic interpretation calibration that was advised by an experienced radiologist and supported by the confidence intervals results. On the other hand, the agreement for both examiners and for both landmarks of the right sides was not similar. The results showed similar tendencies; ...
Several studies2,7,15,16 have analyzed the intra-rater reliability of the 6MWT; therefore, this test has been considered reliable for assessing functional capacity in patients with COPD after a practice test. However, there is a lack of studies verifying the inter-rater reliability for this population.. The intra-rater 6MWT reliability in our study presented ICC values for walked distance ,0.75, indicating excellent reliability. This analysis has been already studied in subjects with chronic respiratory disease by many authors, who found ICC values ranging from 0.82 to 0.99,7,12,14,15,33-35 confirming the findings of our study. The studies mentioned above were conducted with COPD,7,15,34 with obstructive disease and restrictive lung diseases,12 and with lung disease in the final stage.35 The last 2 studies not performed the second 6MWT, with an interval of 30 min after the first 6MWT, according to the standards of the ATS/ERS.7,14 Furthermore, we found low coefficient of variation values (0.06), ...
Despite the many studies on venous haemodynamics using duplex, only a few evaluated the normal values, variability and reproducibility. Therefore, the range and variability of venous diameter, compressibility, flow and reflux were measured. To obtain normal values, 42 healthy individuals (42 limbs, 714 vein segments) with no history of venous disease were scanned by duplex. To determine the reproducibility the intra-observer variability was measured in 11 healthy individuals (187 vein segments) and the inter-observer variability in 15 healthy individuals (255 vein segments) and 13 patients (169 vein segments) previously diagnosed with deep venous thrombosis. Of the 714 normal vein segments, 708 (99%) were traceable, including the crural veins. Of the traceable vein segments, 675 (95%) were compressible and in 696 (98%) flow was present. Of the 42 common femoral vein segments, in 25 (60%) the reflux duration exceeded 1.0 s, but in the other proximal vein segments the reflux duration was less than ...
In this study, we found a poor correlation between naked-eye assessment of the CR time and qCR time measures in both laymen and clinical staff. Further, we observed poor naked-eye intra-observer repeatability and interobserver agreement by clinical staff in their assessment of CR time. The use of a categorical evaluation of time measurement did not improve agreement between naked-eye estimations and machine-derived classifications.. It is self-evident to most clinicians that different observers, not only in regard to the CR test, often disagree in clinical assessments based on naked-eye observation.24 Previous studies on the reliability of the CR test have partially addressed this by showing a lack of interobserver agreement, but neither performance on the task to actually determine return to normal skin colour, nor the intra-observer repeatability for a group of observers on a standardised set of cases has been assessed previously.11 28 29. We have added the use of an objective technique to ...
Background : Stigmata of hemorrhage predict rebleeding and outcome of patients with bleeding peptic ulcers. There are variabilities in reported incidences of stigmata and their respective rebleeding risks. We sought to study the interobserver agreement among experts. Methods : Between June 1994 and July 1994,100 consecutive patients with...
Interobserver agreement for the assessment of handicap in stroke patients was investigated in a group of 10 senior neurologists and 24 residents from two centers. One hundred patients were separately interviewed by two physicians in different combinations. The degree of handicap was recorded by each observer on the modified Rankin scale, which has six grades (0-5). The agreement rates were corrected for chance (kappa statistics). Both physicians agreed on the degree of handicap in 65 patients; they differed by one grade in 32 patients and by two grades in 3 patients. Kappa for all pairwise observations was 0.56; the value for weighted kappa (with quadratic disagreement weights) was 0.91. Our results confirm the value of the modified Rankin scale in the assessment of handicap in stroke patients; nevertheless, further improvements are possible. ...
Purpose: To investigate visual rating of pelvis and knee position in young athletes during lower extremity functional tests. Methods: Pelvis and knee alignment, in 23 athletes, was visually rated by 66 physiotherapists. Peak two-dimensional (2D) and three-dimensional (3D) kinematics were also quantified. Ratings were compared to consensus visual ratings of an expert panel. The consensus ratings were also compared to peak kinematics. Reliability was determined using percentage agreement (PA) and the first order agreement coefficient (AC1). Sensitivity, specificity, diagnostic odds ratio (DOR) and differences in kinematics between groups based on the expert visual ratings were calculated to assess rating validity. Results: Mean intra-rater agreement was substantial (PA: 79-88%, AC1: 0.60-0.78). Inter-rater agreement ranged from fair to substantial (PA: 67-80%; AC1: 0.37-0.61). Sensitivity (≥80%) and specificity (≥50%) were acceptable for all tests except the Drop Jump. Experience (DOR 1.6-2.8 times
Results Intraclass correlation coefficients (ICCs) for intra-rater reliability for CDEIS, SES-CD and GELS (95% CIs) were 0.89 (0.86 to 0.93), 0.91 (0.89 to 0.95) and 0.81 (0.77 to 0.89), respectively, with standard error of measurement (SEM) of 2.10, 2.42 and 1.15. The corresponding ICCs for inter-rater reliability were 0.71 (0.63 to 0.76), 0.83 (0.75 to 0.88) and 0.62 (0.52 to 0.70), with SEM of 3.42, 3.07 and 1.63, respectively. Correlation between CDEIS and GELS was 0.75, between SES-CD and GELS was 0.74 and between CDEIS and SES-CD was 0.92. The most common sources of disagreement were interpretation of superficial ulceration, definition of disease site at the ileocolonic anastomosis, assessment of anorectal lesions and grading severity of stenosis. ...
Purpose: To determine external, middle, and inner ear abnormalities on high-resolution computed tomography (HRCT) of temporal bone in patients with microtia and to predict anatomic external and middle ear anomalies as well as the degree of functional hearing impairment based on clinical grades of microtia. Materials and Methods: It was a retrospective study conducted on Indian population. Fifty-two patients with microtia were evaluated for external, middle, and inner ear anomalies on HRCT of temporal bone. Clinical grading of microtia was done based on criteria proposed by Weerda et al. in 37 patients and degree of hearing loss was assessed using pure tone audiometry or brainstem-evoked response in 32 patients. Independent statistical correlations of clinical grades of micotia with both external and middle ear anomalies detected on HRCT and the degree of hearing loss were finally obtained. Results: The external, middle, and inner ear anomalies were present in 93.1%, 74.5%, and 2.7% patients, ...
We included randomised clinical trials with blinded and non-blinded assessment of the same binary outcome. We excluded trials where it was unclear which group was experimental and which was control as such trials would not allow us to determine the direction of any bias; trials in which only a subgroup of patients had been evaluated by blinded and non-blinded assessors, unless they were selected at random; trials in which blinded and non-blinded assessors had access to each others results (for example, blinded assessments were provided to non-blinded assessors as a quality enhancement procedure); and trials where initially blinded assessors clearly had become unblinded-for example, when radiographs showed ceramic material indicative of the experimental intervention. Finally, we excluded trials with blinded end point committees adjudicating the assessments made by non-blinded clinicians because such adjudication often involves previous knowledge of the non-blinded assessment or is restricted to ...
Interobserver agreement for the current ATS/ERS/JRS/ALAT CT criteria for UIP is only moderate among thoracic radiologists, irrespective of their experience, and did not vary with patient age or the MDT diagnosis.
Background and Aims: The recent development of microforceps for EUS through-the-needle biopsy (TTNB) sampling of the wall of pancreatic cystic lesions (PCLs) allows the collection of histologic specimens never handled and evaluated before by pathologists. We aimed to estimate the interobserver agreement among pathologists in evaluating such samples. Methods: TTNB specimen slides from 40 PCLs with worrisome features were retrieved and independently evaluated for specimen adequacy, presence of lining epithelium, grade of epithelial dysplasia, presence of ovarian type stroma, and specific diagnosis by 6 expert pathologists from 6 different tertiary care centers. The Gwets AC1 was used to assess interobserver agreement. Results: An almost perfect agreement was observed for specimen adequacy (AC1, .82; 95% confidence interval [CI], .79-.98), presence of lesional epithelium (AC1, .90; 95% CI, .86-.92), epithelial dysplasia (AC1, .97; 95% CI, .95-.99), and ovarian-like stroma (AC1, .90; 95% CI, ...
Human cancers are still diagnosed and classified using the light microscope. The criteria are based upon morphologic observations by pathologists and tend to be subject to interobserver variation. In preoperative biopsies of non-small cell lung cancers, the diagnostic concordance, even amongst experienced pulmonary pathologists, is no better than a coin-toss. Only 25% of cancer patients, on average, benefit from therapy as most therapies do not account for individual factors that influence response or outcome. Unsuccessful first line therapy costs Canada CAN$1.2 billion for the top 14 cancer types, and this extrapolates to$90 billion globally. The availability of accurate drug selection for personalized therapy could better allocate these precious resources to the right therapies. This wasteful situation is beginning to change with the completion of the human genome sequencing project and with the increasing availability of targeted therapies. Both factors are giving rise to attempts to correlate tumor
Breast cancer is the most common form of cancer in women. Clinicians favor 2D ultrasonography for breast tissue abnormality screening due to high sensitivity and specificity compared to competing technologies. However, inter- and intra-observer variability in visual assessment and reporting of lesions often handicaps its performance. In this work we present a completely automatic system for detection and segmentation of breast lesions in 2D ultrasound images. We employ random forests for learning of tissue specific primal to discriminate breast lesions from surrounding normal tissues. This enables it to detect lesions of multiple shapes and sizes, as well as discriminate between hypo-echoic lesion from associated posterior acoustic shadowing ...
Following the review, the researchers found that high concordance and interobserver reproducibility were present with the PD-L1 IHC 28-8 pharmDx, PD-L1 IHC 22C3 pharmDx, and Ventana PD-L1 SP263 clinical trial assays for PD-L1 expression on tumor cell membranes. For lower PD-L1 expression was detected with Ventana PD-L1 SP142. Additionally, immune-cell PD-L1 expression was variable and interobserver concordance was poor. Researchers also noted the variable effects on PD-L1 expression for inter- and intraturmoral heterogeneity ...
Improving the accuracy of a survey is the focus of Mark S. Litwins book, which shows how to assess and interpret the quality of survey data by thoroughly examining the survey instrument used. He explains how to code and pilot test new and established surveys. In addition, he covers issues such as: how to measure reliability (including test-retest, alternate form, internal consistency, inter-observer and intra-observer reliability); how to measure validity (including content, criterion and construct validity); how to address cross-cultural issues in survey research; and how to scale and score a survey. ...
Authors: Radwan, Ahmed , Bigney, Kyle A. , Buonomo, Haily N. , Jarmak, Michael W. , Moats, Shannon M. , Ross, Jaimie K. , Tatarevic, Enida , Tomko, Mary Anne Article Type: Research Article Abstract: PURPOSE: To evaluate the extent of intra-subject difference in hamstring flexibility and its possible relationship to the severity of Low Back Pain (LBP). A secondary purpose was to evaluate the extent of intra-rater reliability using both electrogoniometer and conventional goniometer for measuring hamstring tightness. IMPORTANCE: Potential correlations between muscle impairments and LBP may lead to more effective treatments and prevention strategies. METHODS: Seventy two participants with mechanical LBP were recruited for this study. The sample included; 41 females, 31 males with a mean age of 33.69 ± (11.04) years, height of 170 ± (9) cm, and …weight of 79.5 ± (1.6) kg. Hamstring length was detected indirectly using the Active Knee Extension method in the 90/90 position from supine. The amount ...
Intra-reader and inter-reader single-slice renal sinus fat measurements. Renal sinus fat measures between one reader (A) and two readers (B) are plotted. Intra-
We previously proposed a novel image-based quality assessment technique1 to assess the perceptual quality of clinical chest radiographs. In this paper, an observer study was designed and conducted to systematically validate this technique. Ten metrics were involved in the observer study, i.e., lung grey level, lung detail, lung noise, riblung contrast, rib sharpness, mediastinum detail, mediastinum noise, mediastinum alignment, subdiaphragm-lung contrast, and subdiaphragm area. For each metric, three tasks were successively presented to the observers. In each task, six ROI images were randomly presented in a row and observers were asked to rank the images only based on a designated quality and disregard the other qualities. A range slider on the top of the images was used for observers to indicate the acceptable range based on the corresponding perceptual attribute. Five boardcertificated radiologists from Duke participated in this observer study on a DICOM calibrated diagnostic display ...
A benefit of the automatic organ delineation is to minimise inter-observer variation and uncertainty. In order to calculate dose, a pseudo-CT scan would be automatically created from the MRI scan with electron densities mapped to the tissues. This is performed by having a CT electron density atlas that corresponds exactly to the MRI atlas (Fig. 3 and 4). As the deformation for the MRI atlas to the patients MRI scan is known, the same deformation will work for the CT density atlas. The result is a pseudo-CT scan with electron densities mapped to the patients MRI scan. Using this pseudo-CT atlas, it is also possible to generate digitally reconstructed radiographs from MRI scans [8]. Work to validate these pseudo-CTs for treatment planning is ongoing [6-9].. ...
The role of imaging in the evaluation of tumor response is expanding rapidly. The current response evaluation criteria in solid tumors (RECIST) based on anatomical changes suffers from many limitations related mainly to the interand intra-observer variability to delineate the tumoral edges. Consequently, there is a need to update and integrate the RECIST criteria beyond the classical anatomical changes with other more sophisticated methods using three-dimensional and functional criteria. The goal of this paper is to review the current criteria of RECIST measurements (RECIST 1.1) with their limitations and to evaluate the emerging solutions available with the new imaging techniques like PET-CT.. ...
Background/Aim: The aim of this study was to analyze the inter-and intra-observer variability regarding biopsy technique in bone and soft tissue sarcoma based ...
BACKGROUND: An enlarging aneurysm after endovascular aneurysm repair (EVAR) without clear endoleak is a clinical challenge. Management of this problem is guided by the current evidence for adequate EVAR follow up and recommended thresholds for re-intervention. In a frail patient, careful risk assessment of aneurysm related mortality against the risks associated with examinations and interventions is required.. METHODS: The literature was reviewed for imaging modalities for EVAR follow up and their advantages and disadvantages. The current evidence and guideline recommendations regarding follow up and re-intervention after EVAR were assessed in relation to the presented case.. RESULTS: To detect sac expansion after EVAR, repeated examinations with the same imaging modality are needed. Verified expansion must be above the inter-observer variation of the method used. Although duplex ultrasound is an excellent modality for EVAR follow up, the finding of a significant expansion on duplex requires ...
In this study, we found a fair-to-moderate inter- and intraobserver reliability for the presence of microbleeds by using 3D T2*-weighted imaging at 1.5T and moderate-to-good reliability by using dual-echo T2*-weighted imaging at 7T. For the number of microbleeds, the reliability was better; it was moderate at 1.5T and good to very good at 7T. Overall, we found an increased reliability for both presence and number of microbleeds at 7T compared with 1.5T. Cerebral microbleeds were detected in more patients at 7T than at 1.5T. Furthermore, the number of microbleeds detected was higher at 7T.. Besides the detection of microbleeds in more patients and a higher number of microbleeds at 7T, the reliability of the detection also improves. Compared with 1.5T, both inter- and intraobserver reliability improved at 7T. This might be due to the higher resolution, increased SNR, and the use of the dual- echo sequence at 7T. Recently, we showed that the TE1 image provides a good contrast of the dark ...
Taeymans O, Duchateau L, Schreurs E, et al. Veterinary Radiology & Ultrasound 2005;46:139-142. The repeatability of ultrasonographic measurements of the canine
In Figures 1 and 2 (D, E, and F), Lins plots compare regression lines of the data with perfect agreement lines. For weight, (plots 1D and 2D), the straight line representing agreement between self-reported and measured areas is close to the perfect agreement line. However, for height and BMI (Charts E and F), data are more scattered and, therefore, further away from perfect agreement.. Discussion. When we compared reported and assessed data, we saw a trend for underestimating weight and height in both genders, which was higher in obese adolescents regarding weight (up to 6 kg) and in underweight subjects regarding height (up to 5.9 cm for boys and 3.9 cm for girls). These values were higher than those presented in other studies1,6-8, where tendencies to underestimate weight (from 0.5 to 2.6 kg), and to overestimate height (from 0.1 to 0.8 cm) were observed.. Abraham et al.1 and Peixoto et al.16 showed that age, height, and current weight, schooling, income, the frequency these data are ...
An attempt is made to characterize quantitatively the several components of variation existing in the determination of total lung capacity (TLC) by a radiological method. An analysis of variance indicates that over 94 percent of the total variation could be attributed to film differences, the remaining 6 percent being a consequence of interobserver and intraobserver variation and the inherent rand
The localization receiver operating characteristic (LROC) curve is a standard method to quantify performance for the task of detecting and locating a signal. This curve is generalized to arbitrary detection/estimation tasks to give the estimation ROC (EROC) curve. For a two-alternative forced-choice study, where the observer must decide which of a pair of images has the signal and then estimate parameters pertaining to the signal, it is shown that the average value of the utility on those image pairs where the observer chooses the correct image is an estimate of the area under the EROC curve (AEROC). The ideal LROC observer is generalized to the ideal EROC observer, whose EROC curve lies above those of all other observers for the given detection/estimation task. When the utility function is nonnegative, the ideal EROC observer is shown to share many mathematical properties with the ideal observer for the pure detection task. When the utility function is concave, the ideal EROC observer makes use ...
Doctoral thesis (2009). Agreement between raters on a categorical scale is not only a subject of scientific research but also a problem frequently encountered in practice. Whenever a new scale is developed to assess individuals ... [more ▼]. Agreement between raters on a categorical scale is not only a subject of scientific research but also a problem frequently encountered in practice. Whenever a new scale is developed to assess individuals or items in a certain context, inter-rater agreement is a prerequisite for the scale to be actually implemented in routine use. Cohens kappa coeffcient is a landmark in the developments of rater agreement theory. This coeffcient, which operated a radical change in previously proposed indexes, opened a new field of research in the domain. In the first part of this work, after a brief review of agreement on a quantitative scale, the kappa-like family of agreement indexes is described in various instances: two raters, several raters, an isolated rater and a ...
MUNOZ-MOLINA, Maribel; POO-FIGUEROA, Ana María; BUSTOS-MEDINA, Luis and BAEZA-WEINMANN, Bernardita. Agreement among three examiners and one expert in the detection of mother-infant attachment risk during the post-partum period, Temuco, IX región, Chile, 2010. Rev Colomb Obstet Ginecol [online]. 2014, vol.65, n.2, pp.129-138. ISSN 0034-7434. http://dx.doi.org/10.18597/rcog.61.. Introduction: It is advisable to assess the bonding between the mother and the newborn in order to identify abnormalities in this interaction and plan for early interventions to facilitate attachment. The examiner must make reliable observations for diagnosis and this requires training and experience. Objective: To describe inter-observer agreement among three examiners and one expert in the detection of mother-infant attachment risk between, applying the Kimelman attachment assessment guide. Materials and methods: Cross-sectional study. Mother-infant pairs were assessed during their stay at Hospital Hernán Henríquez ...
Detection of multiple lesions (signals) in images is a medically important task and Free-response Receiver Operating Characteristic (FROC) analyses and its variants, such as Alternative FROC (AFROC) analyses, are commonly used to quantify performance in such tasks. However, ideal observers that optimize FROC or AFROC performance metrics have not yet been formulated in the general case. If available, such ideal observers may turn out to be valuable for imaging system optimization and in the design of computer aided diagnosis (CAD) techniques for lesion detection in medical images. In this paper we derive ideal AFROC and FROC observers. They are ideal in that they maximize, amongst all decision strategies, the area under the associated AFROC or FROC curve. In addition these ideal observers minimize Bayes risk for particular choices of cost constraints. Calculation of observer performance for these ideal observers is computationally quite complex. We can reduce this complexity by considering forms ...
Perfusion CT is a technology which allows functional evaluation of tissue vascularity. Due to this potential, it is finding increasing utility in oncology. Although since its introduction continuous advances have interested CT technique, some issues have to be still defined, concerning both clinical and technical aspects. In this study, we dealt with the comparison of two widely employed mathematical models (dual input one compartment model - DOCM - and maximum slope - SM -) analyzing their robustness to the noise. We carried out a computer simulation process to quantify effect of noise on the evaluation of an important perfusion parameter (Arterial Blood Flow - BFa) in liver tumours. A total of 4500 liver TAC, corresponding to 3 fixed BFa values, were simulated using different arterial and portal TAC (computed from 5 real CT images) at 10 values of signal to noise ratio (SNR). BFa values were calculated by applying four different algorithms, specifically developed, to these noisy simulated curves.
TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents ...
RESULTS: Six hundred twenty-two children were included (320 boys, 302 girls), ranging from 1 day to 15 years of age. Normal values (from the 3rd to 97th percentile) are provided for each parameter. All parameters showed rapid growth up to 3 years of age followed by slower (FOD, APD, LCC, GT and ST) or absent (S/T) growth. Growth of BT and IT was completed by 7-8 years. CC modeling (IT/ST) was completed by 3 years. FOD was larger in boys from the age of 1 year (statistically significant). The other parameters did not show any sex effect. Inter- and intraobserver agreement was excellent for all parameters except for IT. ...
Although the reliability of the proposed diagnosis has not yet been established through clinical replication studies published in peer-reviewed journals, this should not be a barrier as field trials are being planned in time to make it into the manual just under the wire. The sites for the field trials will be strategically selected to maximize positive findings. Similarly, high inter-rater reliability will be assured through careful selection, training, and certification of raters by the Bipolar By Proxy Promulgation Association. The journal whose editorial board is dominated by that Association is expected to publish the positive findings. The larger question of validity is not thought to be a problem, as many other current and proposed diagnoses lack real-world validity ...
Limited research on the reliability of cognitive case formulation suggests cognitive therapists can agree about clients presenting problems but show poor agreement about the inferential aspects of formulation. There has been no research examining the quality of practitioners case formulations. This study assessed whether participants with different levels of experience could produce reliable cognitive formulations using a systematic cognitive therapy case formulation method: the J. Beck Case Conceptualization Diagram. As part of continuing education workshops on cognitive case formulation, 115 mental health practitioners were given the same case description and asked to provide case formulations. Inter-rater agreement and agreement with a benchmark formulation provided by J. Beck were measured. The results showed that participants were able to agree with each other and with the benchmark on most descriptive aspects of the formulation but rates of agreement decreased for aspects of the formulation
From meningitis in man, encephalitis in cattle and sheep, a myocardial infection in fowl, and a generalized infection in rabbits, different observers have isolated Gram-positive organisms which are closely related. Their cultural and serological properties are described. When injected intravenously into chickens, rabbits, or guinea pigs there is an unusual blood response, the monocytes being markedly increased. The organisms tend to localize in the myocardium with resulting necrosis.. ...
We identified 904 unique papers. Of these, 90 were relevant to near patient testing in primary care. Only 26 scored 4 or 5 on the initial assessment of validity. The main reasons for failure to reach the cut off were absence or inadequacy of the reference standard and inappropriate statistical analyses. The additional electronic searches for 1997-9 found 11 more relevant papers, of which six passed the quality filter.. The 32 papers described 209 comparisons. Of these, 49 related to repeatability (intraobserver variability) of tests, which is not considered further here. The most interesting data emerged in the comparisons of test performance (n=150) and impact (n=10). Tables giving the key points from these papers are available on the BMJs website. Test performance and impact were considered separately. Details of the other relevant papers have been published.11. We extended the traditional view of test performance to include comparisons of the same test when operated and read by different ...
The red cell sizes are described in the scale of micrometers. For a novice it is difficult to imagine the cell size. Measuring cell size requires the use of advanced instruments whichmandates use of trained personnel. To measure the red cell size using a custom built microscope camera and computer based image analyzing software. Sixty peripheral blood smearswere photographed using a microscope camera. The images were stored in a computer. The retrieved images were measured offline using the software. Two independent observers recorded the red cell sizes on both randomly selected and tagged red cells of each slide. Themeanredcell widthof thetaggedredcells as measured by the first and second observer was 7.70-7.82 ( }0.61μ) and 7.71-7.76 ( }0.61μ) respectively. The measured red cell width ranges between 7.49-7.67 ( }0.54μ) and 7.34-7.44 ( }0.60μ) 