Abstract: : Purpose:To evaluate the test-retest reproducibility of ATL HDI-5000 CDI measurements of volumetric blood flowas compared to an in-vitro phantom flow model. Methods:A phantom flow model was constructed using agarose gel to mimic fatty soft tissue. 1.57mm and 2.36mm lumens were created in the gel. A UHDC flow system pumped blood mimicking fluid through each tube at three different rates. The ATL HDI-5000 measured the velocity and volumetric flow in the phantom model using cineloops (a cineloop is a rapidly acquired sequence of CDI images). A newly developed software package from ATL calculated both volumetric flow and velocity from the cineloops. Measurements were performed with the probe in four different positions: 1) 45° angle, parallel to the flow, 2) 45° angle, offset to the flow, 3) 75° angle, parallel to the flow, and 4) 75° angle, offset to the flow. The coefficient of variance was then calculated for each of the probe positions. Results:The average coefficients of ...
Test-retest reproducibility study of [C-11]Preladenant. Assessment of stability and variation of the PET measures in healthy volunteers.
Studies of elderly patients with Garden-I and Garden-II femoral neck fractures (FNFs) suggest that a preoperative posterior tilt of the femoral head of at least 20° increases the risk of fixation failure. A recently published treatment algorithm recommended hemiarthroplasty over internal fixation for elderly patients with Garden-I and Garden-II FNFs and a preoperative posterior tilt of at least 20°. However, the reliability of the method used to measure the posterior tilt has not been assessed according to recommended standards for reliability trials. Four orthopedic registrars and four consultants measured the posterior tilt angle in 50 preoperative lateral radiographs at two occasions six weeks apart. We estimated inter- and intrarater reliability by intraclass correlation coefficient (ICC). We also assessed repeatability by the repeatability coefficient (RC) and agreement by the minimal detectable change (MDC). Based on the suggested cutoff value of 20°, we reported the overall percentage and
Expert psychiatrists conducting work disability evaluations often disagree on work capacity (WC) when assessing the same patient. More structured and standardised evaluations focusing on function could improve agreement. The RELY studies aimed to establish the inter-rater reproducibility (reliability and agreement) of functional evaluations in patients with mental disorders applying for disability benefits and to compare the effect of limited versus intensive expert training on reproducibility. We performed two multi-centre reproducibility studies on standardised functional WC evaluation (RELY 1 and 2). Trained psychiatrists interviewed 30 and 40 patients respectively and determined WC using the Instrument for Functional Assessment in Psychiatry (IFAP). Three psychiatrists per patient estimated WC from videotaped evaluations. We analysed reliability (intraclass correlation coefficients [ICC]) and agreement (standard error of measurement [SEM] and proportions of comparisons within prespecified limits
Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method - Background: Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. Results: A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit (lowess) normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are
BACKGROUND AND PURPOSE: The 6-minute walk test (6MWT) is widely used as a clinical outcome measure. However, the reliability of the 6MWT is unknown in individuals who have recently experienced a hip fracture. The aim of this study was to evaluate the relative and absolute interrater reliability of the 6MWT in individuals with hip fracture.. METHODS: Two senior physical therapy students independently examined a convenience sample of 20 participants in a randomized order. Their assessments were separated by 2 days and followed the guidelines of the American Thoracic Society. Hip fracture-related pain was assessed with the Verbal Ranking Scale.. RESULTS: Participants (all women) with a mean (standard deviation) age of 78.1 (5.9) years performed the test at a mean of 31.5 (5.8) days postsurgery. Of the participants, 10 had a cervical fracture and 10 had a trochanteric fracture. Excellent interrater reliability (intraclass correlation coefficient [ICC2.1] = 0.92; 95% confidence interval, 0.81-0.97) ...
Background This paper presents the first meta-analysis for the inter-rater reliability (IRR) of journal peer reviews. IRR is defined as the extent to which two or more independent reviews of the same scientific document agree. Methodology/Principal Findings Altogether, 70 reliability coefficients (Cohens Kappa, intra-class correlation [ICC], and Pearson product-moment correlation [r]) from 48 studies were taken into account in the meta-analysis. The studies were based on a total of 19,443 manuscripts; on average, each study had a sample size of 311 manuscripts (minimum: 28, maximum: 1983). The results of the meta-analysis confirmed the findings of the narrative literature reviews published to date: The level of IRR (mean ICC/r2 = .34, mean Cohens Kappa = .17) was low. To explain the study-to-study variation of the IRR coefficients, meta-regression analyses were calculated using seven covariates. Two covariates that emerged in the meta-regression analyses as statistically significant to gain an
TY - JOUR. T1 - Assessment of shoulder active range of motion in prone versus supine. T2 - A reliability and concurrent validity study. AU - Furness, James. AU - Johnstone, Scott. AU - Hing, Wayne. AU - Abbott, Allan. AU - Climstein, Mike. N1 - © and inclinometer have been shown to be reliable tools that show good concurrent validity.. PY - 2015/10/3. Y1 - 2015/10/3. N2 - BACKGROUND: As swimming and surfing are prone dominant sports, it would be more sport specific to assess shoulder active range of motion in this position.OBJECTIVES: To determine the reliability of the inclinometer and HALO© for assessing shoulder active range of motion in supine and prone and the concurrent validity of the HALO©. Concurrent validity is based on the comparison of the HALO© and inclinometer. To determine if active range of motion (AROM) differences exists between prone and supine when assessing shoulder internal (IR) and external rotation (ER).DESIGN: The design included clinical measurement, reliability and ...
This study was comprised of two phases. In Phase One an intense literature review was performed to facilitate item generation for the initial item-pool. This was then subjected to a review by a panel of experts to establish content validity. Phase Two involved the actual testing of the content validated item pool amongst a sample of ICU nurses from the target population. Ethical approval was obtained from the relevant hospitals. Classical Test Theory was implemented for psychometric evaluation of the instrument. Reliability of the instrument was addressed through the technique of test-retest reliability using Pearsons product-moment correlation coefficient and the Intra-Class correlation coefficient. Finally, the internal consistency of the instrument was addressed to examine the tools stability ...
Intraclass test-retest reliability coefficients (one-way ANOVA model for a single measure) ranged from .940 to .996. Validity coefficients determined by Pearson product moment correlation coefficients for males and females, respectively, were as follows: B-90° DTE vs. PRC-DTE = .82, .62 (p , .05); B-90° DTE vs. PRC-STE = .55, .38 (p , .05); B-90° DTE vs. DSBL = −.29, −.23; FG-TE vs. PRC-DTE = .23, −.11; FG-TE vs. PRC-STE = −.15, .33; and FG-TE vs. DSBL = −.04, −.36. ...
onstruct validity with enjoyment and BMI, and on cross-sectional concurrent validity with objectively measured MVPA (tri-axial accelerometry) over the span of seven consecutive days. Study 3 (n = 58) examined the PAQ-C-It reliability, construct validity with BMI and VO₂ max as the objective measurement among a population of children with congenital heart defects (CHD). In study 2 and 3, the factor structure of the PAQ-C-It was then re-examined with an EFA. The PAQ-C-It showed acceptable to good reliability (alpha .70 to .83). Results on construct validity showed moderate but significant association with enjoyment perception (r = .30 and .36), with BMI (r = -.30 and -.79 for CHD simple form), and with the VO₂ max (r = .55 for CHD simple form). Significant concurrent validity with the objectively measured MVPA was reported (rho = .30, p , .05). Findings of the EFA suggested a two-factor structure for the PAQ-C-It, with items 2, 3, and 4 contributing little to the total score. This study ...
TY - JOUR. T1 - The joint council on thoracic surgery education coronary artery assessment tool has high interrater reliability. AU - Lee, Richard. AU - Enter, Daniel. AU - Lou, Xiaoying. AU - Feins, Richard H.. AU - Hicks, George L.. AU - Gasparri, Mario. AU - Takayama, Hiroo. AU - Young, J Nilas. AU - Calhoon, John H.. AU - Crawford, Fred A.. AU - Mokadam, Nahush A.. AU - Fann, James I.. PY - 2013/6. Y1 - 2013/6. N2 - Background: Barriers to incorporation of simulation in cardiothoracic surgery training include lack of standardized, validated objective assessment tools. Our aim was to measure interrater reliability and internal consistency reliability of a coronary anastomosis assessment tool created by the Joint Council on Thoracic Surgery Education. Methods: Ten attending surgeons from different cardiothoracic residency programs evaluated nine video recordings of 5 individuals (1 medical student, 1 resident, 1 fellow, 2 attendings) performing coronary anastomoses on two simulation models, ...
To the best of our knowledge, this study is the first to validate a questionnaire (translated into Brazilian Portuguese) that measures the quality of life of women diagnosed with cervical intraepithelial neoplasia. The FACIT-CD questionnaire was developed by Rao et al. [6] in 2010. To date, no other studies have evaluated the psychometric properties of this instrument, which means that some comparisons are only exploratory.. The first test assessed the reliability of the questionnaire by analysing the internal consistency using Cronbachs alpha coefficient. Results higher than 0.70 indicate that the items on the scales or domains are homogeneous or that they measure the same attribute. In this study, the value on the relationship scale was lower than expected (0.66). However, other authors support the hypothesis that Cronbachs alpha values ​​higher than 0.60 could be acceptable [31]. Despite this assumption, we believe that a value of 0.70 ​​would be more desirable, and thus, we ...
The Health Education Impact Questionnaire (heiQ) evaluates the effectiveness of health education and self-management programs provided to people dealing with a wide range of conditions. Aim of this study was to translate, culturally adapt and validate the Dutch translation of the heiQ and to compare the results with the English, German and French translations. A systematic translation process was undertaken. Psychometric properties were studied among patients with arthritis, atopic dermatitis, food allergy and asthma (n = 286). Factorial validity using confirmatory factor analysis, item difficulty (D), item remainder correlation and composite reliability were conducted. Stability was tested using the intra-class correlation coefficient (ICC). Items were well understood and only minor language adjustments were required. Confirmatory fit indices were |0.95 and item difficulty was D ≥ 0.65 for all items in scales showing acceptable fit indices, except for the reversed Emotional distress scale. Composite
Background: Evaluation of physical activity by condition-specific surveys provides more accurate results than generic physical activity questionnaires. The aim of this study was to investigate the reliability and validity of the Kaiser Physical Activity Survey (KPAS) in Turkish pregnant women. Methods: In the translation and cultural adaptation of the KPAS, the 6-phase guidelines recommended in the literature were followed. The study included a total of 151 pregnant women who were assessed using the Turkish version of KPAS, the Pregnancy Physical Activity Questionnaire, and the SenseWear Pro3 Armband. To determine the test-retest reliability, the KPAS was reapplied after 7 days. The psychometric properties of KPAS were analyzed with respect to internal consistency, test-retest reliability, and concurrent validity. Results: Cronbach α coefficient indicating the internal consistency of the Turkish KPAS was found to be .60 to .80, showing moderate reliability. The intraclass correlation ...
NHTSA has previously conducted testing to evaluate the repeatability of the oblique offset moving deformable barrier test procedure. Since this testing, NHTSA has made changes to the test procedure, and changes to regulations and consumer information testing have propagated to the vehicle fleet. Therefore, there is a need to re-evaluate the repeatability of the test procedure. Also, the reproducibility of the test procedure needs to be evaluated to determine the variability of the test results among multiple test facilities. To evaluate the repeatability and reproducibility of the test procedure three tests of a single vehicle model were conducted at three different test facilities for a total of nine tests. The responses of the vehicle and its occupants, THOR 50th percentile male ATDs in the driver and right front passenger seating positions, were evaluated to determine repeatability within a single test facility and for reproducibility among the three test facilities. The results demonstrated ...
BACKGROUND: Accurate limb volume measurement is key in the assessment of outcomes in lymphedema microsurgery. There are two commonly used methods as follows: manual circumferential measurement (tape) or Perometer measurement. There are no data on the intra- and interclass correlation of either method, making it difficult to establish a gold standard of limb volume measurement. We aim to assess the intra- and interclass correlation of each method to establish the most appropriate method for clinical practice and future research studies, aiming to compare the accuracy and reliability of tape measurement as assessed against Perometer measurement. METHODS AND RESULTS: Student volunteers and experts (lymphedema practitioners) were each asked to perform repeat tape and Perometer measurements on the upper or lower limb of one healthy volunteer. Perometer measurements were globally more accurate than tape (average SE [Perometer]: 23.23 vs. 77.21 [tape]). For intraobserver reliability, experts outperformed
OBJECTIVES: To test the reproducibility of the ABILOCO questionnaire. To validate the patient self-reporting method and the third-party assessment of the stroke patients locomotion ability by a treating physical therapist. DESIGN: Prospective study. SETTING: University hospital. PARTICIPANTS: Adult stroke patients (N=28; 59+/-13y). The time since stroke ranged from 3 to 253 weeks. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURE: The ABILOCO questionnaire. RESULTS: The results of patient self-assessment and the results of the third-party assessments by the physiotherapists at a 2-week interval were highly correlated (intraclass correlation coefficient [ICC]=.77 and ICC=.89, respectively). The results of the patient self-assessment and the third-party assessment by the physical therapist were both well correlated to assessment by an independent medical examiner who observed the patient during the 13 ABILOCO activities (ICC=.69 and ICC=.87, respectively). CONCLUSIONS: The use of ABILOCO as a ...
Few studies have evaluated changes on parent-child agreement in HRQOL over time. The objectives of the study were to assess parent-child agreement on childs HRQOL in a 3-year longitudinal study, and to identify factors associated with possible disagreement. A sample of Spanish children/adolescents aged 8-18 years and their parents both completed the KIDSCREEN-27 questionnaire. Data on age, gender, family socioeconomic status (SES), and mental health (Strengths and Difficulties Questionnaire, SDQ) was also collected at baseline (2003), and again after 3 years (2006). Changes in family composition were collected at follow-up. Agreement was assessed through intraclass correlation coefficient (ICC), and Bland and Altman plots. Generalizing Estimating Equation (GEE) models were built to analyze factors associated with parent-child disagreement. A total of 418 parent-child pairs were analyzed. At baseline the level of agreement on HRQOL was low to moderate and it was related to the level of HRQOL reported.
Purpose To comprehensively assess the precision and agreement of anterior corneal power measurements using 8 different devices. Methods Thirty-five eyes from 35 healthy subjects were included in the prospective study. In the first session, a single examiner performed on each subject randomly measurements with the RC-5000 (Tomey Corp., Japan), KR-8000 (Topcon, Japan), IOLMaster (Carl Zeiss Meditec, Germany), E300 (Medmont International, Australia), Allegro Topolyzer (Wavelight AG, Germany), Vista (EyeSys, TX), Pentacam (Oculus, Germany) and Sirius (CSO, Italy). Measurements were repeated in the second session (1 to 2 weeks later). Repeatability and reproducibility of corneal power measurements were assessed based on the intrasession and intersession within-subject standard deviation (Sw), repeatability (2.77Sw), coefficient of variation (COV), and intraclass correlation coefficient (ICC). Agreement was evaluated by 95% limits of agreement (LoA). Results All devices demonstrated high repeatability and
We observed a high correlation between duplicate measurements of cord blood serum estrogen and SHBG levels. Variance component analysis showed that ,80% of the variation in assay results could be explained by the variability between babies. There has been only one study that presented the assay reproducibility of cord blood estrogen levels to our knowledge. In a study of 256 male and female babies by Maccoby et al. (19) , Pearsons correlation coefficients between duplicate measurements conducted in three samples of babies ranged from 0.98 to 0.99.. A few studies have been conducted to examine the laboratory reproducibility of serum and plasma estrogen levels in adult women. Bolelli et al. (10) evaluated the effects of long-term preservation of frozen plasma and serum samples on the sex hormone assay results including estradiol (10) . When assays were repeated 3 years after baseline, Pearsons correlation coefficient between the two measurements for both serum and plasma estradiol was 0.99 for ...
Purpose: The purposes of this study were to: 1) investigate the inter-rater and intra-rater reliability of use of the Flexicurve for measurement of spinal length (L), thoracic (TL) and lumbar length (LL), thoracic (TW) and lumbar width (LW), and 2) q
Assessing Upper and Lower Extremities Via Tissue Dielectric Constant: Suitability of Single Versus Multiple Measurements Averaged. Harvey N. Mayrovitz, Lymphatic Research and Biology, 2018. Background: Tissue dielectric constant (TDC) measurements as an index of local tissue water are useful in a range of applications most notably to characterize and assess lymphedema. Once a measuring device is applied to skin and a result is obtained in less than 10 seconds, but multiple sites may be required and use of the standard triplicate measurements may be time prohibitive. Thus, this studys goal was to provide data from which informed judgments could be made as to the impact of making a single measurement to reduce expended clinic time.. Methods and Results: Sixty subjects (30 female) were recruited with an average age (mean-standard deviation) of 30.6-13.4 years. TDC was measured in triplicate bilaterally at forearm, hand palm, lateral calf, medial calf, and foot dorsum. The agreement in absolute TDC ...
In the context of large-scale human system immunology studies, controlling for technical and biological variability is crucial to ensure that experimental data support research conclusions. In this study, we report on a universal workflow to evaluate both technical and biological variation in multiparameter flow cytometry, applied to the development of a 10-color panel to identify all major cell populations and T cell subsets in cryopreserved PBMC. Replicate runs from a control donation and comparison of different gating strategies assessed the technical variability associated with each cell population and permitted the calculation of a quality control score. Applying our panel to a large collection of PBMC samples, we found that most cell populations showed low intraindividual variability over time. In contrast, certain subpopulations such as CD56 T cells and Temra CD4 T cells were associated with high interindividual variability. Age but not gender had a significant effect on the frequency of ...
These findings indicate that this questionnaire has satisfactory reliability and validity. It can detect different levels of satisfaction12 and is therefore suitable for evaluating out of hours care received by a broad range of patients. The questionnaire has satisfactory internal reliability with Cronbachs α coefficients greater than 0.60 for all scales and greater than 0.70 for five.38 The test and retest scores were highly correlated, though the regressions show that the retest scores were generally lower, so that there may have been a real fall in satisfaction with time. In a true test of test-retest reliability the variable and measurement technique should be the same on both occasions. The lower retest scores may therefore also reflect the difference in the method of application, with greater expressed satisfaction when the research assistants were present. Nevertheless, these data indicate that the retest reliability of the questionnaire is broadly satisfactory.. Content validity was ...
The aim of this study was to develop and validate an asthma-specific quality of life questionnaire for adolescents with asthma. The final version of the AAQOL contains 32 items covering six domains of HRQOL. It is designed for self-administration with most respondents requiring 5-7 min for completion. The AAQOL showed good construct validity given the correlations with other quality of life measures as anticipated. The high test-retest reliability provides the basis for good responsiveness of the AAQOL.. As HRQOL may be influenced by the individuals current stage of cognitive, social and emotional development, it has been argued that HRQOL in adolescents needs to be addressed separately 17. The AAQOL takes into account the key developmental aspects of adolescence as it was specifically designed for the age range 12-17 yrs. The AAQOL is self-completed and focuses on the adolescents subjective perception. Age appropriateness is ensured by including items which were defined as particularly ...
OBJECTIVES: Responses to health-related items on the Community Health Survey (CHS) provide evidence that is used to develop community-based health policy. This study aimed to assess the test-retest reliability of selected health behavioral items on the CHS according to item category, response period, and response scale. METHODS: A sample of 159 men and women 20 to 69 years of age participated in a test-retest with an interval of 14 to 21 days. A total of 28 items relating to smoking, alcohol consumption, diet and weight control, and mental health were selected. We evaluated the test-retest reliability of the items using kappa statistics. RESULTS: Kappa values ranged from 0.44 to 0.93. Items concerning habits had higher kappa values (mean, 0.7; standard error, 0.05) than items concerning awareness or attitudes (p=0.012). The kappa value of items with two- to four-point scales was 0.63, which was higher than the value of 0.59 for items with scales involving five or more points, although this ...
Awareness of reproducibility issues in various areas of science has been on the rise in recent years, with systematic replication efforts in areas such as psychology, economics, cancer biology and social sciences arising in recent years. The low reproducibility rates in some of these areas raise the question of whether irreproducible results can be predicted from particular features in the original publications. Whether reproducibility can be accurately estimated from published information has major implications not only for choosing what to believe or what is worth replicating, but also for how we assess and fund science.. The question of whether researchers can estimate the reproducibility of published findings has been studied in replication initiatives in psychology (see also this), economics and social sciences, and the answer is that they are reasonably good at it. The pooled prediction accuracy across these four studies is around 66% for individual surveys and 73% for prediction markets, ...
Twelve healthy recreational male runners participated.. The selected muscles were: M. quadriceps-vastus medialis (VM) and rectus femoris (RF), M. biceps femoris (BF), M. tibialis anterior (TA) and the M. gastrocnemius caput mediale (GAS) of the right leg.. The MVC testing conditions were: dry land, underwater prior to (Water 1) and following an aquatic exercise trial (Water 2).. For each muscle, a one-way analysis of variance with repeated measures was used to compare MVC scores between testing conditions, and the intra-class correlation coefficient (ICC) and typical error (CV%) were calculated to determine the reproducibility and precision of MVC scores, respectively, between conditions.. For all muscles, no significant differences were observed between land and water MVC scores (p = 0.88-0.97), and high reliability (ICC = 0.96-0.98) and precision (CV% = 7.4-12.6%) were observed between MVC conditions. Under MMT conditions it appears that comparable MVC sEMG values were achieved on land and in ...
p,BACKGROUND: Intensive care unit (ICU) stays often lead to reduced physical functioning. Change in physical functioning in patients in the ICU is inadequately assessed through available instruments. The de Morton Mobility Index (DEMMI), developed to assess mobility in elderly hospitalized patients, is promising for use in patients who are critically ill.,/p,,p,OBJECTIVE: The aim of this study was to evaluate the clinimetric properties of the DEMMI for patients in the ICU.,/p,,p,DESIGN: A prospective, observational reliability and validity study was conducted.,/p,,p,METHODS: To evaluate interrater and intrarater reliability (intraclass correlation coefficients), patients admitted to the ICU were assessed with the DEMMI during and after ICU stay. Validity was evaluated by correlating the DEMMI with the Barthel Index (BI), the Katz Index of Independence in Activities of Daily Living (Katz ADL), and manual muscle testing (MMT). Feasibility was evaluated based on the percentage of participants in ...
The limits of agreement will be estimated for the difference between single measurements by each method. This is standard practice when reporting patient results for PEFR. The mean measurements option uses the mean of the replicates to compute the limits of agreement. However, this will lead to narrower limits of agreement (due to the reduction in standard deviation mentioned above) and should only be used when it is standard practice to use the mean of multiple measurements as the patient result ...
Durability and reliability are crucially linked in product validation testing. Typically the products life requirement is to be able to withstand specified loading for a given duration with desired reliability and confidence levels. Product validation or durability testing is then used to assess actual product life relative to these requirements. The goal of validation test is to demonstrate that the part is indeed capable of withstanding the loading that it will see in service. It is desirable that lab loading is representative of and correlates with service loading. Fatigue analysis techniques and material data like the stress-life (SN) curve can be used to define equivalent damage test specifications and accelerate tests so a long service life can be replicated quickly in the test lab. The challenge with typical validation test specs is that while fatigue methodologies can be used to address damage correlation and equivalence, testing a single part does not provide information about product ...
We take measurements every day to control processes and to accept or reject products. Often, there is little thought that goes into understanding the measurement system. We take it for granted that the numbers are good. The quality of a measurement system is determined by the statistical properties of the data that are generated. We know, for example, that when the same person measures the same part with the same instrument that there can be different results. This is termed repeatability - re
We found that the 1-min STS test showed very little learning effect and excellent test-retest reliability in COPD patients. Strong correlations with the 6MWT suggest good cross-sectional construct validity. The 1-min STS test was responsive to two different pulmonary rehabilitation programmes, and the MID for clinical practice was estimated as three repetitions. We also observed similar responses to the 1-min STS test and 6MWT in terms of end-exercise cardiorespiratory values.. This is the first study to thoroughly assess the measurement properties of the 1-min STS test in COPD patients. Our findings for cross-sectional validity of the 1-min STS test were in accordance with previous studies that also found moderate to strong correlations (r=0.47-0.75) with the 6MWT [5, 15] and with quadriceps strength (r=0.65) [5]. Many measures met our assumptions about strength of correlation, indicating good cross-sectional construct validity; however, the poor change score correlations with exercise capacity ...
Objective: A large number of tools for assessing the quality of randomized controlled trials are available; however, users have little guidance as to whether a given score represents high or low validity. The purpose of this study is to explore the use of studies identified as having high-internal validity, referred to as the standard studies, to interpret internal validity scores from studies with unknown internal validity. Methods: The standard studies were identified by locating 6 candidate studies reporting the findings of randomized controlled trials from the Journal of American Medicine Association or the New England Journal of Medicine and scoring the studies using 2 scales, the Jadad scale (high score = 5; low = 0) and an internal validity information scale (IVI; high score = 70; low = 0). The 2 studies with the highest average rank were chosen as the standard studies. To determine if the standard studies facilitate interpretation of internal validity scores, 11 randomized controlled ...
Randomizing doctors, medical practices or even entire communities to interventions but taking observations on individual patients or families is not a new idea in public health research but it has received considerable attention in the past few years. Aside from references provided in the Zyzanski et al article, readers will find the recent review papers by Murray et al and by Donner and Klar interesting and informative. Articles in the current issue of the Annals emphasize common facts that every researcher must consider when conducting a group randomized trial (GRT). These center about accounting for the correlation among responses from the same cluster in both the design of the trial as illustrated by Killip et al as well as in the analysis of the data as illustrated by Reed. The key concept in both articles is the product of the intraclass correlation coefficient (ICC) and the average cluster size. The ICC measures the degree of correlation among responses in the same cluster. The product is ...
The COMFORT scale is a measurement tool to assess distress, sedation and pain in nonverbal paediatric patients. Several studies have described the COMFORT scale, but no formal assessment of the methodological quality has been undertaken. Therefore, we performed a systematic review to study the clinimetric properties of the (modified) COMFORT scale in children up to 18 years. We searched Central, CINAHL, Embase, Medline, PsycInfo and Web of Science until December 2014. The selection, data extraction and quality assessment were performed independently by two reviewers. Quality of the included studies was appraised using the COSMIN checklist. We found 30 studies that met the inclusion criteria. Most participants were ventilated children up to 4 years without neurological disorders. The results on internal consistency and interrater reliability showed values of ,0.70 in most studies, indicating an adequate reliability. Construct validity resulted in correlations between 0.68 and 0.84 for distress, ...
The visual vertical (VV) consists of repeated adjustments of a luminous rod to the earth vertical. How many trials are required to reach consistency in this measure? This question has never been addressed despite the widespread clinical use of the measurement in stroke rehabilitation. VV perception was assessed (10 trials) in 117 patients undergoing rehabilitation after a first hemisphere stroke. The intraclass correlation coefficient (ICC) and standard error of measurement (SEM) were calculated for each patient category: with contralesional VV bias (n = 48), ipsilesional VV bias (n = 17) and normal VV (n = 52). For patients with VV biases, 6 trials were required to reach high inter-trial reliability (contralesional: ICC = 0.9, SEM = 1.36°; ipsilesional: ICC = 0.896, SEM = 0.96°). For patients with normal VV, a minimum of 10 trials was required (ICC = 0.728, SEM = 1.13°). A set of 6 trials correctly classified 96 % of patients. In the literature, 10 is the most frequently used number of trials used
The results of this study for adults in India show evidence of reliability for the IMS-PAQ, with good intraclass correlation and kappa statistics between baseline and retest. The validity coefficients and associations produced between total activity/activity intensity and theoretical constructs of PA were in agreement with those predicted, providing evidence of construct validity for the IMS-PAQ. These findings suggest that the IMS-PAQ is valid for ranking individuals based on reported PA within this population but that further research may be needed for urban residents and women. This study has constructed categories of PA based upon reported time in different activity intensities and used them to predict associations with relevant health outcomes (BMI, percent body fat and pulse rate) in order to provide a more thorough assessment of the validity of the questionnaire.. The results show that for the sample as a whole the IMS-PAQ has good reliability with intra-class correlations ranging from ...
Surprisingly, the Odom criteria have never been validated.. The aim of a study was to investigate the reliability and validity of the Odom criteria for the evaluation of surgical procedures of the cervical spine.. Patients with degenerative cervical spine disease were included in the study and divided into 2 subgroups on the basis of their most predominant symptom: myelopathy or radiculopathy. Reliability was assessed with interrater and test-retest design using quadratic weighted kappa coefficients. Construct validity was assessed by means of hypotheses testing. To evaluate whether the Odom criteria could act as a global perceived effect (GPE) scale, we assessed concurrent validity by comparing area under the curve (AUC) values of receiver operating characteristic (ROC) curves for the set of questionnaires.. A total of 110 patients were included in the study; 19 were excluded, leaving 91 in our analysis. Reliability assessments showed κ = 0.77 for overall interrater reliability and κ = 0.93 ...
TY - JOUR. T1 - A template for reliable assessment of resident operative performance. T2 - Assessment intervals, numbers of cases and raters. AU - Williams, Reed G.. AU - Verhulst, Steven. AU - Colliver, Jerry A.. AU - Sanfey, Hilary. AU - Chen, Xiaodong. AU - Dunnington, Gary. PY - 2012/10. Y1 - 2012/10. N2 - Background: Operative performance rating (OPR) instruments have been developed to assess operative performance (OP). To guide program implementation, this study determined: 1) Appropriate intervals for OP progress decisions, 2) Number of OPRs and raters required per interval to achieve reproducible results. Methods: 21 surgeons rated 897 OPs (3 procedures) by 36 residents. Six-month PGY intervals were compared to determine length of stable operative performance intervals. Variance component analyses established rating factor importance. Generalizability analyses and decision studies determined number of OPRs required for reproducible OP decisions (reliabilities = 0.80). Results: Resident ...
article{58770751-98aa-4670-aede-8ddb9462c09c, abstract = {High precision isotopic measurements of Sn in two commercially available high purity materials and a previously analysed cassiterite from Straits Settlement, Malaysia, are presented as a basis for a new measurement procedure using the Micromass IsoProbe MC-ICP-MS. The results show that under optimised instrumental conditions two laboratory calibration standard solutions (Johnson-Matthey Puratronic Grade 1 Sn metal foil and Specpure ICP/DCP Sn solution) are isotopically identical and an external reproducibility of 0.000017 2 s. d. at 150 ppb Sn concentration (Sn-122/Sn-116 0.318597, n = 14) can be achieved. An isotopic fractionation of +0.13parts per thousand/ u (1.3 epsilon units) relative to these in-house standards has been verified for the cassiterite, which indicates a natural isotopic fractionation of approximately 2.8 times greater than the long-term reproducibility of the current optimised measurement procedure.}, author = ...
Several efforts were made to improve on the moderate reliability associated with previously reported chart reviews.13 We developed a computerized data collection form to ensure complete data entry. Data were transferred regularly by phone to a computer at the coordinating centre to minimize data loss and transcription error. Provincial physician and nurse leaders underwent training and used a standard set of hospital charts and a training manual. Reviewer performance was evaluated on a national basis with the use of measures of interrater reliability before data collection was started. Reliability data were reported back to each province. At both stages of the review process, interrater reliability was also assessed on a random sample of 10% of the charts. The kappa statistic for the measurement of agreement on the 10% sample for the first stage of the review process (by nurses or health records professionals) was substantial, 0.70 (95% confidence interval [CI] 0.63- 0.76).14 Kappa scores for ...
Application of the velocity profile method is recommended for reliable measurement of flow volume in larger vessels, and ultrasonic flowmetry is a useful clinical tool for this purpose. We used the velocity profile in conjunction with a minor modification in the conventional velocity profile method and examined the reproducibility of flowmetry from color Doppler data. Data of three examiners were allowed to analyze intraobserver reproducibility and interobserver agreement in the common carotid artery, and we measured flow volume in the peripheral vessels of healthy individuals. Estimated flow volumes in five healthy examinees were 350 to 550 ml/min and did not vary significantly between examiners. Interobserver correlation was good (r 1=0.63), but intraobserver correlations in two sonographers were excellent (r 1=0.85) in by one who was experienced in this method and poor (r 1=0.32) in the other. Good interobserver agreement and intraobserver reproducibility of experienced examiners suggests that this
We measured the long-term test-retest reliability of [C-11]raclopride binding in striatal subregions, the thalamus and the cortex using the bolus-plus-infusion method and a high-resolution positron emission scanner. Seven healthy male volunteers underwent two positron emission tomography (PET) [C-11]raclopride assessments, with a 5-week retest interval. D-2/3 receptor availability was quantified as binding potential using the simplified reference tissue model. Absolute variability (VAR) and intraclass correlation coefficient (ICC) values indicated very good reproducibility for the striatum and were 4.5%/0.82, 3.9%/0.83, and 3.9%/0.82, for the caudate nucleus, putamen, and ventral striatum, respectively. Thalamic reliability was also very good, with VAR of 3.7% and ICC of 0.92. Test-retest data for cortical areas showed good to moderate reproducibility (6.1% to 13.1%). Our results are in line with previous test-retest studies of [C-11]raclopride binding in the striatum. A novel finding is the ...
Objective: To assess classical psychometric properties of the Spanish versions of the Bech-Rafaelsens mania (MAS) and melancholia (MES) scales.. Method: Observational, prospective, and multicentric study in bipolar out-patients. Convergent validity was assessed against the Young Mania Rating Scale and the Montgomery-Åsberg Depression Rating Scale. Discriminant validity, reliability, and sensitivity to change, were also assessed.. Results: One hundred and thirteen bipolar patients with a manic episode and 102 bipolar patients with a depressive episode were included. Both the MAS and the MES showed appropriate convergent validity (r , 0.90), discriminant validity (P , 0.0001), internal consistency (Cronbachs alpha ,0.80), test-retest reliability [intraclass correlation coefficient (ICC) = 0.69 for the MAS and 0.94 for the MES], inter-rater reliability (ICC , 0.80), and sensitivity to change at 4 weeks since inception (P , 0.0001; within-group effect size ≥1.8).. Conclusion: The Spanish ...
tDNA-PCR and capillary electrophoresis of the amplified DNA fragments already have been evaluated for the differentiation ofListeria species (14) and enterococci (1). To enable identification of a large number of strains, a software program which was described previously (1) has been developed at our laboratory. In the present study, the interlaboratory reproducibility of tDNA-PCR was evaluated in order to develop a fully exchangeable digital fingerprint database which can be consecutively extended with new fingerprints of species belonging to a wide array of genera.. For S. agalactiae strains, tDNA-PCR resulted in a fingerprint with six reproducibly present peaks. The standard deviation of the amplified tDNA spacer fragment lengths (peak values) was calculated for each of the six peaks obtained in 122 fingerprints of strain LMG14694T. The standard deviation of all samples ranged from 0.19 to 0.38 bp for peaks between 54 and 253 bp, which indicates that reproducibility with regard to peak values ...
TY - JOUR. T1 - Standardization of sonographic lung-to-head ratio measurements in isolated congenital diaphragmatic hernia. T2 - Impact on the reproducibility and efficacy to predict outcomes. AU - Britto, Ingrid Schwach Werneck. AU - Sananes, Nicolas. AU - Olutoye, Oluyinka O.. AU - Cass, Darrell L.. AU - Sangi-Haghpeykar, Haleh. AU - Lee, Timothy C.. AU - Cassady, Christopher I.. AU - Mehollin-Ray, Amy. AU - Welty, Stephen. AU - Fernandes, Caraciolo. AU - Belfort, Michael A.. AU - Lee, Wesley. AU - Ruano, Rodrigo. PY - 2015/1/1. Y1 - 2015/1/1. N2 - Objectives - The purpose of this study was to evaluate the impact of standardization of the lung-to-head ratio measurements in isolated congenital diaphragmatic hernia on prediction of neonatal outcomes and reproducibility. Methods - We conducted a retrospective cohort study of 77 cases of isolated congenital diaphragmatic hernia managed in a single center between 2004 and 2012. We compared lung-to-head ratio measurements that were performed ...
Key concepts in classical test theory are reliability and validity. A reliable measure is one that measures a construct consistently across time, individuals, and situations. A valid measure is one that measures what it is intended to measure. Reliability is necessary, but not sufficient, for validity. Both reliability and validity can be assessed statistically. Consistency over repeated measures of the same test can be assessed with the Pearson correlation coefficient, and is often called test-retest reliability.[14] Similarly, the equivalence of different versions of the same measure can be indexed by a Pearson correlation, and is called equivalent forms reliability or a similar term.[14] Internal consistency, which addresses the homogeneity of a single test form, may be assessed by correlating performance on two halves of a test, which is termed split-half reliability; the value of this Pearson product-moment correlation coefficient for two half-tests is adjusted with the Spearman-Brown ...
Objective: To report the agreement between gray-scale intravascular ultrasound (GS-IVUS) and optical coherence tomography (OCT) in assessing the bioresorbable vascular scaffolds (BVS) structures and their respective reproducibility. Background: BVS are composed of an erodible polymer. Ultrasound and light signals backscattered from polymeric material differs from metallic stents using GS-IVUS and OCT. Methods: Forty-five patients included in the ABSORB trial were treated with a 3.0 × 18 mm BVS and imaged with GS-IVUS 20 MHz and OCT post-implantation. Qualitative (ISA, side-branch struts, protrusion, and dissections) and quantitative (number of struts, lumen, and scaffold area) measurements were assessed by two investigators. The agreement and the inter- and intraobserver reproducibility were investigated using the kappa (κ) and the interclass correlation coefficient (ICC). Results: GS-IVUS and OCT agreement was predominantly poor at a lesion, frame, and strut level analysis (κ and ICC ,0.4) ...
We investigated the interrater reliability and accuracy of two independent medical doctors in using NINCDS/ADRDA criteria to classify 82 elderly subjects enrolled in OPTIMA, a longitudinal study investigating dementia. Kappa statistics revealed moderate agreement (0.5) in overall classification of dementia type, and almost perfect agreement (0.9) on the absence or presence of dementia. Combining NINCDS/ADRDA possible and probable Alzheimers disease (AD) categories produced substantial agreement (0.7). Comparison with CERAD histopathological criteria for AD showed that combining possible and probable AD resulted in a high sensitivity and accuracy, but a low specificity. To increase specificity, the NINCDS/ADRDA probable AD category should be used alone. An important finding was that the accuracy of diagnoses of AD made from the case notes alone was not different from the diagnoses obtained following active involvement with participants.
Responsiveness of physicians is the social actions that physicians do to meet the legitimate expectations of service seekers. Since there is no such scale, this study aimed at developing one for measuring responsiveness of physicians in rural Bangladesh, by structured observation method. Data were collected from Khulna division of Bangladesh, through structured observation of 393 patient-consultations with physicians. The structured observation tool consisted of 64 items, with four Likert type response categories, each anchored with a defined scenario. Inter-rater reliability was assessed by same three raters observing 30 consultations. Data were analyzed by exploratory factor analysis (EFA), followed by assessment of internal consistency by ordinal alpha coefficient, inter-rater reliability by intra-class correlation coefficient (ICC), concurrent validity by correlating responsiveness score with waiting time, and known group validity by comparing public and private sector physicians. After removing
RESULTS: Mean age of participants was 38.13 years (SD = 11.45) and all men were married. Cronbach α of the MGSIS-I was 0.89 and interclass correlation coefficients ranged from 0.70 to 0.94. Significant correlations were found between the MGSIS-I and the International Index of Erectile Function (P , .01), whereas correlation of the scale with non-similar scales was lower than with similar scale (confirming convergent and divergent validity). The scale could differentiate between subgroups in age, smoking status, and income (known-group validity). A single-factor solution that explained 70% variance of the scale was explored using exploratory factor analysis (confirming uni-dimensionality); confirmatory factor analysis indicated better fitness for the five-item version than the seven-item version of the MGSIS-I (root mean square error of approximation = 0.05, comparative fit index , 1.00 vs root mean square error of approximation = 0.10, comparative fit index , 0.97, respectively ...
Cognitive diagnostic classification models (DCMs) have been developed to assess the cognitive processes underlying assessment responses. Current dissertation aims to provide theoretical and practical considerations for estimation of DCMs for educational applications by investigating several important underexplored issues. To avoid problems related to retrofitting of DCMs to an already existing data, test construction of the newly mathematics assessment for primary school DMA was based on a-priori defined Q-matrices. In this dissertation we compared DCMs with established psychometric models and investigated the incremental validity of DCMs profiles over traditional IRT scores. Furthermore, we addressed the issue of the verification of the Q-matrix definition. Moreover, we examined the impact of invalid Q-matrix specification on item, respondent parameter recovery, and sensitivity of selected fit measures. In order to address these issues one simulation study and two empirical studies illustrating ...
The purpose of the present study was to evaluate the reproducibility and relative validity and calibrate the dietary intake assessment of a food frequency questionnaire (FFQ) using a random sample of 195 adults aged 20 to 50 years from the Central-West Region of Brazil. The reference method used by the study was two 24-hour recalls (24hR) that provided energy-adjusted deattenuated food intake data for comparison purposes. With respect to reproducibility, the average weighted kappa was 0.43 and exact agreement was 41.5%. With regard to relative validity, correlation coefficients ranged from 0.32 (thiamin) to 0.51 (carbohydrates), with a mean of 0.41. Deattenuation and adjustment for energy intake decreased most correlation coefficients in relation to crude values. The food frequency questionnaire showed good reliability and moderate validity for most nutrients based on classification into quartiles of energy and nutrient intake. The calibrated means of the FFQ were more similar to the means ...
Abstract: Motivation: Reproducibility analyses of biologically relevant microarray studies have mostly focused on overlap of detected biomarkers or correlation of differential expression evidences across studies. For clinical utility, direct inter-study prediction (i.e. to establish a prediction model in one study and apply to another) for disease diagnosis or prognosis prediction is more important. Normalization plays a key role for such a task. Traditionally, sample-wise normalization has been a standard for inter-array and inter-study normalization. For gene-wise normalization, it has been implemented for intra-study or inter-study predictions in a few papers while its rationale, strategy and effect remain unexplored.. Results: In this article, we investigate the effect of gene-wise normalization in microarray inter-study prediction. Gene-specific intensity discrepancies across studies are commonly found even after proper sample-wise normalization. We explore the rationale and necessity of ...
To err is human. Scientists being human, they make mistakes. Many if not most of the rules for doing science are designed to weed out mistakes. Reproducibility and replicability are recognized as playing a central role in this process. But a lot of confusion remains about the difference between these two labels and the relation between them. In this essay, I will explain why replicability is the foundation on top of which reproducibility can be constructed, and introduce verifiability as the missing link between them, which deserves particular attention in the context of computer-aided research.. First, a note about terminology. Some people use reproducible and replicable in the sense I will soon define, whereas others exchange the definitions of the two terms, and yet others seem to consider them synonyms. I hope that the scientific community will ultimately converge to common definitions, but we arent there yet.. To make the relation between replicability, reproducibility, and ...
This study evaluated the reproducibility of 24 soft tissue landmarks on six three-dimensional (3D) facial scans. The scans were taken on a DSP400 facial scanner and were viewed using a customized software program. Intraoperator data were obtained by one researcher placing the 24 landmarks on all six scans a total of 30 times. Thirty different orthodontists of varying experience were then asked to place all 24 landmarks on each of the six facial scans in order to establish interoperator reproducibility. The standard deviations (SDs) from the mean were calculated from the data for each individual landmark in the x-, y-, and z-axes.. For the intraoperator data, 12 of the 24 landmarks were found to be reproducible to within a 1 mm SD for each plane of space. The interoperator data showed lower reproducibility with just two landmarks showing less than a 1 mm SD in all three planes of space.. Familiarity with 3D facial scans and associated software programs is important in improving reproducibility. ...
Following guidelines from the Patient-Centred Outcomes Research Institute and using a mixed methods study, a new patient-reported outcome measure (PROM) for both nerve trauma and compression affecting the hand, the Impact of a Hand Nerve Disorders (I-HaND) Scale, was developed. Face-to-face interviews with 14 patients and subsequent pilot-testing with 61 patients resulted in the development of the 32-item PROM. A longitudinal validation study with 82 patients assessed the psychometric properties of the I-HaND. Content and construct validity was confirmed by cognitive interviews with patients and through principal component analysis. The I-HaND has high internal consistency (α = 0.98) and excellent test-retest reliability (intraclass correlation coefficient = 0.97). Responsiveness statistics showed that the I-HaND can detect change over 3 months and discriminate between improvers and non-improvers. We conclude that the I-HaND can be used as a PROM for people with a range of hand nerve ...
Scientific research informs decisions that address many pressing issues, but what happens when results from one lab or study cannot be confirmed in another? Inconsistent results undermine the validity of scientific findings and contribute to the growing concern about replicability and reproducibility in science. A widespread strategy involving a variety of stakeholders is essential in order to promote openness and transparency in the research enterprise.. Three recent reports from the National Academies identify opportunities for meaningful improvement in research practices and offer guidance toward open, consistent, and objective science. Most recently, our 2019 report, Reproducibility and Replicability in Science, defines the terms reproducibility and replicability as distinct concepts that are each critical in achieving this goal. While many use these terms interchangeably, this differentiation is a critical step towards stronger scientific research practices and more reliable science. To ...
The psychometric properties of the Persian-language version of Obsessive-Compulsive Inventory- Revised (OCI-R) were studied in a sample of Iranian college students (N = 450). The total and each of the subscales of OCI-R-Persian demonstrated very high internal consistency as well as high test-retest reliability. Convergent and divergent validity of the OCI-R-Persian total scale and subscales were satisfactory. In general, the OCI-R-Persian appears to be a reliable and valid measure of obsessive-compulsive symptoms in this non-clinical sample of Iranian college students.
The aim of this study was to evaluate the test-retest reliability and criterion validity of self-measurements taken by novice lay persons using a self-assembled tape measure after viewing a brief online instructional video. Results indicate that participants were able to accurately assemble the tape measure and demonstrate proficiency in measuring themselves when observed by lab technicians. The low technical error measurements and high reliability for duplicate measurements demonstrates excellent intra-observer accuracy and reliability. The high ICCs between participant home and lab waist, hip, and neck circumferences indicate that participant self-measurements are highly reliable over time, which is congruent with the limited research reporting reliability of self-measurements [10, 36]. The high reliability indicates that measurements individuals take over time can help them accurately track physical changes that may enable them, their health care providers, and researchers to better realize ...
In order to conduct studies on shared decision-making (SDM) and to implement SDM in routine practice, psychometrically tested measures are needed. The development of the short 5-item version of the OPTION scale (Observer OPTION5) allows to assess SDM from an observer perspective. Observer OPTION5 is so far only available in English and Dutch. The aim of this study was to translate the Observer OPTION5 rating scale into German and to test its psychometric properties. The German Observer OPTION5 was tested in a secondary data analysis of audio-recordings of patient-physician-consultations (N = 79) in German primary care practices. Demographic data were analysed using descriptive statistics. To assess inter- and intra-rater reliability, intraclass correlation coefficients (ICCs) were calculated. For assessing concurrent validity, a correlation (Spearmans Rho) of the sum score of Observer OPTION5 and Observer OPTION12 was calculated. The consultations dealt with decisions regarding type 2 diabetes (N = 31)
Using emerging international guidelines, stringent procedures were used to develop and evaluate Canadian-French, German and UK translations/adaptions of the 50 item, parent-completed Child Health Questionnaire (CHQ-PF50). Multitrait analysis was used to evaluate the convergent and discriminant validity of the hypothesized item sets across countries relative to the results obtained for a representative sample of children in the US. Cronbachs alpha coefficient was used to estimate the internal consistency reliability for each of the health scales. Floor and ceiling effects were also examined. Seventy-nine percent of all the item-scale correlations achieved acceptable internal consistency (0.40 or higher). The tests of the item convergent and discriminant validity were successful at least 87% of the time across all scales and countries. Equal item variance was observed 90% of the time across all countries. The reliability coefficients ranged from a low of 0.43 (parental time impact, Canadian English) to a
Several studies2,7,15,16 have analyzed the intra-rater reliability of the 6MWT; therefore, this test has been considered reliable for assessing functional capacity in patients with COPD after a practice test. However, there is a lack of studies verifying the inter-rater reliability for this population.. The intra-rater 6MWT reliability in our study presented ICC values for walked distance ,0.75, indicating excellent reliability. This analysis has been already studied in subjects with chronic respiratory disease by many authors, who found ICC values ranging from 0.82 to 0.99,7,12,14,15,33-35 confirming the findings of our study. The studies mentioned above were conducted with COPD,7,15,34 with obstructive disease and restrictive lung diseases,12 and with lung disease in the final stage.35 The last 2 studies not performed the second 6MWT, with an interval of 30 min after the first 6MWT, according to the standards of the ATS/ERS.7,14 Furthermore, we found low coefficient of variation values (0.06), ...
A brief measure is needed to examine the role of hopelessness on mental and physical health outcomes in large population studies. We examined the validity and reliability of two brief measures of hopelessness in a large non-clinical sample, one negatively valenced (Brief-H-Neg) and one positively valenced (Brief-H-Pos). Both were shown to correlate strongly with the longer BHS and mirror the positive correlation seen between the BHS and a measure of depression, providing evidence of concurrent validity, with adequate internal consistency and test-retest reliability.. The sizes of the 2-week retest correlations for the brief measures reported in our non-clinical sample (0.67 and 0.72) are similar to those reported for the BHS in a sample of university undergraduates over a 3-week retest interval (0.67, female students) or a 10-week interval (0.75).25 ,26 Studies assessing the retest reliability of hopelessness instruments have reported varying retest intervals. Hopelessness may be conceptualised ...
OBJECTIVE: To determine the inter-rater reliability and validity of the Netherlands Triage Standard (NTS) for paediatric triage. DESIGN: A cross-sectional study using fictional cases for telephone and physical triage. METHOD: An expert panel established in advance the urgency of 40 cases concerning emergency help requests from non-referred children (the reference standard). These requests were presented in an online survey to triagists from three general practitioner (GP) out-of-hours practices, three ambulance dispatching centres and three hospital emergency departments. Triagists assessed all cases, using the NTS. We determined the agreement on degrees of urgency between different triagists and compared them with the reference standard. The outcome measure for inter-rater reliability was the intraclass correlation coefficient (ICC). The outcome measures for validity were the degree of agreement with the reference standard, under-triage and over-triage, and sensitivity and specificity in ...
Iterative algorithms are widely applied in reliability analysis and design optimization. Nevertheless, phenomena of failed convergence, such as periodic oscillation, bifurcation, and chaos, are oftentimes observed in iterative procedures of solving some nonlinear problems. In the present paper, the essential causes of numerical instabilities including periodic oscillation and chaos of iterative solutions are revealed by the eigenvalue-based stability analysis of iterative schemes. To understand and control these instabilities, the stability transformation method (STM), which is capable of tackling numerical instabilities of iterative algorithms in reliability analysis and design optimization, is proposed. Finally, several benchmark examples of convergence control of PMA (performance measure approach) for probabilistic analysis and the SORA (sequential optimization and reliability assessment) for reliability-based design optimization (RBDO) are presented. The observations from the benchmark ...
We evaluated the reliability of 8-hydroxy-2-deoxyguanosine (8-OHdG), and determined its ability to predict functional outcomes in stroke survivors. The rehabilitation effect on 8-OHdG and functional outcomes were also assessed. Sixty-one stroke patients received a 4-week rehabilitation. Urinary 8-OHdG levels were determined by liquid chromatography-tandem mass spectrometry. The test-retest reliability of 8-OHdG was good (interclass correlation coefficient = 0.76). Upper-limb motor function and muscle power determined by the Fugl-Meyer Assessment (FMA) and Medical Research Council (MRC) scales before rehabilitation showed significant negative correlation with 8-OHdG (r = −0.38, r = −0.30; p < 0.05). After rehabilitation, we found a fair and significant correlation between 8-OHdG and FMA (r = −0.34) and 8-OHdG and pain (r = 0.26, p < 0.05). Baseline 8-OHdG was significantly correlated with post-treatment FMA, MRC, and pain scores (r = −0.34, −0.31, and 0.25; p < 0.05), indicating its
The aim was to assess intraobserver reliability of a new semi-automated technique of embryo volumetry. Power calculations suggested 46 subjects with viable, singleton pregnancies were required for reliability analysis. Crown rump length (CRL) of each
The psychometric properties of Chinese version of SCI Exercise Self-Efficacy Scale in patients with stroke Xiaofang Dong,1 Yanjin Liu,2 Aixia Wang,3 Min Wang41Neurology Department, 2Nursing Department, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan Province, Peoples Republic of ChinaObjective: To test the Chinese version of the SCI Exercise Self-Efficacy Scale (C-ESES) in stroke patients and evaluate its validity and reliability.Background: Physical inactivity is a well established and changeable risk factor for stroke, and regular exercise of 3-7 days per week is essential for stroke survivors and the general population. Though regular exercise is beneficial, it has been proved that duration, frequency, and intensity of exercise are generally low in stroke survivors.Methods: The performance of the instrument was assessed intab 350 Chinese stroke survivors and repeated in 50 patients to examine test-retest reliability. Questionnaires included a form on demographic and
We report results from a worldwide interlaboratory comparison of samples among laboratories that measure (or measured) stable carbon and hydrogen isotope ratios of atmospheric CH4 (δ13C-CH4 and δD-CH4). The offsets among the laboratories are larger than the measurement reproducibility of individual laboratories. To disentangle plausible measurement offsets, we evaluated and critically assessed a large number of intercomparison results, some of which have been documented previously in the literature. The results indicate significant offsets of δ13C-CH4 and δD- CH4 measurements among data sets reported from different laboratories; the differences among laboratories at modern atmospheric CH4 level spread over ranges of 0.5 ‰ for δ13C-CH4 and 13 ‰ for δD-CH4. The intercomparison results summarized in this study may be of help in future at tempts to harmonize δ13C-CH4 and δD-CH4 data sets from different laboratories in order to jointly incorporate them into modelling studies. However,
Background. Preclinical perfusion studies are useful for the improvement of diagnosis and therapy in dermatologic, cardiovascular and rheumatic human diseases. The Laser Doppler Perfusion Imaging (LDPI) technique has been used to evaluate superficial alterations of the skin microcirculation in surgically induced murine hindlimb ischemia. We assessed the reproducibility and the accuracy of LDPI acquisitions and identified several critical factors that could affect LDPI measurements in mice. Methods. Twenty mice were analysed. Statistical standardisation and a repeatability and reproducibility analysis were performed on mouse perfusion signals with respect to differences in body temperature, the presence or absence of hair, the type of anaesthesia used for LDPI measurements and the position of the mouse body. Results. We found excellent correlations among measurements made by the same operator (i.e., repeatability) under the same experimental conditions and by two different operators (i.e.,
We have developed a diabetes quality-of-life (DQOL) measure oriented toward the patient with insulin-dependent diabetes mellitus (IDDM). The DQOL was assessed for its reliability and validity in a group of patients with IDDM (n = 192). We found that the DQOL and its four scales had high degrees of internal consistency (Cronbachs r = .66−.92) and excellent test-retest reliability (r = .78−.92). Using conceptually relevant measures of psychiatric symptoms, perceived well-being and adjustment to illness, we also demonstrated convergent validity of the DQOL. This instrument was initially designed for use in the Diabetes Control and Complications Trial, a multicenter controlled clinical trial evaluating the effects of two different diabetes treatment regimens on the appearance and progression of early vascular complications. However, the DQOL may also be useful in evaluating the quality of life in other groups of patients with IDDM.. ...
I am extremely passionate about both reproducibility and replicability in neuroimaging. Reproducibility is the ultimate goal, while replicability should be the bare minimum that we demand in the 21st century. To this end I am committed to releasing all the analysis code from my peer reviewed manuscripts. You can find it at my github page. Please do use whatever you…
IgA nephropathy (IgAN) is the commonest global cause of glomerulonephritis. Extent of fibrosis, tubular atrophy and glomerulosclerosis predict renal function decline. Extent of renal fibrosis is assessed with renal biopsy which is invasive and prone to sampling error. We assessed the utility of non-contrast native T1 mapping of the kidney in patients with IgAN for assessment of renal fibrosis. Renal native T1 mapping was undertaken in 20 patients with IgAN and 10 healthy subjects. Ten IgAN patients had a second scan to assess test-retest reproducibility of the technique. Native T1 times were compared to markers of disease severity including degree of fibrosis, eGFR, rate of eGFR decline and proteinuria. All patients tolerated the MRI scan and analysable quality T1 maps were acquired in at least one kidney in all subjects. Cortical T1 times were significantly longer in patients with IgAN than healthy subjects (1540 ms ± 110 ms versus 1446 ± 88 ms, p = 0.038). There was excellent test-retest
This study examined the reliability and validity of the Virtual Assessment of Mentalising Ability (VAMA). The VAMA consists of 12 video clips depicting a social drama imposed within an interactive virtual environment with questions assessing the mental states of virtual friends. Response options capture the continuum of ability (i.e., impaired, reduced, accurate, and hypermentalising) within first- and second-order cognitive and affective theory of mind (ToM). Sixty-two healthy participants were administered the VAMA, three other ToM measures, and additional measures of neurocognitive abilities and social functioning. The VAMA had sound internal consistency and high test-retest reliability. Significant correlations between performance on the VAMA and other ToM measures provided preliminary evidence of convergent validity. Small to moderate correlations were observed between performance on the VAMA and neurocognitive tasks. Further, the VAMA was found to correlate significantly with indices of ...
Introduction: It is a common finding that despite high levels of specificity and sensitivity, many medical tests are not highly effective in diagnosing diseases exhibiting a low prevalence within a clinical population. What is not widely known or appreciated is how the results of retesting a patient using the same or a different medical or psychological test impacts the estimated probability that a patient has a particular disease. In the absence of a gold standard spe-cial techniques are required to understand the error structure of a medical test. Generalizability can provide guid-ance as to whether a serial Bayes model accurately updates the positive predictive value of multiple test results. Methods: In order to understand how sources of error impact a tests outcome, test results should be sampled across the testing conditions that may contribute to error. A generalizability analysis of appropriately sampled test results should allow researchers to estimate the influence of each error source as a
Objective: The assessment of response to lithium maintenance treatment in bipolar disorder (BD) is complicated by variable length of treatment, unpredictable clinical course, and often inconsistent compliance. Prospective and retrospective methods of assessment of lithium response have been proposed in the literature. In this study we report the key phenotypic measures of the Retrospective Criteria of Long-Term Treatment Response in Research Subjects with Bipolar Disorder scale currently used in the Consortium on Lithium Genetics (ConLiGen) study. Materials and Methods: Twenty-nine ConLiGen sites took part in a two-stage case-vignette rating procedure to examine inter-rater agreement [Kappa ($$\kappa$$)] and reliability [intra-class correlation coefficient (ICC)] of lithium response. Annotated first-round vignettes and rating guidelines were circulated to expert research clinicians for training purposes between the two stages. Further, we analyzed the distributional properties of the treatment ...
Read Test-retest repeatability of myocardial oxidative metabolism and efficiency using standalone dynamic 11C-acetate PET and multimodality approaches in healthy controls, Journal of Nuclear Cardiology on DeepDyve, the largest online rental service for scholarly research with thousands of academic publications available at your fingertips.
Arterial spin labeling (ASL) sequences that incorporate multiple postlabeling delay (PLD) times allow estimation of when arterial blood signal arrives within a region of interest. Sequences that account for such variability may improve the reliability of ASL and therefore make the technique well suited for future clinical and experimental investigations of cerebral perfusion. This study assessed the within- and between-session reproducibility of an optimized pseudo-continuous ASL (pCASL) functional magnetic resonance imaging (FMRI) sequence that incorporates multiple postlabeling delays (multi-PLD pCASL). Healthy subjects underwent four identical scans separated by 30 minutes, 1 week, and 1 month using multi-PLD pCASL to image absolute perfusion (cerebral blood flow (CBF) and arterial arrival time (AAT)) during both rest and a visual-cued motor task. We show good test-retest reliability, with strong consistency across subjects and sessions during rest (inter-session within-subject coefficient of
Arterial spin labeling (ASL) sequences that incorporate multiple postlabeling delay (PLD) times allow estimation of when arterial blood signal arrives within a region of interest. Sequences that account for such variability may improve the reliability of ASL and therefore make the technique well suited for future clinical and experimental investigations of cerebral perfusion. This study assessed the within- and between-session reproducibility of an optimized pseudo-continuous ASL (pCASL) functional magnetic resonance imaging (FMRI) sequence that incorporates multiple postlabeling delays (multi-PLD pCASL). Healthy subjects underwent four identical scans separated by 30 minutes, 1 week, and 1 month using multi-PLD pCASL to image absolute perfusion (cerebral blood flow (CBF) and arterial arrival time (AAT)) during both rest and a visual-cued motor task. We show good test-retest reliability, with strong consistency across subjects and sessions during rest (inter-session within-subject coefficient of
The observer reliability of VLS is fair to good with intraobserver reliability being better than interobserver reliability. This supports the use of VLS for detection of gastrointestinal ischemia. ...
