According to modern historiography, around the year 1000, the northern Hungarian duchy with its capital in Nitra was divided into four parts which can be predicted with more or less certainty as direct descendants of tribes both within and beyond the Great Moravian Empire: Nitra, Hont, Váh, Borsod, and, in addition, in the very eastern part of today's Slovak Republic, there was a tribe which was not the part of Nitra Duchy. The consideration of tribes is important. A tribe speaks a dialect. However, the division of Slovakia was probably more complicated and on the basis of both genealogical and linguistic research we have to accept at least seven tribes and dialects in Slovakia around the year 1000. This division is supported also by the research of Proto-Slavic lexis in Old Slovak. The paper deals with hierarchical cluster analysis of this lexis and offers a solution to the genealogical problem of Slovak language. According to it, Old Slovak can be divided into eastern and west-central ...
Objective of any biclustering algorithm in microarray data is to discover a subset of genes that are expressed similarly in a subset of conditions. The boundaries of biclusters usually overlap as genes and conditions may belong to different biclusters with different membership degrees. Hence the notion of fuzzy sets is useful for discovering such overlapping biclusters. In this article an attempt has been made to develop a multiobjective genetic algorithm based approach for probabilistic fuzzy biclustering that minimizes the residual and maximizes cluster size and expression profile variance. A novel variable string length encoding has been proposed in this regard that encodes multiple biclusters in a single string. Also a new performance measure that reflects how a bicluster is statistically distinguished from the background is proposed. Performance of the proposed algorithm has been compared with some well known biclustering algorithms. © 2008 IEEE.. ...
Downloadable (with restrictions)! Supporting services augment the value of a businesss core service, provide points of differentiation, and create a competitive advantage over competitors. Fitness clubs offer a number of supporting services, including sport participation opportunities. Fitness tests are a common supporting service. This study examined interest in fitness tests and related supporting services. Moreover, because customised programs are harder to imitate, optimal combinations of desired services were investigated. Further, K-means cluster analysis identified seven meaningfully differentiated customer groups. MANOVA and chi-square analyses indicated that clustered groups differed based on demographic and psychographic variables. The study demonstrates that (1) consumers desire supporting services, (2) distinct bundles of supporting services can be identified, and (3) consumers desiring distinct bundles of services are have distinct demographic and psychographic profiles. Fitness providers
Cluster analysis of gene expression data from a cDNA microarray is useful for identifying biologically relevant groups of genes. However, finding the natural clusters in the data and estimating the correct number of clusters are still two largely unsolved problems. In this paper, we propose a new clustering framework that is able to address both these problems. By using the one-prototype-take-one-cluster (OPTOC) competitive learning paradigm, the proposed algorithm can find natural clusters in the input data, and the clustering solution is not sensitive to initialization. In order to estimate the number of distinct clusters in the data, we propose a cluster splitting and merging strategy. We have applied the new algorithm to simulated gene expression data for which the correct distribution of genes over clusters is known a priori. The results show that the proposed algorithm can find natural clusters and give the correct number of clusters. The algorithm has also been tested on real gene ...
Analysis of Chlorinated Hydrocarbon Concentration Data from Thousands of Groundwater Wells Using a Density-Based Cluster Analysis Approach
CiteSeerX - Document Details (Isaac Councill, Lee Giles, Pradeep Teregowda): In gene expression data, a bicluster is a subset of the genes exhibiting consistent patterns over a subset of the conditions. We propose a new method to detect significant biclusters in large expression datasets. Our approach is graph theoretic coupled with statistical modelling of the data. Under plausible assumptions, our algorithm is polynomial and is guaranteed to find the most significant biclusters. We tested our method on a collection of yeast expression profiles and on a human cancer dataset. Cross validation results show high specificity in assigning function to genes based on their biclusters, and we are able to annotate in this way 196 uncharacterized yeast genes. We also demonstrate how the biclusters lead to detecting new concrete biological associations. In cancer data we are able to detect and relate finer tissue types than was previously possible. We also show that the method outperforms the biclustering
article{c7645192-ef05-4b8e-bbfb-db2b0fc93f4e, abstract = {ObjectiveHematopoietic stem cell transplantation (HSCT) is curative in several life-threatening pediatric diseases but may affect children and their families inducing depression, anxiety, burnout symptoms, and post-traumatic stress symptoms, as well as post-traumatic growth (PTG). The aim of this study was to investigate the co-occurrence of different aspects of such responses in parents of children that had undergone HSCT. MethodsQuestionnaires were completed by 260 parents (146 mothers and 114 fathers) 11-198 months after HSCT: the Hospital Anxiety and Depression Scale, the Shirom-Melamed Burnout Questionnaire, the post-traumatic stress disorders checklist, civilian version, and the PTG inventory. Additional variables were also investigated: perceived support, time elapsed since HSCT, job stress, partner-relationship satisfaction, trauma appraisal, and the childs health problems. A hierarchical cluster analysis and a k-means cluster ...
Cluster Analysis Menggunakan Algoritma Fuzzy C-means dan K-means Untuk Klasterisasi dan Pemetaan Lahan Pertanian di Minahasa Tenggara
Background: The timely and accurate identification of symptoms of acute coronary syndrome (ACS) is a challenge forpatients and clinicians. It is unknown whether response times and clinical outcomes differ with specific symptoms. We sought toidentify which ACS symptoms are related symptom clusters and to determine if sample characteristics, response times, and outcomes differ among symptom cluster groups. Methods: In a multisite randomized clinical trial, 3522 patients with known cardiovascular disease were followed up for 2 years. During follow-up, 331 (11%) had a confirmed ACS event. In this group, 8 presenting symptoms were analyzed using cluster analysis. Differences in symptom cluster group characteristics, delay times, and outcomes were examined. Results: The sample was predominately male (67%), older (mean 67.8, S.D. 11.6 years), and white (90%). Four symptom clusters were identified: Classic ACS characterized by chest pain; Pain Symptoms (neck, throat, jaw, back, shoulder, arm pain); ...
  This paper aims to test the hypothesis of consumer price index convergence among Iran provinces over the period from 2003 to 2016 by implementing cluster analysis and panel unit root test. Studying the price index convergence is important in several ways. First, CPI convergence is equivalent in some ways ...
With limited resources available, injury prevention efforts need to be targeted both geographically and to specific populations. As part of a pediatric injury prevention project, data was obtained on all pediatric medical and injury incidents in a fire district to evaluate geographical clustering of pediatric injuries. This will be the first step in attempting to prevent these injuries with specific interventions depending on locations and mechanisms. There were a total of 4803 incidents involving patients less than 15 years of age that the fire district responded to during 2001-2005 of which 1997 were categorized as injuries and 2806 as medical calls. The two cohorts (injured versus medical) differed in age distribution (7.7 ± 4.4 years versus 5.4 ± 4.8 years, p | 0.001) and location type of incident (school or church 12% versus 15%, multifamily residence 22% versus 13%, single family residence 51% versus 28%, sport, park or recreational facility 3% versus 8%, public building 8% versus 7%, and street
With limited resources available, injury prevention efforts need to be targeted both geographically and to specific populations. As part of a pediatric injury prevention project, data was obtained on all pediatric medical and injury incidents in a fire district to evaluate geographical clustering of pediatric injuries. This will be the first step in attempting to prevent these injuries with specific interventions depending on locations and mechanisms. There were a total of 4803 incidents involving patients less than 15 years of age that the fire district responded to during 2001-2005 of which 1997 were categorized as injuries and 2806 as medical calls. The two cohorts (injured versus medical) differed in age distribution (7.7 ± 4.4 years versus 5.4 ± 4.8 years, p | 0.001) and location type of incident (school or church 12% versus 15%, multifamily residence 22% versus 13%, single family residence 51% versus 28%, sport, park or recreational facility 3% versus 8%, public building 8% versus 7%, and street
A feature selection method in microarray gene expression data should be independent of platform, disease and dataset size. Our hypothesis is that among the statistically significant ranked genes in a gene list, there should be clusters of genes that share similar biological functions related to the investigated disease. Thus, instead of keeping N top ranked genes, it would be more appropriate to define and keep a number of gene cluster exemplars. We propose a hybrid FS method (mAP-KL), which combines multiple hypothesis testing and affinity propagation (AP)-clustering algorithm along with the Krzanowski & Lai cluster quality index, to select a small yet informative subset of genes. We applied mAP-KL on real microarray data, as well as on simulated data, and compared its performance against 13 other feature selection approaches. Across a variety of diseases and number of samples, mAP-KL presents competitive classification results, particularly in neuromuscular diseases, where its overall AUC score was 0
In this thesis, a mixture-model cluster analysis technique under different covariance structures of the component densities is developed and presented, to capture the compactness, orientation, shape, and the volume of component clusters in one expert system to handle Gaussian high dimensional heterogeneous data sets to achieve flexibility in currently practiced cluster analysis techniques. Two approaches to parameter estimation are considered and compared; one using the Expectation-Maximization (EM) algorithm and another following a Bayesian framework using the Gibbs sampler. We develop and score several forms of the ICOMP criterion of Bozdogan (1994, 2004) as our fitness function; to choose the number of component clusters, to choose the correct component covariance matrix structure among nine candidate covariance structures, and to select the optimal parameters and the best fitting mixture-model. We demonstrate our approach on simulated datasets and a real large data set, focusing on early detection
This study links empirical analysis of geographical variations in fertility to ideas of contextualising demography. We examine whether there are statistically significant clusters of fertility in Scotland between 1981 and 2001, controlling for more general factors expected to influence fertility. Our hypothesis, that fertility patterns at a local scale cannot be explained entirely by ecological socio-economic variables, is supported. In fact, there are unexplained local clusters of high and low fertility, which would be masked in analyses at a different scale. We discuss the demographic significance of local fertility clusters as contexts for fertility behaviour, including the role of the housing market and social interaction processes, and the residential sorting of those displaying or anticipating different fertility behaviour. We conclude that greater understanding of local geographical contexts is needed if we are to develop mid-level demographic theories and shift the focus of fertility ...
K-means algorithm is explained and an implementation is provided in C# and Silverlight. It includes a live demo in Silverlight so that the users can understand the working of k-means algorithm by specifying custom data points.
TY - JOUR. T1 - Dietary patterns by cluster analysis in pregnant women. T2 - relationship with nutrient intakes and dietary patterns in 7-year-old offspring. AU - Freitas-Vilela, Ana Amélia. AU - Smith, Andrew D A C. AU - Kac, Gilberto. AU - Pearson, Rebecca M. AU - Heron, Jon. AU - Emond, Alan. AU - Hibbeln, Joseph R. AU - Castro, Maria Beatriz Trindade. AU - Emmett, Pauline M. N1 - © 2016 The Authors. Maternal & Child Nutrition published by John Wiley & Sons Ltd.. PY - 2017/4. Y1 - 2017/4. N2 - Little is known about how dietary patterns of mothers and their children track over time. The objectives of this study are to obtain dietary patterns in pregnancy using cluster analysis, to examine womens mean nutrient intakes in each cluster and to compare the dietary patterns of mothers to those of their children. Pregnant women (n = 12 195) from the Avon Longitudinal Study of Parents and Children reported their frequency of consumption of 47 foods and food groups. These data were used to obtain ...
Read Constrained clustering with a complex cluster structure, Advances in Data Analysis and Classification on DeepDyve, the largest online rental service for scholarly research with thousands of academic publications available at your fingertips.
Abstract: Color fusion MRI is being investigated for its value in automatic segmentation of tissues. An existing color fusion MRI data set of the liver, pancreas, and kidney of a normal male volunteer was analyzed both visually and statistically. Automatic tissue segmentation can allow better differentiation of abdominal pathologies, as well as pathologies associated with other organs. My research hypothesis is that fuzzy c-means clustering can be used to quantify the confidence levels of correct classification of renal, pancreatic, and hepatic tissues visualized by the color fusion MRI method. Results from data show that fuzzy c-means clustering can be used to validate the correctness of classification of abdominal tissues that are visualized by color fusion MRI.
Longitudinal data refer to the situation where repeated observations are available for each sampled object. Clustered data, where observations are nested in a hierarchical structure within objects (wi
Forty-two native, new and foreign breeds were analyzed for 18 traits. Principal component (PC) analysis showed that the first three PCs accounted for 82.6% of the total variation. The first PC is a Size and Weight Factor (SWF) and accounts for 50.5% of the total variation. The second PC is a Skin and Bone Factor (SBF) and accounts for 20.8% of the variation. The third PC is a Reproduction and Fat Factor (RFF) and accounts for 11.3% of the total variation. Non-lean meat carcass traits (skin, bone and fat) are associated with reproductive performance. Plotting SBF against SWF is useful in grouping of breed groups. This grouping is in agreement with that obtained by cluster analysis. Breeds from the same geographical area tend to be in the same performance group, suggesting genetic connections in the past. Cluster analysis indicated six genetic types. New breeds showed the shortest genetic distance to the foreign contributor breeds ...
In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we --a team of visualization scientists and meteorologists-- deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows ...
In the present investigation, we sought to refine the classification of urothelial carcinoma by combining information on gene expression, genomic, and gene mutation levels. For these purposes, we performed gene expression analysis of 144 carcinomas, and whole genome array-CGH analysis and mutation analyses of FGFR3, PIK3CA, KRAS, HRAS, NRAS, TP53, CDKN2A, and TSC1 in 103 of these cases. Hierarchical cluster analysis identified two intrinsic molecular subtypes, MS1 and MS2, which were validated and defined by the same set of genes in three independent bladder cancer data sets. The two subtypes differed with respect to gene expression and mutation profiles, as well as with the level of genomic instability. The data show that genomic instability was the most distinguishing genomic feature of MS2 tumors, and that this trait was not dependent on TP53/MDM2 alterations. By combining molecular and pathologic data, it was possible to distinguish two molecular subtypes of T(a) and T(1) tumors, ...
The clustering methods have to assume some cluster relationship among the data objects that they are applies on. Similarity between a pai...
COPD is a highly heterogeneous disease composed of different phenotypes with different aetiological and prognostic profiles and current classification systems do not fully capture this heterogeneity. In this study we sought to discover, describe and validate COPD subtypes using cluster analysis on data derived from electronic health records. We applied two unsupervised learning algorithms (k-means and hierarchical clustering) in 30,961 current and former smokers diagnosed with COPD, using linked national structured electronic health records in England available through the CALIBER resource. We used 15 clinical features, including risk factors and comorbidities and performed dimensionality reduction using multiple correspondence analysis. We compared the association between cluster membership and COPD exacerbations and respiratory and cardiovascular death with 10,736 deaths recorded over 146,466 person-years of follow-up. We also implemented and tested a process to assign unseen patients into clusters
We have analyzed genetic data for 326 microsatellite markers that were typed uniformly in a large multiethnic population-based sample of individuals as part of a study of the genetics of hypertension (Family Blood Pressure Program). Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity. On the other hand, we detected only modest genetic differentiation between different current geographic locales within each race/ethnicity group. Thus, ancient geographic ancestry, which is highly correlated with ...
Compared with individually randomised trials, cluster randomised trials are more complex to design, require more participants to obtain equivalent statistical power, and require more complex analysis. The methodological issues in cluster randomised trials have been widely discussed.7 9 In brief, observations on individuals in the same cluster tend to be correlated (non-independent), and so the effective sample size is less than the total number of individual participants.. The reduction in effective sample size depends on average cluster size and the degree of correlation within clusters, known as the intracluster (or intraclass) correlation coefficient (ρ). The intracluster correlation coefficient is the proportion of the total variance of the outcome that can be explained by the variation between clusters. To retain power, the sample size should be multiplied by 1+(m - 1)ρ, called the design effect, where m is the average cluster size. Hayes and Bennett describe a related coefficient of ...
Downloadable (with restrictions)! FESER E. J. and BERGMAN E. M. (2000) National industry cluster templates: a framework for applied regional cluster analysis, Reg. Studies 34, 1-19. A growing number of cities, states and regions in Europe, North America and elsewhere are designing development strategies around strategic clusters of industries. In many cases, a lack of data on local and interregional industrial linkages, shared business institutions, channels of technology and knowledge transfer, and other dimensions of the cluster concept means that relatively simple measures (location quotients, industry size) are often used to initially detect clusters in subnational regions. In this paper, we suggest a means of using available information on national interindustry linkages to identify potential clusters in subnational areas. Specifically, we derive a set of 23 US manufacturing clusters and employ them as templates in an illustrative analysis of the manufacturing sector in a single US state. The
The distributed K-means cluster algorithm which focused on multidimensional data has been widely used. However, the current distributed K-means clustering algor
In this study we have shown biclustering to be a useful approach to identifying subgroups of tumours, based on the use of stratified biomarkers that are personalised to specific subsets of patients. Biclustering determines gene modules and related clinical features which are important in determining phenotypic and clinical outcomes in those patients, but not in others.. In particular, we have applied biclustering to a large breast cancer expression data set that includes careful clinical annotations, and have used this method to identify clusters of breast tumours conditional on common expression profiles across a set of genes. We also demonstrated that biclusters do not simply recapitulate any obvious single, known clinical covariate (Figure 3 and Additional file 1: Figure S1), but instead represent a group of tumours co-expressing a set of genes that are associated with similar clinical presentation and give rise to recurrence risk. We found that biclusters have strong prognostic association ...
It can be seen that, FCM differs from k-means by using the membership values \(u_{ij}\) and the fuzzifier \(m\). The variable \(u_{ij}^m\) is defined as follow: \[ u_{ij}^m = \frac{1}{\sum\limits_{l=1}^k \left( \frac{, x_i - c_j ,}{, x_i - c_k ,}\right)^{\frac{2}{m-1}}} \] The degree of belonging, \(u_{ij}\), is linked inversely to the distance from x to the cluster center. The parameter \(m\) is a real number greater than 1 (\(1.0 < m < \infty\)) and it defines the level of cluster fuzziness. Note that, a value of \(m\) close to 1 gives a cluster solution which becomes increasingly similar to the solution of hard clustering such as k-means; whereas a value of \(m\) close to infinite leads to complete fuzzyness. Note that, a good choice is to use m = 2.0 (Hathaway and Bezdek 2001). In fuzzy clustering the centroid of a cluster is he mean of all points, weighted by their degree of belonging to the cluster: \[ C_j = \frac{\sum\limits_{x \in C_j} u_{ij}^m x}{\sum\limits_{x \in C_j} u_{ij}^m} \] ...
Let us now take a closer look at the results. Clik on the picture on the left to get to an interactive 3d-graph of the 4-cluster solution for which the R-code can be found below. The 4-cluster solution yields 4 ellipsoids aiming to reflect the areas with high observation densities for the clusters. These ellispoids should contribute to the ease of reading the graph, the actual observations are still represented by differently coloured dots just like in the 2-dimensional plot we used for exploration. The three upper clusters in the picture share a comparable level of Monetary Value and Recency. The dark blue ellispoid stand out of the three as it reflects higher Frequeny. The lower ellipsoid reflects observations that rank relatively low on all of the three RFM variables (remember, the higher the recency, the worse - knowing that we are working with a dataset of good donors). The video below contains a fixed-axis rotation. ...
A comprehensive study of the lattice dynamics, elastic moduli, and liquid metal resistivities for 16 simple metals in the bcc and fcc crystal structures is made using a density-based local pseudopotential. The phonon frequencies exhibit excellent agreement with both experiment and nonlocal pseudopotential theory. The bulk modulus is evaluated by the long wave and homogeneous deformation methods, which agree after a correction is applied to the former. Calculated bulk and Voigt shear moduli are insensitive to crystal structure, and long-wavelength soft modes are found in certain cases. Resistivity calculations confirm that electrons scatter off the whole Kohn-Sham potential, including its exchange-correlation part as well as its Hartree part. All of these results are found in second-order pseudopotential perturbation theory. However, the effect of a nonperturbative treatment on the calculated lattice constant is not negligible, showing that higher-order contributions have been subsumed into the ...
An urban land-cover classification of the 900 km(2) comprising the UK West Midland metropolitan area was generated for the purpose of facilitating stratified environmental survey and sampling. The classification grouped the 900 km(2) into eight urban land-cover classes. Input data to the classification algorithms were derived from spatial land-cover data obtained from the UK Centre for Ecology and Hydrology, and from the UK Ordnance Survey. These data provided a description of each km(2) in terms of the contributions to the land cover of 25 attributes (e.g. open land, urban, villages, motorway, etc.). The dimensionality of the land-cover dataset was reduced using principal component analysis, and eight urban classes were derived by cluster analysis using an agglomeration technique on the extracted components. The resulting urban land-cover classes reflected groupings of 1 km(2) pixels with similar urban land morphology. Uncertainties associated with this agglomerative classification were ...
This book provides the reader with a basic understanding of the formal concepts of the cluster, clustering, partition, cluster analysis etc.. The book explains feature-based, graph-based and spectral clustering methods and discusses their formal similarities and differences. Understanding the related formal concepts is particularly vital in the epoch of Big Data; due to the volume and characteristics of the data, it is no longer feasible to predominantly rely on merely viewing the data when facing a clustering problem.. Usually clustering involves choosing similar objects and grouping them together. To facilitate the choice of similarity measures for complex and big data, various measures of object similarity, based on quantitative (like numerical measurement results) and qualitative features (like text), as well as combinations of the two, are described, as well as graph-based similarity measures for (hyper) linked objects and measures for multilayered graphs. Numerous variants demonstrating ...
Fixes a problem where a clustering model that uses the K-means algorithm generates different results that are affected by PredictOnly columns in SQL Server 2008 R2 Analysis Services.
TEDDER, Michelle J. et al. Classification and mapping of the composition and structure of dry woodland and savanna in the eastern Okavango Delta. Koedoe [online]. 2013, vol.55, n.1, pp.00-00. ISSN 2071-0771.. The dry woodland and savanna regions of the Okavango Delta form a transition zone between the Okavango Swamps and the Kalahari Desert and have been largely overlooked in terms of vegetation classification and mapping. This study focused on the species composition and height structure of this vegetation, with the aim of identifying vegetation classes and providing a vegetation map accompanied by quantitative data. Two hundred and fifty-six plots (50 m χ 50 m) were sampled and species cover abundance, total cover and structural composition were recorded. The plots were classified using agglomerative, hierarchical cluster analysis using group means and Bray-Curtis similarity and groups described using indicator species analysis. In total, 23 woody species and 28 grass species were recorded. ...
Chronic pain represents a major health problem among older people. The aims of the present study were to: (i) identify various profiles of pain and distress experiences among older patients; and (ii) compare whether background variables, sense of coherence, functional ability and experiences of interventions aimed at reducing pain and distress varied among the patient profiles. Interviews were carried out with 42 older patients. A cluster analysis yielded three clusters, each representing a different profile of patients. Case illustrations are provided for each profile. There were no differences between the clusters, regarding intensity and duration of pain. One profile, with subjects of advanced age, showed a decreased functional ability and favourable scores in most of the categories of pain and distress. Another profile of patients showed favourable mean scores in all categories. The third cluster of patients showed unfavourable scores in most categories of pain and distress. There appears to ...
This paper deals with several problems in cluster analysis. It appears that the suggested solutions have not been considered in current literature. First, the author proposes the use of a permuted matrix as a tool for interpretation of clusters generated by hierarchical agglomerative clustering algorithms. Second, a new method of defining similarity between a pair of clusters is shown. This method leads to a new class of hierarchical agglomerative clustering. Third, two criteria are defined to optimize dendrograms that are outputs of hierarchical clustering.. This paper has been presented at the Task Force Seminar Session on New Advances in Decision Support Systems, Laxenburg, Austria, November 3-5, 1986.. ...
In this paper, we illustrate an application of Ascendant Hierarchical Cluster Analysis (AHCA) to complex data taken from the literature (interval data), based on the standardized weighted generalized affinity coefficient, by the method of Wald and Wolfowitz. The probabilistic aggregation criteria used belong to a parametric family of methods under the probabilistic approach of AHCA, named VL methodology. Finally, we compare the results achieved using our approach with those obtained by other authors. ...
Clustering or cluster analysis is a type of data analysis. The analyst groups objects so that objects in the same group (called a cluster) are more similar to each other than to objects in other groups (clusters) in some way. This is a common task in data mining. ...
TY - JOUR. T1 - The XMM Cluster Survey. T2 - X-ray analysis methodology. AU - Lloyd-Davies, E. J.. AU - Romer, A. Kathy. AU - Mehrtens, Nicola. AU - Hosmer, Mark. AU - Davidson, Michael. AU - Sabirli, Kivanc. AU - Mann, Robert G.. AU - Hilton, Matt. AU - Liddle, Andrew R.. AU - Viana, Pedro T. P.. AU - Campbell, Heather C.. AU - Collins, Chris A.. AU - Dubois, E. Naomi. AU - Freeman, Peter. AU - Harrison, Craig D.. AU - Hoyle, Ben. AU - Kay, Scott T.. AU - Kuwertz, Emma. AU - Miller, Christopher J.. AU - Nichol, Robert C.. AU - Sahlén, Martin. AU - Stanford, S. A.. AU - Stott, John P.. PY - 2011/11/21. Y1 - 2011/11/21. N2 - The XMM Cluster Survey (XCS) is a serendipitous search for galaxy clusters using all publicly available data in the XMM-Newton Science Archive. Its main aims are to measure cosmological parameters and trace the evolution of X-ray scaling relations. In this paper we describe the data processing methodology applied to the 5776 XMM observations used to construct the current XCS ...
My presentation aims at showing how these limitations can be solved by means of affinity propagation clustering. This is a mathematical method that is able to uses the phylogenetic distance matrix to allocate sequences to generic clusters. I will present you how affinity propagation clustering was applied to the distance matrices derived from the RABV full genome sample sets, resulting in a cluster structure which strongly corresponds to the structure of the Maximum Likelihood-based phylogenetic tree. At the end of my presentation I would like to discuss on strategies to implement a workflow based on this method to validate evidence for space-dependent clustering of rabies virus sequences ...
My presentation aims at showing how these limitations can be solved by means of affinity propagation clustering. This is a mathematical method that is able to uses the phylogenetic distance matrix to allocate sequences to generic clusters. I will present you how affinity propagation clustering was applied to the distance matrices derived from the RABV full genome sample sets, resulting in a cluster structure which strongly corresponds to the structure of the Maximum Likelihood-based phylogenetic tree. At the end of my presentation I would like to discuss on strategies to implement a workflow based on this method to validate evidence for space-dependent clustering of rabies virus sequences ...
Learn Disease Clusters from Johns Hopkins University. Do a lot of people in your neighborhood all seem to have the same sickness? Are people concerned about high rates of cancer? Your community may want to explore the possibility of a disease ...
The aim of this study is to investigate the profiles of students in MIS department by performing cluster analysis on various dimensions of academic abilities
To explore the clinical patterns of patients with IgG4-related disease (IgG4-RD) based on laboratory tests and the number of organs involved. Twenty-two baseline variables were obtained from 154 patients with IgG4-RD. Based on principal component analysis (PCA), patients with IgG4-RD were classified into different subgroups using cluster analysis. Additionally, IgG4-RD composite score (IgG4-RD CS) as a comprehensive score was calculated for each patient by principal component evaluation. Multiple linear regression was used to establish the
Cluster analysis is used in data mining and is a common technique for statistical data analysis used in many fields of study, such as the medical & life science
Multimorbidity is highly prevalent in the elderly and relates to many adverse outcomes, such as higher mortality, increased disability and functional decline. Many studies tried to reduce the heterogeneity of multimorbidity by identifying multimorbidity clusters or disease combinations, however, the internal structure of multimorbidity clusters and the linking between disease combinations and clusters are still unknown. The aim of this study was to depict which diseases were associated with each other on person-level within the clusters and which ones were responsible for overlapping multimorbidity clusters. The study analyses insurance claims data of the Gmünder ErsatzKasse from 2006 with 43,632 female and 54,987 male patients who were 65 years and older. The analyses are based on multimorbidity clusters from a previous study and combinations of three diseases (triads) identified by observed/expected ratios ≥ 2 and prevalence rates ≥ 1%. In order to visualise a disease network, an edgelist was
Discover how segmentation & cluster analysis can benefit market research for Major Retailers and how Fuel Cycle can help with these techniques today.