To determine a classifier with a machine learning algorithm like the PNN and DT, class skew in data should be considered. The NB and DT were optimal classifiers in a prediction task in an imbalanced medical database.
ObjectiveThe current study was undertaken for use of the decision tree (DT) method for development of different prediction models for incidence of type 2 diabetes (T2D) and for exploring interactions between predictor variables in those models.DesignProspective cohort study.SettingTehran Lipid and Glucose Study (TLGS).MethodsA total of 6647 participants (43.4% men) aged >20 years, without T2D at baselines ((1999–2001) and (2002–2005)), were followed until 2012. 2 series of models (with and without 2-hour postchallenge plasma glucose (2h-PCPG)) were developed using 3 types of DT algorithms. The performances of the models were assessed using sensitivity, specificity, area under the ROC curve (AUC), geometric mean (G-Mean) and F-Measure.Primary outcome measureT2D was primary outcome which defined if fasting plasma glucose (FPG) was ≥7 mmol/L or if the 2h-PCPG was ≥11.1 mmol/L or if the participant was taking antidiabetic medication.ResultsDuring a median follow-up of 9.5 years, 729 new cases of T2D were identified. The Quick Unbiased Efficient Statistical Tree (QUEST) algorithm had the highest sensitivity and G-Mean among all the models for men and women. The models that included 2h-PCPG had sensitivity and G-Mean of (78% and 0.75%) and (78% and 0.78%) for men and women, respectively. Both models achieved good discrimination power with AUC above 0.78. FPG, 2h-PCPG, waist-to-height ratio (WHtR) and mean arterial blood pressure (MAP) were the most important factors to incidence of T2D in both genders. Among men, those with an FPG≤4.9 mmol/L and 2h-PCPG≤7.7 mmol/L had the lowest risk, and those with an FPG>5.3 mmol/L and 2h-PCPG>4.4 mmol/L had the highest risk for T2D incidence. In women, those with an FPG≤5.2 mmol/L and WHtR≤0.55 had the lowest risk, and those with an FPG>5.2 mmol/L and WHtR>0.56 had the highest risk for T2D incidence.ConclusionsOur study emphasises the utility of DT for exploring interactions between predictor variables.
Background:Histopathologic assessment of liver tissue is an essential step in management and follow-up of non-alcoholic fatty liver disease (NAFLD) while inter- and intra-observer variations limit the accuracy of these assessments.Objectives:The aim of this study was to assess the inter- and intra-observer reproducibility of histopathologic assessment of liver biopsies based on NAFLD activity score (NAS) scoring system.Materials and Methods:The anonymous liver biopsy samples of 100 consecutive NAFLD suspected adults were randomly assigned to four pathologists. Then, the samples were randomly reassigned to the pathologists for the second time in a way that each sample would be evaluated by two different pathologists. Biopsies were revisited by their first evaluator after two months. The results were reported based on NAS scoring system.Results:Inter-observer agreement of the pathology scores based on NAS scoring system was acceptable for steatosis, lobular inflammation, and fibrosis, but not for hepatocyte ballooning. The intra-observer agreement was acceptable in all scales, with lowest intra-class correlation observed for lobular inflammation.Conclusions:NAS scoring system has good overall inter- and intra-observer agreement, but more attention should be given to defining the hepatocyte ballooning and lobular inflammation, and training the pathologists to improve the accuracy of pathology reports.
Background:Type 2 diabetes, common and serious global health concern, had an estimated worldwide prevalence of 366 million in 2011, which is expected to rise to 552 million people, by 2030, unless urgent action is taken.Objectives:The aim of this study was to identify risk patterns for type 2 diabetes incidence using association rule mining (ARM).Patients and Methods:A population of 6647 individuals without diabetes, aged ≥ 20 years at inclusion, was followed for 10-12 years, to analyze risk patterns for diabetes occurrence. Study variables included demographic and anthropometric characteristics, smoking status, medical and drug history and laboratory measures.Results:In the case of women, the results showed that impaired fasting glucose (IFG) and impaired glucose tolerance (IGT), in combination with body mass index (BMI) ≥ 30 kg/m2, family history of diabetes, wrist circumference > 16.5 cm and waist to height ≥ 0.5 can increase the risk for developing diabetes. For men, a combination of IGT, IFG, length of stay in the city (> 40 years), central obesity, total cholesterol to high density lipoprotein ratio ≥ 5.3, low physical activity, chronic kidney disease and wrist circumference > 18.5 cm were identified as risk patterns for diabetes occurrence.Conclusions:Our study showed that ARM is a useful approach in determining which combinations of variables or predictors occur together frequently, in people who will develop diabetes. The ARM focuses on joint exposure to different combinations of risk factors, and not the predictors alone.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.