This paper measures social media activity of 15 broad scientific disciplines indexed in Scopus database using Altmetric.com data. First, the presence of Altmetric.com data in Scopus database is investigated, overall and across disciplines. Second, the correlation between the bibliometric and altmetric indices is examined using Spearman correlation. Third, a zero-truncated negative binomial model is used to determine the association of various factors with increasing or decreasing citations. Lastly, the effectiveness of altmetric indices to identify publications with high citation impact is comprehensively evaluated by deploying Area Under the Curve (AUC) -an application of receiver operating characteristic. Results indicate a rapid increase in the presence of Altmetric.com data in Scopus database from 10.19% in 2011 to 20.46% in 2015. A zero-truncated negative binomial model is implemented to measure the extent to which different bibliometric and altmetric factors contribute to citation counts. Blog count appears to be the most important factor increasing the number of citations by 38.6% in the field of Health Professions and Nursing, followed by Twitter count increasing the number of citations by 8% in the field of Physics and Astronomy. Interestingly, both Blog count and Twitter count always show positive increase in the number of citations across all fields. While there was a positive weak correlation between bibliometric and altmetric indices, the results show that altmetric indices can be a good indicator to discriminate highly cited publications, with an encouragingly AUC= 0.725 between highly cited publications and total altmetric count. Overall, findings suggest that altmetrics could better distinguish highly cited publications.
Information retrieval systems for scholarly literature rely heavily not only on text matching but on semantic-and context-based features. Readers nowadays are deeply interested in how important an article is, its purpose and how influential it is in follow-up research work. Numerous techniques to tap the power of machine learning and artificial intelligence have been developed to enhance retrieval of the most influential scientific literature. In this paper, we compare and improve on four existing state-of-the-art techniques designed to identify influential citations. We consider 450 citations from the Association for Computational Linguistics corpus, classified by experts as either important or unimportant, and further extract 64 features based on the methodology of four state-of-the-art techniques. We apply the Extra-Trees classifier to select 29 best features and apply the Random Forest and Support Vector Machine classifiers to all selected techniques. Using the Random Forest classifier, our supervised model improves on the state-of-the-art method by 11.25%, with 89% Precision-Recall area under the curve. Finally, we present our deep-learning model, the Long Short-Term Memory network, that uses all 64 features to distinguish important and unimportant citations with 92.57% accuracy.
The pandemic has taken the world by storm. Almost the entire world went into lockdown to save the people from the deadly COVID-19. Scientists around the around have come up with several vaccines for the virus. Amongthem, Pfizer, Moderna, and AstraZeneca have become quite famous. General people however have been expressing their feelings about the safety and effectiveness of the vaccines on social media like Twitter. In this study, such tweets are being extracted from Twitter using a Twitter API authentication token. The raw tweets are stored and processed using NLP. The processed data is then classified using a supervised KNN classification algorithm. The algorithm classifies the data into three classes, positive, negative, and neutral. These classes refer to the sentiment of the general people whose Tweets are extracted for analysis. From the analysis it is seen that Pfizer shows 47.29%positive, 37.5% negative and 15.21% neutral, Moderna shows 46.16%positive, 40.71% negative, and 13.13% neutral, AstraZeneca shows 40.08%positive, 40.06% negative and 13.86% neutral sentiment.
Owing to its nature of scalability and privacy by design, federated learning (FL) has received increasing interest in decentralized deep learning. FL has also facilitated recent research on upscaling and privatizing personalized recommendation services, using on-device data to learn recommender models locally. These models are then aggregated globally to obtain a more performant model, while maintaining data privacy. Typically, federated recommender systems (FRSs) do not take into account the lack of resources and data availability at the end-devices. In addition, they assume that the interaction data between users and items is i.i.d. and stationary across end-devices (i.e., users), and that all local recommender models can be directly averaged without considering the user’s behavioral diversity. However, in real scenarios, recommendations have to be made on end-devices with sparse interaction data and limited resources. Furthermore, users’ preferences are heterogeneous and they frequently visit new items. This makes their personal preferences highly skewed, and the straightforwardly aggregated model is thus ill-posed for such non-i.i.d. data. In this paper, we propose Resource Efficient Federated Recommender System (ReFRS) to enable decentralized recommendation with dynamic and diversified user preferences. On the device side, ReFRS consists of a lightweight self-supervised local model built upon the variational autoencoder for learning a user’s temporal preference from a sequence of interacted items. On the server side, ReFRS utilizes a scalable semantic sampler to adaptively perform model aggregation within each identified cluster of similar users. The clustering module operates in an asynchronous and dynamic manner to support efficient global model update and cope with shifting user interests. As a result, ReFRS achieves superior performance in terms of both accuracy and scalability, as demonstrated by comparative experiments on real datasets.
In this contemporary era, the uses of machine learning techniques are increasing rapidly in the field of medical science for detecting various diseases such as liver disease (LD). Around the globe, a large number of people die because of this deadly disease. By diagnosing the disease in a primary stage, early treatment can be helpful to cure the patient. In this research paper, a method is proposed to diagnose the LD using supervised machine learning classification algorithms, namely logistic regression, decision tree, random forest, AdaBoost, KNN, linear discriminant analysis, gradient boosting and support vector machine (SVM). We also deployed a least absolute shrinkage and selection operator (LASSO) feature selection technique on our taken dataset to suggest the most highly correlated attributes of LD. The predictions with 10 fold cross-validation (CV) made by the algorithms are tested in terms of accuracy, sensitivity, precision and f1-score values to forecast the disease. It is observed that the decision tree algorithm has the best performance score where accuracy, precision, sensitivity and f1-score values are 94.295%, 92%, 99% and 96% respectively with the inclusion of LASSO. Furthermore, a comparison with recent studies is shown to prove the significance of the proposed system.
Network embedding aims to learn vector representations of vertices, that preserve both network structures and properties. However, most existing embedding methods fail to scale to large networks. A few frameworks have been proposed by extending existing methods to cope with network embedding on large-scale networks. These frameworks update the global parameters iteratively or compress the network while learning vector representation. Such network embedding schemes inevitably lead to a high cost of either high communication overhead or sub-optimal embedding quality. In this paper, we propose a novel decentralized large-scale network embedding framework called DeLNE. As the name suggests, DeLNE divides a network into smaller partitions and learn vector representation in a distributed fashion, avoiding any unnecessary communication overhead. Our proposed framework uses Variational Graph Convolution Auto-Encoders to embed the structure and properties of each sub-network. Secondly, we propose an embedding aggregation mechanism, that captures the global properties of each node. Thirdly, we propose an alignment function, that reconciles all subnetworks embedding into the same vector space. Due to the parallel nature of DeLNE, it scales well on large clustered environments. Through extensive experimentation on realistic datasets, we show that DeLNE produces high-quality embedding and outperforms existing large-scale network embeddings frameworks, in terms of both efficiency and effectiveness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.