Mike Thelwall scite author profile

A huge number of informal messages are posted every day in social network sites, blogs and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behaviour to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially-oriented, designed to identify opinions about products rather than user behaviours. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de-facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimised by machine learning, SentiStrength is able to predict positive emotion with 60.6% accuracy and negative emotion with 72.8% accuracy, both based upon strength scales of 1-5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.

show abstract

Do Altmetrics Work? Twitter and Ten Other Social Web Services

Thelwall

et al. 2013

View full text Add to dashboard Cite

Altmetric measurements derived from the social web are increasingly advocated and used as early indicators of article impact and usefulness. Nevertheless, there is a lack of systematic scientific evidence that altmetrics are valid proxies of either impact or utility although a few case studies have reported medium correlations between specific altmetrics and citation rates for individual journals or fields. To fill this gap, this study compares 11 altmetrics with Web of Science citations for 76 to 208,739 PubMed articles with at least one altmetric mention in each case and up to 1,891 journals per metric. It also introduces a simple sign test to overcome biases caused by different citation and usage windows. Statistically significant associations were found between higher metric scores and higher citations for articles with positive altmetric scores in all cases with sufficient evidence (Twitter, Facebook wall posts, research highlights, blogs, mainstream media and forums) except perhaps for Google+ posts. Evidence was insufficient for LinkedIn, Pinterest, question and answer sites, and Reddit, and no conclusions should be drawn about articles with zero altmetric scores or the strength of any correlation between altmetrics and citations. Nevertheless, comparisons between citations and metric values for articles published at different times, even within the same year, can remove or reverse this association and so publishers and scientometricians should consider the effect of time when using altmetrics to rank articles. Finally, the coverage of all the altmetrics except for Twitter seems to be low and so it is not clear if they are prevalent enough to be useful in practice.

show abstract

Sentiment strength detection for the social web

Thelwall

Buckley

Paltoglou

2011

J. Am. Soc. Inf. Sci.

830

600

View full text Add to dashboard Cite

Sentiment analysis is concerned with the automatic extraction of sentiment-related information from text. Although most sentiment analysis addresses commercial tasks, such as extracting opinions from product reviews, there is increasing interest in the affective dimension of the social web, and Twitter in particular. Most sentiment analysis algorithms are not ideally suited for this task because they exploit indirect indicators of sentiment that can reflect genre or topic instead. Hence, such algorithms used to process social web texts can identify spurious sentiment patterns caused by topics rather than affective phenomena. This article assesses an improved version of the algorithm SentiStrength for sentiment strength detection across the social web that primarily uses direct indications of sentiment. The results from six diverse social web data sets (MySpace, Twitter, YouTube, Digg, Runners World, BBC Forums) indicate that SentiStrength 2 is successful in the sense of performing better than a baseline approach for all data sets in both supervised and unsupervised cases. SentiStrength is not always better than machine learning approaches that exploit indirect indicators of sentiment, however, and is particularly weaker for positive sentiment in news-related discussions. Overall, the results suggest that, even unsupervised, SentiStrength is robust enough to be applied to a wide variety of different social web contexts.

show abstract

Sentiment in Twitter events

Thelwall

Buckley

Paltoglou

2010

J. Am. Soc. Inf. Sci.

646

428

View full text Add to dashboard Cite

The microblogging site Twitter generates a constant stream of communication, some of which concerns events of general interest. An analysis of Twitter may, therefore, give insights into why particular events resonate with the population. This article reports a study of a month of English Twitter posts, assessing whether popular events are typically associated with increases in sentiment strength, as seems intuitively likely. Using the top 30 events, determined by a measure of relative increase in (general) term usage, the results give strong evidence that popular events are normally associated with increases in negative sentiment strength and some evidence that peaks of interest in events have stronger positive sentiment than the time before the peak. It seems that many positive events, such as the Oscars, are capable of generating increased negative sentiment in reaction to them. Nevertheless, the surprisingly small average change in sentiment associated with popular events (typically 1% and only 6% for Tiger Woods' confessions) is consistent with events affording posters opportunities to satisfy preexisting personal goals more often than eliciting instinctive reactions.

show abstract

Google Scholar, Web of Science, and Scopus: A systematic comparison of citations in 252 subject categories

Martín-Martín¹,

Orduña-Malea²,

Thelwall³

et al. 2018

Journal of Informetrics

981

403

View full text Add to dashboard Cite

Despite citation counts from Google Scholar (GS), Web of Science (WoS), and Scopus being widely consulted by researchers and sometimes used in research evaluations, there is no recent or systematic evidence about the differences between them. In response, this paper investigates 2,448,055 citations to 2,299 English-language highly-cited documents from 252 GS subject categories published in 2006, comparing GS, the WoS Core Collection, and Scopus. GS consistently found the largest percentage of citations across all areas (93%-96%), far ahead of Scopus (35%-77%) and WoS (27%-73%). GS found nearly all the WoS (95%) and Scopus (92%) citations. Most citations found only by GS were from non-journal sources (48%-65%), including theses, books, conference papers, and unpublished materials. Many were non-English (19%-38%), and they tended to be much less cited than citing sources that were also in Scopus or WoS. Despite the many unique GS citing sources, Spearman correlations between citation counts in GS and WoS or Scopus are high (0.78-0.99). They are lower in the Humanities, and lower between GS and WoS than between GS and Scopus. The results suggest that in all areas GS citation data is essentially a superset of WoS and Scopus, with substantial extra coverage.

show abstract

Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature

Haustein

Peters

Sugimoto

et al. 2013

Asso for Info Science & Tech

336

281

View full text Add to dashboard Cite

Data collected by social media platforms have been introduced as new sources for indicators to help measure the impact of scholarly research in ways that are complementary to traditional citation analysis. Data generated from social media activities can be used to reflect broad types of impact. This article aims to provide systematic evidence about how often Twitter is used to disseminate information about journal articles in the biomedical sciences. The analysis is based on 1.4 million documents covered by both PubMed and Web of Science and published between 2010 and 2012. The number of tweets containing links to these documents was analyzed and compared to citations to evaluate the degree to which certain journals, disciplines, and specialties were represented on Twitter and how far tweets correlate with citation impact. With less than 10% of PubMed articles mentioned on Twitter, its uptake is low in general but differs between journals and specialties. Correlations between tweets and citations are low, implying that impact metrics based on tweets are different from those based on citations. A framework using the coverage of articles and the correlation between Twitter mentions and citations is proposed to facilitate the evaluation of novel social-media-based metrics.

show abstract

Sentiment analysis: A combined approach

Prabowo

Thelwall

2009

Journal of Informetrics

576

282

View full text Add to dashboard Cite

Online Interventions for Social Marketing Health Behavior Change Campaigns: A Meta-Analysis of Psychological Architectures and Adherence Factors

Cugelman¹,

Thelwall²,

Dawes³

2011

J Med Internet Res

273

238

View full text Add to dashboard Cite

BackgroundResearchers and practitioners have developed numerous online interventions that encourage people to reduce their drinking, increase their exercise, and better manage their weight. Motivations to develop eHealth interventions may be driven by the Internet’s reach, interactivity, cost-effectiveness, and studies that show online interventions work. However, when designing online interventions suitable for public campaigns, there are few evidence-based guidelines, taxonomies are difficult to apply, many studies lack impact data, and prior meta-analyses are not applicable to large-scale public campaigns targeting voluntary behavioral change.ObjectivesThis meta-analysis assessed online intervention design features in order to inform the development of online campaigns, such as those employed by social marketers, that seek to encourage voluntary health behavior change. A further objective was to increase understanding of the relationships between intervention adherence, study adherence, and behavioral outcomes.MethodsDrawing on systematic review methods, a combination of 84 query terms were used in 5 bibliographic databases with additional gray literature searches. This resulted in 1271 abstracts and papers; 31 met the inclusion criteria. In total, 29 papers describing 30 interventions were included in the primary meta-analysis, with the 2 additional studies qualifying for the adherence analysis. Using a random effects model, the first analysis estimated the overall effect size, including groupings by control conditions and time factors. The second analysis assessed the impacts of psychological design features that were coded with taxonomies from evidence-based behavioral medicine, persuasive technology, and other behavioral influence fields. These separate systems were integrated into a coding framework model called the communication-based influence components model. Finally, the third analysis assessed the relationships between intervention adherence and behavioral outcomes.ResultsThe overall impact of online interventions across all studies was small but statistically significant (standardized mean difference effect size d = 0.19, 95% confidence interval [CI] = 0.11 - 0.28, P < .001, number of interventions k = 30). The largest impact with a moderate level of efficacy was exerted from online interventions when compared with waitlists and placebos (d = 0.28, 95% CI = 0.17 - 0.39, P < .001, k = 18), followed by comparison with lower-tech online interventions (d = 0.16, 95% CI = 0.00 - 0.32, P = .04, k = 8); no significant difference was found when compared with sophisticated print interventions (d = –0.11, 95% CI = –0.34 to 0.12, P = .35, k = 4), though online interventions offer a small effect with the advantage of lower costs and larger reach. Time proved to be a critical factor, with shorter interventions generally achieving larger impacts and greater adherence. For psychological design, most interventions drew from the transtheoretical approach and were goal orientated, deploying numerous influence compon...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mike Thelwall

Sentiment strength detection in short informal text

Do Altmetrics Work? Twitter and Ten Other Social Web Services

Sentiment strength detection for the social web

Sentiment in Twitter events

Google Scholar, Web of Science, and Scopus: A systematic comparison of citations in 252 subject categories

Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature

Sentiment analysis: A combined approach

Online Interventions for Social Marketing Health Behavior Change Campaigns: A Meta-Analysis of Psychological Architectures and Adherence Factors

Contact Info

Product

Resources

About