Hongfei Lin scite author profile

Distributed word representations have become an essential foundation for biomedical natural language processing (BioNLP), text mining and information retrieval. Word embeddings are traditionally computed at the word level from a large corpus of unlabeled text, ignoring the information present in the internal structure of words or any information available in domain specific structured resources such as ontologies. However, such information holds potentials for greatly improving the quality of the word representation, as suggested in some recent studies in the general domain. Here we present BioWordVec: an open set of biomedical word vectors/embeddings that combines subword information from unlabeled biomedical text with a widely-used biomedical controlled vocabulary called Medical Subject Headings (MeSH). We assess both the validity and utility of our generated word embeddings over multiple NLP tasks in the biomedical domain. Our benchmarking results demonstrate that our word embeddings can result in significantly improved performance over the previous state of the art in those challenging tasks.

show abstract

An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition

Luo

Yang

et al. 2017

322

182

View full text Add to dashboard Cite

show abstract

Drug drug interaction extraction from biomedical literature using syntax convolutional neural network

et al. 2016

View full text Add to dashboard Cite

Motivation: Detecting drug-drug interaction (DDI) has become a vital part of public health safety. Therefore, using text mining techniques to extract DDIs from biomedical literature has received great attentions. However, this research is still at an early stage and its performance has much room to improve.Results: In this article, we present a syntax convolutional neural network (SCNN) based DDI extraction method. In this method, a novel word embedding, syntax word embedding, is proposed to employ the syntactic information of a sentence. Then the position and part of speech features are introduced to extend the embedding of each word. Later, auto-encoder is introduced to encode the traditional bag-of-words feature (sparse 0–1 vector) as the dense real value vector. Finally, a combination of embedding-based convolutional features and traditional features are fed to the softmax classifier to extract DDIs from biomedical literature. Experimental results on the DDIExtraction 2013 corpus show that SCNN obtains a better performance (an F-score of 0.686) than other state-of-the-art methods.Availability and Implementation: The source code is available for academic use at http://202.118.75.18:8080/DDI/SCNN-DDI.zip.Contact: yangzh@dlut.edu.cnSupplementary information: Supplementary data are available at Bioinformatics online.

show abstract

Concurrent Non-malleable Commitments from Any One-Way Function

View full text Add to dashboard Cite

show abstract

Detection of Depression-Related Posts in Reddit Social Media Forum

et al. 2019

View full text Add to dashboard Cite

Obfuscation of Probabilistic Circuits and Applications

Canetti

Lin

Tessaro

et al. 2015

View full text Add to dashboard Cite

This paper studies the question of how to define, construct, and use obfuscators for probabilistic programs. Such obfuscators compile a possibly randomized program into a deterministic one, which achieves computationally indistinguishable behavior from the original program as long as it is run on each input at most once. For obfuscation, we propose a notion that extends indistinguishability obfuscation to probabilistic circuits: It should be hard to distinguish between the obfuscations of any two circuits whose output distributions at each input are computationally indistinguishable, possibly in presence of some auxiliary input. We call the resulting notion probabilistic indistinguishability obfuscation (pIO).We define several variants of pIO, using different approaches to formalizing the above security requirement, and study non-trivial relations among them. Moreover, we give a construction of one of our pIO variants from sub-exponentially hard indistinguishability obfuscation (for deterministic circuits) and one-way functions, and conjecture this construction to be a good candidate for other pIO variants. We then move on to show a number of applications of pIO:• We give a general and natural methodology to achieve leveled homomorphic encryption (LHE) from variants of semantically secure encryption schemes and of pIO. In particular, we propose instantiations from lossy and re-randomizable encryption schemes, assuming the two weakest notions of pIO.• We enhance the above constructions to obtain a full-fledged (i.e., non-leveled) FHE scheme under the same (or slightly stronger) assumptions. In particular, this constitutes the first construction of full-fledged FHE that does not rely on encryption with circular security.• Finally, we show that assuming sub-exponentially secure puncturable PRFs computable in NC 1 , sub-exponentially-secure indistinguishability obfuscation for (deterministic) NC 1 circuits can be bootstrapped to obtain indistinguishability obfuscation for arbitrary (deterministic) poly-size circuits.

show abstract

Drug–drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths

et al. 2017

View full text Add to dashboard Cite

MotivationAdverse events resulting from drug-drug interactions (DDI) pose a serious health issue. The ability to automatically extract DDIs described in the biomedical literature could further efforts for ongoing pharmacovigilance. Most of neural networks-based methods typically focus on sentence sequence to identify these DDIs, however the shortest dependency path (SDP) between the two entities contains valuable syntactic and semantic information. Effectively exploiting such information may improve DDI extraction.ResultsIn this article, we present a hierarchical recurrent neural networks (RNNs)-based method to integrate the SDP and sentence sequence for DDI extraction task. Firstly, the sentence sequence is divided into three subsequences. Then, the bottom RNNs model is employed to learn the feature representation of the subsequences and SDP, and the top RNNs model is employed to learn the feature representation of both sentence sequence and SDP. Furthermore, we introduce the embedding attention mechanism to identify and enhance keywords for the DDI extraction task. We evaluate our approach using the DDI extraction 2013 corpus. Our method is competitive or superior in performance as compared with other state-of-the-art methods. Experimental results show that the sentence sequence and SDP are complementary to each other. Integrating the sentence sequence with SDP can effectively improve the DDI extraction performance.Availability and implementationThe experimental data is available at https://github.com/zhangyijia1979/hierarchical-RNNs-model-for-DDI-extraction.Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

After-the-Fact Leakage in Public-Key Encryption

Halevi¹,

Lin

2011

View full text Add to dashboard Cite

Abstract. What does it mean for an encryption scheme to be leakageresilient? Prior formulations require that the scheme remains semantically secure even in the presence of leakage, but only considered leakage that occurs before the challenge ciphertext is generated. Although seemingly necessary, this restriction severely limits the usefulness of the resulting notion.In this work we study after-the-fact leakage, namely leakage that the adversary obtains after seeing the challenge ciphertext. We seek a "natural" and realizable notion of security, which is usable in higher-level protocols and applications. To this end, we formulate entropic leakageresilient PKE. This notion captures the intuition that as long as the entropy of the encrypted message is higher than the amount of leakage, the message still has some (pseudo) entropy left. We show that this notion is realized by the Naor-Segev constructions (using hash proof systems).We demonstrate that entropic leakage-resilience is useful by showing a simple construction that uses it to get semantic security in the presence of after-the-fact leakage, in a model of bounded memory leakage from a split state.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hongfei Lin

BioWordVec, improving biomedical word embeddings with subword information and MeSH

An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition

Drug drug interaction extraction from biomedical literature using syntax convolutional neural network

Concurrent Non-malleable Commitments from Any One-Way Function

Detection of Depression-Related Posts in Reddit Social Media Forum

Obfuscation of Probabilistic Circuits and Applications

Drug–drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths

After-the-Fact Leakage in Public-Key Encryption

Contact Info

Product

Resources

About