Jacob Engelbrecht scite author profile

We have developed a new method for the identification of signal peptides and their cleavage sites based on neural networks trained on separate sets of prokaryotic and eukaryotic sequence. The method performs significantly better than previous prediction schemes and can easily be applied on genome-wide data sets. Discrimination between cleaved signal peptides and uncleaved N-terminal signal-anchor sequences is also possible, though with lower precision. Predictions can be made on a publicly available WWW server.

show abstract

Prediction of human mRNA donor and acceptor sites from the DNA sequence

Brunak

Engelbrecht

Knudsen

1991

Journal of Molecular Biology

715

471

View full text Add to dashboard Cite

A Neural Network Method for Identification of Prokaryotic and Eukaryotic Signal Peptides and Prediction of their Cleavage Sites

Nielsen

Engelbrecht

Brunak

et al. 1997

Int. J. Neur. Syst.

640

458

View full text Add to dashboard Cite

We have developed a new method for the identification of signal peptides and their cleavage sites based on neural networks trained on separate sets of prokaryotic and eukaryotic sequences. The method performs significantly better than previous prediction schemes, and can easily be applied to genome-wide data sets. Discrimination between cleaved signal peptides and uncleaved N-terminal signal-anchor sequences is also possible, though with lower precision. Predictions can be made on a publicly available WWW server: .

show abstract

Prediction of O-glycosylation of mammalian proteins: specificity patterns of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase

et al. 1995

View full text Add to dashboard Cite

The specificity of the enzyme(s) catalysing the covalent link between the hydroxyl side chains of serine or threonine and the sugar moiety N-acetylgalactosamine (GalNAc) is unknown. Pattern recognition by artificial neural networks and weight matrix algorithms was performed to determine the exact position of in vivo O-linked GalNAc-glycosylated serine and threonine residues from the primary sequence exclusively. The acceptor sequence context for O-glycosylation of serine was found to differ from that of threonine and the two types were therefore treated separately. The context of the sites showed a high abundance of proline, serine and threonine extending far beyond the previously reported region covering positions -4 through +4 relative to the glycosylated residue. The O-glycosylation sites were found to cluster and to have a high abundance in the N-terminal part of the protein. The sites were also found to have an increased preference for three different classes of beta-turns. No simple consensus-like rule could be deduced for the complex glycosylation sequence acceptor patterns. The neural networks were trained on the hitherto largest data material consisting of 48 carefully examined mammalian glycoproteins comprising 264 O-glycosylation sites. For detection neural network algorithms were much more reliable than weight matrices. The networks correctly found 60-95% of the O-glycosylated serine/threonine residues and 88-97% of the non-glycosylated residues in two independent test sets of known glycoproteins. A computer server using E-mail for prediction of O-glycosylation sites has been implemented and made publicly available. The Internet address is NetOglyc@cbs.dtu.dk.

show abstract

Defining a similarity threshold for a functional protein sequence pattern: The signal peptide cleavage site

et al. 1996

View full text Add to dashboard Cite

G + C-rich tract in 5′ end of human introns

Engelbrecht

Knudsen

Brunak

1992

Journal of Molecular Biology

View full text Add to dashboard Cite

Protein structure and the sequential structure of mRNA: α‐helix and β‐sheet signals at the nucleotide level

Brunak¹,

Engelbrecht²

1996

Proteins

View full text Add to dashboard Cite

Neural network detects errors in the assignment of mRNA splice sites

Brunak

Engelbrecht²,

Knudsen

1990

Nucl Acids Res

View full text Add to dashboard Cite

The use of databanks in genetic research assumes reliability of the information they contain. Currently, error-detection in the manually or electronically entered data contained in the nucleotide sequence databanks at EMBL, Heidelberg and GenBank at Los Alamos is limited. We have used a subset of sequences from these databanks to train neural networks to recognize pre-mRNA splicing signals in human genes. During the training on 33 human genes from the EMBL databank seven genes appeared to disturb the learning process. Subsequent investigation revealed discrepancies from the original published papers, for three genes. In four genes, we found wrongly assigned splicing frames of introns. We believe this to be a reflection of the fact that splicing frames cannot always be unambiguously assigned on the basis of experimental data. Thus incorrect assignment appear both due to mere typographical misprints as well as erroneous interpretation of experiments. Training on 241 human sequences from GenBank revealed nine new errors. We propose that such errors could be detected by computer algorithms designed to check the consistency of data prior to their incorporation in databanks.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.