On the properties of bit string-based measures of chemical similarity

DR Flower

Research output: Contribution to journalArticle

Abstract

With the growth of interest in database searching and compound selection, the quantification of chemical similarity has become an area of intense practical and theoretical interest. One of the most widely used methods of measuring chemical similarity is based on mapping fragments within a molecule as bits within a binary string. We present empirical results which suggest that bit strings provide a nonintuitive encoding of molecular size, shape, and global similarity. Other results, this time statistical in nature, suggest that the observed behavior of bit string-based searches have a large nonspecific component. On this basis, we question whether bit string-based similarity methods possess all the features desirable in a quantitative chemical distance measure or metric and suggest that there are instances when they may not be the most appropriate tool for searching or segregating chemical structures.
Original languageEnglish
Pages (from-to)379-386
Number of pages8
JournalJournal of Chemical Information and Computer Sciences
Volume38
Issue number3
Early online date4 Apr 1998
DOIs
Publication statusUnpublished - May 1998

Fingerprint

Molecules

Cite this

@article{ab71437294694eaebf6ac27fdce5a85d,
title = "On the properties of bit string-based measures of chemical similarity",
abstract = "With the growth of interest in database searching and compound selection, the quantification of chemical similarity has become an area of intense practical and theoretical interest. One of the most widely used methods of measuring chemical similarity is based on mapping fragments within a molecule as bits within a binary string. We present empirical results which suggest that bit strings provide a nonintuitive encoding of molecular size, shape, and global similarity. Other results, this time statistical in nature, suggest that the observed behavior of bit string-based searches have a large nonspecific component. On this basis, we question whether bit string-based similarity methods possess all the features desirable in a quantitative chemical distance measure or metric and suggest that there are instances when they may not be the most appropriate tool for searching or segregating chemical structures.",
author = "DR Flower",
year = "1998",
month = "5",
doi = "10.1021/ci970437z",
language = "English",
volume = "38",
pages = "379--386",
journal = "Journal of Chemical Information and Computer Sciences",
issn = "0095-2338",
publisher = "American Chemical Society",
number = "3",

}

On the properties of bit string-based measures of chemical similarity. / Flower, DR.

In: Journal of Chemical Information and Computer Sciences, Vol. 38, No. 3, 05.1998, p. 379-386.

Research output: Contribution to journalArticle

TY - JOUR

T1 - On the properties of bit string-based measures of chemical similarity

AU - Flower, DR

PY - 1998/5

Y1 - 1998/5

N2 - With the growth of interest in database searching and compound selection, the quantification of chemical similarity has become an area of intense practical and theoretical interest. One of the most widely used methods of measuring chemical similarity is based on mapping fragments within a molecule as bits within a binary string. We present empirical results which suggest that bit strings provide a nonintuitive encoding of molecular size, shape, and global similarity. Other results, this time statistical in nature, suggest that the observed behavior of bit string-based searches have a large nonspecific component. On this basis, we question whether bit string-based similarity methods possess all the features desirable in a quantitative chemical distance measure or metric and suggest that there are instances when they may not be the most appropriate tool for searching or segregating chemical structures.

AB - With the growth of interest in database searching and compound selection, the quantification of chemical similarity has become an area of intense practical and theoretical interest. One of the most widely used methods of measuring chemical similarity is based on mapping fragments within a molecule as bits within a binary string. We present empirical results which suggest that bit strings provide a nonintuitive encoding of molecular size, shape, and global similarity. Other results, this time statistical in nature, suggest that the observed behavior of bit string-based searches have a large nonspecific component. On this basis, we question whether bit string-based similarity methods possess all the features desirable in a quantitative chemical distance measure or metric and suggest that there are instances when they may not be the most appropriate tool for searching or segregating chemical structures.

UR - http://pubs.acs.org/doi/abs/10.1021/ci970437z

U2 - 10.1021/ci970437z

DO - 10.1021/ci970437z

M3 - Article

VL - 38

SP - 379

EP - 386

JO - Journal of Chemical Information and Computer Sciences

JF - Journal of Chemical Information and Computer Sciences

SN - 0095-2338

IS - 3

ER -