Abstract
As interest in database searching and compound selection has grown, there has been a concomitant growth in interest in the quantification of chemical similarity. Described here is a computer program called DISSIM, which addresses the problem of selecting diverse subsets from larger collections of chemical compounds. It is a pragmatic solution combining a maximum dissimilarity search algorithm and a general multidimensional measure of chemical similarity based on the combination of different molecular descriptors. The problem of correlation between descriptors is addressed and appropriate schemes for weighting and normalisation are described. The specific application of these techniques to the comparative analysis of topological indices and their use in the area of chemical diversity analysis and compound selection are also described.
Original language | English |
---|---|
Pages (from-to) | 239+253 |
Number of pages | 15 |
Journal | Journal of Molecular Graphics and Modelling |
Volume | 16 |
Issue number | 4-6 |
DOIs | |
Publication status | Unpublished - Aug 1998 |
Keywords
- chemical diversity
- compound selection
- topological index
- maximum dissimilarity search
- variable selection
- molecular descriptors