Adaptive feature selection based on the most informative graph-based features

Lu Bai*, Lixin Cui, Luca Rossi, Edwin R. Hancock, Yuhang Jiao

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose a novel method to adaptively select the most informative and least redundant feature subset, which has strong discriminating power with respect to the target label. Unlike most traditional methods using vectorial features, our proposed approach is based on graph-based features and thus incorporates the relationships between feature samples into the feature selection process. To efficiently encapsulate the main characteristics of the graph-based features, we probe each graph structure using the steady state random walk and compute a probability distribution of the walk visiting the vertices. Furthermore, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature, through the Jensen-Shannon divergence measure between the probability distributions from the random walk on different graphs. By solving a quadratic programming problem, we use the new measure to automatically locate the subset of the most informative features, that have both low redundancy and strong discriminating power. Unlike most existing state-of-the-art feature selection methods, the proposed information theoretic feature selection method can accommodate both continuous and discrete target features. Experiments on the problem of P2P lending platforms in China demonstrate the effectiveness of the proposed method.

Original languageEnglish
Title of host publicationGraph-based representations in pattern recognition : 11th IAPR-TC-15 international workshop, GbRPR 2017. Proceedings
EditorsPasquale Foggia, Cheng-Lin Liu, Mario Vento
Place of PublicationCham (CH)
PublisherSpringer
Pages276-287
Number of pages12
ISBN (Electronic)978-3-319-58961-9
ISBN (Print)978-3-319-58960-2
DOIs
Publication statusPublished - 2017
Event11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition, GbRPR 2017 - Anacapri, Italy
Duration: 16 May 201718 May 2017

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume10310
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition, GbRPR 2017
CountryItaly
CityAnacapri
Period16/05/1718/05/17

Fingerprint

Feature Selection
Feature extraction
Probability distributions
Graph in graph theory
Quadratic programming
Target
Random walk
Probability Distribution
Redundancy
Labels
Divergence Measure
Subset
Quadratic Programming
Walk
Pairwise
China
Probe
Experiments
Demonstrate
Experiment

Cite this

Bai, L., Cui, L., Rossi, L., Hancock, E. R., & Jiao, Y. (2017). Adaptive feature selection based on the most informative graph-based features. In P. Foggia, C-L. Liu, & M. Vento (Eds.), Graph-based representations in pattern recognition : 11th IAPR-TC-15 international workshop, GbRPR 2017. Proceedings (pp. 276-287). (Lecture Notes in Computer Science; Vol. 10310). Cham (CH): Springer. https://doi.org/10.1007/978-3-319-58961-9_25
Bai, Lu ; Cui, Lixin ; Rossi, Luca ; Hancock, Edwin R. ; Jiao, Yuhang. / Adaptive feature selection based on the most informative graph-based features. Graph-based representations in pattern recognition : 11th IAPR-TC-15 international workshop, GbRPR 2017. Proceedings. editor / Pasquale Foggia ; Cheng-Lin Liu ; Mario Vento. Cham (CH) : Springer, 2017. pp. 276-287 (Lecture Notes in Computer Science).
@inproceedings{b9e31336b5f34dbeb87ff27f5e4cda98,
title = "Adaptive feature selection based on the most informative graph-based features",
abstract = "In this paper, we propose a novel method to adaptively select the most informative and least redundant feature subset, which has strong discriminating power with respect to the target label. Unlike most traditional methods using vectorial features, our proposed approach is based on graph-based features and thus incorporates the relationships between feature samples into the feature selection process. To efficiently encapsulate the main characteristics of the graph-based features, we probe each graph structure using the steady state random walk and compute a probability distribution of the walk visiting the vertices. Furthermore, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature, through the Jensen-Shannon divergence measure between the probability distributions from the random walk on different graphs. By solving a quadratic programming problem, we use the new measure to automatically locate the subset of the most informative features, that have both low redundancy and strong discriminating power. Unlike most existing state-of-the-art feature selection methods, the proposed information theoretic feature selection method can accommodate both continuous and discrete target features. Experiments on the problem of P2P lending platforms in China demonstrate the effectiveness of the proposed method.",
author = "Lu Bai and Lixin Cui and Luca Rossi and Hancock, {Edwin R.} and Yuhang Jiao",
year = "2017",
doi = "10.1007/978-3-319-58961-9_25",
language = "English",
isbn = "978-3-319-58960-2",
series = "Lecture Notes in Computer Science",
publisher = "Springer",
pages = "276--287",
editor = "Pasquale Foggia and Cheng-Lin Liu and Mario Vento",
booktitle = "Graph-based representations in pattern recognition : 11th IAPR-TC-15 international workshop, GbRPR 2017. Proceedings",
address = "Germany",

}

Bai, L, Cui, L, Rossi, L, Hancock, ER & Jiao, Y 2017, Adaptive feature selection based on the most informative graph-based features. in P Foggia, C-L Liu & M Vento (eds), Graph-based representations in pattern recognition : 11th IAPR-TC-15 international workshop, GbRPR 2017. Proceedings. Lecture Notes in Computer Science, vol. 10310, Springer, Cham (CH), pp. 276-287, 11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition, GbRPR 2017 , Anacapri, Italy, 16/05/17. https://doi.org/10.1007/978-3-319-58961-9_25

Adaptive feature selection based on the most informative graph-based features. / Bai, Lu; Cui, Lixin; Rossi, Luca; Hancock, Edwin R.; Jiao, Yuhang.

Graph-based representations in pattern recognition : 11th IAPR-TC-15 international workshop, GbRPR 2017. Proceedings. ed. / Pasquale Foggia; Cheng-Lin Liu; Mario Vento. Cham (CH) : Springer, 2017. p. 276-287 (Lecture Notes in Computer Science; Vol. 10310).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Adaptive feature selection based on the most informative graph-based features

AU - Bai, Lu

AU - Cui, Lixin

AU - Rossi, Luca

AU - Hancock, Edwin R.

AU - Jiao, Yuhang

PY - 2017

Y1 - 2017

N2 - In this paper, we propose a novel method to adaptively select the most informative and least redundant feature subset, which has strong discriminating power with respect to the target label. Unlike most traditional methods using vectorial features, our proposed approach is based on graph-based features and thus incorporates the relationships between feature samples into the feature selection process. To efficiently encapsulate the main characteristics of the graph-based features, we probe each graph structure using the steady state random walk and compute a probability distribution of the walk visiting the vertices. Furthermore, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature, through the Jensen-Shannon divergence measure between the probability distributions from the random walk on different graphs. By solving a quadratic programming problem, we use the new measure to automatically locate the subset of the most informative features, that have both low redundancy and strong discriminating power. Unlike most existing state-of-the-art feature selection methods, the proposed information theoretic feature selection method can accommodate both continuous and discrete target features. Experiments on the problem of P2P lending platforms in China demonstrate the effectiveness of the proposed method.

AB - In this paper, we propose a novel method to adaptively select the most informative and least redundant feature subset, which has strong discriminating power with respect to the target label. Unlike most traditional methods using vectorial features, our proposed approach is based on graph-based features and thus incorporates the relationships between feature samples into the feature selection process. To efficiently encapsulate the main characteristics of the graph-based features, we probe each graph structure using the steady state random walk and compute a probability distribution of the walk visiting the vertices. Furthermore, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature, through the Jensen-Shannon divergence measure between the probability distributions from the random walk on different graphs. By solving a quadratic programming problem, we use the new measure to automatically locate the subset of the most informative features, that have both low redundancy and strong discriminating power. Unlike most existing state-of-the-art feature selection methods, the proposed information theoretic feature selection method can accommodate both continuous and discrete target features. Experiments on the problem of P2P lending platforms in China demonstrate the effectiveness of the proposed method.

UR - https://link.springer.com/chapter/10.1007%2F978-3-319-58961-9_25

UR - http://www.scopus.com/inward/record.url?scp=85019568244&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-58961-9_25

DO - 10.1007/978-3-319-58961-9_25

M3 - Conference contribution

AN - SCOPUS:85019568244

SN - 978-3-319-58960-2

T3 - Lecture Notes in Computer Science

SP - 276

EP - 287

BT - Graph-based representations in pattern recognition : 11th IAPR-TC-15 international workshop, GbRPR 2017. Proceedings

A2 - Foggia, Pasquale

A2 - Liu, Cheng-Lin

A2 - Vento, Mario

PB - Springer

CY - Cham (CH)

ER -

Bai L, Cui L, Rossi L, Hancock ER, Jiao Y. Adaptive feature selection based on the most informative graph-based features. In Foggia P, Liu C-L, Vento M, editors, Graph-based representations in pattern recognition : 11th IAPR-TC-15 international workshop, GbRPR 2017. Proceedings. Cham (CH): Springer. 2017. p. 276-287. (Lecture Notes in Computer Science). https://doi.org/10.1007/978-3-319-58961-9_25