SciELO - Scientific Electronic Library Online

 
vol.29 suppl.2Multiobjective optimization with expensive functions. Survey on the state of the art índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Revista Tecnología en Marcha

versão On-line ISSN 0379-3982versão impressa ISSN 0379-3982

Resumo

CALVO-VALVERDE, Luis-Alexánder. Strategy based on machine learning to deal with untagged data sets using rough sets and/or information gain. Tecnología en Marcha [online]. 2016, vol.29, suppl.2, pp.4-15. ISSN 0379-3982.  http://dx.doi.org/10.18845/tm.v29i5.2581.

As had been seen in the history of humanity, today data of various kinds and cheaply collected, for example sensors that record information every minute, web pages that store all the actions performed by the user on the page supermarkets that keep everything their customers buy and when to do it and many more examples like these. But these large databases have presented a challenge to their owners How to take advantage of them? How to turn data into information for decision making? This paper presents a strategy based on machine learning to deal with unlabeled datasets using rough sets and/or information gain. A method is proposed to cluster the data using k-means considering how much information provides an attribute (information gain); besides being able to select which attributes are really essential to classify new data and which are dispensable (rough sets), which is very beneficial as it allows decisions in less time.

Palavras-chave : Machine Learning; Data Mining; Rough Sets; Entropy; Information Gain; Feature Reduction.

        · resumo em Espanhol     · texto em Espanhol     · Espanhol ( pdf )