There are two main parts considered in this package which are Rough Set Theory (RST) and Fuzzy Rough Set Theory (FRST). RST was introduced by (Z. Pawlak, 1982; Z. Pawlak, 1991) which provides sophisticated mathematical tools to model and analyze information systems that involve uncertainty and imprecision. By employing indiscernibility relation among objects, RST does not require additional parameters to extract information. Secondly, FRST, an extension of RST, was introduced by (D. Dubois and H. Prade, 1990) as a combination between fuzzy sets proposed by (L. A. Zadeh, 1965) and RST. This concept allows to analyze continuous attributes without performing discretization on data first. Based on the above concepts, many methods have been proposed and applied for dealing with several different domains. In order to solve the problems, the methods employ the indiscernibility relation and lower and upper approximation concepts. All methods that have been implemented in this package will be explained by grouping based on their domains. The following is a list of
domains considered in this package:
- Basic concepts of RST and FRST: This part, we can divide into four different tasks which are indiscernibility relation, lower and upper approximation, positive region and discernibilitymatrix.
- Discretization: It is used to convert real valued attributes into nominal/symbolic ones in an information system. In RST point of view, this task attempts to maintain the discernibility between objects.
- Feature selection: It is a process for finding a subset of features which attempts to obtain the same quality as the complete feature set. In other words, its purpose is to select the significant features and eliminate the dispensible ones. It is a useful and necessary process when we are facing datasets containing large numbers of features. From RST and FRST perspective, feature selection refers to searching superreducts and reducts.
- Instance selection: This process is aimed to remove noisy, superfluous, or inconsistent instances from training datasets but retain consistent ones. Therefore, good accuracy of classification is achieved by removing instances which do not give positive contributions.
- Prediction/classification: This task is used to predict decision values of a new dataset (test data). We consider implementing some methods to perform this task: rule-based classifiers and nearest neighbor-based classifiers.