

  1. 王国胤, 姚一豫, 于洪. 粗糙集理论与应用研究综述[J]. 计算机学报, 2009, 32(7).
  2. 关于粗糙集一些概念通俗易懂的解释
  3. 大佬写的博文,不过都是英文的
  4. 维基百科,可以直接访问
  5. 百度百科中的找到好工作的例子可以一看
  6. 顺带看一眼好了


《张文修, 吴伟志, 粱吉业, 李德玉. 粗糙集理论与方法[M]. 北京: 科学出版社,2001.》



  • 信息表
个体编号 头疼 肌肉疼 体温 流感
x1 正常
x3 很高
x4 正常
x6 很高

信息表 M 可以形式化地表达为四元组 M=(U,At,Va∣a∈At,Ia∣a∈At)M =(U , A_t , {V_a|a ∈ A_t}, {I_a| a ∈ A_t}) M=(U,At​,Va​∣a∈At​,Ia​∣a∈At​) 表 1 中, U ={x1 , x2 , …, x6}是有限非空对象的集合,也称为论域,A_t ={头疼 ,肌肉疼, 体温, 流感}是有限非空的属性集合。VaV_aVa​ 表示属性 a∈Ata ∈ A_ta∈At​ 的属性值的范围 ,即属性 a的值域, Ia:U−>VaI_a :U -> V_aIa​:U−>Va​是一个信息函数 .如果 A⊆AtA\subseteq A_tA⊆At​ , 则IA(x)I_A(x)IA​(x)表示 U 中对象 x 在属性 A 上的属性值。

  • 概念、内涵(公式)、外延
    信息表 M 中的概念就是(ϕ,m(ϕ))( \phi , m(\phi))(ϕ,m(ϕ))。概念(ϕ,m(ϕ))( \phi, m(\phi))(ϕ,m(ϕ))的内涵是ϕ\phiϕ , 表示 M 中对对象子集m(ϕ)m(\phi)m(ϕ)的描述;概念(ϕ,m(ϕ))( \phi, m(\phi))(ϕ,m(ϕ))的外延是m(ϕ)m(\phi)m(ϕ), 其含义是满足公式ϕ\phiϕ的所有对象的全体 。
    其中的ϕ\phiϕ就是公式,也是概念的内涵,举个例子的话就是(头疼=是)U (肌肉疼=否),可以是各个属性的公式的并。外延的话也就是这个公式所确定的子集。一个公式也可以确定一个划分,即通过某个公式可以将论域中所有的object进行划分。

  • 条件属性、决策属性

  • 可定义集、语言
    我们用符号 (A)表示由属性子集 A定义的语言。
    在信息表 M 中, 如果称子集 X⊆UX \subseteq UX⊆U是可被属性子集 A⊆AtA \subseteq A_tA⊆At​ 定义的 , 当且仅当在语言(A)中存在一个公式 使得 X=m(ϕ)X =m(\phi)X=m(ϕ).否则 , X 称为不可定义的。
    我们考虑属性子集 A ={头疼, 肌肉疼},语言(A)我的理解就是一个公式的集合,里面包含了所有可能的公式,对于上述的信息表,语言(A)就是 { (头疼=是) , (头疼=否) , (肌肉疼=是) ,(肌肉疼=否)}.
    于是可定义集的全体表示为Def(U,(A))=m(ϕ)∣ϕ∈(A)Def(U , (A))={m(\phi) | \phi∈(A)}Def(U,(A))=m(ϕ)∣ϕ∈(A)如果概念的外延能用逻辑公式简洁地表达 , 那它就是一个可定义的概念;从这个角度讲 ,概念的外延就是可定义集。

  • 等价类
    如果两个对象xi,xjx _i , x_jxi​,xj​是等价的 ,那么他们在语言 (A)中由相同的公式描述, 或者说他们在A 上的各个属性值相同。刚才得到的可定义集就是属性集合 A 上的等价关系 E(A)在论域U 上产生的划分,记为 U/E(A)=[x]E(A)∣x∈UU/E(A)={[ x]_{E(A)} | x ∈ U}U/E(A)=[x]E(A)​∣x∈U,[x]E(A)[ x]_{E(A)}[x]E(A)​是由关系 E(A)确定的等价类, 同一个等价类中的对象是不可分辨的,所以,有时我们也称等价关系为不可分辨关系。上例中,我们考虑属性子集 A ={头疼, 肌肉疼}, U/E(A)={{x1,x2,x3},{x4,x6},{x5}}U/E (A)=\{ \{x_1 , x_2 , x_3 \} , \{x_4 , x_6\} , \{x_5\}\}U/E(A)={{x1​,x2​,x3​},{x4​,x6​},{x5​}}。

  • 上下近似
    针对不可定义集, 显然不可能构造一个公式来精确描述,只能通过上下界逼近的方式来刻画, 这就是粗糙集理论中的上下近似算子。设 E(A)是信息表 M 上的等价关系,X⊆UX\subseteq UX⊆U , 上下近似算子 apr(X)‾\overline{apr(X)}apr(X)​, apr(X)‾\underline{apr(X)}apr(X)​(下文我们采用缩写形式 apr(X)‾\overline{apr(X)}apr(X)​ , apr(X)‾\underline{apr(X)}apr(X)​)定义为:
    apr(X)‾=∪{Y∣Y∈σ(U/E(A)),Y∩X≠∅}=∩{Y∣Y∈Def(U,(A),X⊆Y}\overline{apr(X)}=∪\{Y|Y ∈ σ(U/E(A)), Y ∩ X ≠ \emptyset \} =∩\{Y | Y ∈ Def(U , (A), X \subseteq Y\}apr(X)​=∪{Y∣Y∈σ(U/E(A)),Y∩X​=∅}=∩{Y∣Y∈Def(U,(A),X⊆Y}
    apr(X)‾=∪{Y∣Y∈σ(U/E(A)),Y⊆X}=∩{Y∣Y∈Def(U,(A),Y⊆X}\underline{apr(X)}=∪\{Y|Y ∈ σ(U/E(A)), Y \subseteq X \} =∩\{Y | Y ∈ Def(U , (A), Y \subseteq X\}apr(X)​=∪{Y∣Y∈σ(U/E(A)),Y⊆X}=∩{Y∣Y∈Def(U,(A),Y⊆X}
    上近似 apr(X)‾\overline{apr(X)}apr(X)​是包含 X 的最小可定义集 , 下近似 apr(X)‾\underline{apr(X)}apr(X)​是包含在 X 中的最大可定义集 .
    根据定义 , 可定义集显然有相同的上下近似。刚才我们在可定义的基础上构造了一对近似算子。也就是说 ,只有当对象不可定义时 ,才会用上下近似的方法来描述。
    下近似的计算方法: lower approximation is the union of all equivalence classes in [x]P[x]_P[x]P​ which are contained by (i.e., are subsets of) the target set.
    上近似的计算方法: upper approximation is the union of all equivalence classes in [x]P[x]_P[x]P​ which have non-empty intersection with the target set.

  • 正负域
    考虑子集 X⊆UX\subseteq UX⊆U ,论域空间将被分成 3 个区域 :
    (1)集合 X 的正域 :
    (2)集合 X 的负域 :
    NEG(X)=POS(~X)=U−apr(X)‾NEG(X)=POS(~ X)=U - \overline{apr(X)}NEG(X)=POS(~X)=U−apr(X)​
    (3)集合 X 的边界域:
    如果 BND(X)是空集, 则称集合 X 关于关系E(A)是清晰的(crisp);反之,如果 BND(X)不是空集,则称集合 X 为关于关系 E(A)粗糙的(rough).

  • 粗糙集
    Pawlak定义由等价关系确定的等价类[x]E(A)[x]_{E(A)}[x]E(A)​的集合就组成了 P1-粗糙集集合(P1-Rough Set , PRS1).显然 , P1-粗糙集集合是子集集合, 即PRS1={[x]E(A)∣X⊆2U}PRS1 =\{ [x]_{E(A)} | X \subseteq 2^U \}PRS1={[x]E(A)​∣X⊆2U}.也可以给出和PRS1等价的关于粗糙集的另外一种定义 ,称为 P2-粗糙集集合.即 PRS2={<X1,X2>}={〈apr(X)‾,apr(X)‾〉}PRS2 =\{<X 1 , X2>\}=\{〈{\underline{apr(X)}}, {\overline {apr(X)}}〉\}PRS2={<X1,X2>}={〈apr(X)​,apr(X)​〉}.PRS1 和 PRS2 通称为 Pawlak 粗糙集.
    The tuple ⟨apr(X)‾,apr(X)‾⟩\langle{\underline {apr(X)}},{\overline {apr(X)}}\rangle⟨apr(X)​,apr(X)​⟩ composed of the lower and upper approximation is called a rough set; thus, a rough set is composed of two crisp sets, one representing a lower boundary of the target set X, and the other representing an upper boundary of the target set X.

  • 粗糙集准确性
    The accuracy of the rough-set representation of the set X :αP(X)=∣apr(X)‾∣∣apr(X)‾∣\alpha_P(X)= \frac{ |{\underline{apr(X)}}| }{|{\overline{apr(X)}}| } αP​(X)=∣apr(X)​∣∣apr(X)​∣​ That is, the accuracy of the rough set representation of X,αP(X),0≤αP(X)≤1X, \alpha_{P}(X), 0 \leq \alpha_{P}(X) \leq 1X,αP​(X),0≤αP​(X)≤1, is the ratio of the number of objects which can positively be placed in X to the number of objects that can possibly be placed in X. this provides a measure of how closely the rough set is approximating the target set.

  • Reduct and core
    Formally, a reduct is a subset of attributes RED⊆P\mathrm{RED} \subseteq PRED⊆Psuch that:

  • [x]RED=[x]P[x]_{\mathrm{RED}} = [x]_P[x]RED​=[x]P​, that is, the equivalence classes induced by the reduced attribute set RED\mathrm{RED}RED are the same as the equivalence class structure induced by the full attribute set P.

  • the attribute set RED\mathrm{RED}RED is minimal, in the sense that [x](RED−{a})≠[x]P[x]_{(\mathrm{RED}-\{a\})} \neq [x]_P[x](RED−{a})​​=[x]P​ for any attribute a∈REDa \in \mathrm{RED}a∈RED; in other words, no attribute can be removed from set RED\mathrm{RED}RED without changing the equivalence classes [x]P[x]_P[x]P​.

  • Attribute dependency
    in rough set theory, the notion of dependency is defined very simply. Let us take two (disjoint) sets of attributes, set P and set Q, and inquire what degree of dependency obtains between them. Each attribute set induces an (indiscernibility) equivalence class structure, the equivalence classes induced by P given by [x]P[x]_{P}[x]P​, and the equivalence classes induced by Q given by [x]Q[x]_{Q}[x]Q​.
    Let [x]Q={Q1,Q2,Q3,…,QN}[x]_{Q}=\{Q_{1},Q_{2},Q_{3},\dots ,Q_{N}\}[x]Q​={Q1​,Q2​,Q3​,…,QN​}, whereQiQ_{i}Qi​ is a given equivalence class from the equivalence-class structure induced by attribute set Q. Then, the dependency of attribute set Q on attribute set P, γP(Q)\gamma _{{P}}(Q)γP​(Q), is given byγP(Q)=∑i=1N∣P‾Qi∣∣U∣≤1\gamma _{{P}}(Q)={\frac {\sum _{{i=1}}^{N}\left|{\underline P}Q_{i}\right|}{\left|{\mathbb {U}}\right|}}\leq 1γP​(Q)=∣U∣∑i=1N​∣P​Qi​∣​≤1That is, for each equivalence class Qi∈[x]QQ_{i} \in [x]_{Q}Qi​∈[x]Q​, we add up the size of its lower approximation by the attributes in P, i.e., P‾Qi{\underline P}Q_{i}P​Qi​. This approximation (as above, for arbitrary set X) is the number of objects which on attribute set P can be positively identified as belonging to target set QiQ_{i}Qi​. Added across all equivalence classes in [x]Q[x]_{Q}[x]Q​, the numerator above represents the total number of objects which – based on attribute set P – can be positively categorized according to the classification induced by attributes Q. The dependency ratio therefore expresses the proportion (within the entire universe) of such classifiable objects. The dependency γP(Q)\gamma _{{P}}(Q)γP​(Q) “can be interpreted as a proportion of such objects in the information system for which it suffices to know the values of attributes in P to determine the values of attributes in Q”.
    简单地来说,就是选取两个attribute set P and Q,然后利用set Q就可以得到一个属性的划分,即得到一个等价类的集合,把这个等价类集合中的每一个等价类作为target set,把P作为条件属性set,然后分别求出每一个lower approximation,并将其元素数量加总起来,除以universe中总的元素数量就是dependency了。根据这个定义我们可以知道这个dependency是不对称的,即P和Q and Q和P是不一样的。当然还有一些其他的dependency的定义方法。

  • Rule Extraction
    顾名思义,这个就是利用rough set来做知识挖掘啦。
    有很多人提出了许多方法,这里说的是rule-extraction procedure based on Ziarko & Shan (1995).
    Decision Matrices
    (Pi=a)∧(Pj=b)∧⋯∧(Pk=c)→(Q=d)(P_{i}=a)\land (P_{j}=b)\land \dots \land (P_{k}=c)\to (Q=d)(Pi​=a)∧(Pj​=b)∧⋯∧(Pk​=c)→(Q=d)
    考虑第一种情况P4=1,可以生成以下的决策矩阵,行是P4=1的object集合,列是P4!=1的object集合。其中的元素的含义举个例子来说明:对于(O7,O6)cell中的元素的意思是,O7 object和O6 object在属性P1和P3上是不同的,并且O7这个object的P1=2,P3=0。当然对于O7和O6一个P4=1,一个P4!=1,可能只是P1和P3中的一个属性起作用,当然也有可能是两个都起作用了。其他的cell类似,于是可以做出下面这个矩阵了。


    {(P1=1)→(P4=1)(P2=2)→(P4=1)(P1=2)∧(P2=0)→(P4=1)(P3=0)∧(P2=0)→(P4=1){\begin{cases}(P_{1}=1)\to (P_{{4}}=1)\\(P_{2}=2)\to (P_{{4}}=1)\\(P_{1}=2)\land (P_{2}=0)\to (P_{{4}}=1)\\(P_{3}=0)\land (P_{2}=0)\to (P_{{4}}=1)\end{cases}}⎩⎪⎪⎪⎨⎪⎪⎪⎧​(P1​=1)→(P4​=1)(P2​=2)→(P4​=1)(P1​=2)∧(P2​=0)→(P4​=1)(P3​=0)∧(P2​=0)→(P4​=1)​

  • Incomplete data
    Rough set theory is useful for rule induction from incomplete data sets. Using this approach we can distinguish between three types of missing attribute values: lost values (the values that were recorded but currently are unavailable), attribute-concept values (these missing attribute values may be replaced by any attribute value limited to the same concept), and “do not care” conditions (the original values were irrelevant). A concept (class) is a set of all objects classified (or diagnosed) the same way.
    Two special data sets with missing attribute values were extensively studied: in the first case, all missing attribute values were lost (Stefanowski and Tsoukias, 2001), in the second case, all missing attribute values were “do not care” conditions (Kryszkiewicz, 1999).
    In attribute-concept values interpretation of a missing attribute value, the missing attribute value may be replaced by any value of the attribute domain restricted to the concept to which the object with a missing attribute value belongs (Grzymala-Busse and Grzymala-Busse, 2007). For example, if for a patient the value of an attribute Temperature is missing, this patient is sick with flu, and all remaining patients sick with flu have values high or very-high for Temperature when using the interpretation of the missing attribute value as the attribute-concept value, we will replace the missing attribute value with high and very-high. Additionally, the characteristic relation, (see, e.g., Grzymala-Busse and Grzymala-Busse, 2007) enables to process data sets with all three kind of missing attribute values at the same time: lost, “do not care” conditions, and attribute-concept values.



