信息论一个小实例


ssh-keygen -t rsa -C v@mmrsl.cn
ssh-keygen -t rsa -C kp@mmrsl.cn

不用问知道你名字的玄学问题

1.前言

不知道大家有没有遇到路边有人问你,不用问你的姓就能知道你姓什么,当然我也遇到了。我怕被要钱就没有问,但后来仔细寻思挺有趣的,到底是什么原理呢,很多人都在网上给出了自己的解释,但在我看来却没有完全解决我的疑惑。近期看信息论有感,感觉是很有智慧的信息编码技术。

2.理论分析

2.1 信息论

这个问题,看起来是一道玄学题,本质上其实是信息论问题。解决这个问题并不是我们的最终目的,我们应该通过这个简单的问题了解信息论的本质,以小见大,一通百通。
信息论可以很好的给一个问题快速找到他的一个边界,每个信息所包含的信息量。最让人吃惊的是信息论的作者在上世纪中叶就给所有人类可以接触的所有信息问题设置了边界。人类遇到的全部都是经典问题,为这种问题设置了一个绝对的边界,无论你怎么设置也不能逾越的一个最大边界。任何人想要压缩数据、用一个事件发生去必然寻找另一个小概率事件所需要的数量等都已经被信息论所决定。
当别人还在寻找问题的解决方案的时候,信息论可以让你迅速从理论角度找到答案,降维打击。让我们知道这个问题的极限在哪里,不需要做很多没意义的工作。
所以我首先给大家介绍一下信息熵。

2.2 信息熵

热力学的熵主要指的是:在一个孤立系统中没有外力做功的情况下,该系统的混乱程度(熵)会不断的增大。
信息论中的熵则指的是:信息的不确定程度

种类 区别
热力学熵 系统的混乱程度
信息熵 信息的不确定度

根据定义我们不难想到,一个信息的出现概率越小,这个信息的熵也就越大。那么举个例子:
两个人进行博弈,A和B这两个人如果势均力敌两者获胜的概率都是0.5,则这个两人无论谁获胜,获胜者的熵是:

b=2的时候单位是bits
上述例子的熵即可简单计算得到为$H(x) =\frac 1 2log_2(\frac 1 {\frac 1 2}))+\frac 1 2log_2(\frac 1 {\frac 1 2}))=1$

而如果两个人的概率并不相等,比如两个人打球一个是乔丹,一个是小白。那小白获胜的概率基本上是百万分之一,那么一旦小白获胜这个事件发生的信息熵是相当大的,也就是这个事件发生的信息论巨大。

因此综上所述:接下来的问题用信息论进行解决。

2.3 百家姓的编码

有了信息熵的基础,我们现在就可以很容易的将中国500多个姓进行编码,那么就可以很用以的从图片是否有你的姓名这个信息很容易,找到你自己的姓。

姓名: {“赵”,”钱”,”孙”,”李”,”周”,”吴”,”郑”,”王”,”冯”,”陈”,”褚”,”卫”,”蒋”,”沈”,”韩”,”杨”,”朱”,”秦”,”尤”,”许”,”何”,”吕”,”施”,”张”,”孔”,”曹”,”严”,”华”,”金”,”魏”,”陶”,”姜”,”戚”,”谢”,”邹”,”喻”,”柏”,”水”,”窦”,”章”,”云”,”苏”,”潘”,”葛”,”奚”,”范”,”彭”,”郎”,”鲁”,”韦”,”昌”,”马”,”苗”,”凤”,”花”,”方”,”俞”,”任”,”袁”,”柳”,”酆”,”鲍”,”史”,”唐”,”费”,”廉”,”岑”,”薛”,”雷”,”贺”,”倪”,”汤”,”滕”,”殷”,”罗”,”毕”,”郝”,”邬”,”安”,”常”,”乐”,”于”,”时”,”傅”,”皮”,”卞”,”齐”,”康”,”伍”,”余”,”元”,”卜”,”顾”,”孟”,”平”,”黄”,”和”,”穆”,”萧”,”尹”,”姚”,”邵”,”湛”,”汪”,”祁”,”毛”,”禹”,”狄”,”米”,”贝”,”明”,”臧”,”计”,”伏”,”成”,”戴”,”谈”,”宋”,”茅”,”庞”,”熊”,”纪”,”舒”,”屈”,”项”,”祝”,”董”,”梁”,”杜”,”阮”,”蓝”,”闵”,”席”,”季”,”麻”,”强”,”贾”,”路”,”娄”,”危”,”江”,”童”,”颜”,”郭”,”梅”,”盛”,”林”,”刁”,”钟”,”徐”,”邱”,”骆”,”高”,”夏”,”蔡”,”田”,”樊”,”胡”,”凌”,”霍”,”虞”,”万”,”支”,”柯”,”昝”,”管”,”卢”,”莫”,”经”,”房”,”裘”,”缪”,”干”,”解”,”应”,”宗”,”丁”,”宣”,”贲”,”邓”,”郁”,”单”,”杭”,”洪”,”包”,”诸”,”左”,”石”,”崔”,”吉”,”钮”,”龚”,”程”,”嵇”,”邢”,”滑”,”裴”,”陆”,”荣”,”翁”,”荀”,”羊”,”於”,”惠”,”甄”,”麴”,”家”,”封”,”芮”,”羿”,”储”,”靳”,”汲”,”邴”,”糜”,”松”,”井”,”段”,”富”,”巫”,”乌”,”焦”,”巴”,”弓”,”牧”,”隗”,”山”,”谷”,”车”,”侯”,”宓”,”蓬”,”全”,”郗”,”班”,”仰”,”秋”,”仲”,”伊”,”宫”,”宁”,”仇”,”栾”,”暴”,”甘”,”钭”,”厉”,”戎”,”祖”,”武”,”符”,”刘”,”景”,”詹”,”束”,”龙”,”叶”,”幸”,”司”,”韶”,”郜”,”黎”,”蓟”,”薄”,”印”,”宿”,”白”,”怀”,”蒲”,”邰”,”从”,”鄂”,”索”,”咸”,”籍”,”赖”,”卓”,”蔺”,”屠”,”蒙”,”池”,”乔”,”阴”,”欎”,”胥”,”能”,”苍”,”双”,”闻”,”莘”,”党”,”翟”,”谭”,”贡”,”劳”,”逄”,”姬”,”申”,”扶”,”堵”,”冉”,”宰”,”郦”,”雍”,”舄”,”璩”,”桑”,”桂”,”濮”,”牛”,”寿”,”通”,”边”,”扈”,”燕”,”冀”,”郏”,”浦”,”尚”,”农”,”温”,”别”,”庄”,”晏”,”柴”,”瞿”,”阎”,”充”,”慕”,”连”,”茹”,”习”,”宦”,”艾”,”鱼”,”容”,”向”,”古”,”易”,”慎”,”戈”,”廖”,”庾”,”终”,”暨”,”居”,”衡”,”步”,”都”,”耿”,”满”,”弘”,”匡”,”国”,”文”,”寇”,”广”,”禄”,”阙”,”东”,”殴”,”殳”,”沃”,”利”,”蔚”,”越”,”夔”,”隆”,”师”,”巩”,”厍”,”聂”,”晁”,”勾”,”敖”,”融”,”冷”,”訾”,”辛”,”阚”,”那”,”简”,”饶”,”空”,”曾”,”毋”,”沙”,”乜”,”养”,”鞠”,”须”,”丰”,”巢”,”关”,”蒯”,”相”,”查”,”後”,”荆”,”红”,”游”,”竺”,”权”,”逯”,”盖”,”益”,”桓”,”公”,”万俟”,”司马”,”上官”,”欧阳”,”夏侯”,”诸葛”,”闻人”,”东方”,”赫连”,”皇甫”,”尉迟”,”公羊”,”澹台”,”公冶”,”宗政”,”濮阳”,”淳于”,”单于”,”太叔”,”申屠”,”公孙”,”仲孙”,”轩辕”,”令狐”,”钟离”,”宇文”,”长孙”,”慕容”,”鲜于”,”闾丘”,”司徒”,”司空”,”亓官”,”司寇”,”仉”,”督”,”子车”,”颛孙”,”端木”,”巫马”,”公西”,”漆雕”,”乐正”,”壤驷”,”公良”,”拓跋”,”夹谷”,”宰父”,”谷梁”,”晋”,”楚”,”闫”,”法”,”汝”,”鄢”,”涂”,”钦”,”段干”,”百里”,”东郭”,”南门”,”呼延”,”归”,”海”,”羊舌”,”微生”,”岳”,”帅”,”缑”,”亢”,”况”,”后”,”有”,”琴”,”梁丘”,”左丘”,”东门”,”西门”,”商”,”牟”,”佘”,”佴”,”伯”,”赏”,”南宫”,”墨”,”哈”,”谯”,”笪”,”年”,”爱”,”阳”,”佟”,”第五”,”言”,”福”}
源自:百度知道-百家姓.
可以找到比如从里面找到”万”这个姓的概率是$P_{“万”}(x)=\frac 1 {500}$.
那么你正好选中这个字的信息熵就等于$H(x)=log(\frac 1{500})\approx8.96$

2.3.1方式的选择

1、让人只能选择一张图片里面可以选择出来他的姓,根据信息论可以知道直选一次需要至少在500个字中选择。
由此可见,咱们不可能用500多张图去考一个人,而且这一点也不玄学。
2、我们采用两次选择图片则我们可以有以下计算:
由于上诉我们已经知道找到万这个姓的信息熵是8.96
那么如果选两张图片里面可能有万这个姓的信息熵至少要达到4.48
那么至少需要$2^{4.48}=22.3$张图片,及两个22进制的选择提供给被测试者选择,大概就是从44张图片中选两张。
3、三次选择呢我们可以有以下计算:
由于上诉我们已经知道找到万这个姓的信息熵是8.96
那么如果选三张图片里面可能有万这个姓的信息熵至少要达到2.98
那么至少需要$2^{2.98}=7.8$张图片,及3个8张图的选择提供给被测试者选择,大概就是从24张图片中选三张。这就给选择者降低了很多难度
4、四次选择呢我们可以有以下计算:
由于上诉我们已经知道找到万这个姓的信息熵是8.96
那么如果选三张图片里面可能有万这个姓的信息熵至少要达到2.24
那么至少需要$2^{2.98}=4.72$张图片,及3个5张图的选择提供给被测试者选择,大概就是从15张图片中选四张。这就给选择者更加降低了难度
5、同理咱么可以不断的增加选择次数可以选择9次
那么我们信息熵就可以是2张图选择9次:18张图选9张

但是大家可能会困惑,那么为什么不采用更多的选择次数:
首先:选择次数为9时,每次选择图片的时候你需要从200多个里面选一个。
其次:当你选了3-4次在我看来是最好的,每次选择的字不会太多大概是32-64个里面选一个降低了选择难度,其次你给的信息少,更能够让你感觉到不可思议。

因此接下来我选择3次8选1,从中选择你的姓,你们可以选择其他方法进行实现。

3. 程序设计

读者可以参考本文代码,将百家姓进行编码,通过程序咱们可以找到你的姓名。(3x8)


clear;
Book_of_Family_Names = ["赵","钱","孙","李","周","吴","郑","王","冯","陈","褚","卫","蒋","沈","韩","杨","朱","秦","尤","许","何","吕","施","张","孔","曹","严","华","金","魏","陶","姜","戚","谢","邹","喻","柏","水","窦","章","云","苏","潘","葛","奚","范","彭","郎","鲁","韦","昌","马","苗","凤","花","方","俞","任","袁","柳","酆","鲍","史","唐","费","廉","岑","薛","雷","贺","倪","汤","滕","殷","罗","毕","郝","邬","安","常","乐","于","时","傅","皮","卞","齐","康","伍","余","元","卜","顾","孟","平","黄","和","穆","萧","尹","姚","邵","湛","汪","祁","毛","禹","狄","米","贝","明","臧","计","伏","成","戴","谈","宋","茅","庞","熊","纪","舒","屈","项","祝","董","梁","杜","阮","蓝","闵","席","季","麻","强","贾","路","娄","危","江","童","颜","郭","梅","盛","林","刁","钟","徐","邱","骆","高","夏","蔡","田","樊","胡","凌","霍","虞","万","支","柯","昝","管","卢","莫","经","房","裘","缪","干","解","应","宗","丁","宣","贲","邓","郁","单","杭","洪","包","诸","左","石","崔","吉","钮","龚","程","嵇","邢","滑","裴","陆","荣","翁","荀","羊","於","惠","甄","麴","家","封","芮","羿","储","靳","汲","邴","糜","松","井","段","富","巫","乌","焦","巴","弓","牧","隗","山","谷","车","侯","宓","蓬","全","郗","班","仰","秋","仲","伊","宫","宁","仇","栾","暴","甘","钭","厉","戎","祖","武","符","刘","景","詹","束","龙","叶","幸","司","韶","郜","黎","蓟","薄","印","宿","白","怀","蒲","邰","从","鄂","索","咸","籍","赖","卓","蔺","屠","蒙","池","乔","阴","欎","胥","能","苍","双","闻","莘","党","翟","谭","贡","劳","逄","姬","申","扶","堵","冉","宰","郦","雍","舄","璩","桑","桂","濮","牛","寿","通","边","扈","燕","冀","郏","浦","尚","农","温","别","庄","晏","柴","瞿","阎","充","慕","连","茹","习","宦","艾","鱼","容","向","古","易","慎","戈","廖","庾","终","暨","居","衡","步","都","耿","满","弘","匡","国","文","寇","广","禄","阙","东","殴","殳","沃","利","蔚","越","夔","隆","师","巩","厍","聂","晁","勾","敖","融","冷","訾","辛","阚","那","简","饶","空","曾","毋","沙","乜","养","鞠","须","丰","巢","关","蒯","相","查","後","荆","红","游","竺","权","逯","盖","益","桓","公","万俟","司马","上官","欧阳","夏侯","诸葛","闻人","东方","赫连","皇甫","尉迟","公羊","澹台","公冶","宗政","濮阳","淳于","单于","太叔","申屠","公孙","仲孙","轩辕","令狐","钟离","宇文","长孙","慕容","鲜于","闾丘","司徒","司空","亓官","司寇","仉","督","子车","颛孙","端木","巫马","公西","漆雕","乐正","壤驷","公良","拓跋","夹谷","宰父","谷梁","晋","楚","闫","法","汝","鄢","涂","钦","段干","百里","东郭","南门","呼延","归","海","羊舌","微生","岳","帅","缑","亢","况","后","有","琴","梁丘","左丘","东门","西门","商","牟","佘","佴","伯","赏","南宫","墨","哈","谯","笪","年","爱","阳","佟","第五","言","福"];
save Book_of_Family_Names.mat  Book_of_Family_Names;
%信息量 log 500 9   [1-8][1-8][1-8]
Names_code= ["赵","钱","孙","李","周","吴","郑","王"];
for i=1:8
Names_code_N2(i,:)=Names_code;
end
for i=1:8
Names_code_N3(i,:,:)=Names_code_N2;
end
for a = 1:8
    for b=1:8
        for c=1:8
            if (a-1)*8^2+(b-1)*8+c-1<length(Book_of_Family_Names)
                Names_code_N(a,b,c)=Book_of_Family_Names((a-1)*8^2+(b-1)*8+c);
            else
                Names_code_N(a,b,c)= "猫";
            end
        end
    end
end

 for a=1:8
  for b=1:8
        for c=1:8
            figure(c)
                    text(a*0.1,b*0.1,Names_code_N(a,b,c));  
           hold on
        end
  end
 end
 
a1=input('choice the your Chinese name');



 for a=1:8
  for b=1:8                 
        for c=1:8
            figure(2*b)
                    text(a*0.1,c*0.1,Names_code_N(a,b,c));  
            hold on
        end
  end
 end
 
b1=input('choice the your Chinese name');

 for a=1:8
  for b=1:8                 
        for c=1:8
            figure(3*a)
                    text(b*0.1,c*0.1,Names_code_N(a,b,c));  
            hold on
        end
  end
 end
 
c1=input('choice the your Chinese name');


disp(Names_code_N(a1,b2,c1));

4.图形用户界面设计

  • 1.第一次选择
    在这里插入图片描述

  • 2.第二次选择
    在这里插入图片描述

  • 3.第三次选择

在这里插入图片描述

  • 结果:
    在这里插入图片描述

  • [x] 你的姓是万对吗?

gantt
        dateFormat  2020-10-23
        title  不用告诉我就能知道你的姓
        section 已经完成
        已完成               :done,    des1, 2020-10-06,2020-10-16
        进行中               :active,  des2, 2020-10-023, 3d
        选择2               :         des3, after des2, 4d
        选择3               :         des4, after des3, 2d

5.结论

可以修改尝试进行4x4进行选择[4,4,4,4]

源代码:不问知其名-赚点小积分.


文章作者: 万鲲鹏
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 万鲲鹏 !
评论
  目录