Category Archives: Writting

Some words

#some words
Sometimes in life,
You find a special friend;
Someone who change your life just be part of it.
Someone who makes you laugh until you can’t stop;
Someone who makes you believe that there really is good in the word.
Someone who convinces you that there really is an unlocked door just waiting for you to open it.
This is [...]

Chinese in American Film

Though many Chinese went to America in the early 20th Century, they didn’t appear in those day’s western films. At that time, there were also some films telling the story about Chinese, but having no Chinese actors. The directors chose some western stars to make them look like Chinese using makeup and costumes in those [...]

Road trip In USA

Road trip is now popular in China. It’s not surprising to read news reporting one guy rides a bike from Beijing to Shanghai. It is also common to see another young man runs a motorcycle from Chengdu to Tibet.
Last Friday, Miss Goldsmith gave us a talk on “Road trip in USA”. From her lecture, I [...]

A snapshot of Collective Intelligence with Web 2.0

 
A snapshot of Collective Intelligence with Web 2.0
Jianfeng Chen
Institute of Web Science, School of Computer Science and Engineering, Southeast University, Nanjing 210096, P.R.China
cjfsmart777@gmail.com
Abstract
During the pass five years, Web 2.0 facilitates interactive information sharing, interoperability, user-centered design and collaboration on the World Wide Web. All of these changes continue to evolve around an intelligent web, which [...]

ANF: a tool for approximate neighourhood function

阅读论文ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs
1. 什么是Neighbourhood Function
1.1 点的neighbourhood function
给定点x,IN(x,h)表表示到x距离小于等于h顶点数目,
对所有的h。
1.2 图的neighbourhood function (也叫hop plot)
N(h)表示距离小于h的点对数目,对所有的h。
给定h,计算N(h),可以先固定点x,计算IN(x,h), 然后对x求和。
1.3 对NF而言,无论是单个点还是整个图,图定义域是h可取值范围,值域是NF(h)可取值范围。
这个函数可以研究的东西包括:
h的范围;
NF(h)随h变化的情况(认为是一种分布,讨论hop exponent表征变化趋势)
2.  Neighbourhood Function放映出来的图的性质
图的连通性(connectivity),h的范围可以放映图的连通性、可达性
图的结构(neighourhood structure),NF函数的变化趋势表征邻居结构的变化
Graph Similarity, Subgraph Similary, Vertex Important
3. ANF(approximate Neighbourhood Function)工具计算邻居函数
ANF是一个近似算法,特点是近似程度有保障,算法复杂度低(O(n+m)d),基于外存能处理的大规模,可以并行。
算法基本思想:
问题1:已知集合包含n个元素,统计他的一个子集中元素的个数
方法一,基于n位bit串。
使用n位bit串标识某元素是否属于该n元集合,每个元素对应于一个bit位,
如果某个元素属于考虑的子集合范畴则将其对应的bit位置为1否则为0。
这样将所有的元素的bit串做Bitwise-OR运算得到一个集合的bit串,统计结果bit串里面的1的个数,其值就是子集合包含的元素的个数。
方法二,probabilistic counting(ANF使用的方法)
每个元素拥有一个logn+r ( < n )位的bit串,
其中bit串的从左往右第0个位置为1的概率是1/2
第1个位置为1的概率是1/4
第2个位置为1的概率是1/2
……………
第i个位置为1的概率是1/(2^(i+1))
然后将子集合中所有元素的bit串做Bitwise-OR运算得到集合的bit串,从左往右看该bit串,找出第一个不是1的位置记为b,则子集合的元素个数大约成比例于2^b
(这种概率近似方法相较于方法一,不关心是哪些元素属于子集合,只关心子集合元素个数与基本集合元素个数之间的比例)
直观解释:将子集合中所有元素的bit串做Bitwise-OR运算得到集合的bit串中位置1表示有25%的可能性是1。如果现在计算结果显示该位置为0,很可能是因为观察到的元素的数量不超过4。
问题2:如何获取某个点的邻居结点的数目,即给点x,h计算IN(x,h)
迭代计算
IN(x,0) = |M(x,0)| = |{x}|,
IN(x,h) = |M(x,h) U M(y,h-1)|, 其中y是x的一步邻居结点
将上面两个问题进行综合,来计算整体图的邻居函数:
给每个点随机产生bit串(使用问题1的方法二)
对h=1, 2, … 做
对x属于顶点集合做
计算邻居函数(bitwise-or, 2^b估计)
对x的邻居函数结果求和作为
算法优化:
1. [...]

A talk before leaving(prepared for the last english lesson)

The day to leave school is coming. Today I am honored to give you a talk here, about my school days and future work.
Different from others’, my childhood days were not very interesting and I usually played with myself. I also don’t like the school days filled with exams. At that time I had no [...]

Some english sentences

In this case an example is worth one thousand words.