◇◇新语丝(www.xys.org)(xys.dxiong.com)(xys.3322.org)(xys.xlogit.com)◇◇

中山大学信息学院黄继武院长等人涉嫌抄袭

最近看到由中山大学康显桂副教授、黄继武教授，以及美国新泽西科技大学
（New Jersey Institute of Technology）教授Yun Q. Shi在国际会议上的
一篇论文，有明显的抄袭。该文章的信息如下：

Xiangui Kang, Yun Q. Shi, Jiwu Huang. "Achieving Non-Ambiguity of
Quantization Based Watermarking."　IS&T/SPIE 18th annual symposium-
Security, Steganography, and Watermarking of Multimedia Contents
VIII (EI121), San Jose, CA, USA, 2006.

该文第3.2节完全抄自以下论文第5节：

Q. Li and E.-C. Chang, "On the possibility of non-invertible
watermarking schemes," Information Hiding Workshop, Lecture notes
in Computer science (LNCS), vol. 3200, pp. 13–24, 2004.

虽然Kang-Shi-Huang的论文在摘要和概论几次引用了Li-Chang的论文，3.2
节里的文字也没有完全逐字逐句的抄，但在3.2节长达一页有余的文字里没
有任何一处哪怕是暗示该节的内容出自Li-Chang的论文。

涉嫌抄袭的部分是关于[Li-Chang]论文中一个数字水印方案的安全性的证明。
在概论中，Kang-Shi-Huang的文章里是这样说的：

"In section 3, we introduce a framework for quantization based
watermarking schemes and prove that it is invertible."

（按：该处invertible应该为non-invertible。如此质量的文章居然也能发表。）

并且在第3.2节的开始，又说：

"Now, we prove that the proposed watermarking schemes can achieve
non-invertibilty by using contradiction method."

并且再未提到过[Li-Chang]这篇文章。

由此可见这三位作者根本是想让人以为这个证明是他们自己提出的，而实
际上完全一样的证明在[Li-Chang]第5节里已经给出过了。

几位作者的基本信息如下：

康显桂：中山大学中山大学信息科学与技术学院副教授，研究生导师。见：
http://sist.zsu.edu.cn/graduate/shuodao/kangxiangui.htm

Yun Q. Shi：美国新泽西科技大学（NJIT）电子与计算机工程系教授，
个人主页：http://web.njit.edu/~shi/

黄继武：中山大学信息科学与技术学院院长兼中山大学软件学院院长，中
山大学信息安全技术研究所所长，教授、博士导师。国家杰出青年科学基
金、教育部跨世纪优秀人才基金获得者。见：
http://sist.sysu.edu.cn/graduate/bodao/huangjiwu.htm

涉嫌抄袭的部分：

[Kang-Shi-Huang Section 3.2]
来源于http://zsu.edu.cn/~isskxg/pdf/ei2005.pdf

Now, we prove that the proposed watermarking schemes can achieve
non-invertibilty by using contradiction method. Suppose, towards
the contradiction, there exists successful attacker B, which is a

probabilistic polynomial-time algorithm. Given any medium ^I,
if ^I is embedded with some valid watermark W or ^I happens to
contain a valid watermark W , B can successfully in polynomial
time find a valid watermark W_v embedded in ^I and output a pair
(W_v, K) with not negligible probability. That is, for any
positive n > n_0 , there exists a positive polynomial p(.), the
probability that B outputs a pair (W_v, K) is more than or equal
to 1/p(n). Note that W and W_v can be different. If ^I does not
contains a valid watermark, B cannot find a valid watermark and
hence always output a symbol \bot to correct claim that ^I does
not contain a valid watermark. The above mentioned probability
distribution is taken over the coin tosses made by B.

Then B can be easily transformed to a distinguisher, denoted as
D, operating as follows. Given a sequence W \in {?? 1,1}^n , D
carries out the following steps:

1. Randomly pick an original signal I.
2. Using the embedding method, embed W in I to get ^I.
3. Pass ^I to B, and ask B for a valid watermark W_v embedded
   in ^I.
4. If the output of B is a pair (W_v, K) such that W_v = g(K), D

declares that W is generated by the PRSG by outputting a 0;
   Otherwise B outputs a \bot, then D declares that W is from a
   random source by outputting a 1.

Next, we calculate the expected output of D for the following
two cases.

Case 1: W is generated by the truly random source. Consider the
unlikely event: a randomly n-bit watermark ^W (^W = extract(^I))
is extracted from ^I and any valid W_v \in W exists such that
w_d = W~ W_v >= T. The probability of that the unlikely event
happens is F (n) from last section. We consider the worst case
(best for ambiguity attacker) that when the unlikely event
happens, B always can output a pair (W_v, K) and D output a 0.
Exclude the unlikely event happens, D always output 1, because
B can not output a pair (W_v , K). Then the expect output E_1(n)
of D is:

    E_1(n) > 1 - F(n)                      (9)

Case2: W is indeed generated from the PRSG. Since B is a
successful attacker, the chances that a valid W_v can be found
is not negligible. So with probability that is not negligible,
more specifically, for any positive n > n_0 , there exists a
positive polynomial p(.), the probability that D declares that
W is from PRSG is more than 1/p(n). Then the expected output
E_2(n) of D is:

    E_2(n) < (1 - 1/p(n))                 (10)

Combining the above two cases leads to the conclusion that D
can distinguish the truly random sequence from the PRSG
sequence because the difference between E_1(n) and E_2(n) is
not negligible of n. This contradicts the hypothesis of the
pseudo random sequence. Therefore, no such B exists, thus the
proposed scheme is non-invertible. From above, we can see that
if the false positive rate F(n) of a watermarking scheme, which
uses a pseudo random sequence as the valid watermark, is
negligible with respect of n, then the watermarking is provably
non-invertible. F(n) decreases exponentially when watermark
detection threshold increase. In the next section, we will
discuss the invertibility of spread spectrum watermarking.

被抄论文[Li-Chang]第5节，来源：
http://www.comp.nus.edu.sg/~changec/publications/invertible.pdf

Now, we are ready to prove that the proposed protocol is secure.
We assume　that the function f is a CSPRNG. Suppose that there
is a successful attacker B　as defined in Definition 1, we want
to extend it to a statistical test T that has　an advantage in
distinguishing sequences produced by f from that by a truly
random source. Since f is a CSPRNG, this leads to a
contradiction, and thus　such a B is impossible.

Given an input W \in {??1, 1}^n, the following steps are carried
out by T:
1. Randomly pick an original work I.
2. Compute I~ = I + kW. That is, embed W into I.
3. Pass I~ to B and obtain an output.
4. If the output of B is a pair (W, K), such that W = f(K), then
   T declares that W is generated by f by outputting a 0.
   Otherwise B outputs a \bot, then T declares that W comes from
   a random source by outputting a 1.

We want to calculate the expected output of T for the following
2 cases. If the difference of the expected outputs of these 2
cases is non-negligible, then by the definitions in Section 3.4,
f is not a CSPRNG, thus leads to a contradiction.

Case 1: W is from a random source. Suppose W is from a random
source, then the probability that there exists a valid watermark
W \in W in I~ is exactly the probability V(n) in (7), which is
negligible with respect to n as we have shown in Section 4.3.
Hence, we know that T will almost always output a 1 to correctly
declare that it is from the random source, except in the unlikely
event E where eI happens to contain a valid watermark. Clearly E
happens with negligible probability V(n). We observe that, when
E happens, T may output a 0 with a probability that is not
negligible (since B is a successful attacker). We consider the
obvious worst case (best case for the attacker) that, T always
output 0 when E happens. In this case, the fraction of 0's output
by T is V (n), which is still negligible. Therefore, let E_1(T)
be the expected output of T , we have

    E_1(T) > 1-V(n)                          (8)

Case 2: W is from the CSPRNG f. Suppose W is generated by f,
then W is a valid watermark. Since B is a successful attacker,
by definition B is able to find a valid watermark W that is
already embedded in eI with a probability that is not negligible.
More specifically, for any positive integer n_0,

    Pr[B(I~) = PASS] > 1/p(n)

for some positive polynomial p(·) and for some n > n_0. Hence,
the probability that T decides that W is from the CSPRNG f is
more than 1/p(n). Hence, let E_2(T) be the expected output of
T in this case, and we have

    E_2(T) < (1 ?? 1/p(n)).                    (9)

Consider the difference between (8) and (9). Since V(n) is
negligible but 1/p(n) is not, the difference cannot be
negligible because the sum of two negligible functions is still
negligible. Hence, the difference between E_1(T) and E_2(T)
is not negligible. Thus T has an advantage in distinguishing
the truly random source from the the output of f, therefore f
by definition is not a CSPRNG, which is a contradiction. As a
result, such a successful attacker B does not exist.

(XYS20070114)

◇◇新语丝(www.xys.org)(xys.dxiong.com)(xys.3322.org)(xys.xlogit.com)◇◇