Résumé de Math Sup : déterminants - Maths-France

ou juste avant la balise de fermeture -->

 

 

10% de réduction sur vos envois d'emailing --> CLIQUEZ ICI

Retour à l'accueil, cliquez ici

ou juste avant la balise de fermeture -->

Résumé de Math Sup : déterminants I - Applications multilinéaires 1) Définition. Soient E1,..., En, F n + 1 espaces vectoriels . Soit f une application de E1 × ... × En dans F. f est n − linéaire ⇔ f est linéaire par rapport à chaque variable ⇔ ∀(ai)16i6n, ∀i ∈ J1, nK, l’application Ei → F xi 7→ f(a1, ..., ai−1, xi, ai+1, ..., an) est linéaire. Exemple. Dans R 3 euclidien orienté, f : R 3 × R 3 (x, y) 7→ x ∧ y est bilinéaire. Si E1 = ... = En = E et F = K, on obtient les formes n-linéaires sur E. 2/ Formes symétriques , antisymétriques , alternées. Définition. Soit f une forme n-linéaire sur E. 1) f est symétrique ⇔ ∀(x1, ..., xn) ∈ E n, ∀σ ∈ Sn, f(xσ(1) , ..., xσ(n)) = f(x1, ..., xn). 2) f est antisymétrique ⇔ ∀(x1, ..., xn) ∈ E n, ∀σ ∈ Sn, f(xσ(1) , ..., xσ(n)) = ε(σ)f(x1, ..., xn). 3) f est alternée ⇔ ∀(x1, ..., xn) ∈ E n, [(∃(i, j) ∈ J1, nK 2/ i 6= j et xi = xj) ⇒ f(x1, ..., xn) = 0]. Théorème. f est antisymétrique ⇔ ∀(x1, ..., xn) ∈ E n, ∀τ transposition de J1, nK, f(xτ(1) , ..., xτ(n)) = −f(x1, ..., xn). Démonstration. Soit σ ∈ Sn, on écrit σ = τ1 ◦ ... ◦ τk où les τi sont des transpositions et on sait que ε(σ) = (−1) k. Théorème. Soient E un K-espace vectoriel (K sous-corps de C) puis f une forme n-linéaire sur E. f alternée ⇔ f antisymétrique. Démonstration. ⇒ / Soit (x1, ..., xn) ∈ E n. Soient i 6= j puis τ = τi,j. 0 = f(x1, ..., xi + xj, ...xi + xj, ..., xn) = f(x1, ..., xi, ..., xi, ...xn) + f(x1, ..., xi, ..., xj, ...xn) + f(x1, ..., xj, ..., xi, ...xn) + f(x1, ..., xj, ..., xj, ...xn) = f(x1, ..., xi, ..., xj, ...xn) + f(x1, ..., xj, ..., xi, ...xn). Donc pour tout (x1, ..., xn) ∈ E n, pour toute transposition τ, f(xτ(1) , ..., xτ(n)) = −f(x1, ..., xn) et f est antisymétrique. ⇐ / Soit (x1, ..., xn) ∈ E n tel qu’il existe i 6= j tel que xi = xj = x. L’égalité f(x1, ..., xi, ..., xj, ...xn) = −f(x1, ..., xj, ..., xi, ...xn) s’écrit encore f(x1, ..., x, ..., x, ...xn) = −f(x1, ..., x, ..., x, ...xn) ou encore 2f(x1, ..., x, .., x, ...xn) = 0 ou enfin f(x1, ..., x, .., x, ...xn) = 0. II- Formes n-linéaires alternées sur un espace de dimension n . Définition de la forme déterminant dans une base 1) Théorème fondamental. Soit E un espace vectoriel de dimension finie n > 1. On note Λ∗ n(E) l’ensemble des formes n-linéaires alternées sur E. Théorème. 1) Λ∗ n(E) est un K-espace vectoriel de dimension 1. 2) Si B = (ei)16i6n est une base donnée de E, il existe une et une seule forme f n-linéaire alternée sur E telle que f(B) = 1. Définition. L’unique forme f n-linéaire alternée sur E telle que f(B) = 1 s’appelle la forme déterminant dans la base B et se note detB. Démonstration. Soit (x1, ..., xn) ∈ E n. Pour j ∈ J1, nK, posons xj = Xn i=1 xi,jei (où les xi,j sont dans K). Soit f une forme n-linéaire alternée sur E. En développant f(x1, ..., xn) par n-linéarité, on obtient une somme de n n termes du type xχ(1),1...xχ(n),nf eχ(1) , ..., eχ(n)  où χ est une application quelconque de J1, nK dans lui-même. f est alternée et les termes correspondant aux applications χ telles que ∃i 6= j/ χ(i) = χ(j), sont nuls. Donc tous les termes pour lesquels χ n’est pas injective disparaissent. Maintenant, J1, nK étant un ensemble fini, χ est injective si et seulement si χ est une bijection de J1, nK sur lui-même ou encore une permuation de J1, nK. Il ne reste donc que les termes du type xσ(1),1...xσ(n),nf(eσ(1) , ..., eσ(n)) = ε(σ)xσ(1),1...xσ(n),nf(e1, ..., en) où σ est une élément quelconque de Sn. On a montré que nécessairement c Jean-Louis Rouget, 2009. Tous droits réservés. 1 http ://www.maths-france.fr ∀(x1, ..., xn) ∈ E n, f(x1, ..., xn) = f(e1, ..., en) X σ∈Sn ε(σ)xσ(1),1...xσ(n),n. Pour (x1, ..., xn) ∈ E n, posons ϕ(x1, ..., xn) = X σ∈Sn ε(σ)xσ(1),1...xσ(n),n. • ϕ est une forme n-linéaire sur E car linéaire par rapport à chaque variable. • ϕ est non nulle car ϕ(e1, ..., en) = X σ∈Sn ε(σ)δσ(1),1...δσ(n),n = 1. • ϕ est alternée. En effet, soit (x1, ..., xn) ∈ E n tel que xi = xj pour un certain couple (i, j) tel que i 6= j. Soit τ = τi,j. On sait que si An est l’ensemble des permutations paires, Anτ est l’ensemble des permutations impaires. Donc ϕ(x1, ..., xi, ..., xj, ..., xn) = X σ∈An xσ(1),1...xσ(n),n − X σ∈An xστ(1),1...xστ(n),n = X σ∈An xσ(1),1...xσ(i),i . . .xσ(j),j . . . xσ(n),n − X σ∈An xσ(1),1...xσ(j),i . . . xσ(i),j . . . xσ(n),n = X σ∈An xσ(1),1...xσ(i),i . . .xσ(j),j . . . xσ(n),n − X σ∈An xσ(1),1...xσ(j),j . . . xσ(i),i . . . xσ(n),n (car xi = xj) = 0. Finalement, Λ∗ n(E) = Vect(ϕ) avec ϕ 6= 0 et Λ∗ n(E) est un K-espace vectoriel de dimension 1. On a vu que ϕ(B) = 1 et que si f ∈ Λ∗ n(E), f = f(B)ϕ. Par suite, f(B) = 1 ⇔ f = ϕ. 2) Propriétés. Théorème. 1) detB(B) = 1. 2) detB′ = detB′ (B)detB. 3) detB(B′ ) × detB′ (B) = 1. 4) detB(B′ ) × detB′ (B′′) = detB(B′′). Démonstration. On applique : ∀f ∈ Λ∗ n(E), f = f(B)detB. 3) Applications. a) Théorème. Soit B une base de E de dimension finie n > 1 et B′ une famille de n vecteurs de E. B′ est une base de E si et seulement si detB(B′ ) 6= 0. Démonstration. Si B′ est une base, detB(B′ ) × detB′ (B) = 1 et en particulier detB(B′ ) 6= 0. Si B′ n’est pas une base, puisque card(B) = n, B′ est liée. Par suite, l’un des vecteurs de B′ est combinaison linéaire des autres vecteurs de B′ . Par n linéarité de detB et puisque detB est alternée, on a bien detB(B′ ) = 0. b) Orientation. Soient B et B′ deux bases de E 6= {0}. On définit la relation : « B′ a même orientation que B ⇔ detB(B′ ) > 0 ». La relation précédente est une relation d’équivalence à deux classes. On appelle arbitrairement l’une des deux classes, classe des bases directes et l’autre, classe des bases indirectes. L’espace E est alors orienté. III - Déterminant d’une matrice . Déterminant d’un endomorphisme . 1) Déterminant d’un endomorphisme. a) Définition. (Remarque. L’ordre de présentation 1) déterminant d’un endomorphisme 2) déterminant d’une matrice est un ordre qui permet de parvenir facilement à det(AB) = detA × detB). Soit f une forme n-linéaire alternée non nulle sur E (de dimension finie non nulle n) et soit u un endomorphisme de E. Soit fu définie par : ∀(x1, .., xn) ∈ E n, fu(x1, ..., xn) = f(u(x1), ..., u(xn)). fu est une forme n-linéaire alternée sur E. Donc il existe un scalaire λ tel que ∀(x1, ..., xn) ∈ E n, fu(x1, ..., xn) = λf(x1, ..., xn) ou encore telle que ∀(x1, ..., xn) ∈ E n, f(u(x1), ..., u(xn)) = λf(x1, ..., xn). Montrons que λ ne dépend que de u et pas de f. Soit g une forme n-linéaire alternée sur E. Puisque dimΛ∗ n(E) = 1, il existe un scalaire µ tel que g = µf. Donc pour (x1, ..., xn) ∈ E n g(u(x1), ..., u(xn)) = µf(u(x1), ..., u(xn)) = λµf(x1, ..., xn) = λg(x1, ..., xn). Définition. Soit u ∈ L (E). Le déterminant de u est le scalaire noté det(u) tel que ∀f ∈ Λ∗ n(E), ∀(x1, ..., xn) ∈ E n f(u(x1), ..., u(xn)) = (det(u)) × f(x1, ..., xn). c Jean-Louis Rouget, 2009. Tous droits réservés. 2 http ://www.maths-france.fr b) Propriétés. Théorème. Soit u ∈ L (E). 1) Pour toute base B de E, pour tout (x1, ..., xn) ∈ E n, detB(u(x1), ..., u(xn)) = (detu)detB(x1, ..., xn). 2) Pour toute base B de E, detB(u(B)) = det(u). Théorème. 1) det(Id) = 1. 2) ∀(u, v) ∈ (L (E))2 , det(u ◦ v) = (detu) × (detv). 3) ∀u ∈ L (E), (u ∈ GL(E) ⇔ detu 6= 0) et dans ce cas, det(u −1 ) = (detu) −1 . Démonstration. Soit B une base de E. 1) det(Id) = detB(B) = 1. 2) det(u ◦ v) = detB(u(v(B))) = (detu)detB(v(B)) = (detu)(detv). 3) Si u ∈ GL(E), (detu)(det(u −1 ) = det(u ◦ u −1 ) = det(Id) = 1 6= 0. Si detu 6= 0, detB(u(B)) = detu 6= 0 et u(B) est une base de E puis u ∈ GL(E). Théorème. ∀u ∈ L (E), ∀λ ∈ K, det(λu) = λ n(detu). Danger. En général, det(u + v) 6= detu + detv. 2) Déterminant d’une matrice carrée. a) Définition. Si A = (ai,j)16i,j6n ∈ Mn(K), le déterminant de A est le nombre detA = X σ∈Sn ε(σ)aσ(1),1...aσ(n),n. Notation. detA = |ai,j| 16i,j6n . b) Propriétés. Théorème. Soit u ∈ L (E) de matrice A dans une base B de E donnée. Alors det(u) = det(A). Théorème. Deux matrices semblables ont même déterminant Théorème. 1) L’application det : (Mn(K), ×) → (K, ×) A 7→ detA est un morphisme et l’application det : (GLn(K), ×) → (K∗ , ×) A 7→ detA est un morphisme de groupes. 2) ∀(A, B) ∈ (Mn(K))2 , det(AB) = (detA)(detB). 3) det(In) = 1. 4) ∀A ∈ Mn(K), [A ∈ GLn(K) ⇔ detA 6= 0] et dans ce cas det(A−1 ) = (detA) −1 . Le noyau du morphisme de groupes det : (GLn(K), ×) → (K∗ , ×) A 7→ detA c’est-à-dire l’ensemble des matrice carrées de déterminant 1 est un sous-groupe de (GLn(K), ×) noté SLn(K) (groupe spécial linéaire). Théorème. detA = det( tA). Démonstration. det( tA) = X σ∈Sn ε(σ)a1,σ(1) ...an,σ(n) . Soit σ un élément donné de Sn. Si on pose i1 = σ(1), ... , in = σ(n), alors σ −1 (i1) = 1, ..., σ −1 (in) = n. Le monôme a1,σ(1) ...an,σ(n) s’écrit aσ−1(i1),i1 ...aσ−1(in),in ou encore aσ−1(1),1 . . . aσ−1(n),n, après avoir remis dans l’ordre les n facteurs. Donc, det( tA) = X σ∈Sn ε(σ)a1,σ(1) ...an,σ(n) = X σ∈Sn ε(σ)aσ−1(1),1 . . . aσ−1(n),n = X σ∈Sn ε(σ −1 )aσ−1(1),1 . . . aσ−1(n),n = X σ′∈Sn ε(σ ′ )aσ′(1),1 . . . aσ′(n),n = det(A) car l’application σ 7→ σ −1 est une permutation de Sn (puisque application involutive de Sn dans lui-même). IV - Calculs de déterminants 1) Transposition. detA = det( tA) et donc toutes les règles portant sur les colonnes sont encore valables sur les lignes. 2) Matrices triangulaires. Le déterminant d’une matrice triangulaire est égal au produit de ses coefficients diagonaux (se montre directement en revenant à la définition). 3) Opérations élémentaires. a) ∀σ ∈ Sn, det(Cσ(1) , ..., Cσ(n)) = ε(σ)det(C1, ..., Cn). Quand on permute des colonnes, le déterminant est multiplié par la signature de la permutation. b) Si on ajoute à une colonne une combinaison linéaire des autres colonnes, le déterminant garde la même valeur. c Jean-Louis Rouget, 2009. Tous droits réservés. 3 http ://www.maths-france.fr c) det est n-linéaire et donc det(C1, ..., Ci+C ′ i , ..., Cn) = det(C1, ..., Ci, ..., Cn)+det(C1, ..., C′ i , ..., Cn) et det(C1, ..., λCi, ..., Cn) = λdet(C1, ..., Cn). Danger. det(A + B) 6= detA + detB en général et det(λA) = det(λC1, ..., λCi, ..., λCn) = λ ndet(C1, ..., Cn) = λ ndetA. 4) Calculs par blocs. Théorème. Si les Ai sont des matrices carrées, det   A1 × . . . × 0 . . . . . . . . . . . . . . . . . . × 0 . . . 0 Ap   = (detA1)(detA2)...(detAp). Démonstration. On le montre pour p = 2 puis on termine par récurrence. On montre que det  A C 0 B  = (detA)(detB) où A ∈ Mn(K) et B ∈ Mp(K). Soit ϕ : Mn(K) → K X 7→ det  X C 0 B  . ϕ « est » une application n-linéaire alternée des colonnes de X. Donc, il existe une constante k (indépendante de X) telle que ∀X ∈ Mn(K), ϕ(X) = k(detX) avec k = kdet(In) = ϕ(In) ce qui montre que ∀X ∈ Mn(K), det  X C 0 B  = detX × det  In C 0 B  . De même, l’application ψ : Mn(K) → K X 7→ det  In C 0 Y  « est » une application p-linéaire alternée des lignes de Y. Donc, il existe une constante k ′ (indépendante de Y) telle que ∀Y ∈ Mp(K), ψ(Y) = k(detY) avec k ′ = k ′detIp = ψ(Ip). Ceci montre que ∀X ∈ Mn(K), ∀Y ∈ Mp(K), det  X C 0 Y  = detX×detY ×det  In C 0 Ip  . Enfin le déterminant d’une matrice triangulaire est égale au produit de ses coefficients diagonaux et donc det  In C 0 Ip  = 1. 5) Développement suivant une ligne ou une colonne. Théorème. Soient mi,j le mineur de ai,j et Ai,j = (−1) i+jmi,j = cofacteur de ai,j. Alors, ∀(i, j) ∈ J1, nK 2 , detA = Xn k=1 ai,kAi,k (développement suivant la ligne i) = Xn k=1 ak,jAk,j (développement suivant la colonne j) Démonstration. Il suffit de démontrer la formule de développement suivant une colonne car detA = det( tA). Ensuite, il suffit de démontrer la formule de développement suivant la première colonne car alors, si on veut développer suivant la colonne j, on effectue la permutation des colonnes Cj → C1 → C2... → Cj−1 dont la signature est (−1) j−1 (signature d’un cycle de longueur j), puis en développant suivant la première colonne, on obtient detA = (−1) j−1 Xn k=1 ak,j(−1) k+1mk,j = Xn k=1 ak,jAk,j. Il reste à démontrer la formule de développement suivant la première colonne. C1 est somme de n colonnes du type   0 . . . 0 ai,1 0 . . . 0   et par n-linéarité du déterminant, detA = Xn i=1 detAi où detAi = 0 a1,2 . . . . . . a1,n . . . . . . . . . 0 ai−1,2 . . . . . . ai−1,n ai,1 ai,2 . . . . . . ai,n 0 ai+1,2 . . . . . . ai+1,n . . . . . . . . . 0 an,2 . . . . . . an,n . c Jean-Louis Rouget, 2009. Tous droits réservés. 4 http ://www.maths-france.fr Si i = 1, un calcul de déterminant par blocs fournit detA1 = a1,1A1,1. Si i > 2, on passe Li en L1, L1 en L2,..., Li−1 en Li. On obtient detAi = (−1) i−1 ai,1 ai,2 . . . . . . ai,n 0 a1,2 . . . . . . a1,n . . . . . . . . . 0 ai−1,2 . . . . . . ai−1,n 0 ai+1,2 . . . . . . ai+1,n . . . . . . . . . 0 an,2 . . . . . . an,n = Ai,1 (calcul par blocs) et finalement detA = Xn i=1 ai,1Ai,1. V - Comatrice. Inverse d’une matrice . La comatrice de la matrice carrée A de format n est la matrice, notée comA, dont le coefficient ligne i, colonne j, est le cofacteur de l’élément ai,j de A, c’est-à-dire si A = (ai,j)16i,j6n, alors com(A) = (Ai,j)16i,j6n. La transposée de la comatrice de A est appelée matrice complémentaire de A et est notée A˜ . Théorème. 1) ∀A ∈ Mn(K), AA˜ = AA˜ = (detA)In ou encore A( t comA) = (t comA)A = (detA)In. 2) ∀A ∈ Mn(K), (A ∈ GLn(K) ⇔ detA 6= 0) et dans ce cas, A−1 = 1 detA ( t comA). Démonstration. Le coefficient ligne i, colonne j de At comA vaut Xn k=1 ai,kAj,k. • Si i = j, cette expression n’est autre que le développement de detA suivant sa i-ème ligne et vaut donc detA. • Si i 6= j, Xn k=1 ai,kAj,k est le développement suivant la ligne j du déterminant déduit de detA en remplaçant la ligne j de detA par sa ligne i (et en ne modifant pas sa ligne i). Cette expression est donc nulle puisque égale à un déterminant ayant deux lignes identiques. Ensuite , il est clair que com( tA) = t (comA) (à partir de la définition de comA) et donc AA˜ = (t comA)A = com( tA) t ( tA) = t ( tAt (com( tA)) = t ((det( tA)In) = (detA)In. c Jean-Louis Rouget, 2009. Tous droits réservés. 5 http ://www.maths-france.fr

Voir également :

MATHEMATIQUESFINANCI ..> 29-Sep-2011 14:28 12K STATISTIQUEDESCRIPTI ..> 08-Oct-2011 11:48 15K approxautosemblabler ..> 29-Sep-2011 14:52 3.1k baseslogrecurrences.htm 29-Sep- 2011 14:57 21K calculfinanciervaleu ..> 29-Sep-2011 14:31 100K elementsmathfi.htm 29-Sep-2011 14:40 15K financementetemprunt ..> 29-Sep-2011 14:39 10K infocontextuelleweb.htm 29-Sep- 2011 15:02 81K informationpersonnal ..> 29-Sep-2011 15:04 156K interrogationflexibl ..> 29-Sep-2011 15:09 13K martingalepourlafina ..> 08-Oct-2011 11:48 47K mathematiquefinancie ..> 29 - Sep-2011 17:54 45K mathfiPGM110.htm 29-Sep-2011 14:44 10K mathphysiquestatisti ..> 29-Sep-2011 14:55 22K methodestatistiquege ..> 18-Oct-2011 17:05 116K metiersmathematiques ..> 29 -Sep-2011 14:51 114K probabilitesimulatio ..> 08-Oct-2011 11:48 96K rechercheinfobibliot ..> 29-Sep-2011 15:12 30K securiteinformatique ..> 29-Sep-2011 15:00 20K statistiqueelementai .. > 08-Oct-2011 11:48 101K 
Pattern Recognition 40 (2007) 3277 – 3291 www.elsevier.com/locate/pr Signi?cant edges in the case of non-stationary Gaussian noise I. Abrahama , R. Abrahamb , A. Desolneux c,* , S. Li-Thiao-Te d a CEA/DIF, 91680 Bruyères le Chatel, France b Laboratoire MAPMO, Fédération Denis Poisson, Université d’Orléans, B.P. 6759, 45067 Orléans cedex 2, France c Laboratoire MAP5, Université René Descartes, 45 rue des Saints-Pères, 75270 Paris cedex 06, France d Laboratoire CMLA, ENS Cachan, 61 avenue du Président Wilson, 94235 Cachan cedex, France Received 8 June 2006; received in revised form 27 February 2007; accepted 28 February 2007 Abstract In this paper, we propose an edge detection technique based on some local smoothing of the image followed by a statistical hypothesis testing on the gradient. An edge point being de?ned as a zero-crossing of the Laplacian, it is said to be a signi?cant edge point if the gradient at this point is larger than a threshold s () de?ned by: if the image I is pure noise, then the probability of ?I (x )s () conditionally on I (x ) = 0 is less than . In other words, a signi?cant edge is an edge which has a very low probability to be there because of noise. We will show that the threshold s () can be explicitly computed in the case of a stationary Gaussian noise. In the images we are interested in, which are obtained by tomographic reconstruction from a radiograph, this method fails since the Gaussian noise is not stationary anymore. Nevertheless, we are still able to give the law of the gradient conditionally on the zero-crossing of the Laplacian, and thus compute the threshold s (). We will end this paper with some experiments and compare the results with those obtained with other edge detection methods.  2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. Keywords: Edge detection; Signi?cant edges; Inverse problem; Statistical hypothesis testing 1. Introduction This work stems from analysis of physical experiments where a test object is submitted to a shock. More precisely, we surround the object with explosives and we monitor the shape of the hull as it ?rst collapses onto itself and then expands. The shape of inner hole(s) at the very beginning of the re-expansion phase is of particular interest. We use radiography to determine those shapes. Standard tomography cannot be carried out because the inverse Radon transform requires many radiographs (taken from different angles), whereas only one radiograph can be acquired in the lapse of the expansion phase. That is why the hulls are supposed to be radially symmetric at the beginning of the experiment and during the explosion. In that case, * Corresponding author. Tel.: +33 1 44 55 35 26. E-mail addresses: isabelle.abraham@cea.fr (I. Abraham), romain.abraham@univ-orleans.fr (R. Abraham), agnes.desolneux@math-info.univ-paris5.fr (A. Desolneux), lithiaote@cmla.ens-cachan.fr (S. Li-Thiao-Te). 0031-3203/$30.00  2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2007.02.015 a single radiograph is enough to perform the tomographic reconstruction (see [1,2], and Section 4.1). Thus, from now on, we will only consider radially symmetric objects. To describe such an object, it is suf?cient to provide the densities of the materials on a slice of the object that contains the symmetry axis. An example of studied object is given in Fig. 1. The ?rst step of the physical experiment is to take a radiography of this object (see Fig. 2(a)), then a tomographic reconstruction is computed (Fig. 2(b)) and ?nally an edge detection is performed (Fig. 2(c)). As we can see in Fig. 2(c), many detected edges do not correspond to real features. This is due to the high level of noise. Our goal is to perform an automated selection of the “true” edges. For that purpose, the edge detector will not be changed but we will compute also other signi?cant features that will allow us to select the “true” edges. Let us stress that our goal is not to develop a new edge detector but to give a method of selection of edges from the map given by an edge detector. We focus here on an edge detector which is certainly not optimal but where the computations are easy. However, we think that our method can be applied to other edge detectors3278 I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 such as the Canny detector [3]. In his paper, Canny points out the fact that “it is very dif?cult to set a threshold so that there is small probability of marking noise edges while retaining high sensitivity.” And the solution he proposes to solve this problem is the use of hysteresis. Here, we will use the zero-crossings of the Laplacian to de?ne edge points. We will see that in this case, we can compute a threshold so that there is a small probability of selecting edges due to noise. The ideas used here mainly come from previous work of Desolneux et al. [4]. Informally speaking, they de?ne the notion of signi?cant edges by computing the probability that some edge-related events appear in an image of pure noise. When this probability is small enough, the edge is probably a feature of the image and not due to noise. Unfortunately, that method assumes that the noise is stationary (i.e., the law of the noise does not depend on the location in the image) which, as is easily seen in Fig. 2(b), is not the case in our study because of the tomographic inversion (see Section 4.5.1 for some examples of results obtained with their method). Moreover, their study is quite general and, apart from the stationarity, no assumption is made on the noise. In our case, as we deal with speci?c images, the noise is well-known and some statistical models can be used. Indeed, we may suppose that the noise on the radiograph (Fig. 2(a)) is a Gaussian white noise with mean zero and with a variance that can easily be estimated. After tomographic inversion, we still obtain Gaussian noise because that operation is linear, but the noise process is correlated and non-stationary. The edge detector we will use here consists in estimating the Laplacian at each point, and edge points are then de?ned as the Fig. 1. Slice of a studied object. Fig. 2. (a) Radiograph of the object of Fig. 1. (b) Tomographic reconstruction. (c) Edge detection on the reconstructed image. zero-crossings of the Laplacian. As already said, we will compute additional features that discriminate the signi?cant edges. The easiest feature to compute is a kind of contrast measurement C based on a gradient estimate. To be more precise, we consider an image I of pure noise (that is a realization of our model of noise after tomographic reconstruction), estimate the gradient and the Laplacian of I at a point (u, v) (with an abuse of notation, we will denote by ?I (u, v) and I (u, v) these estimates and by C(u, v) the contrast value) and we compute, for a ?xed  > 0, the smallest value s () such that P(C(u, v)s ()|I (u, v) = 0), (1) where P(B|A) denotes the conditional probability of B given the event A. Then, we perform an edge detection on an image f of interest (where we also estimate ?f and f by the same method) and we keep the points (u, v) of the studied image f that satisfy: • f (u, v) = 0 (an edge is present at point (u, v)). • C(u, v)s () (this edge is signi?cant). From a statistical point of view, this consists in performing a hypothesis test. We consider a point (u, v) where an edge takes place (f (u, v) = 0) and we test the null hypothesis “the zerocrossing of the Laplacian is due to the noise.” The level of the test  is arbitrarily chosen and related to the number of false detections allowed. It will be set to  = 10-5 hereafter (in a 512 × 512 image, this corresponds to 512 × 512 × 10-5  2.6 false detections on average). Let us mention that the threshold value s () varies slowly with respect to . For instance, in the case of a white noise (see Section 3), the threshold value can be computed explicitly and is proportional to v - ln . This implies that when we increase the speci?city (less false detections), the threshold increases slowly and the method retains much of its sensitivity. When the null hypothesis is rejected, the edge is retained as it comes from a “true” feature of the image, whereas when the null hypothesis is accepted, the zero-crossing of the Laplacian may come from the noise and the edge is not signi?cant. Let us mention at this point that such statistical approaches have already been used for edge detection in [5–7]. In [8], Leclerc and Zucker develop a technique to capture discontinuities de?ned (in one dimension) as points such that the estimated left- and right-hand limits are statistically signi?cantlyI. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 3279 different. The main dif?culty in their approach is that they need to have good estimates of the parameters of the ?tted curves (generally polynomials) on half-neighborhoods of points in order to apply some statistical test. To solve this problem, they propose a procedure to eliminate points where the estimates are not good, which are points whose half-neighborhoods overlap (unknown) discontinuities. Other methods usually employ estimates of the gradient based on ?nite differences which fail in our case because these are not robust enough to high levels of noise. Moreover, the noise is in most cases stationary. Let us also cite Ref. [9] where the authors have modi?ed the method of Ref. [4] to take into account the non-stationarity of some images, by a local noise estimate. Their work is still general and does not make any assumption on the noise structure. As we deal with speci?c experiments, the noise is always the same and well-known and we can take proper advantage of this knowledge. The problem of ?nding an optimal threshold is also a key point for image thresholding methods [10] and particularly for the binarization of document images [11]. In some cases, the thresholding is not stationary: this happens for instance in images with non-uniform illumination. Our framework here is different since we deal with a well-known noise, and the question we address is: given an edge detection method, what is the probability that an edge point is due to noise? The paper is organized as follows: in Section 2, we present the edge detector based on the estimates of the gradient and of the Laplacian. Then, in Section 3, our method is presented in the case of a Gaussian white noise. Of course, this does not correspond to our case but the computations are easier and demonstrate the performance of this framework. In Section 4.1, we will ?rst describe the tomographic inversion and the operators involved, and then describe the noise model we have to deal with. We will then apply the signi?cant edges detection method in the framework of this non-stationary noise. We will end the section with some experiments and comparisons with other methods. 2. Estimating the gradient and the Laplacian In this section, we introduce the method for edge detection. We consider that an image is a real-valued function (u, v) ? f (u, v) of two continuous real parameters u and v. Then, we say that there exists an edge at point (u, v) if the Laplacian of f is zero at this point. Moreover, the computation of the contrast function C will be based on the gradient of f (see the end of this section for the choice of this function). As the images are often very noisy, these derivatives cannot be estimated by usual ?nite differences. The method used here, sometimes known as Savitsky–Golay smoothing [12], consists in locally approximating the image by a polynomial. The derivatives of the polynomial are then identi?ed with those of the image. 2.1. An optimization problem Let (u, v) denote the point where we want to compute the ?rst- and second-order derivatives of the image f. We chose two parameters: d which is the maximum degree of the approximating polynomial and r which is the radius of the ball on which we perform the approximation. We denote by Br (u, v) the ball of radius r centered at point (u, v). We will simply write Br when the center of the ball is the origin (0, 0) of R 2 . We are then looking for a polynomial P of degree less than d such that E(P ) =  Br (f (u + x , v + y ) - P (x , y )) 2 dx dy (2) is minimal among all polynomials of degree less than d. In other words, we are looking for the best approximation of f by a polynomial of degree less than d on the ball Br (u, v) in the sense of the L 2 -norm. This is an optimization problem where the unknowns are the coef?cients of the polynomial. As the problem is convex, there is a unique solution (given by the orthogonal projection of f on the space of polynomials of degree less than d) which is easily computed by solving the set of equations E ai = 0, where the ai ’s denote the coef?cients of the polynomial. Role of the ball radius: Two parameters are arbitrarily chosen in this method. The ?rst one is the ball radius the r. The larger the r is, the more effective the smoothing is. The in?uence of the noise is therefore attenuated with a large r but the location of the edge is then less precise. We must consequently make a balance between noise smoothing and edge detection accuracy. For instance, if we have a small level of noise or if the edges are very complicated (with high curvature), we must choose a small value for r. Role of the polynomial degree: The second parameter is the polynomial degree. Here again a large value of d gives a better approximation but does not smooth the noise enough. In fact, as we are, in a ?rst step, interested in the points where the Laplacian is zero, it appears that a second-order polynomial is enough. Of course, the estimate of the ?rst-order derivatives with a polynomial of degree 2 is not very good and highly depends on the size of the window Br . But we will see that this drawback can be useful for the choice of a contrast function. Choice of the L 2 -norm: Approximating a function in an L 2 -sense, although quite usual, is not always accurate as some oscillations may appear or local bad approximations are allowed (this usually occurs near the boundary of the domain). However, as we will only look for polynomials of degree 2, the oscillations do not appear and the L 2 - norm gives a good enough approximation together with easy computations. In what follows, the approximation is made with a polynomial of degree d =2, and the ?rst- and second-order derivatives of the image are identi?ed with those of the approximating polynomial.3280 I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 2.2. Computations with a second-order polynomial Let us ?rst introduce some notations. In the following, we will set ?i, j ? N, bij (r) =  Br x i y j dx dy. As the ball Br is symmetric, we have that bij (r) = 0 if i or j is odd and that bij (r) = bj i (r) for all i,j. In order to have simple expressions, we will also set b(r) = b20(r), (r) = - 2b20(r) b00(r) and (r) = 1 2  b40(r) + b22(r) - 2b 2 20 (r) b00(r)  . Lemma 1. The gradient and the Laplacian of the polynomial P of degree 2 which is the best approximation of f on the ball Br (u, v) for the L 2 -norm, being, respectively, denoted by ?r f (u, v) = (( r f /x)(u, v), ( r f /y)(u, v)) and r f (u, v), are given by  r f x (u, v) = 1 b(r)  Br xf (u + x , v + y ) dx dy,  r f y (u, v) = 1 b(r)  Br yf (u + x , v + y ) dx dy, r f (u, v) = 1 (r)  Br f (u + x , v + y ) × ((r) + x 2 + y 2 ) dx dy. Proof. We consider a polynomial of degree 2 which we write as P (x , y ) = ax x x 2 + ayy y 2 + axy xy + ax x + ay y + a0. The equations obtained by writing ?E(P ) = 0, where E(P ) is given by Eq. (2), are ? ? ? b40(r)ax x + b22(r)ayy + b20(r)a0 = Br x 2 f (u + x , v + y ) dx dy , b22(r)ax x + b40(r)ayy + b20(r)a0 = Br y 2 f (u + x , v + y ) dx dy , b22(r)axy = Br xyf (u + x , y + v) dx dy , b20(r)ax = Br xf (u + x , y + v) dx dy , b20(r)ay = Br yf (u + x , v + y ) dx dy , b20(r)ax x + b20(r)ayy + b00(r)a0 = Br f (u + x , v + y ) dx dy . Fig. 3. Object with an inhomogeneous material. We then obtain the following estimates for the derivatives: P x (0, 0) = ax = 1 b20(r)  Br xf (u + x , v + y ) dx dy, P y (0, 0) = ay = 1 b20(r)  Br yf (u + x , v + y ) dx dy, P (0, 0) = 2(ax x + ayy ) = 2 b40 + b22 - 2b 2 20 /b00  Br f (u + x , v + y ) × - 2b20 b00 + x 2 + y 2 dx dy .  2.3. Choice of the contrast function We would like to use a contrast function based on the estimates of the ?rst and second derivatives of the image f obtained in the previous section. The simplest contrast function we can choose is the norm of the gradient: C1(u, v) = ?r f (u, v). Indeed, the value of this norm tells how sharp the edge is. This contrast function performs reasonably well and will be used when the images we deal with are piecewise constant. However, in many cases, the objects we handle are not homogeneous and their images contain some slopes (see Fig. 3). In this case, the gradient norm is not a good contrast function. Indeed, let us consider an image with a constant slope with some noise (see Fig. 4). We would like to say that no edge is signi?cant in that case. However, the value of the gradient norm (which will be close to the value of the slope) will always be greater than the threshold value s when the noise level is small. In the latter case, we take advantage of the dependence of the ?rst-order derivative estimates with respect to the ball radius. Indeed, the estimates of the gradient in the case of the constant slope in Fig. 4 will not depend on the size of the window (see Fig. 4) whereas, when an edge (a discontinuity) occurs, the estimates do depend on that radius (see Fig. 5). So, we can useI. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 3281 Fig. 4. A noisy constant slope: the gradient of the approximating polynomial does not depend on the radius r. Fig. 5. An edge on a noisy slope: the approximating polynomials with two different values of the radius r. as a contrast function the function C2(u, v) = ?r1 f (u, v) - ?r2 f (u, v), where r1 < r2 and ?r f denotes the value of the gradient estimate with a ball of radius r. 3. Signi?cant edges in the case of a Gaussian white noise 3.1. White noise and Wiener integral We recall here the de?nition and the main properties of a white noise in a continuous setting and of the Wiener integral. We refer to Refs. [13–15] for more on white noise and the Wiener integral. De?nition 1. A Gaussian white noise on R 2 of variance  2 is a random function W de?ned on the Borel sets A of R 2 of ?nite Lebesgue measure (denoted by |A|) such that • W(A) is a Gaussian random variable (r.v.) with mean 0 and variance  2 |A|. • If A1 n A2 = Ø, the r.v. W (A1) and W (A2) are independent and W (A1 ? A2) = W (A1) + W (A2). Such a function W exists but is not a true measure since the two-parameter process B(s , t ) := W ((0, s] × (0, t ]) (with the convention that (0, x] is [x , 0) when x is negative) is of unbounded total variation. This process is usually called the Brownian sheet. Nevertheless we can de?ne the so-called Wiener integral f dW for every function f in L 2 (R 2 ). We can also de?ne the derivatives of the Brownian sheet in the sense of Schwartz distributions (although the Brownian sheet is nowhere differentiable). Thus, we de?ne B (s, t ) ? =  2 B(s , t ) st and we have  f dW =  R 2 f (u, v)B(u, v) ? du dv a.s. for every function f in the Schwartz space. With a slight abuse of notations, we call B? a Gaussian white noise and we always denote by R 2 f (u, v)B(u, v) ? du dv the Wiener integral with respect to this white noise, for every function f ? L 2 . The main properties of this integral are: • For every f, the r.v. R 2 f (u, v)B(u, v) ? du dv is a Gaussian r.v. with mean 0 and variance  2 R 2 f (u, v) 2 du dv. • For every f, g, the random vector  R 2 f (u, v)B(u, v) ? du dv,  R 2 g(u, v)B(u, v) ? du dv is Gaussian with cross-covariance  2  R 2 f (u, v)g(u, v) du dv. We will use these properties to compute the laws of ?I and I . 3.2. Laws of the gradient and of the Laplacian We suppose here that our noise is a Gaussian white noise, of variance  2 . As we have already said, this case is not the one we are interested in and our method is probably out-performed by other standard methods in that case. The goal of this section is to present our method in a simple case where the computations are easy and can be carried out in a continuous setting. We will only focus here on the case of piecewise constant objects and therefore we will use the contrast function C1. Lemma 2. If the image I is a Gaussian white noise of variance  2 , then  r I x ,  r I y , r I 3282 I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 is a Gaussian random vector with mean zero and covariance matrix ? ?  2 b(r) 0 0 0  2 b(r) 0 0 0 V (r, ) ? ? , where V (r, ) =  2  2 (r)  Br ((r) + x 2 + y 2 ) 2 dx dy. Proof. We compute the laws of the approximate derivatives of I when I = B? . We recall that these derivatives are given by  r I x (u, v) = 1 b(r)  Br xB (u ? + x , v + y ) dx dy,  r I y (u, v) = 1 b(r)  Br yB (u ? + x , v + y ) dx dy, r I (u, v) = 1 (r)  Br B (u ? + x , v + y ) × ((r) + x 2 + y 2 ) dx dy. Because of the stationarity of B? , they have the same law as  r I x (0, 0) = 1 b(r)  Br xB (x , y ) ? dx dy,  r I y (0, 0) = 1 b(r)  Br yB (x , y ) ? dx dy, r I (0, 0) = 1 (r)  Br B (x , y )( ? (r) + x 2 + y 2 ) dx dy. As we deal with Wiener integrals, we deduce that the random vector  r I x ,  r I y , r I is a Gaussian random vector with mean zero. To compute its covariance matrix, let us recall that, if X and Y are random variables de?ned by X =  Br h1(x, y)B (x , y ) ? dx dy, Y =  Br h2(x, y)B (x , y ) ? dx dy, then we have Cov(X, Y ) =  2  Br h1(x, y)h2(x, y) dx dy. Consequently, we have for instance: Cov  r I x ,  r I y =  2 b 2 (r)  Br xy dx dy = 0. By some analogous calculations, we ?nally get the following covariance matrix for our Gaussian random vector: ? ?  2 b(r) 0 0 0  2 b(r) 0 0 0 V (r, ) ? ? , where V (r, ) =  2  2 (r)  Br ((r) + x 2 + y 2 ) 2 dx dy .  Let us recall that two uncorrelated components of a Gaussian random vector are independent random variables (which is not the case in general). Therefore, we immediately have the following properties: • The random variable ?r I 2 is the sum of two squared independent Gaussian random variables which have the same variance. It is therefore distributed as a  2 -law. More precisely, its law is  2 b(r)  2 (2), where  2 (2) denotes a  2 -law with two degrees of freedom. • The random variable r I is a Gaussian random variable with mean zero and variance V (r, ). • The random variables ?r I 2 and r I are independent. 3.3. Computation of the threshold Proposition 1. Let I be a Gaussian white noise and let s () be the threshold value such that P(?r Is ()|r I = 0). Then, s () =  - 2 2 b(r) ln . Proof. To begin with, as the random variables ?r I 2 and r I are independent, we can forget the conditioning and only compute P(?r Is ()) = P(?r I 2 s () 2 ). As a consequence of Lemma 2, we have that the law of ?r I 2 is ( 2 /b(r)) 2 (2). Now, since the density of a  2 (2) law is the one of a ( 1 2 , 1) law, we have that the law of ?r I 2 is given by P(?r I 2 s 2 ) =  +8 (b(r)/2 )s 2 1 2 e -t /2 dt = exp - b(r)s 2 2 2 . This ?nally leads to the announced threshold value s (). I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 3283 Fig. 6. (a) The noisy image ( = 0.2). (b) The zero-crossings of the Laplacian with the contrast function C1 visualized in gray level. (c) The extracted signi?cant edges ( = 10-5 ). Fig. 7. (a) The noisy image ( = 0.4). (b) The zero-crossings of the Laplacian with the contrast function C1 visualized in grey-level. (c) The extracted signi?cant edges ( = 10-5 ). Remark. Choice of the ball radius. Conversely, the result of Proposition 1 can also be used in order to ?x the ball radius using some prior knowledge on the edges. Indeed, if we know the signal-to-noise ratio (denoted by SNR) and if we suppose that the edges are not too complicated so that the image can be locally considered as the characteristic function of a half-plane, we can compute the estimated gradient for a given ball radius a, say g(a) (for a step of height 1): g(a) =  Ba x 1x 0 dx dy. Then, if we choose the ball radius a such that exp(- 1 2 b(a)SNR 2 g(a) 2 ) the edges are signi?cant. As we mentioned it before, the ball radius must be chosen close enough to this critical value in order not to smooth too much the edges. 3.4. Experiments We consider the piecewise constant object of Fig. 1 with some additive Gaussian white noise. Note that the noisy images here are obtained simply by summing the image of Fig. 1 and the pure white Gaussian noise image, and any tomographic reconstruction procedure is not applied unlike the experiments described in Section 4.5. The densities of the different materials of the object of Fig. 1 are: • 1 for the outer material. • 0.8 for the material inside the circle. • 0.3 for the other inner material. The standard deviation of the Gaussian noise is  = 0.2 in the experiments of Fig. 6 and is  = 0.4 in the experiments of Fig. 7. Both images have the same size 512 × 512 pixels. The experiments have been carried out with a ball of radius r = 12 pixels. The different images of Figs. 6 and 7, are, respectively: • (a) The noisy image. • (b) The zero-crossings of the Laplacian with the contrast function C1 visualized in gray level (the white color corresponds to high value for the contrast function C1). • (c) The extracted signi?cant edges ( = 10-5 ). In the case of a SNR large enough (Fig. 6), all the edges are well detected and the “false” edges are removed. However, the edges which have a high curvature are smoothed by our algorithm. This drawback is all the more important when the ball radius r is large (the in?uence of the value of this radius will be studied in the experiments of the next section).3284 I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 When the noise level is rather large (Fig. 7), some edges of the image cannot be extracted from the noise (this happens when the contrast associated with this edge is close to the noise level). Estimating the noise variance: If the variance of the Gaussian noise is not readily available, standard methods can be used in order to estimate it directly from the image. For instance, as it is assumed that the noise is a centered Gaussian white noise, the image f we are studying can be written as f (x , y ) = f0(x, y) + (x, y), where f0 is a deterministic function and is the additive Gaussian noise (with mean 0 and unknown variance  2 ). Let h be the pixel size and let us consider the random variable V (x , y ) = f (x + h, y) + f (x - h, y) + f (x , y + h) + f (x , y - y ) - 4f (x , y ). Then, V (x , y ) = h 2 f0(x, y) + o(h 2 ) + (x + h, y) + (x - h, y) + (x, y + h) + (x, y - h) - 4 (x, y). Therefore, if we suppose h small, we have E[V 2 (x, y)]  h 4 f0(x, y) 2 + 20 2 . Hence, as f0 is usually small, the quantity 2 = 1 20N2  1 x N 1 y N V 2 (x, y) is a good estimator of the noise variance. In any case, the variance is always over-estimated by this method, and consequently, the signi?cant edges detected when using 2 are also signi?cant for the variance  2 . 4. Signi?cant edges in the case of Gaussian white noise on a radiograph 4.1. Tomography Let us now turn to the more realistic case we are interested in. As we mentioned it in Introduction, we ?rst make a radiography of an object. Tomography is the inverse problem associated with reconstructing the initial object from its radiograph. This is now a well-known problem as it is the key tool in medical scanner imagery (or other medical imaging systems). To begin with, let us describe what radiography is from a mathematical point of view. The studied object is exposed to X-rays that traverse it and some of the X-photons are absorbed. In the output, we observe the quantity of X-photons that have not been absorbed by the material, and we thus measure in some sense the “mass” of material the ray went through. More precisely, if the object is described by its density (which is y x u v axis of symmetry Fig. 8. Radiography of a radially symmetric object. a function of the space coordinates), what can be measured at any point of the receptor is  ray d, where “ray” means the straight line that goes from the source to the studied point of the receptor (we suppose that the X-ray source is just a point, which implies that the previous line is unique). We also assume that the X-ray source is far away from the object so that the rays are assumed to be all parallel. Then, to reconstruct an object from its radiographs, we must rotate the object and take a radiograph for every angle ? [0, ). This leads to the so-called Radon transform of the object, which is known to be invertible. This is the principle of the medical scanner. In our case, as the object is radially symmetric, if we rotate around its symmetry axis, all the radiographs are exactly the same. Consequently, a single radiograph of such an object is enough to perform the tomographic reconstruction. Indeed, if f (x , y ) denotes the density along a slice that contains the symmetry axis (see Figs. 1 and 8), then a radiograph of this object is given by g(u, v) = 2  +8 |u| f (x , v ) x v x 2 - u 2 dx. This is a linear transform and we will denote it hereafter by g = Hf . As we already said, this linear operator H is invertible and we in fact know explicitly its inverse on the space of continuously differentiable functions g: f (x , y ) = (H -1 g)(x, y) = - 1  +8 x 1 v u 2 - x 2 g u (u, y) du. (The proof of this formula and more details about the operators H and H -1 can be found in the book of Bracewell [16, pp. 244–250]).I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 3285 Our assumption on the noise is that it is an additive Gaussian white noise on the radiograph (i.e., on g). But what we want is to study the object given by f. So we must transform the white noise by the operator H -1 . Unfortunately, because of the singularity of the integral at x = 0, we cannot apply the operator H -1 to a white noise B? , even in an L 2 -sense. Therefore, we will work in a discrete framework: the images f and g are naturally discretized (as they are numerical images). This leads to a discretization of the operator H, which we will still denote by H and which now may be viewed as a matrix. The discretization is made in such a way that the symmetry axis (x = 0) is settled between two pixels so that the previous singularity does not appear. This matrix is then invertible and we denote by H-1 its inverse which we can make now operate on a discrete Gaussian white noise. To give an idea of the ill-conditionedness of the matrix H, we can compute its condition number. For an image of size 512 × 512, we ?nd that this condition number is 377, which shows how H-1 ampli?es noise. 4.2. Law of the noise on the tomographic reconstruction Let us consider a ?eld = ( i,j )1 i p,1 j n of independent identically distributed (i.i.d.) random Gaussian variables with mean 0 and variance  2 . Let us de?ne I = (Ii,j ) the random ?eld obtained after tomographic reconstruction, i.e., after making H -1 operate on = ( i,j ). In fact, as the X-rays are supposed to be parallel, the reconstruction can be made line by line independently and therefore, if we consider the row vectors  i = ( i,1 , . . . , i,n ) and I i = (Ii,1, . . . , Ii,n) then, if we denote M = (H t ) -1 (where H t is the transpose of H), we have I i =  iM. (Notice that M is independent of i and that its size is n × n.) Consequently, the law of I is characterized by the following properties: • I = (Ii,j ) is a Gaussian random ?eld. • For i = k, I i and I k are independent. • For each i, the vector I i is a Gaussian random vector of mean 0 and covariance matrix  =  2 Mt M, where Mt denotes the transpose of M. 4.3. Laws of the gradient and of the Laplacian The expressions obtained in Section 2 for the gradient and for the Laplacian of an image in a continuous setting are easily translated in the discrete framework we now deal with. Indeed, we have  r I x (u, v) = 1 b(r)  (i,j )?Br j I u+i,v+j ,  r I y (u, v) = 1 b(r)  (i,j )?Br i I u+i,v+j , r I (u, v) = 1 (r)  (i,j )?Br ((r) + i 2 + j 2 )Iu+i,v+j , where Br now denotes the discrete ball of radius r, i.e., Br = {(i, j ), i 2 + j 2 r 2 } and where the constants (r), (r), b(r), etc. are the discrete analogs of the constants in Section 2. With these estimates, the contrast functions C1 and C2 are easily computed. They are both of the form C(u, v) =  C2 x (u, v) + C2 y (u, v), with Cx (u, v) =  i,j j c ij Iu+i,v+j and Cy (u, v) =  i,j i cij Iu+i,v+j , where the coef?cients cij are given by: (1) In the case of the contrast function C1, cij = 1 b(r) 1(i,j )?Br . (2) In the case of the contrast function C2 with two balls of radius r1 < r2, cij = 1 b(r1) 1(i,j )?Br 1 - 1 b(r2) 1(i,j )?Br 2 , where 1A denotes the characteristic function of the event A (its value is 1 if A is true and it is 0 otherwise). Therefore, the computations of the laws will be similar and they will be treated simultaneously using the coef?cients cij . When the contrast function C2 is used with two radii r1 < r2, we compute the Laplacian r I with the larger ball radius, that is with r = r2. Lemma 3. For both contrast functions C1 and C2, the vector (Cx (u, v), Cy (u, v), r I (u, v)) is a Gaussian random vector with mean 0 and covariance matrix of the form: ? ?  2 x 0 x , 0  2 y 0 x , 0  2  ? ? . In particular, we have that Cy is independent of (Cx , r I ).3286 I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 Proof. The lemma is a consequence of the two following remarks. The ?rst one is that, in both cases for the contrast function, the coef?cients cij are symmetric: ci,j = c-i,j and ci,j = ci,-j . Thus, they satisfy that whenever k or l is odd then  (i,j )?Br i k j l cij = 0. (3) The second remark is that the random vectors I i and I k are independent if i = k. And we thus have E[Ii,j Ik,l ] =  0 if i = k, (j, l) if i = k, where E[X] denotes the expectation (i.e., the mean value) of the random variable X. We can now compute the covariance matrix. For instance, let us start with E[CxCy ] =  (i,j,k,l) j kc ij ckl E[Iu+i,v+j Iu+k,v+l ], =  (i,j,l) j i c ij ci l(v + j , v + l ), =  (j,l) j(v + j , v + l )  i i cij ci l = 0. Similar computations give E[Cyr I ] = 0 and  2 x := E[C 2 x ] =  (i,j,l) j l c ij ci l(v + j , v + l ),  2 y := E[C 2 y ] =  (i,j,l) i 2 cij ci l(v + j , v + l ),  2  := E[(r I ) 2 ] = 1  2 (r)  (i,j,l)?r ((r) + i 2 + j 2 ) × ((r) + i 2 + l 2 )(v + j , v + l ), x , := E[Cxr I ] = 1 (r2)  (i,j,l)?r 2 j c ij ((r) + i 2 + l 2 )(v + j , v + l ), where we have set r = {(i, j , l ) such that (i, j ) ? Br and (i, l) ? Br }.  4.4. Computation of the threshold As the ?rst- and second-order derivatives are no longer independent, we must compute the conditional law of the contrast function knowing that r I = 0. Proposition 2. Let C be one of the two contrast functions. Then, the random variable C 2 is distributed, conditionally on {r I =0}, as the sum of the square of two independent Gaussian random variables, with mean zero and respective variance  2 y and  2 x|=0 =  2 x  2  -  2 x ,  2  , that is a Gamma law with parameters 1 2 and 1 2 ( 2 y +  2 x|=0 ). Remark. The threshold value s () de?ned by P(Cs ()|r I = 0) can no longer be computed explicitly but a numerical approximation is easy to get as the Gamma density is well known. Proof. Cy is independent of the pair (Cx , r I ). Thus, conditionally on {r I = 0}, the random variables Cy and Cx are still independent and the conditional law of Cy is the Gaussian distribution with mean 0 and variance  2 y . Now, if D2 :=  2 x  2  -  2 x , = 0, then the law of the pair (Cx , r I ) has a density which is given by fx ,(t1, t2) = 1 2 D e -1/2(t1,t2)(t1,t2) t where  is the inverse of the covariance matrix, i.e.,  = 1 D2   2  -x , -x ,  2 x  . Let us recall that, if f denotes the Gaussian density of r I , then the law of Cx conditionally on r I =0 has a density given by fx ,(t1, 0) f(0) and so is Gaussian with mean zero and variance  2 x|=0 = D2  2  · This result is still valid when D = 0 since it implies that Cx and r I are proportional and thus the law of Cx conditionally on r I = 0 is Gaussian with mean 0 and variance 0 (it is not random anymore).  4.5. Experiments 4.5.1. Case of a piecewise constant object To begin with, we still study the piecewise constant object of Fig. 1 described in Section 3.4. Let us recall that this image represents a slice of the object that contains the symmetry axis. The three-dimensional object is obtained by rotation around the vertical axis that goes through the middle of the image. In that case, we will use the contrast function C1, which is simply the norm of the gradient. The experiments of Fig. 9 correspond to a ball radius r =12 pixels. The size of the images is 512 × 512 pixels.I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 3287 Fig. 9. (a) Reconstructed object from a single noisy radiograph, (b) contrast value at the zero-crossings of the Laplacian for the contrast function C1 and (c) signi?cant edges. -300 -200 -100 0 100 200 300 0.02 0.06 0.10 0.14 0.18 0.22 0.26 0.30 0.34 0.38 0.42 Fig. 10. Numerical computation of the threshold as a function of the distance from the symmetry axis for the noise level  = 4 in the case r = 12. We start with the image of the radiograph obtained after the application of matrix H to our initial image. Then a Gaussian white noise (with  = 4) is added to this radiograph. Then tomographic inversion (application of the matrix H -1 ) is performed. This gives the image of Fig. 9(a). As we already mentioned it, the noise is not stationary, it is now correlated and its variance depends on the distance from the symmetry axis. For instance, if the standard deviation of the Gaussian white noise on the radiograph is  = 4, the variance of the noise on the tomography is about 2 2 = 32 near the axis, 0.02 2 = 0.32 at a distance of 65 pixels from the axis and 8.10-3  2 = 0.128 at the edge of the image located on the right at 200 pixels from the axis. The threshold value can be numerically computed. As an illustration, we display in Fig. 10 the threshold value for the contrast C1 as a function of the distance from the symmetry axis (range from -256 to +256), in the case of a noise level  = 4 and of a ball radius r = 12. Some edges are not declared signi?cant by our algorithm near the symmetry axis; the noise is too pronounced there to distinguish the true edges from the noise. In other words, the threshold value near the symmetry axis (see Fig. 10) can be higher than the contrast associated with small differences in the density of the object. In those images, an edge with a low contrast is signi?cant if it is far enough from the symmetry axis (see also Fig. 11). Even when the edges are signi?cant, a high level of noise leads to a decrease in the location accuracy and increased raggedness of the edges. Moreover, some details are lost because of the smoothing due to the size of the ball. Since the edges separate two materials, one included in another, they must be closed curves. Currently, a human operator has to close them manually. Our method yields open edges. This does not imply that there is no frontier between the materials but only that the noise level is too high to give an accurate position of the edge. Therefore, we can either close the curves manually, or with standard curve completion methods, but this will not tell which closure is better (i.e., the closest to the real shape). Several methods are usual for curve completion. Let us mention • Methods based on graph exploration [17]. • Methods inspired by neurobiology [18]. • Snakes with ?xed tips [19]. Choice of the ball radius: Let us compare the results obtained with different ball radii (see Fig. 11). When the ball radius is small, the edges are more accurate but some are not signi?cant: the smoothing of the noise is not enough to get rid of it. On the contrary, when the radius is large, most of the edges are detected but small details are lost because of this smoothing. The choice of the ball radius depends essentially on the complexity of the edges (or the degree of accuracy we want). If we know (from prior knowledge) that the edges are very smooth, a large radius gives very good results (see the outer edge of our studied object). On the contrary, if we look for very complicated objects, we must choose smaller radii but some edges (with low contrast) may become non-signi?cant. The choice of radius r = 12 corresponds to an appropriate balance between the size of typical details and an effective smoothing of the noise in our case. Another strategy is to use both values r = 6 and 12. The ?rst value gives accurate edges but some true edges are not detected, and the value r = 123288 I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 Fig. 11. Signi?cant edges obtained with different ball radius: from left to right: r = 6, 12 and 20. gives more signi?cant edges but smoothed. A dif?cult question is then how to merge both pieces of information. Let us also mention that the “inverse” use of the formula of Proposition 1 in order to estimate a good value for the ball radius (as it was explained in the remark of Section 3.3) would lead to a value of the ball radius that depends on the distance from the symmetry axis (large values when close to the axis, small ones when far from it). Comparison with other methods: We will give here the results obtained with two other methods which have both the advantage of directly providing closed curves. • The ?rst method is the one introduced in Ref. [4]. One keeps only the meaningful level lines of the image, which are de- ?ned by: the minimum of the norm of the gradient along the level line is larger than a threshold T (). This threshold is computed from the gradient histogram of the image. The meaning of this de?nition is that such curves have a probability less than  to appear in a pure noise image (with same gradient histogram as the original image). The results obtained with this method are shown in Fig. 12. In the ?rst row: we smooth the image of Fig. 9(a) by convolution with a Gaussian kernel with respective standard deviation 2 and 4 pixels. And then, in the second row, we display the respective meaningful level lines. This experiment clearly shows that, since the noise model is not adapted to the image (in particular, the non-stationarity is not taken into account), many false contours are detected. • The second method is the famous Mumford–Shah segmentation for piecewise constant images [20]. Given an observed image g0 de?ned on a domain D, one looks for the piecewise constant approximation g of g0 that minimizes the functional E (g) =  D |g - g0| 2 +  Length(K(g)), where Length(K(g)) is the one-dimensional measure of the discontinuity set of g (which is a set of curves denoted by K (g)) and  is a parameter which weights the second term of the functional. The results obtained with this method are shown in Fig. 13 for three different values of , respectively;  = 42 882, 12 220 and 11 621 (the size of the image is 512 × 512). The implementation used here is the regionmerging algorithm described in Ref. [21]. The main drawFig. 12. First row: the image of Fig. 9(a) is smoothed by convolution with a Gaussian kernel with respective standard deviation 2 and 4 pixels. Second row: the meaningful level lines of each image. back of this method is that there is no veri?cation that the obtained contours are not due to the noise, and moreover the user has to give the value of  (or of the ?nal number of regions). We end this subsection with the results of Canny detector on the original image and on the images obtained after Gaussian ?ltering with respective standard deviation  = 2 and 4. The results are shown in Fig. 14. On these edge maps, no thresholding has been performed. The next step would be to compute the signi?cant edge points. According to the de?nition of a signi?cant edge point, the question would be: I being a pure noise image (with known parameters), what is the law of |?I |(x) conditionally on x being a Canny edge point? Such a computation is dif?cult to perform and should be part of some future work. The obtained result would probably be close to the ones obtained with our method (see for instance Fig. 11).I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 3289 Fig. 13. Results obtained with the Mumford–Shah segmentation for piecewise constant images, for three different values of  (see text). From left to right, the number of regions in the segmented image is, respectively, 3, 6 and 7. Fig. 14. From left to right: results of the Canny edge detector for, respectively, the original image (Fig. 9(a)), the image smoothed by convolution with a Gaussian kernel of width  = 2 (middle) and  = 4 (see ?rst row of Fig. 12). Fig. 15. (a) Inhomogeneous object, (b) its noisy radiograph ( = 4) and (c) tomographic reconstruction. 4.5.2. Case of an inhomogeneous material Let us turn now to a more realistic case: the materials are not homogeneous and consequently the object is no longer piecewise constant (see Fig. 15(a)). The object is now composed of an inhomogeneous material (with constant gradient) with homogeneous objects inside. In the ?rst experiment, the noise level is =4. The results (Figs. 15 and 16) are similar to the ones obtained for a homogeneous object in the previous section. The contrast function used here is C1. With the contrast function C2, the edges are not signi?cant anymore in this case because the noise level  is too large. In the second experiment (Fig. 17), the noise level is  = 0.4. It is lower than the one used in the previous experiments, and this is why the artifacts of tomographic reconstruction are less pronounced here. The noise level here is such that the Fig. 16. (a) Contrast value at the zero-crossings of the Laplacian for the contrast function C1 (r =12) and (b) signi?cant edges of the inhomogeneous object.3290 I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 Fig. 17. (a) Inhomogeneous object, (b) its noisy radiograph ( = 0.4) and (c) tomographic reconstruction. Fig. 18. (a) Signi?cant edges with the C1 contrast function (r = 12): there are many false detections. (b) Signi?cant edges with the C2 contrast function (r1 = 6 and r2 = 12): only the “true” edges are obtained. threshold computed with the contrast function C1 is lower than the gradient of the material. In that case the use of the contrast function C1 fails as many detected edges remain signi?cant. This is illustrated by Fig. 18(a). Fig. 18(b) gives the signi?cant edges obtained with the contrast function C2 with two ball radii r1 = 6 and r2 = 12. With this contrast function, we eventually get only the “true” edges. Choice of the ball radii: Here again, we must discuss how the radii are chosen. The discussion concerning the choice of the radius of the outer ball is exactly the same as for the contrast function C1. That is why we kept the value of 12. The inner ball radius must not be too close to the outer one as the difference between the two gradients would be too small. But this radius must also be not too small in order to still perform a good smoothing of the noise. The choice of the half of the outer radius is usually a good compromise. 5. Discussion and conclusion In this paper, we have introduced a method for the detection of signi?cant edges in noisy images when the parameters of the noise are known. The general idea of the method is to make a statistical test to decide whether an edge point is due to the noise or not. The edge detector we have used here is the zero-crossings of an estimated Laplacian of the image. We did not use a Gaussian ?lter to smooth the image and estimate its Laplacian (as it is done in Marr–Hildreth detector [22]), but rather local polynomial approximations of the image. The reason for this is that such an estimator is more robust to noise. We think that future extensions of our work should mainly focus on: • performing the same kind of computations for Marr–Hildreth detector for Canny detector and more generally for “derivative-based” edge detector (including non-isotropic ones); • being able to merge the information obtained at different scales (which are the different ball radii in our case). Indeed, as we have shown it in the experimental section, it is sometimes necessary to use the gradient estimated at two different scales to obtain good results; • extending the framework to tomography without rotational symmetry, and more generally to other situations of nonstationary noise (including possibly non-Gaussian ones). References [1] I. Abraham, R. Abraham, J.-M. Lagrange, F. Lavallou, Méthodes inverses pour la reconstruction tomographique X monovue, Revue Chocs 31 (2005) (chocs@cea.fr). [2] J.M. Dinten, Tomographie à partir d’un nombre limité de projections: régularisation par des champs markoviens, Ph.D. Thesis, Université ParisSud, 1990. [3] J. Canny, A computational approach to edge detection, IEEE Trans. Pattern Anal. Mach. Intell. 8 (6) (1986) 679–698. [4] A. Desolneux, L. Moisan, J.-M. Morel, Edge detection by Helmholtz principle, J. Math. Imaging Vision 14 (2001) 271–284. [5] R. Touzi, A. Lopes, P. Bousquet, A statistical and geometrical edge detector for SAR images, IEEE Trans. Geosci. Remote Sensing 26 (1988) 764–773. [6] P. Qiu, S. Bhandarkar, An edge detection technique using local smoothing and statistical hypothesis testing, Pattern Recognition Lett. 17 (1996) 849–872. [7] D. Marimont, Y. Rubner, A probabilistic framework for edge detection and scale selection, in: Sixth International Conference on Computer Vision, 1998. [8] Y.G. Leclerc, S.W. Zucker, The local structure of image discontinuities in one dimension, IEEE Trans. Pattern Anal. Mach. Intell. 9 (3) (1987) 341–355. [9] F. Cao, P. Musé, F. Sur, Extracting meaningful curves from images, J. Math. Imaging Vision 22 (2005) 159–181. [10] M. Sezgin, B. Sankur, Survey over image thresholding techniques and quantitative performance evaluation, J. Electron. Imaging 13 (1) (2004) 146–165.I. Abraham et al. / Pattern Recognition 40 (2007) 3277 – 3291 3291 [11] O.D. Trier, T. Taxt, Evaluation of binarization methods for document images, IEEE Trans. Pattern Anal. Mach. Intell. 17 (3) (1995) 312–315. [12] A. Savitsky, M.J.E. Golay, Smoothing and differentiation of data by simpli?ed least squares procedures, Anal. Chem 36 (1964) 1627–1639. [13] J.B. Walsh, An introduction to stochastic partial differential equations, in: Ecole d’été de Probabilités de Saint-Flour XIV 1984, Lecture Notes in Mathematics, vol. 1180, Springer, Berlin, 1986. [14] T. Hida, Brownian Motion, Applications of Mathematics, vol. 11, Springer, Berlin, 1980. [15] T. Hida, H.H. Kuo, J. Potthoff, L. Streit, White Noise. An In?nite Dimensional Calculus, Mathematics and Its Applications, vol. 253, Kluwer Academic Publishers, Dordrecht, 1993. [16] R.N. Bracewell, The Fourier Transform and its Applications, second ed., Mc Graw-Hill, New York, 1978. [17] R. Deriche, J.P. Cocquerez, G. Almouzni, An ef?cient method to build early image description, 9th International Conference on Pattern Recognition, Roma, 1988. [18] P. Gaussier, J.P. Cocquerez, Utilisation des réseaux de neurones pour la reconnaissance de scènes complexes: simulation d’un système comprenant plusieurs aires corticales, Traitement du signal 6 (1991) 441 –466. [19] L.D. Cohen, Etudes des modèles de contours actifs et autres techniques de traitement d’image, Ph.D. Thesis, University Paris-Sud, 1990. [20] D. Mumford, J. Shah, Boundary detection by minimizing functionals, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, 1985. [21] G. Koep?er, C. Lopez, J.-M. Morel, A multiscale algorithm for image segmentation by variational method, SIAM J. Numer. Anal. 31 (1) (1994) 282–299. [22] D. Marr, E. Hildreth, Theory of edge detection, Proc. R. Soc. London B 207 (1980) 187–217. About the Author—ISABELLE ABRAHAM is currently a researcher at the CEA (Centre Energie Atomique) in Bruyeres le Chatel (near Paris). She is working on image processing and tomographic reconstruction. About the Author—ROMAIN ABRAHAM is a professor of mathematics at the University of Orleans. He is working on stochastic processes (like Brownian snakes), and also on some problems in image analysis. About the Author—AGNES DESOLNEUX is a researcher at the CNRS (Centre National de la Recherche Scienti?que) at University Paris 5. She is working on stochastic models for images, and the geometrical analysis of images. About the Author—SEBASTIEN LI-THIAO-TE is currently a PhD student at the Ecole Normale Superieure de Cachan. He is working with Bernard Chalmond and Benno Schwikowski (Institut Pasteur, Paris) on applied mathematics for proteomic.