0%

Exploring Photobios

Introdution

Key Challenges

  1. the face appearance space is extremely high dimensional
  2. we generally have access to only a sparse sampling of this space
  3. the mapping of each image to pose, expression, and other parameters is not generally known a priori

Key Issues

  1. define the edges weights in th graph
  2. create a compelling, stabilized output sequence

Method

Automatic alignment and pose estimation

  1. Run a face detector
  2. Apply a fiducial points detector
  3. Use a pre-labeled 3D template model to estimate pose and warp the image to a frontal view for a more consistent computation of similarity

The face graph

Distance between faces

Local Binary Pattern(LBP) Hisrograms

  1. Divide an image to gird of cells

  2. Convert each pixel in a cell into a code which encodes the relative brightnss patterns in a square neighborhood around the pixel

    Each neghibor is assigned a 1 or 0 if it is brighter or darker than the center pixel

    The pattern of 1’s and 0’s defines a per pixel binary code

    The per cell histogram of these codes defines the descriptors a cell

  3. Calculate a separate set of descriptors for the eyes, mouth, and hair regions

    A descriptor for a region is a concatenation of participating cells’ descriptors

The distance bettwen two face images $i,j$, denoted $d_{ij}$, is defiend by $\chi^2$-distance between the corresponding descriptors, and then normalized by a logistic function $L(d)=(1+e^{-\gamma(d-\mu)/\sigma})^{-1}$(s.t. $\gamma=ln(99)$)

Appearance distance function
$$
D_{appear}(i,j)= 1 - (1-\lambda^md_{ij}^m)(1-\lambda^ed_{ij}^e)(1-\lambda^hd_{ij}^h)
$$

$d^{m,e,h}$: the LBP histogram distances restricted to the mouth, eyes, and hair regions, respectively

$\lambda^{m,e,h}$: the corresponding weights for there regions, fixed $\lambda^,=0.8,\lambda^e=\lambda^h=0.1$ in this paper

Difference in pose(yaw and pitch) and time (when timestamps are avaliable) $D_{yaw}, D_{pitch}, D_{time}$is measured by $L_2$ followed by a logistic function $L(d)$

The face graph

Each edge $(i,j)$ with weight $D(i,j)$ as
$$
D(i,j)=[1-\prod_{s\in app, yaw,pitch,time}(1-D_s(i,j))]^\alpha
$$

$\alpha$: nonlinearly scale the distances

Two ways of finding paths

  1. Look for the shortest paths using Dijkstra’s algorithm
  2. Produce a smooth path of arbitrary length by taking walks on the graph

The Cross dissolve

$$
I_{out}(t)=(1-t)i_{in_1}+tI_{in_2}
$$

$I_{in_1}, I_{in_2}$: the input images

$I_{out}$: the output sequences