FPVLS

Brief Introdcution

As the prevailing of video live streaming, establishing online pixelation, or at least online face pixelation mechanism, is an urgency. In this paper, we develop a new method called Face Pixelation in Video Live Streaming (FPVLS) to generate automatic personal privacy filtering during unconstrained streaming activities. Simply applying multi-face trackers will encounter the problem in computing efficiency, target drifting, and overpixelation due to the inherent feature of live video streaming. Therefore, for fast and accurate pixelation of irrelevant people’s faces, FPVLS is organized in a frame-to-video structure with two core stages. On individual frames, our framework utilizes the fast and cost-effective merits of image-based face detection and embedding networks to yield face vectors. We propose a Positioned Incremental Affinity Propagation (PIAP) clustering algorithm to associate the same person’s faces across frames according to face vectors and corresponding positions. PIAP also extends the classic affinity propagation into an incremental way for the efficient generation of the raw face trajectories. Such frame-wise accumulated raw trajectories are likely to be intermittent and unreliable on video level. Hence, we further introduce a trajectory refinement stage that merges a proposal network for loosed face detection with the two-sample test based on the Empirical Likelihood Ratio (ELR) statistic to compensate the deep network insufficiency and refine the raw trajectories seamlessly. The shallow proposal network and ELR will not trigger the computation burden. A Gaussian filter is laid on the refined trajectories for final pixelation.

Graph of Conceptual Working Pipeline

Conceptual Working Pipeline

Toy example of PIAP Clustering

A toy example of how PIAP Clustering works is presented. (a)-(i) are corresponding to left-to-right and then top-to-bottom order of pictures in the toy example. Traditional AP clustering is implemented on the first batch of objects, it converges in (a), and the clustering result is shown in (b). New objects arrive in (c), and aggregated affinity is recomputed in (d). Message-passing continues in (e) and (h), and reconverges in (i). The final clustering result is shown in (i).

PIAP Example

The Structure of Proposal Net(Compensate Detection through Shallow CNN)

To fix gaps accumulated by false negatives in a trajectory, we build a proposal net structured as below. The proposal net resizes frames as MTCNN does, and proposes suspicious face areas in such gap frames.

Proposal Net

Two-sample test based on ELR (compensate the detection lost)

Relationship of z (solid line) and z' (orange dash lines). Orange dots are the suspicious faces proposed by the proposal network. Red areas on z are the breaks recovred by interpolation.

Proposal Net

Video Test Data Results(Naive Cases)

Video Test Data Results(sophisticated Cases)

Youtude Studio offline tools failed to produce any mosaics in first few tens of seconds. Then, after some recalibration, Youtube Studio works under heavy drifting problems.

Our FPVLS results with PIPA clustering algorithm, and the compensation algorithm is not applied yet in following demo.

Our FPVLS results with the compensation algorithm.

Pixelation Results Analysis#1

We demonstrate the pixelation results of FPVLS vs. Youtube Studio offered offline face blur tool on 1080p high resolution (H) multi-people (S) scenario in this section. A thumbnail is used to show the results of pixelation in sequential order from left to right. Since the paper-sized thumbnail cannot present the details, we also show the origin pictures under the thumbnail one after one. The upper row of the thumbnail is produced offline by Youtube Studio; FPVLS generates the lower row in real-time. This test presents the FPVLS's ability of raw trajectories refinement. The live-streaming happened in a crowded street with a noisy and complex backgournd. Except the main streamer James Xiao, all other people including the dancers are set to be blurred. With unpredictable camera movements, tracking algorithms cannot handle the drifting and tracking loss problem due to the failed linkage of tracklets. However, FPVLS can still place mosaics on irrelevant people's faces precisely through compensating detections and empirical likelihood ratio test.

The Thumbnail of Pixelation Results#1

thumbnail

Original Pictures of Pixelation Results#1

Youtube Studio Pixelation Results	FPVLS Pixelation Results

Pixelation Results Analysis#2

Another pixelation result of FPVLS vs. Youtube Studio on low resolution (480p) (L) few-people (N) scenario is shown in this section. A thumbnail is used to display the results of pixelation in live-streaming from left to right. Since the paper-sized thumbnail cannot present the details, we also show the origin pictures under the thumbnail one after one. The upper row of the thumbnail is produced offline by Youtube Studio. FPVLS generates the lower row in real-time. This test focus on the typical over-pixelation problem that cannot be handled by the current face tracking algorithms. James Xiao is playing the piano while his friend is watching. His friend's face is set to be blurred for privacy protection.When two or more faces are overlapped with each other, we don't want to pixelate the occluded faces anymore since they are invisible to the audience. However, for tracking algorithms, they insist on predicting the movement of such partial/fully occluded faces and retain their tracklet. These algorithms will produce many annoying and odd mosaics during streaming.

The Thumbnail of Pixelation Results#2

thumbnail

Original Pictures of Pixelation Results#2

Youtube Studio Pixelation Results	FPVLS Pixelation Results

Brief Conclusion

According to our knowledge, we are the first to address the face pixelation problem in live video streaming by building the proposed FPVLS. FPVLS is already surpassing offline tool offered by YouTube and Microsoft and becomes applicable in real life scenarios. FPVLS can achieve high accuracy and real-time performances on the dataset we collected. We will extend FPVLS to behave on other privacy sensitive objects in the future.

#1:Brief Introduction #2:Working Pipeline #3:More Video Pixelation Results on High-resolution Multi-people (HS) Scenario #4:More Video Pixelation Results on Low-resolution Few-people (LN) Scenario #5:Brief Conclusion

Brief Introdcution

Graph of Conceptual Working Pipeline

Toy example of PIAP Clustering

The Structure of Proposal Net(Compensate Detection through Shallow CNN)

To fix gaps accumulated by false negatives in a trajectory, we build a proposal net structured as below. The proposal net resizes frames as MTCNN does, and proposes suspicious face areas in such gap frames.

Two-sample test based on ELR (compensate the detection lost)

Relationship of z (solid line) and z' (orange dash lines). Orange dots are the suspicious faces proposed by the proposal network. Red areas on z are the breaks recovred by interpolation.

Video Test Data Results(Naive Cases)

Video Test Data Results(sophisticated Cases)

Youtude Studio offline tools failed to produce any mosaics in first few tens of seconds. Then, after some recalibration, Youtube Studio works under heavy drifting problems.

Our FPVLS results with PIPA clustering algorithm, and the compensation algorithm is not applied yet in following demo.

Our FPVLS results with the compensation algorithm.

Pixelation Results Analysis#1

Youtube Studio Pixelation Results

FPVLS Pixelation Results

Pixelation Results Analysis#2

Youtube Studio Pixelation Results

FPVLS Pixelation Results

Brief Conclusion

#1:Brief Introduction

#2:Working Pipeline

#3:More Video Pixelation Results on High-resolution Multi-people (HS) Scenario

#4:More Video Pixelation Results on Low-resolution Few-people (LN) Scenario

#5:Brief Conclusion