4.5 Cascade of Strong ClassiﬁersA b

4.5 Cascade of Strong Classiﬁers
A boosted strong classiﬁer effectively eliminates a large portion of nonface subwindows while maintaining a high detection rate. Nonetheless, a single strong classiﬁer may not meet the requirement of an extremely low false alarm rate (e.g., 10−6 or even lower). A solution is to arbitrate between several detectors (strong classiﬁer) [32], for example, using the “AND” operation.

Fig. 2.10. A cascade of n strong classiﬁers (SC). The input is a subwindow x. It is sent to the next SC for further classiﬁcation only if it has passed all the previous SCs as the face (F) pattern; otherwise it exits as nonface (N). x is ﬁnally considered to be a face when it passes all the n SCs.
Viola and Jones [46, 47] further extend this idea by training a cascade consisting of a cascade of strong classiﬁers, as illustrated in Figure 2.10. A strong classiﬁer is trained using bootstrapped nonface examples that pass through the previously trained cascade. Usually, 10 to 20 strong classiﬁers are cascaded. For face detection, subwindows that fail to pass a strong classiﬁer are not further processed by the subsequent strong classiﬁers. This strategy can signiﬁcantly speed up the detection and reduce false alarms, with a little sacriﬁce of the detection rate.
5 Dealing with Head Rotations
Multiview face detection should be able to detect nonfrontal faces. There are three types of head rotation: (1) out-of-plane (left-right) rotation; (2) in-plane rotation; and (3) up-and-down nodding rotation. Adopting a coarse-to-ﬁne view-partition strategy, the detector-pyramid architecture consists of several levels from the coarse top level to the ﬁne bottom level.
Rowley et al. [31] propose to use two neural network classiﬁers for detection of frontal faces subject to in-plane rotation. The ﬁrst is the router network, trained to estimate the orientation of an assumed face in the subwindow, though the window may contain a nonface pattern. The inputs to the network are the intensity values in a preprocessed 20×20 subwindow. The angle of rotation is represented by an array of 36 output units, in which each unit represents an anglar range. With the orientation estimate, the subwindow is derotated to make the potential face upright. The second neural network is a normal frontal, upright face detector.
Li et al. [18, 20] constructed a detector-pyramid to detect the presence of upright faces, subject to out-of-plane rotation in the range Θ = [−90◦ , +90◦ ] and in-plane rotation in Φ = [−45◦ , +45◦ ]. The in-plane rotation in Φ = [−45, +45] may be handled as follows: (1) Divide Φ into three subranges: Φ1 = [−45, −15], Φ2 = [−15, +15], and Φ3 = [+15, +45]. (2) Apply the detector-pyramid on the original image and two images derived from the original one; the two images are derived by rotating the original one in the image plane by ±30 (Figure 2.11). This effectively covers in-plane-rotation in [−45, +45]. The up-and-down nodding rotation is dealt with by the tolerance of the face detectors to this.

Fig. 2.11. Middle: An image containing frontal faces subject to in-plane rotation. Left and right: In-plane rotated by ±30◦ .
The design of the detector-pyramid adopts the coarse-to-ﬁne and simple-to-complex strategy [2, 8]. The architecture is illustrated in Figure 2.12. This architecture design is for the detection of faces subject to out-of-plane rotation in Θ = [−90◦ , +90◦ ] and in-plane rotation in Φ2 = [−15◦ , +15◦ ]. The full in-plane rotation in Φ = [−45◦ , +45◦ ] is dealt with by applying the detector-pyramid on the images rotated ±30◦ , as mentioned earlier.

Fig. 2.12. Detector-pyramid for multiview face detection.
Coarse-to-ﬁne The partitions of the out-of-plane rotation for the three-level detector-pyramid is illustrated in Figure 2.13. As the the level goes from coarse to ﬁne, the full range Θ of out-ofplane rotation is partitioned into increasingly narrower ranges. Although there are no overlaps between the partitioned view subranges at each level, a face detector trained for one view may detect faces of its neighboring views. Therefore, faces detected by the seven channels at the bottom level of the detector-pyramid must be merged to obtain the ﬁnal result. This is illustrated in Figure 2.14.

Fig. 2.13. Out-of-plane view partition. Out-of-plane head rotation (row 1), the facial view labels (row 2), and the coarse-to-ﬁne view partitions at the three levels of the detector-pyramid (rows 3 to 5).
Simple-to-complex A large number of subwindows result from the scan of the input image. For example, there can be tens to hundreds of thousands of them for an image of size 320×240, the actual number depending on how the image is scanned (e.g., regarding the scale increment factor). For the purpose of efﬁciency, it is crucial to discard as many nonface subwindows as possible at the earliest possible stage so as few as possible subwindows are processed further at later stages. Therefore, the detectors in the early stages are designed to be simple so that they can reject nonface subwindows quickly with little computation, whereas those at the later stage are more complex and require more computation.

Fig. 2.14. Merging from different channels. From left to right: Outputs of fontal, left, and right view channels and the ﬁnal result after the merge.
6 Postprocessing
A single face in an image may be detected several times at close locations or on multiple scales. False alarms may also occur but usually with less consistency than multiple face detections. The number of multiple detections in a neighborhood of a location can be used as an effective indication for the existence of a face at that location. This assumption leads to a heuristic for resolving the ambiguity caused by multiple detections and eliminating many false detections. A detection is conﬁrmed if the number of multiple detections is greater than a given value; and given the conﬁrmation, multiple detections are merged into a consistent one. This is practiced in most face detection systems [32, 41]. Figure 2.15 gives an illustration. The image on the left shows a typical output of initial detection, where the face is detected four times with four false alarms on the cloth. On the right is the ﬁnal result after merging. After the postprocessing, multiple detections are merged into a single face and the false alarms are eliminated. Figures 2.16 and 2.17 show some typical frontal and multiview face detection examples; the multiview face images are from the Carnegie Mellon University (CMU) face database [45].

Fig. 2.10. A cascade of n strong classiﬁers (SC). The input is a subwindow x. It is sent to the next SC for further classiﬁcation only if it has passed all the previous SCs as the face (F) pattern; otherwise it exits as nonface (N). x is ﬁnally considered to be a face when it passes all the n SCs.
Viola and Jones [46, 47] further extend this idea by training a cascade consisting of a cascade of strong classiﬁers, as illustrated in Figure 2.10. A strong classiﬁer is trained using bootstrapped nonface examples that pass through the previously trained cascade. Usually, 10 to 20 strong classiﬁers are cascaded. For face detection, subwindows that fail to pass a strong classiﬁer are not further processed by the subsequent strong classiﬁers. This strategy can signiﬁcantly speed up the detection and reduce false alarms, with a little sacriﬁce of the detection rate.
5 Dealing with Head Rotations
Multiview face detection should be able to detect nonfrontal faces. There are three types of head rotation: (1) out-of-plane (left-right) rotation; (2) in-plane rotation; and (3) up-and-down nodding rotation. Adopting a coarse-to-ﬁne view-partition strategy, the detector-pyramid architecture consists of several levels from the coarse top level to the ﬁne bottom level.
Rowley et al. [31] propose to use two neural network classiﬁers for detection of frontal faces subject to in-plane rotation. The ﬁrst is the router network, trained to estimate the orientation of an assumed face in the subwindow, though the window may contain a nonface pattern. The inputs to the network are the intensity values in a preprocessed 20×20 subwindow. The angle of rotation is represented by an array of 36 output units, in which each unit represents an anglar range. With the orientation estimate, the subwindow is derotated to make the potential face upright. The second neural network is a normal frontal, upright face detector.
Li et al. [18, 20] constructed a detector-pyramid to detect the presence of upright faces, subject to out-of-plane rotation in the range Θ = [−90◦ , +90◦ ] and in-plane rotation in Φ = [−45◦ , +45◦ ]. The in-plane rotation in Φ = [−45, +45] may be handled as follows: (1) Divide Φ into three subranges: Φ1 = [−45, −15], Φ2 = [−15, +15], and Φ3 = [+15, +45]. (2) Apply the detector-pyramid on the original image and two images derived from the original one; the two images are derived by rotating the original one in the image plane by ±30 (Figure 2.11). This effectively covers in-plane-rotation in [−45, +45]. The up-and-down nodding rotation is dealt with by the tolerance of the face detectors to this.

Fig. 2.11. Middle: An image containing frontal faces subject to in-plane rotation. Left and right: In-plane rotated by ±30◦ .
The design of the detector-pyramid adopts the coarse-to-ﬁne and simple-to-complex strategy [2, 8]. The architecture is illustrated in Figure 2.12. This architecture design is for the detection of faces subject to out-of-plane rotation in Θ = [−90◦ , +90◦ ] and in-plane rotation in Φ2 = [−15◦ , +15◦ ]. The full in-plane rotation in Φ = [−45◦ , +45◦ ] is dealt with by applying the detector-pyramid on the images rotated ±30◦ , as mentioned earlier.

Fig. 2.12. Detector-pyramid for multiview face detection.
Coarse-to-ﬁne The partitions of the out-of-plane rotation for the three-level detector-pyramid is illustrated in Figure 2.13. As the the level goes from coarse to ﬁne, the full range Θ of out-ofplane rotation is partitioned into increasingly narrower ranges. Although there are no overlaps between the partitioned view subranges at each level, a face detector trained for one view may detect faces of its neighboring views. Therefore, faces detected by the seven channels at the bottom level of the detector-pyramid must be merged to obtain the ﬁnal result. This is illustrated in Figure 2.14.

Fig. 2.13. Out-of-plane view partition. Out-of-plane head rotation (row 1), the facial view labels (row 2), and the coarse-to-ﬁne view partitions at the three levels of the detector-pyramid (rows 3 to 5).
Simple-to-complex A large number of subwindows result from the scan of the input image. For example, there can be tens to hundreds of thousands of them for an image of size 320×240, the actual number depending on how the image is scanned (e.g., regarding the scale increment factor). For the purpose of efﬁciency, it is crucial to discard as many nonface subwindows as possible at the earliest possible stage so as few as possible subwindows are processed further at later stages. Therefore, the detectors in the early stages are designed to be simple so that they can reject nonface subwindows quickly with little computation, whereas those at the later stage are more complex and require more computation.

Fig. 2.14. Merging from different channels. From left to right: Outputs of fontal, left, and right view channels and the ﬁnal result after the merge.
6 Postprocessing
A single face in an image may be detected several times at close locations or on multiple scales. False alarms may also occur but usually with less consistency than multiple face detections. The number of multiple detections in a neighborhood of a location can be used as an effective indication for the existence of a face at that location. This assumption leads to a heuristic for resolving the ambiguity caused by multiple detections and eliminating many false detections. A detection is conﬁrmed if the number of multiple detections is greater than a given value; and given the conﬁrmation, multiple detections are merged into a consistent one. This is practiced in most face detection systems [32, 41]. Figure 2.15 gives an illustration. The image on the left shows a typical output of initial detection, where the face is detected four times with four false alarms on the cloth. On the right is the ﬁnal result after merging. After the postprocessing, multiple detections are merged into a single face and the false alarms are eliminated. Figures 2.16 and 2.17 show some typical frontal and multiview face detection examples; the multiview face images are from the Carnegie Mellon University (CMU) face database [45].

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

4,5 cascade của mạnh mẽ ClassiﬁersMột mạnh mẽ classiﬁer boosted có hiệu quả loại bỏ một phần lớn của nonface subwindows trong khi duy trì một tỷ lệ phát hiện cao. Tuy nhiên, một classiﬁer mạnh duy nhất có thể không đáp ứng yêu cầu của một tỷ lệ rất thấp false alarm (ví dụ như, 10−6 hoặc thậm chí thấp hơn). Một giải pháp là để phân xử giữa một số thiết bị dò (mạnh mẽ classiﬁer) [32], ví dụ, bằng cách sử dụng các hoạt động "Và".Hình 2.10. Một thác n mạnh mẽ classiﬁers (SC). Đầu vào là một subwindow x. Nó được gửi đến SC tiếp theo cho thêm classiﬁcation chỉ khi nó đã thông qua tất cả các SCs trước đó như là mô hình khuôn mặt (F); Nếu không, nó đi ra như nonface (N). x là ﬁnally được coi là một khuôn mặt khi nó vượt qua tất cả n SCs.Viola và Jones [46, 47] tiếp tục mở rộng ý tưởng này bởi đào tạo một thác bao gồm một thác của các mạnh mẽ classiﬁers, như minh hoạ trong hình 2.10. Một classiﬁer mạnh mẽ được huấn luyện bằng cách sử dụng ví dụ bootstrapped nonface đi qua cascade được đào tạo trước đây. Thông thường, 10-20 mạnh mẽ classiﬁers được cascaded. Để phát hiện khuôn mặt, subwindows mà không vượt qua một classiﬁer mạnh không tiếp tục xử lý bởi classiﬁers mạnh tiếp theo. Tốc độ này signiﬁcantly chiến lược có thể lên phát hiện và giảm báo động sai, với một chút sacriﬁce của tỷ lệ phát hiện.5 đối phó với phép quay đầuPhát hiện khuôn mặt MultiView có thể phát hiện khuôn mặt nonfrontal. Có ba loại đầu xoay: xoay (1) ra-của-plane (trái-phải); (2) trong máy bay quay; và (3) lên và xuống gật đầu xoay. Kiến trúc kim tự tháp phát hiện việc áp dụng một chiến lược thô ﬁne xem-phân vùng, bao gồm nhiều tầng lớp từ cấp cao nhất thô để cấp độ dưới ﬁne.Rowley et al. [31] đề xuất để sử dụng hai mạng nơ-ron classiﬁers để phát hiện các khuôn mặt phía trước tùy thuộc vào trong máy bay quay. Chính là bộ định tuyến mạng, được đào tạo để ước tính định hướng của một khuôn mặt giả định trong subwindow, mặc dù cửa sổ có thể chứa một mô hình nonface. Đầu vào mạng là các giá trị cường độ trong một subwindow preprocessed 20 × 20. Góc quay được đại diện bởi một loạt các đơn vị sản lượng 36, trong đó mỗi đơn vị đại diện cho một phạm vi anglar. Với dân số ước tính định hướng, subwindow derotated để làm cho khuôn mặt tiềm năng thẳng đứng. Mạng nơ-ron thứ hai là một bình thường trán, phát hiện khuôn mặt thẳng đứng.Li et al. [18, 20] xây dựng một kim tự tháp phát hiện để phát hiện sự hiện diện của khuôn mặt thẳng đứng, tùy thuộc vào ra của máy bay xoay trong phạm vi Θ = [−90◦, + 90◦] và trong-chiếc máy bay quay Φ = [−45◦, + 45◦]. Xoay trong máy bay trong Φ = [−45, +45] có thể được xử lý như sau: (1) Φ chia ba subranges: Φ1 = [−45, −15], Φ2 = [−15, + 15], và Φ3 = [+ 15, +45]. (2) áp dụng các kim tự tháp phát hiện trên hình ảnh ban đầu và hai hình ảnh có nguồn gốc từ một bản gốc; hai hình ảnh có nguồn gốc bằng cách xoay một bản gốc trong mặt phẳng hình ảnh bởi ±30 (con số 2,11). Điều này có hiệu quả bao gồm trong-máy bay-xoay trong [−45, +45]. Việc luân chuyển lên và xuống nodding xử lý của khoan dung của phát hiện khuôn mặt này.2.11 hình. Trung: Một hình ảnh có chứa khuôn mặt phía trước tùy thuộc vào trong máy bay quay. Trái và bên phải: trong máy bay xoay bởi ±30◦.Thiết kế của các kim tự tháp phát hiện thông qua chiến lược thô ﬁne và đơn giản đến phức tạp [2, 8]. Kiến trúc được minh họa trong hình 2.12. Thiết kế kiến trúc này là để phát hiện khuôn mặt tùy thuộc vào ra của máy bay xoay trong Θ = [−90◦, + 90◦] và trong-máy bay quay trong Φ2 = [−15◦, + 15◦]. Quay đầy đủ trong máy bay Φ = [−45◦, + 45◦] được xử lý bằng cách áp dụng các phát hiện kim tự tháp vào hình ảnh xoay ±30◦, như đã đề cập trước đó.Hình 2.12. Detector-tháp mặt multiview phát hiện.Hạt thô để ﬁne phân chia ra máy bay quay dùng cho máy dò ba cấp-kim tự tháp được minh họa trong hình 2,13. Như các mức độ đi từ thô để ﬁne, Θ đầy đủ out-ofplane quay phân vùng vào phạm vi ngày càng hẹp hơn. Mặc dù có không chồng chéo giữa subranges partitioned xem ở mỗi cấp, một phát hiện khuôn mặt được đào tạo cho một giao diện có thể phát hiện khuôn mặt quan điểm giáp ranh của nó. Vì vậy, khuôn mặt phát hiện bởi các kênh bảy ở cấp độ dưới cùng của máy dò kim tự tháp phải được hợp nhất để có được kết quả ngoài. Điều này được minh họa trong hình 2.14.Hình 2,13. Phân vùng ra máy bay xem. Xoay vòng đầu ra của máy bay (hàng 1), xem mặt nhãn (hàng 2) và thô-để-ﬁne xem phân chia ở ba cấp độ của phát hiện kim tự tháp (hàng 3-5).Đơn giản đến phức tạp A nhiều subwindows là kết quả của quét của hình ảnh nhập vào. Ví dụ, có thể có hàng chục đến hàng trăm ngàn người trong số họ cho một hình ảnh kích thước 320 × 240, số lượng thực tế tùy thuộc vào làm thế nào hình ảnh được quét (ví dụ như, liên quan đến các yếu tố tăng quy mô). Với mục đích efﬁciency, nó là rất quan trọng để loại bỏ subwindows nonface càng nhiều càng tốt ở giai đoạn sớm nhất có thể vì vậy, khi ít như subwindows có thể được thực hiện tiếp tục ở giai đoạn sau này. Do đó, các thiết bị dò trong giai đoạn đầu được thiết kế để được đơn giản để họ có thể từ chối nonface subwindows một cách nhanh chóng với ít tính toán, trong khi những người ở giai đoạn sau là phức tạp hơn và yêu cầu tính toán nhiều.Hình 2.14. Việc sáp nhập từ kênh khác nhau. Từ trái sang phải: kết quả đầu ra của fontal, trái, và phải xem kênh và ngoài kết quả sau khi kết hợp.6 postprocessingMột khuôn mặt duy nhất trong một hình ảnh có thể được phát hiện nhiều lần tại các địa điểm gần hoặc về nhiều quy mô. Báo động sai cũng có thể xảy ra nhưng thường với ít nhất quán hơn nhiều mặt phát hiện. Số lượng nhiều phát hiện trong một khu phố của các vị trí có thể được sử dụng như là một dấu hiệu có hiệu quả cho sự tồn tại của một khuôn mặt tại địa điểm đó. Giả định này dẫn đến một heuristic cho việc giải quyết sự mơ hồ gây ra bởi nhiều phát hiện và loại bỏ các phát hiện sai nhiều. Một phát hiện là conﬁrmed nếu số lượng nhiều phát hiện là lớn hơn một giá trị nhất định; và đưa ra conﬁrmation, nhiều phát hiện được sáp nhập vào một trong những phù hợp. Điều này được thực hiện trong hầu hết hệ thống mặt phát hiện [32, 41]. Con số 2,15 cho một minh hoạ. Hình ảnh ở bên trái cho thấy một kết quả điển hình của phát hiện ban đầu, nơi mặt phát hiện bốn lần với bốn báo động sai trên vải. Bên phải là kết quả ngoài sau khi sáp nhập. Sau khi postprocessing, nhiều phát hiện được sáp nhập vào một khuôn mặt duy nhất và báo động sai được loại bỏ. Con số 2.16 và 2,17 Hiển thị một số điển hình trán và multiview mặt phát hiện ví dụ; hình ảnh multiview mặt là từ cơ sở dữ liệu mặt Carnegie Mellon University (CMU) [45].

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

4,5 Cascade Strong ers phân loại fi
A tăng mạnh er fi classi hiệu quả loại bỏ một phần lớn của subwindows nonface trong khi duy trì một tỷ lệ phát hiện cao. Tuy nhiên, một fi er duy nhất phân loại mạnh có thể không đáp ứng các yêu cầu về tỷ lệ báo động sai rất thấp (ví dụ, hoặc thậm chí thấp hơn 10-6). Một giải pháp là để phân xử giữa nhiều máy dò (mạnh er fi classi) [32], ví dụ, bằng cách sử dụng "và" hoạt động. Hình. 2.10. Một dòng thác của ers fi classi n mạnh (SC). Đầu vào là một subwindow x. Nó được gửi đến SC tiếp theo để biết thêm cation fi classi chỉ khi nó đã vượt qua tất cả các SC trước đó là khuôn mặt (F) mô hình; nếu không thì nó thoát như nonface (N). x là fi nally coi là một khuôn mặt khi nó vượt qua tất cả các SC n. Viola và Jones [46, 47] tiếp tục mở rộng ý tưởng này bằng cách đào tạo một thác bao gồm một chuỗi các ers fi classi mạnh mẽ, như minh họa trong hình 2.10. Một fi er phân loại mạnh mẽ được đào tạo sử dụng các ví dụ nonface bootstrapped đi qua các đợt huấn luyện trước đó. Thông thường, 10-20 ers fi classi mạnh được cascaded. Để phát hiện khuôn mặt, subwindows mà không vượt qua một er fi classi mạnh không được tiếp tục xử lý bởi những người đối fi classi mạnh mẽ tiếp theo. Chiến lược này có thể trọng yếu đáng fi tốc độ phát hiện và giảm báo động giả, với một chút sacri fi ce của tỷ lệ phát hiện. 5 Đối phó với Head Xoay nhận diện khuôn mặt MultiView sẽ có thể phát hiện khuôn mặt nonfrontal. Có ba loại xoay sở chính: (1) out-of-plane (trái-phải) xoay; (2) trong mặt phẳng xoay; và (3) lên và xuống gật đầu quay. Việc áp dụng một chiến lược fi ne xem phân vùng thô-to-, kiến trúc máy dò kim tự tháp bao gồm một số các cấp, từ cấp cao nhất thô đến ne fi mức đáy. Rowley et al. [31] đề xuất sử dụng hai mạng ers classi fi thần kinh để phát hiện phía trước phải đối mặt với đối tượng luân chuyển trong mặt phẳng. Việc đầu tiên fi là mạng router, được đào tạo để ước tính định hướng của một khuôn mặt giả định trong subwindow, mặc dù cửa sổ có thể chứa một mô hình nonface. Các đầu vào cho mạng lưới là các giá trị cường độ trong một tiền xử lý 20 × 20 subwindow. Các góc quay được biểu diễn bởi một mảng của 36 đơn vị sản lượng, trong đó mỗi đơn vị đại diện cho một loạt anglar. Với ước tính định hướng, các subwindow được derotated làm cho bộ mặt tiềm năng đứng thẳng. Mạng lưới thần kinh thứ hai là một trán bình thường, phát hiện khuôn mặt thẳng đứng. Li et al. [18, 20] xây dựng một máy dò kim tự tháp để phát hiện sự hiện diện của khuôn mặt thẳng đứng, chịu out-of-máy bay xoay trong khoảng Θ = [-90◦, + 90◦] và trong mặt phẳng quay trong Φ = [- 45◦, + 45◦]. Vòng xoay trong mặt phẳng trong Φ = [-45, +45] có thể được xử lý như sau: (1) Chia Φ thành ba subranges: Φ1 = [-45, -15], Φ2 = [-15, 15], và Φ3 = [+15, +45]. (2) Áp dụng các máy dò kim tự tháp trên ảnh gốc và hai hình ảnh thu được từ một nguồn; hai hình ảnh được lấy ra bằng cách xoay một gốc trong mặt phẳng ảnh bởi ± 30 (Hình 2.11). Điều này bao phủ hiệu quả trong mặt phẳng-xoay trong [-45, +45]. Vòng xoay gật đầu lên xuống được xử lý bằng sự khoan dung của các máy dò mặt này. Fig. 2.11. Trung: Một hình ảnh có chứa phía trước phải đối mặt với đối tượng chuyển động quay trong mặt phẳng. Trái và phải:. Trong mặt phẳng quay bằng ± 30◦ Các thiết kế của máy dò kim tự tháp thông qua các fi ne-to- thô và đơn giản đến phức tạp chiến lược [2, 8]. Các kiến trúc được minh họa trong hình 2.12. Thiết kế kiến trúc này là để phát hiện khuôn mặt chịu out-of-máy bay quay trong Θ = [-90◦, + 90◦] và trong mặt phẳng quay trong Φ2 = [-15◦, + 15◦]. Đầy đủ trong mặt phẳng quay trong Φ = [-45◦, + 45◦] được xử lý bằng cách áp dụng các máy dò kim tự tháp trên những hình ảnh xoay ± 30◦, như đã đề cập trước đó. Fig. 2.12. Detector-kim tự tháp để phát hiện MultiView mặt. fi Thô-to- ne Các phân vùng của vòng xoay out-of-máy bay cho ba cấp máy dò kim tự tháp được minh họa trong hình 2.13. Khi mức độ đi từ thô fi ne, phạm vi Θ toàn quay ra-ofplane được phân chia thành các phạm vi ngày càng hẹp hơn. Mặc dù không có sự chồng chéo giữa các view subranges phân ở mỗi cấp, một máy dò mặt đào tạo cho một lần xem có thể phát hiện khuôn mặt của quan điểm lân cận của nó. Vì vậy, khuôn mặt phát hiện bởi bảy kênh ở mức đáy của máy dò kim tự tháp phải được sáp nhập để có được những kết quả fi nal. Điều này được minh họa trong hình 2.14. Hình. 2.13. Out-of-plane phân vùng xem. Out-of-máy bay xoay đầu (dòng 1), xem nhãn trên khuôn mặt (dòng 2), và các phân vùng xem fi ne thô-to- ở ba cấp độ của máy dò kim tự tháp (hàng 3-5). Đơn giản-to- phức tạp Một số lượng lớn các subwindows kết quả từ quá trình quét các hình ảnh đầu vào. Ví dụ, có thể có hàng chục đến hàng trăm ngàn trong số họ cho một hình ảnh kích thước 320 × 240, con số thực tế phụ thuộc vào cách thức hình ảnh được quét (ví dụ, liên quan đến các yếu tố tăng quy mô). Đối với mục đích của ef fi ciency, nó là rất quan trọng để loại bỏ càng nhiều càng tốt subwindows nonface ở giai đoạn sớm nhất có thể vì vậy càng ít càng subwindows thể được tiếp tục xử lý ở giai đoạn muộn. Vì vậy, các máy dò trong giai đoạn đầu được thiết kế để đơn giản để họ có thể từ chối subwindows nonface nhanh chóng với ít tính toán, trong khi những người ở giai đoạn sau này là phức tạp hơn và đòi hỏi tính toán nhiều hơn. Hình. 2.14. Sáp nhập từ các kênh khác nhau. Từ trái sang phải:. Đầu ra của fontal, trái, và các kênh truyền hình xem bên phải và kết quả nal fi sau khi hợp nhất 6 bước xử lý sau một khuôn mặt duy nhất trong một hình ảnh có thể được phát hiện nhiều lần tại các địa điểm gần hoặc trên nhiều vảy. Báo động sai cũng có thể xảy ra nhưng thường có ít nhất quán hơn so với nhiều điểm nhận diện khuôn mặt. Số lượng nhiều phát hiện trong một khu phố của một vị trí có thể được sử dụng như là một dấu hiệu cho thấy có hiệu quả cho sự tồn tại của một khuôn mặt tại địa điểm đó. Giả định này dẫn đến một heuristic để giải quyết sự mơ hồ gây ra bởi nhiều phát hiện và loại bỏ nhiều phát hiện sai. Một phát hiện được con fi rmed nếu số lượng nhiều phát hiện lớn hơn một giá trị nhất định; và cho các con fi rmation, nhiều phát hiện được sáp nhập vào một quán. Điều này được thực hiện ở hầu hết các hệ thống phát hiện khuôn mặt [32, 41]. Hình 2.15 đưa ra một minh họa. Những hình ảnh bên trái cho thấy một kết quả tiêu biểu của phát hiện ban đầu, nơi khuôn mặt được phát hiện bốn lần với bốn cảnh báo sai về vải. Bên phải là fi nal kết quả sau khi sáp nhập. Sau khi xử lý sau, nhiều phát hiện được sáp nhập vào một khuôn mặt duy nhất và các báo động sai được loại bỏ. Hình 2.16 và 2.17 cho thấy một số vùng trán và nhận diện khuôn mặt MultiView ví dụ điển hình; những hình ảnh MultiView mặt là từ các trường Đại học Carnegie Mellon (CMU) mặt cơ sở dữ liệu [45].

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.