|
| 1 | +# Image flagging |
| 2 | + |
| 3 | +The image flagging system automatically identifies inappropriate or problematic content in product images to help maintain Open Food Facts' image quality standards. |
| 4 | + |
| 5 | +## How it works |
| 6 | + |
| 7 | +Image flagging uses multiple detection methods to identify content that may not be appropriate for a food database: |
| 8 | + |
| 9 | +1. **Face Detection** – Uses Google Cloud Vision's Face Detection API to identify images containing human faces. |
| 10 | +2. **Label Annotation** – Scans for labels indicating the presence of humans, pets, electronics, or other non-food items. |
| 11 | +3. **Safe Search** – Uses Google Cloud Vision's Safe Search API to detect adult content or violence. |
| 12 | +4. **Text Detection** – Analyzes OCR text for keywords related to beauty products or other inappropriate content. |
| 13 | + |
| 14 | +When flagged content is detected, an `image_flag` prediction is generated with details about the issue and the associated confidence level. These predictions trigger notifications to moderation services where humans can review potentially problematic images. |
| 15 | + |
| 16 | +## Detection Methods |
| 17 | + |
| 18 | +### Face Detection |
| 19 | + |
| 20 | +The system processes `faceAnnotations` from Google Cloud Vision to detect human faces. If multiple faces are detected, the one with the highest confidence score is used. Only faces with a detection confidence ≥ 0.6 are flagged to minimize false positives. |
| 21 | + |
| 22 | +Prediction data includes: |
| 23 | + |
| 24 | +- `type`: "face_annotation" |
| 25 | +- `label`: "face" |
| 26 | +- `likelihood`: Detection confidence score |
| 27 | + |
| 28 | +### Label Annotation Detection |
| 29 | + |
| 30 | +The system flags images containing specific labels from Google Cloud Vision with confidence scores ≥ 0.6. Only the first matching label is flagged per image. |
| 31 | + |
| 32 | +**Human-related labels**: |
| 33 | + |
| 34 | +- Face, Head, Selfie, Hair, Forehead, Chin, Cheek |
| 35 | +- Arm, Tooth, Human Leg, Ankle, Eyebrow, Ear, Neck, Jaw, Nose |
| 36 | +- Facial Expression, Glasses, Eyewear |
| 37 | +- Child, Baby, Human |
| 38 | + |
| 39 | +**Other flagged labels**: |
| 40 | + |
| 41 | +- **Pets**: Dog, Cat |
| 42 | +- **Technology**: Computer, Laptop, Refrigerator |
| 43 | +- **Clothing**: Jeans, Shoe |
| 44 | + |
| 45 | +The prediction data includes: |
| 46 | + |
| 47 | +- `type`: "label_annotation" |
| 48 | +- `label`: The detected label (lowercase) |
| 49 | +- `likelihood`: Label confidence score |
| 50 | + |
| 51 | +### Safe Search Detection |
| 52 | + |
| 53 | +The Safe Search API flags the following categories only if marked as "VERY_LIKELY": |
| 54 | + |
| 55 | +- **Adult content** – Sexually explicit material |
| 56 | +- **Violence** – Graphic or violent imagery |
| 57 | + |
| 58 | +The prediction data includes: |
| 59 | + |
| 60 | +- `type`: "safe_search_annotation" |
| 61 | +- `label`: "adult" or "violence" |
| 62 | +- `likelihood`: Likelihood level name |
| 63 | + |
| 64 | +### Text-based Detection |
| 65 | + |
| 66 | +The system scans OCR-extracted text for keywords from predefined keyword files. Only the first matching keyword is flagged per image. |
| 67 | + |
| 68 | +- **Beauty products** – Cosmetic-related terms from beauty keyword file |
| 69 | +- **Miscellaneous** – Other inappropriate content keywords from miscellaneous keyword file |
| 70 | + |
| 71 | +The prediction data includes: |
| 72 | + |
| 73 | +- `type`: "text" |
| 74 | +- `label`: "beauty" or "miscellaneous" |
| 75 | +- `text`: The matched text phrase |
0 commit comments