GovHack 2025: Protecting Australians through AI-powered government data verification
A comprehensive end-to-end system that transforms government data into life-saving protection tools. Built by a solo developer in one weekend, this project demonstrates how modern AI technologies can address critical social problems at scale.
The Problem: Australians lose millions of dollars annually to fraudsters impersonating trusted organizations like the ATO, Australia Post, and charities. The core issue is the lack of a single, trustworthy source for people to verify if a communication is legitimate.
The Solution: A two-part ecosystem combining sophisticated backend AI with an empathetic, accessible mobile experienceβturning doubt into certainty for vulnerable populations.
This project consists of two integrated components:
- π§ Backend Data Pipeline: Multi-agent AI system that collects, validates, and categorizes legitimate government contact information
- π± iOS Mobile App: Native application that provides real-time scam protection using the verified data
Building safer communities through AI-powered government data verification
A comprehensive multi-agent system that collects, validates, and categorizes legitimate government and charity contact information to protect Australians from scams using Google's Agent2Agent (A2A) protocol.
Australians lose $3.1 billion annually to scams where fraudsters impersonate:
- Government agencies (ATO, Centrelink, Medicare)
- Banks and financial institutions
- Charities and community organizations
- Healthcare providers
Current Challenge: No centralized database exists to quickly verify if a contact claiming to be from a legitimate organization is actually authentic.
We built an AI-powered multi-agent pipeline that:
- Automatically collects verified contact information from official government APIs and websites
- AI-validates data quality using advanced format compliance and consistency checks
- Cross-references legitimate contacts against known scam databases
- Categorizes and prioritizes contacts by risk level and organization type
- Provides real-time verification for incoming calls, emails, and websites
Our system demonstrates Google's Agent Development Kit (ADK) concepts with:
- Coordinator Agent: Orchestrates the entire pipeline
- Collector Agents: Specialized scrapers for different data sources
- Critic Agent: AI-powered quality assessment and validation
- Sorter Agent: Risk categorization and priority assignment
- Standardizer Agent: Data normalization across all sources
- π€ Visualization Agent: LLM-enhanced dashboard generation with Claude API
- Real-time agent communication visualization
- Interactive contact verification lookup
- Risk assessment scoring demonstration
- Data quality metrics dashboard
- π΄ LIVE status dashboard with real-time metrics
- 6 Chart.js visualizations with dynamic data loading
- 415 Total Contact Records processed across 5 proven agents
- 402 Verified Safe Contacts (96.9% safety rate)
- 13 Threat Indicators identified and catalogued
- 100% Pipeline Success Rate across all 5 collector agents
| Source | Records | Success Rate | Key Data |
|---|---|---|---|
| Federal Government Services | 109 | 100% | Official phone numbers for federal agencies |
| NSW Hospitals | 266 | 100% | Complete NSW health system contacts |
| NSW Government Directory | 22 | 100% | State agency contact information |
| Scamwatch Threat Intel | 13 | 100% | Known scam patterns and indicators |
| ACNC Charity Register | 5 | 100%* | Organizational verification (see limitations) |
- Average Confidence Score: 93% across all collected data
- Data Validation: Format compliance and duplicate detection
- Quality Grading: Grade A (95.4% overall score)
- LLM Integration: Real Claude API powering intelligent analysis
COORDINATOR AGENT (Orchestrator)
βββ COLLECTOR AGENTS (Data Gathering)
β βββ government_services_scraper (Federal APIs)
β βββ nsw_hospitals_agent (Health System)
β βββ nsw_correct_scraper (State Directory)
β βββ scamwatch_threat_agent (Threat Intel)
βββ CRITIC AGENT (AI Quality Assessment)
βββ SORTER AGENT (Risk Categorization)
βββ STANDARDIZER AGENT (Data Normalization)
βββ VISUALIZATION AGENT (LLM Dashboard Generation)
- Real-world Problem Solving: Addresses the $3.1 billion annual scam losses in Australia
- Advanced AI Integration: Google A2A protocol + LLM-enhanced agents
- Government Data Utilization: Leverages official APIs and directories
- Scalable Architecture: Can easily extend to all states/territories
- Production Ready: Grade A data quality with comprehensive error handling
- π€ LLM Innovation: Real Claude API integration for intelligent analysis
- Python 3.8+
- Virtual environment support (required)
- Internet connection for data collection
# Clone the repository
git clone https://github.com/vin67/govhack2025.git
cd govhack2025
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtTo test the system's live data collection capabilities, first clear existing data files:
# Clear all data files to force fresh collection from web sources
find data/ -name "*.csv" -delete && find data/ -name "*.json" -delete
# Clear generated dashboards to test live visualization agent
rm frontend/live_dashboard.html
# Or remove specific categories
rm -rf data/raw/* data/verified/* data/threats/* data/reports/*
rm data/standardized_contacts.csv data/sorted_contacts_master.csvThis ensures the pipeline fetches fresh data from:
- directory.gov.au (federal services)
- NSW Health API (hospitals)
- service.nsw.gov.au (government agencies)
- scamwatch.gov.au (threat intelligence)
# Activate virtual environment and run the full multi-agent pipeline
source venv/bin/activate # On Windows: venv\Scripts\activate
python backend/run_pipeline.pyThis will execute all agents in sequence:
- Data Collection (5 collector agents)
- Data Standardization (410 records normalized)
- Quality Review (AI-powered validation)
- Risk Categorization (Safe vs. threat classification)
- π€ LLM-Enhanced Dashboard Generation (Claude AI analysis + Chart.js visualizations)
- Final Reporting (Pipeline summary and A2A communication log)
Expected Results:
β
Phase 1: Data Collection (5 agents)
β
Phase 2: Data Standardization
β
Phase 3: Quality Assessment (AI-powered)
β
Phase 4: Risk Analysis & Sorting
β
Phase 5: Live Dashboard Creation
β
Phase 6: Final Report
# View results:
open frontend/live_dashboard.html # Live agent-generated dashboard
open frontend/dashboard.html # Interactive static dashboard
# Activate virtual environment first
source venv/bin/activate # On Windows: venv\Scripts\activate
# Run specific data collectors
python backend/agents/gov_services_scraper.py # Federal services (109 records)
python backend/agents/nsw_hospitals_agent.py # NSW hospitals (266 records)
python backend/agents/scamwatch_threat_agent.py # Threat intelligence (13 indicators)
# Run data processing agents
python backend/utils/data_standardizer.py # Normalize all datasets
python backend/agents/critic_agent.py # AI quality assessment
python backend/agents/sorter_agent.py # Risk categorization
python backend/agents/visualization_agent.py # π€ LLM-enhanced dashboard with Chart.jsAfter running the pipeline, you'll find organized data in the data/ directory:
government_contacts.csv- 131 verified government contactshospital_contacts.csv- 266 NSW hospital recordscharity_contacts.csv- 5 Picton-area charity contactssafe_contacts.csv- Safe contacts by organization typeall_safe_contacts.csv- 402 verified legitimate contacts (by risk level)high_priority_contacts.csv- 397 priority contactsthreat_contacts.csv- Threat indicators (duplicate for convenience)
threat_contacts.csv- 13 known scam indicators
critic_report.json- Detailed AI quality assessmentsorter_report.json- Risk categorization analysispipeline_report.json- Complete execution summaryvalidation_report.json- Cross-reference validation results
dashboard.html- π€ Interactive dashboard with Claude AI analysis + 6 Chart.js visualizationslive_dashboard.html- π΄ LIVE status dashboard with real-time metrics and pulsing animationindex.html- Project landing page
Dashboard Features:
- Real-time agent communication visualization
- Interactive contact verification lookup
- Risk assessment scoring demonstration
- Data quality metrics with live updates
- Modern dark theme with accessibility compliance (WCAG AA)
- Responsive design with gradient animations
- Agent-generated content using A2A protocol
data/raw/- Original scraped data from all 5 agents:government_services.csv(109 federal services)nsw_hospitals.csv(266 hospital records)scamwatch_threats.csv(13 threat indicators)acnc_charities_picton.csv(12 charity records)nsw_correct_directory.csv(9 NSW agency records)
data/standardized_contacts.csv- All 415 records in common formatdata/sorted_contacts_master.csv- Complete sorted dataset
govhack2025/
βββ README.md # This file
βββ backend/ # All Python agents & scripts
β βββ agents/ # Multi-agent system files
β β βββ government_services_scraper.py
β β βββ nsw_hospitals_agent.py
β β βββ scamwatch_threat_agent.py
β β βββ critic_agent.py (LLM-powered)
β β βββ sorter_agent.py
β β βββ visualization_agent.py (LLM-powered)
β βββ utils/ # Helper utilities
β βββ run_pipeline.py # Main orchestrator
βββ data/ # Organized data files
β βββ raw/ # Original scraped data
β βββ verified/ # Safe, categorized contacts
β βββ threats/ # Scam indicators
β βββ reports/ # Quality & analysis reports
βββ frontend/ # Dashboard & visualization
βββ live_dashboard.html # Agent-generated dashboard
βββ dashboard.html # Interactive results view
Issue: Contact details (phone/email/website) not available from ACNC charity profiles
Reason: ACNC implements JavaScript-rendered content and bot protection mechanisms
Current Status: β
Organizational verification available (names, ABNs, addresses, purposes)
Impact: Reduces callback functionality but maintains anti-scam organizational verification
Technical Details: See BUG_REPORT_ACNC.md for complete analysis
Why This Is Good Software Engineering:
- β Respects data protection: ACNC protects charities from automated harvesting
- β Ethical approach: Demonstrates awareness of privacy vs. utility balance
- β Graceful degradation: System works with available data, documents limitations
- β Future-proofed: Alternative solutions identified for production deployment
Issue: Phone numbers can still be spoofed by attackers regardless of database completeness
Mitigation: Our system focuses on organizational verification and behavioral patterns rather than relying solely on caller ID
Current: Focused on NSW/Federal for demonstration purposes
Production: Would require scaling to all states/territories for complete coverage
- Web API endpoint for real-time contact verification
- Mobile app integration for on-the-go scam checking β (Already built! See Part 2 below)
- Machine learning models for predictive scam detection
- Geographic expansion to all Australian states/territories
- International partnerships for cross-border scam prevention
- Natural language processing for scam content analysis
- Blockchain verification for tamper-proof contact records
- Real-time threat intelligence feeds
- Community reporting integration
- Government alert system integration
A comprehensive iOS app that protects users from scam calls and SMS messages using verified government contact data
The Digital Guardian iOS app transforms the backend data into a proactive mobile shield, providing real-time protection through deep iOS integration and on-device AI capabilities.
- 410+ verified contacts from official government directories embedded in app
- Real-time verification for phone numbers, emails, and websites
- Color-coded risk assessment (Red=Scam, Green=Safe, Yellow=Unknown)
- Instant offline responses using embedded CSV database
- Personalized safe word system with unique security questions for each family member
- CallKit integration for real-time call monitoring and identification
- Gentle nudge notifications with 2-second delay after call connects
- Visual indicators (π‘οΈβ for safe, π¨β for scam contacts)
- Universal app integration - works with Messages, Reminders, Notes, any text app
- Long-press sharing workflow for suspicious SMS analysis without app switching
- Comprehensive threat detection against scam database and verified contacts
- Easy testing capabilities across multiple iOS applications
- OpenELM-270M Core ML Integration: Real on-device AI running on Apple Neural Engine
- Natural language queries like "What is the ATO phone number?", "Call Medicare", "Travel advice"
- RAG System: AI-powered search combining LLM intelligence with 383 verified government contacts
- Digital Guardian AI Analysis: LLM generates intelligent responses with verified data
- Privacy-First: All AI processing happens on-device, no data sent to servers
- Current Limitations: Works best with simple queries; comprehensive improvement roadmap available
- TabView Navigation: Dual-tab interface (Protection + Family Circle)
- Background Monitoring: Continuous call monitoring with CallKit
- Notification Actions: Interactive notifications with callback options
- Privacy-First Design: All processing happens on-device
- Debug/Release Modes: Smart data loading for development vs production
- MessageFilter Extension: SMS filtering against verified contacts
- Call Directory Extension: Enhanced caller ID with government contact labels
- Smart Caller ID: Display verified organization names for incoming calls
- Scam Pattern Detection: Intelligent analysis of suspicious message content
- SwiftUI: Modern iOS UI framework for accessible design
- CallKit: Real-time call monitoring and identification
- UserNotifications: Background notification system
- Share Extension: SMS analysis without app switching
- App Groups: Secure data sharing between main app and extension
- Core ML: On-device AI inference (OpenELM-270M model)
- Backend CSV Integration: Uses existing GovHack pipeline data
- Government Contacts: Hospital, agency, service numbers
- Threat Database: Known scam numbers and patterns
- Verification System: Safe contact whitelist
- Share Extensions: Receive text from Messages app
- CallKit Extensions: Call identification and blocking
- MessageFilter: SMS filtering (iOS 14+)
- Accessibility: VoiceOver, Dynamic Type, high contrast
ContentView.swift: Main app interface with verification toolsFamilyCircle.swift: Family member management and safe word systemCallMonitor.swift: CallKit integration for call state monitoringNotificationHandler.swift: Processes calls through verification pipelineSMSAnalyzer.swift: SMS threat detection engineShareViewController.swift: Share Extension UI for SMS analysisLLMService.swift: On-device AI processing with Core ML
- Incoming Call β Family Circle Check β Verified Contacts β Scam Detection β Notification
- SMS Analysis β Text Selection β Share Extension β Threat Database β Risk Assessment
- AI Query β Natural Language β On-Device LLM β Government Contact Search β Verified Response
- Xcode 15+ (requires recent beta versions for Core ML features)
- iOS 17+ simulator or device for testing
- Apple Developer account (required for device testing and CallKit)
- macOS with Apple Silicon (recommended for Core ML model conversion)
Note: The complete iOS implementation requires significant iOS development expertise and beta toolchain access. The core verification functionality works with standard Xcode, but the on-device AI features require advanced setup.
For Full Feature Access:
- Xcode Beta Access: Core ML model integration requires latest Xcode beta
- iOS Development Experience: CallKit, Share Extensions, and Core ML integration
- Model Conversion Knowledge: Converting LLM models to Core ML format
# 1. Ensure backend data is available
cd govhack2025
python backend/run_pipeline.py # Generate sorted_contacts_master.csv
# 2. Open iOS project
open ios-app-simple/DigitalGuardianSimple.xcodeproj
# 3. Build and run in Xcode
# - Target: iOS 17+ simulator or device
# - Grant notification permissions when prompted
# - Allow CallKit integration for call monitoring-
Open Xcode 26 Beta 7 (or latest available)
-
Create New Project:
Product Name: DigitalGuardianSimple Bundle ID: com.govhack.digitalguardian.simple Language: Swift Interface: SwiftUI iOS Target: 17.0+ -
Configure Claude Integration:
- Open Xcode Intelligence settings
- Add Claude Sonnet 4 account
- Test AI code completion
-
Import Source Files:
- Copy Swift files from
ios-app-simple/DigitalGuardianSimple/ - Build and run in simulator
- Copy Swift files from
Try these prompts in Xcode with Claude:
- "Create an accessible button for senior users"
- "Help me parse phone numbers from text"
- "Design a SwiftUI Share Sheet extension"
- "Implement CallKit caller ID display"
# Download pre-converted OpenELM model
huggingface-cli download \
--local-dir ios-app-simple/DigitalGuardianSimple/DigitalGuardianLLM.mlpackage \
corenet-community/coreml-OpenELM-270M-Instruct \
--include "*.mlpackage/"
# Or use conversion scripts (requires compatible PyTorch version)
cd ios-app-simple/
python convert_openelm.py # Convert OpenELM to Core ML
python csv_to_llm_json.py # Convert CSV data to LLM-friendly JSON# Using APNS (Apple Push Notification Service) files
cd ios-app-simple/apns/
# Drag .apns files to iOS Simulator for testing:
# - vin.apns β Family member recognition with safe word prompt
# - robyn.apns β Security question verification flow
# - adam.apns β Trusted contact identification
# - jordan.apns β Family verification systemReal Messages App Integration:
- Open Messages app in iOS simulator
- Create or receive SMS messages (works with real conversations)
- Long-press on message bubble β Share β Digital Guardian
- Extension analyzes message against 410 verified contacts + 13 threat indicators
- View comprehensive threat analysis with color-coded risk assessment
Alternative Testing Methods:
- Reminders app: Create reminder with test message, select text and share
- Notes app: Paste test message, select and share to Digital Guardian
- Any app with text selection: Use universal iOS share sheet integration
// Sample queries that work with on-device AI:
"ATO number" β "Australian Taxation Office: 13 28 61"
"Medicare phone" β "Medicare General Enquiries: 132 011"
"hospital contacts" β "266 NSW hospitals available"
"travel advice" β "Smartraveller - Department of Foreign Affairs"{
"family_members": [
{
"name": "Vin",
"phone": "+61412345678",
"security_question": "What was the name of our first pet?"
},
{
"name": "Robyn",
"phone": "+61423456789",
"security_question": "What street did we live on when we first met?"
},
{
"name": "Adam",
"phone": "+61434567890",
"security_question": "What was your favorite childhood movie?"
},
{
"name": "Jordan",
"phone": "+61445678901",
"security_question": "What was the name of your first school?"
}
]
}See test_sms_messages.txt for comprehensive scam and legitimate message examples covering:
- Government impersonation attempts
- Bank fraud messages
- Charity scam requests
- Healthcare appointment confirmations
- Legitimate government communications
Latest screenshots available in screenprints/ directory:
- Family Circle Identification: Visual call notifications with safe word prompts
- Risk Assessment Display: Green/red visual indicators for threat levels
- SMS Share Extension: Live demonstration of text analysis workflow
- π§ Ask Digital Guardian AI Interface: Real OpenELM-270M chat showing natural language queries
- Complete User Flow: From query input to AI-powered government contact search with verified results
- On-Device AI Responses: Screenshots showing "Digital Guardian AI Analysis" with Apple Neural Engine attribution
This iOS app integrates seamlessly with the GovHack 2025 multi-agent pipeline:
data/verified/government_contacts.csv- Safe government numbersdata/verified/hospital_contacts.csv- Verified hospital contactsdata/threats/threat_contacts.csv- Known scam numbersdata/sorted_contacts_master.csv- Complete verified dataset (410 contacts)
- Real-time Updates: Sync with latest CSV exports from backend agents
- Quality Scores: Use confidence ratings from critic agent
- Risk Assessment: Leverage sorter agent categorizations
- Community Support: Connect to navigator platform
- CSV Data Integration: sorted_contacts_master.csv embedded in app bundle for offline verification
- Senior-Friendly: Large text, simple navigation, clear alerts
- Accessibility First: VoiceOver, Dynamic Type, high contrast support
- Crisis-Ready: Clear, unambiguous warnings and help options
- Privacy-Focused: On-device processing, minimal data collection
- Primary: Older Australians (75+) with low digital ability
- Secondary: CALD communities, people with disabilities
- Support: Family members and community navigators
- Background processing and battery efficiency
- Accessibility Enhancement: VoiceOver and Dynamic Type support
// Core privacy principles implemented:
class DataManager {
// β
No API keys stored in code
// β
No personal data transmitted
// β
All processing happens on-device
// β
App Group sandboxing for data isolation
func verifyContact(_ contact: String) -> VerificationResult {
// Local CSV database lookup only - zero network calls
return localDatabase.verify(contact)
}
}Security Implementations:
- π On-Device Processing: SMS analysis happens locally
- π« No Data Collection: App doesn't store or transmit personal messages
- β Verified Sources: Only uses official government and charity data
- π‘οΈ Apple Privacy: Follows iOS security best practices
- App Group Sandboxing: Secure data sharing between main app and extensions
- Local Analysis Only: No user data ever transmitted externally
- Embedded Database: 410 verified contacts stored locally for offline access
- Privacy-First AI: Core ML inference happens entirely on-device
- π€ AI Voice Detection: Identify synthetic voice calls
- π Real-time Updates: Live threat intelligence feeds
- π± Cross-Platform: Android version using React Native
- π’ Enterprise: Version for aged care facilities and community centers
- MessageFilter Extension: Enhanced SMS filtering
- Call Directory Extension: Improved caller ID with government labels
- Smart Caller ID: Display verified organization names
- Scam Pattern Detection: ML-powered suspicious content analysis
- Use Claude assistance in Xcode for complex iOS features
- Test regularly on iOS simulator and devices
- Commit to Git with descriptive messages
- Focus on accessibility and senior-friendly design
This is a GovHack 2025 project. Development is focused on creating a working prototype that demonstrates the anti-scam protection concept using Australian government data.
The iOS app is seamlessly integrated with the broader anti-scam data pipeline:
- Backend Pipeline generates
sorted_contacts_master.csvwith 410 verified contacts - Data Transformation converts government data to mobile-friendly formats
- Real-time Protection uses verified database for instant scam detection
- Community Safety helps citizens identify legitimate vs fraudulent contacts
- Vulnerable Population Focus designed for seniors and accessibility-first usage
Built using verified data from:
- 109 Federal Government services
- 266 NSW hospitals and healthcare providers
- 22 NSW government agency contacts
- 13 threat intelligence indicators from Scamwatch
This is a sophisticated iOS application requiring:
- Advanced Xcode skills for CallKit and Share Extension development
- Core ML expertise for on-device AI model integration
- iOS framework knowledge for background processing and notifications
- Beta toolchain access for latest Core ML features
- Simple queries work best: "ATO number", "Call Medicare"
- Complex sentences may not parse correctly with current vocabulary
- 270M parameter model provides basic but functional language understanding
- Greedy decoding only - deterministic but not conversational
For vulnerable populations dealing with potential scams:
- Simple, predictable responses reduce confusion
- Deterministic behavior builds trust through consistency
- Fast, offline processing works in crisis situations
- Clear safe/unsafe indicators provide unambiguous guidance
GOVERNMENT DATA β AI PROCESSING β MOBILE PROTECTION β USER SAFETY
β β β β
410 contacts β Grade A quality β Real-time guard β Scam prevention
5 agents β LLM validation β Privacy-first β Vulnerable protected
- Solo developer achievement: Complete system built in one weekend
- Real AI integration: Both cloud LLM (Claude) and edge AI (Core ML)
- Production-ready quality: 95.4% data quality, comprehensive testing
- Privacy-first approach: All user processing happens on-device
- Accessibility focus: Designed for seniors and vulnerable populations
- Government data utilization: 100% authentic official sources
This project proves that privacy-first AI is not just possibleβit's practical and often better. As Apple Silicon becomes more powerful and models become more efficient, on-device AI will transition from exception to expectation.
The techniques pioneered here can evolve beyond smartphones:
- Voice-first interfaces for landline integration
- Caregiver-enhanced systems for family support networks
- Progressive assistance that adapts to cognitive decline
- Universal accessibility regardless of technical ability
- GitHub Repository: https://github.com/vin67/govhack2025
- Issues & Features: Report bugs or suggest features
- Documentation:
- Backend: See
approach_backend.mdfor detailed development notes - iOS: See
approach_mobile.mdfor mobile development workflow - AI Journey: See
APPLE_AI_AT_THE_EDGE.mdfor on-device AI story
- Backend: See
This project is developed for GovHack 2025 and is intended for educational and public benefit purposes.
GovHack 2025 Team Project
Building safer digital experiences for all Australians π¦πΊπ‘οΈ
The Weekend's Achievement: Proving that one developer with determination, community resources, and Apple Silicon can build privacy-first AI that actually helps peopleβwithout sending a single byte to the cloud.