Data Integration Architecture
ETL design, API integration priorities, data flow model, and security constraints for Pride’s intelligence dashboard and automation platform.
Strategic Objective
Connect 8 years of historical venue data with real-time operational metrics to support daily decision-making. Three critical deliverables:
- Event P&L reports — revenue and costs per event, available within 24 hours of event close
- Weekly P&L summary — rollup of daily/event P&L, delivered every Monday
- Cash forecasts — visibility of cash position trends, updated daily
API Integration Priorities (Ordered by Impact)
Priority 1: Core Revenue and Attendance
Square POS (OAuth):
- Data: Sales by item, category, register, channel; tab and ticket data; multi-location sales capability
- Refresh rate: Daily, nightly batch pull via Node.js SDK
- Rate limit: 50 req/sec (generous)
- Status: Production since ~2018; 8 years of data available
- Integration approach: Query transactions by date range; group by event/category for P&L; backfill historical data
TryBooking (API key):
- Data: Ticket sales, customer emails, event attendance records, promo code usage, ticket tier performance, package upsells
- Refresh rate: Daily or event-triggered
- Status: Production since 2018; underutilised for reporting
- Integration approach: Webhook on ticket sale; daily event sync; attendance list capture (currently printed and discarded)
- CSV fallback: If API unreliable, manual export process available
Priority 2: Financial and Operational Context
Xero (OAuth):
- Data: Chart of accounts, P&L, Balance Sheet, payroll, invoices, bank transactions
- Refresh rate: Daily snapshot
- Status: Production but data quality concerns; bookkeeper departed Nov 2025
- Integration approach: Query P&L endpoints; link account codes to event categories; aggregate into weekly summary
- Prerequisite: Data cleanup required before dashboard deployment. Bookkeeper support essential.
Deputy (OAuth or API key):
- Data: Staff rosters, clock-in/out times, shift data
- Refresh rate: Daily or shift-triggered
- Status: Production; manual processes downstream
- Integration approach: Daily shift export; link to Xero payroll codes; calculate labour hours per event
Priority 3: Customer and Marketing Context
PinTuna (export TBD):
- Data: Member records, loyalty participation, wallet data, member-to-transaction linkage
- Status: Production; underutilised
- Integration approach: Weekly export; customer segmentation (members vs non-members vs lapsed); loyalty spend analysis
Meta Business Suite (API):
- Data: Post performance, engagement, ad spend, audience demographics
- Refresh rate: Daily
- Integration approach: Pull post/campaign metrics; correlate with ticket sales events and date ranges
Google Business Profile:
- Data: Reviews, search impressions, customer actions
- Refresh rate: Weekly
- Integration approach: Sentiment analysis on recent reviews; reputation tracking
Data Flow Architecture
Daily Batch Pulls (11pm)
├── Square POS (transactions)
├── TryBooking (ticket sales, attendance, events)
├── Xero (P&L snapshot, invoices)
├── Deputy (shifts, clock times)
└── Meta (post performance)
Event-Triggered Pulls
├── TryBooking (new ticket sale, event close)
├── PinTuna (daily member export)
└── Google Business (review alert)
Transformation Layer
├── Link Square transactions to TryBooking events
├── Map Xero accounts to event categories
├── Calculate event P&L (revenue - costs)
├── Aggregate weekly and daily rollups
└── Update forecast based on booking trends
Dashboard Update
└── Serve live data to Humphrey Intelligence App
Phase-by-Phase Implementation
| Phase | Focus | Data Sources | Timeline |
|---|---|---|---|
| Phase 1 | Revenue and attendance baseline | Square, TryBooking | Weeks 1–2 |
| Phase 2 | Financial performance | Xero (post-cleanup) | Weeks 3–4 |
| Phase 3 | Labour and customer insights | Deputy, PinTuna, Instagram | Weeks 5–6 |
| Phase 4 | Decision support | Integrated analysis across all | Weeks 7+ |
Security Model
No API keys in client code: All external API calls server-side (Server Actions or API routes).
OAuth token storage: Supabase vault (encrypted at rest).
Row Level Security: Every table has RLS policies tied to user role:
- Admin: Sees all data
- Operations: Sees operations-relevant data (roster, incidents, performance)
- Events: Sees event-relevant data (attendance, revenue, feedback)
- HR-sensitive: Warnings, medical certificates visible only to Admin
Domain-restricted access: Only @prideofourfootscray.bar and @exudegroup.com email addresses can authenticate.
Gmail service account: Domain-wide delegation for email triage; Mat’s personal credentials never exposed to app.
Environment variables: All secrets stored in env vars, never committed to version control.
Data Quality Issues to Address
TryBooking Attendance Data Loss
Problem: Attendance lists printed and discarded; actual attendance vs tickets sold not tracked digitally.
Solution: Implement automated weekly export of TryBooking attendance data to database; retain for 8-year historical analysis.
Impact: Unlocks customer attendance patterns, event performance analysis, customer acquisition metrics.
Xero Data Integrity
Problem: Bookkeeper departed Nov 2025; data likely contains errors; stakeholders don’t trust Xero P&L.
Solution: Audit chart of accounts; verify account mappings to event categories; clean transaction history. Ongoing governance now handled by Shae (Director, pro bono bookkeeper, ~1hr/week automated).
Timeline: Required before Phase 2 (financial performance reporting).
Square Stock Tracking
Problem: Staff type amounts inconsistently recorded; no staff-level sales attribution; stock tracking inaccurate.
Solution: Implement inventory workflow (weekly counting, automated depletion from POS); requires staff process change and training.
Timeline: Not critical for Phase 1–2 but valuable for unit economics analysis.
Integration Approach Options
Option A: Manual Daily Export (Low complexity, slow)
- User exports CSV from each system
- Uploads to shared folder
- Dashboard pulls from CSVs
- Pros: No API authentication; minimal code
- Cons: Manual, error-prone, slow, not real-time
Option B: API-Driven Batch Pulls (Medium complexity, reliable)
- Nightly scheduled job pulls data from each API
- Transforms and loads into Supabase
- Dashboard queries Supabase
- Pros: Reliable, scheduled, testable; no real-time delay acceptable; most flexible
- Cons: Requires API knowledge, authentication management
Option C: Webhooks + Real-Time Subscriptions (High complexity, live)
- Each system sends webhook on transaction/event update
- Real-time update to Supabase
- Dashboard reflects changes instantly
- Pros: Real-time; efficient (no unnecessary pulls)
- Cons: Webhook management complex; requires reliable infrastructure
Recommendation: Start with Option B (API batch pulls), migrate to Option C (webhooks) after proof of concept.
Cost Estimate
- Railway cron jobs: Free (5 min/month execution limit; adequate for batch pulls)
- Supabase storage: Free tier (500MB database; initial load ~50MB; tracks growth as events accumulate)
- Anthropic API (for email triage + event discovery): $5–15/month
- Third-party APIs: Free tier adequate for all (Square, TryBooking, Xero, Deputy, Meta, Google)
Integration Landscape Update (April 2026)
Key Findings
- Amaka IS the “native” Square→Xero integration — Xero retired its original in November 2021. No separate direct integration exists.
- Deputy→Xero has 13+ documented export error types. Deputy Payroll add-on (June 2025) eliminates the integration but is immature.
- TryBooking has no direct BI integration — event/capacity data is locked unless extracted via API. This confirms the Humphrey Intelligence App as essential.
- No off-the-shelf analytics platform covers all four systems (Square, Xero, TryBooking, Deputy). Entertainment venues are an underserved niche.
Deputy→Xero Integration Detail (April 2026)
The Deputy→Xero integration is now strategically locked in (Xero $25M investment, exclusive embedded partner). Key implications for the data architecture:
Integration is semi-automated, not real-time. A manager must manually trigger each timesheet export. Timesheets do not flow to Xero on approval. This means the Humphrey Intelligence App cannot rely on Xero payroll data being current in real-time — labour cost calculations must account for export lag.
Deputy API supports webhooks for timesheets, employees, leave, and schedules at no extra cost. No dedicated “timesheet approved” event, but timesheet update with TimeApproved=true serves the same purpose. This enables the dashboard to pull labour hours directly from Deputy rather than waiting for Xero export — recommended approach for real-time labour cost visibility.
Xero API rate limit (5,000 calls/day, 60/min) is the binding constraint for any integration consuming Xero data. Deputy exports and Humphrey dashboard pulls share this budget. Batch pulls should be scheduled to avoid conflicts with payroll export windows.
Middleware verdict: Zapier and Make cannot replicate the native batch payroll export. They process events individually with no Australian award awareness. Custom middleware (AWS Lambda, ~$0–5/month) is viable only if zero-touch automation is required.
Integration Health Checklist
- Amaka: verify partnership free tier covers transaction volume
- Xero: must be on Grow plan ($75/month) — Starter breaks integration
- Deputy→Xero: employee profiles must match exactly (names, emails, pay items); enable Timesheet Preflight
- GST tax groups: must be manually configured for alcohol items
- Xero API budget: coordinate Deputy export timing with Humphrey dashboard batch pulls to stay under 5,000 calls/day
See Tech Stack Optimisation for full integration recommendations.
Related Pages
- Humphrey Intelligence App — dashboard consuming this data
- Tech Stack — architecture decisions enabling integration
- Automation Opportunities — automation tasks using integrated data
- Tech Stack Optimisation — integration recommendations
- Deputy Xero Integration Research — source: Deputy→Xero integration detail (April 2026)
- Melbourne Venue Tech Stack Research — source: integration analysis