GET-109 · Technical Spec

Email & Calendar Integration for Relationship Strength

Drafted 2026-04-22 · Status: review · Audience: backend + admin-portal engineers

1. Overview

Ingest email and calendar metadata from each admin user's Google Workspace or Microsoft 365 account and turn it into three transparent relationship indicators on contacts:

Connection strength — per team-member × contact: Warm Known Cold
Contact reachability — per team × contact: High / Medium / Low / None
Key connection — the best person on the team to ask for an intro

Anchor decisions

Keep Contact and ContactEmail structure unchanged. Collection scoping stays.
Use Findem enrichment as the dedup oracle (email → canonical identity → Getro Contact).
Two-layer scoring: admin rules assign the tier, weighted score sorts within tier.
Never create contacts from email metadata. Unattributed interactions stage and resolve lazily.
Ship a thin V1 (email + calendar, no Findem, no user enrichment) first; upgrade in place.

2. System architecture

Five pipeline stages — ingestion, resolution, rollup, enrichment, scoring — each isolated, each restartable independently. No stage writes outside its own tables.

flowchart LR subgraph Providers["External providers"] GM["Gmail
(gmail.metadata)"] GC["Google Calendar
(calendar.readonly)"] OM["Outlook Mail
(Mail.ReadBasic)"] OC["Outlook Calendar
(Calendars.ReadBasic)"] FI["Findem"] end subgraph Ingest["Ingestion (per user)"] GMS[GmailSyncer] GCS[GoogleCalSyncer] OMS[OutlookMailSyncer] OCS[OutlookCalSyncer] end IE[("InteractionEvent
user · kind · direction
occurred_at · thread_id
contact_email · contact_id?")] subgraph Resolve["Resolution"] FER[FindemEnrichmentResolver] CEL[(EnrichedEmailLookup
cache)] end subgraph Rollup["Rollup"] RW[NightlyRollupWorker] UCIS[("UserContactInteractionStats")] end subgraph Enrich["User enrichment"] UES[Users::Enrichment::FindemSyncer] UEP[("UserEnrichedProfile
UserWorkExperience
UserEducation")] WOC[(WorkOverlapCache
EducationOverlapCache)] end subgraph Score["Scoring"] RSS[RelationshipStrengthService] RS[ReachabilityService] UCC[("UCC: strength_tier
strength_score
strength_override")] CR[("ContactReachability
tier · counts
key_user_id")] end GM --> GMS --> IE GC --> GCS --> IE OM --> OMS --> IE OC --> OCS --> IE IE --> FER FER -.->|cache| CEL FER -.->|lookup| FI FER -->|contact_id| IE IE --> RW --> UCIS FI --> UES --> UEP --> WOC UCIS --> RSS WOC --> RSS RSS --> UCC UCC --> RS --> CR classDef provider fill:#e0f2fe,stroke:#0369a1 classDef data fill:#fef3c7,stroke:#b45309 classDef service fill:#f3e8ff,stroke:#7c3aed class GM,GC,OM,OC,FI provider class IE,CEL,UCIS,UEP,WOC,UCC,CR data class GMS,GCS,OMS,OCS,FER,RW,UES,RSS,RS service

Why one normalized table. InteractionEvent is provider-agnostic by construction — Gmail, Outlook, calendars all land in the same shape. The scoring engine has no idea where an interaction came from. Adding Slack or Zoom later would be an enum value, not a new pipeline. See DR-01.

Stage responsibilities

Stage	Input	Output	Trigger
Ingestion	Provider APIs (metadata only)	`InteractionEvent` rows	Daily cron + on-demand backfill
Resolution	New/unresolved `InteractionEvent` + `Contact` creates	`contact_id` populated	`after_commit` on insert + `Contact` hooks
Rollup	`InteractionEvent`	`UserContactInteractionStats`	Nightly full + incremental on event insert
User enrichment	User email / linkedin handle	`UserEnrichedProfile`, work/edu records	On account connect + weekly refresh + Findem webhook
Scoring	Stats + overlaps	`UCC.strength_tier`, `ContactReachability`	On stats change + on override edit + nightly

3. Data model

Five new tables. No changes to Contact, ContactEmail, or Collection. Three existing tables get additive columns.

erDiagram USERS ||--o{ UCC : "has many" CONTACTS ||--o{ UCC : "has many" COLLECTIONS ||--o{ UCC : "has many" USERS ||--o{ INTERACTION_EVENT : "owns" CONTACTS ||--o{ INTERACTION_EVENT : "attributed" USERS ||--o{ USER_CONTACT_STATS : "rollup" CONTACTS ||--o{ USER_CONTACT_STATS : "rollup" USERS ||--o| USER_ENRICHED_PROFILE : "has one" USER_ENRICHED_PROFILE ||--o{ USER_WORK_EXPERIENCE : "has many" USER_ENRICHED_PROFILE ||--o{ USER_EDUCATION : "has many" CONTACTS ||--o{ CONTACT_REACHABILITY : "per collection" COLLECTIONS ||--o{ CONTACT_REACHABILITY : "per contact" UCC { int user_id FK int contact_id FK int collection_id FK string source string strength_tier float strength_score string strength_override datetime strength_computed_at } INTERACTION_EVENT { bigint id PK int user_id FK string provider string kind string direction datetime occurred_at string thread_id string remote_message_id UK string contact_email int contact_id FK boolean newsletter_flag string raw_headers } USER_CONTACT_STATS { int user_id FK int contact_id FK int email_inbound_count_12mo int email_outbound_count_12mo int email_twoway_count_total int email_inbound_nnl_count_24mo datetime email_last_at datetime email_last_outbound_at boolean email_has_response_in_3mo int email_activity_bitmap_12q int email_active_quarters_consecutive int meeting_count_180d int meeting_count_24mo datetime meeting_last_at datetime recomputed_at } USER_ENRICHED_PROFILE { int user_id FK string findem_id string raw datetime refreshed_at } USER_WORK_EXPERIENCE { int user_id FK int organization_id FK date date_from date date_to string title string location } USER_EDUCATION { int user_id FK int school_id FK date date_from date date_to } CONTACT_REACHABILITY { int contact_id FK int collection_id FK string tier int warm_count int known_count int cold_count int key_user_id FK datetime computed_at }

New columns on existing UCC table

Column	Type	Notes
`strength_tier`	enum (warm, known, cold, nullable)	Null until first computation
`strength_score`	float, nullable	Within-tier sort, hidden from users in V1
`strength_override`	enum, nullable	User-set override; rule result still stored for audit
`strength_computed_at`	timestamp	Last rule-engine run

Enum values (stored as strings / Rails enums)

INTERACTION_EVENT.provider: gmail, outlook, gcal, ocal
INTERACTION_EVENT.kind: email, meeting
INTERACTION_EVENT.direction: inbound, outbound, attended
UCC.strength_tier / strength_override: warm, known, cold
CONTACT_REACHABILITY.tier: high, medium, low, none

Fields excluded from the diagram for brevity

USER_CONTACT_STATS also has email_inbound_count_24mo, email_outbound_count_24mo, email_last_inbound_at.
USER_ENRICHED_PROFILE.raw is actually jsonb; INTERACTION_EVENT.raw_headers is jsonb.
INTERACTION_EVENT.contact_id is nullable (intentional — unresolved interactions).

Additive columns, not a new table. Strength belongs where the user⇄contact link already lives. strength_override is nullable so the absence of a user decision is explicit — we can always tell whether tier was computed or set manually. See DR-06.

Why these shapes

InteractionEvent is the single normalized store. Gmail, Outlook, calendars all land here. The contact_id being nullable is deliberate: unresolved interactions wait for resolution rather than forcing a premature contact creation.
UserContactInteractionStats is a materialized cache, not the source of truth. Nightly rebuild + incremental updates on event insert. email_activity_bitmap_12q is a 12-bit int encoding 2-way activity per quarter — cheap to maintain, powerful for sustained-relationship detection.
UCC gets four columns, not a new table. Strength lives where the team-member⇄contact link already lives. Override is a nullable enum so the absence of an override is explicit.
ContactReachability is denormalized per (contact, collection). A cheap row-level recompute beats a per-request aggregation over hundreds of UCCs.
UserEnrichedProfile intentionally mirrors ContactEnrichedProfile — same enrichment engine (Findem), same shape, same webhook wiring.

4. Integration status

Google Workspace Mostly live

Component	Status	Notes
OAuth client, GoogleAccount, token encryption	Live	`lib/google/oauth_client.rb`
Scopes `gmail.metadata` + `calendar.readonly`	Declared	Not yet ingested; already in `GoogleAccount::GOOGLE_SCOPES`
People API contact sync, daily cron	Live	Shipped Dec 2025
Admin-portal Google card, OAuth flow, polling	Live	`useIntegrationsPage` hook
Per-scope UI toggles	Build	Extend `GoogleIntegrationCard`
Gmail + Calendar metadata clients	Build	New files under `lib/google/`
Syncers, schedulers, workers	Build	Mirror existing `ContactsSyncer` pattern
Newsletter / bulk inbound filter	Build	Header inspection + recipient count
CircuitBox wrap retrofit	Build	Gap in existing integration

Retrofit, not rewrite. Wrap existing client methods in Circuitbox.circuit(:name, ...).run { … } — no changes to business logic, just a guard layer. One named circuit per provider (:google_people, :google_gmail, :google_calendar, :ms_graph_mail, :ms_graph_calendar, :findem_profile) so one provider's outage doesn't cascade to others.

Microsoft 365 Greenfield

Multi-tenant Azure AD registration. One app registration in Getro's Azure tenant, users sign in from their own Microsoft 365 tenant. Same pattern as multi-org Google OAuth — the MS app is registered with signInAudience: "AzureADMultipleOrgs".

Component	Status
Azure AD app registration (multi-tenant)	Build + ops
`MicrosoftAccount` model + migration	Build
`lib/microsoft/graph_client.rb` base	Build
Mail + calendar metadata clients	Build
OAuth callback controller + refresh flow	Build
Syncers, schedulers, workers for both	Build
Admin-portal Microsoft card + RTK Query service	Build
Env vars, feature flag, egress registry entry	Build + ops

Findem Pending capabilities

Capability	Consumer	Status
F1 · Person lookup by email	Dedup resolver	Ask Findem
F2 · Enrichment by LinkedIn handle	Contact enrichment	Live
F3 · Enrichment when only email known	Fallback contact enrichment	Ask Findem
F4 · User enrichment	New `UserEnrichedProfile` pipeline	Ask Findem
F5 · Company details (size, stage)	Work-overlap gate	Partial
F6 · Investor / cap-table data	Investor-overlap signal (future)	Ask Findem
F7 · Webhooks on profile updates	Re-trigger rollups	Framework

5. Service layer

Rule engine + scorer

The scorer is stateless: given a User and a Contact, it reads the rollup + overlap caches and returns a deterministic result.

Rules are declared as data, not code. Adding a new signal is a new Rule row — no changes to the evaluator. Makes the V2 admin-tunable-rules work a straightforward extension instead of a refactor. See DR-04.

Result = Struct.new(:tier, :score, :reasons)
# tier:    :warm | :known | :cold
# score:   Float, for within-tier sort only
# reasons: [{ code: :twoway_12mo_5plus, met: true, detail: "7 in / 9 out" }, ...]

class RelationshipStrengthService
  def self.call(user:, contact:)
    stats = UserContactInteractionStats.find_by(user: user, contact: contact)
    overlap = WorkOverlapCache.by_pair(user, contact)
    edu = EducationOverlapCache.by_pair(user, contact)

    evaluated = RULES.map { |rule| rule.evaluate(stats, overlap, edu) }
    tier = %i[warm known cold].find { |t| evaluated.any? { |r| r.tier == t && r.met } } || :cold
    score = WEIGHTED_SCORE.call(evaluated)
    Result.new(tier, score, evaluated.select(&:met))
  end
end

Rules are declared as data, not code:

RULES = [
  Rule.new(code: :twoway_12mo_5plus, tier: :warm, signal_weight: 3.0,
           predicate: ->(s, _, _) { s && s.email_inbound_count_12mo >= 5 && s.email_outbound_count_12mo >= 5 }),
  Rule.new(code: :sustained_2yr, tier: :warm, signal_weight: 2.5,
           predicate: ->(s, _, _) { s && s.email_active_quarters_consecutive >= 8 }),
  # ... 14 more ...
]

Adding a new signal = adding a Rule row. No changes to the evaluator.

Reachability service

Pure aggregation — runs once per (contact, collection) whenever any UCC tier in that collection changes.

class ReachabilityService
  THRESHOLDS = {
    high:   ->(c) { c[:warm] >= 1 || c[:known] >= 3 || c[:cold] >= 10 },
    medium: ->(c) { c[:known] >= 1 || c[:cold] >= 5 },
    low:    ->(c) { c[:cold] >= 1 }
  }.freeze

  def self.call(contact:, collection:)
    uccs = UserContactCollection.where(contact: contact, collection: collection)
    counts = uccs.group(:strength_tier).count.symbolize_keys
    tier = THRESHOLDS.find { |_, pred| pred.call(counts) }&.first || :none
    key = uccs.order(strength_tier_ordinal: :desc, strength_score: :desc).first&.user_id
    ContactReachability.upsert({ contact_id: contact.id, collection_id: collection.id,
                                 tier: tier, key_user_id: key, **counts, computed_at: Time.now })
  end
end

Rollup worker

Why nightly in addition to incremental? Windowed counters need to forget events as they age out (an email just crossed the 12-month boundary). Incremental can add but can't drop — the nightly sweep ages out stale entries and reconciles any drift. See DR-09.

Two entry points:

Incremental: on every InteractionEvent insert via after_commit, bump counters and update the activity bitmap. O(1) per event.
Nightly full: recompute all rows touched in the last 24h. Catches drift, re-bucketizes aging windows (e.g. emails that just fell outside the 12-month window).

Findem enrichment resolver

Lookup → cache → Contact match. Conservative: ambiguous Findem responses leave contact_id null.

Resolver runs per event; Findem called per unique email. The EnrichedEmailLookup cache absorbs ~99% of calls after day 1. Typical user: ~800 Findem calls on backfill, <10/day steady state. See DR-03 and the FAQ.

flowchart TD Start([New InteractionEvent
contact_email = bob@acme.com]) --> Norm[Normalize email
lowercase + trim] Norm --> Cache{EnrichedEmailLookup
cache hit?} Cache -->|Hit| Apply Cache -->|Miss| Findem[Findem F1 lookup] Findem --> Amb{Confidence >= threshold?} Amb -->|No| LeaveNull[Leave contact_id NULL
store neg cache 7d] Amb -->|Yes| Store[Store canonical:
linkedin_handle,
known_emails] Store --> Apply[Apply to Getro] Apply --> ByHandle{Contact by
linkedin_handle?} ByHandle -->|Found| Set[Set contact_id
on InteractionEvent] ByHandle -->|None| ByEmail{ExistingContactsByEmailsFinder
on known_emails?} ByEmail -->|Found| Set ByEmail -->|None| LeaveNull LeaveNull --> Done([Done — may resolve later
via Contact create hook]) Set --> Done classDef warn fill:#fef3c7,stroke:#b45309 classDef ok fill:#dcfce7,stroke:#15803d class LeaveNull warn class Set ok

6. Operational envelope

Quantitative choices, rate limits, and volume estimates — the numbers reviewers ask about when deciding whether this will scale or what it will cost.

Findem cache

Positive TTL: 90–180 days
Negative TTL: 7–30 days
Day-1 calls/user: ~800 (one per unique correspondent)
Steady-state: 5–10 calls/day/user
Invalidation: Findem F7 webhook on profile update

Backfill horizon

On OAuth connect: 24 months (see DR-08)
Older history: On-demand "load historical" job (future)
First-sync duration: ~minutes, not hours

Gmail / Google Calendar

Rate limit: 600 req/min/user; 1B units/day/project
Sync cadence: Daily, 6am UTC
Incremental: historyId (mail), syncToken (calendar)

Microsoft Graph

Rate limit: 10k req / 10min / user
Sync cadence: Daily, staggered from Google
Incremental: Delta query tokens

Interaction volume (typical user)

Day-1 events: ~20,000 (24-month backfill)
Steady-state: ~50 events/day
Row size: ~500 bytes
10 users × 2 years: ~1.5 GB raw events

CircuitBox (all external clients)

error_threshold: 50%
time_window: 60s
volume_threshold: 5 requests
sleep_window: 120s before half-open probe

Rollup worker

Incremental: after_commit per event, O(1)
Nightly rebuild: 00:00 UTC, 36-hour window
Batch size: 500 pairs/job
Uniqueness: sidekiq-unique-jobs :until_and_while_executed

OAuth tokens

Google access: ~1 hour
Google refresh: Long-lived, revoked on disconnect
MS access: ~1 hour
MS refresh: 24h rolling
Storage: Encrypted at rest via lockbox

Scoring / reachability

Scorer complexity: O(1) per pair — reads one stats row
Reachability aggregate: SQL GROUP BY on UCC strength_tier
Typical cascade on strength change: ~1–10 reachability recomputes

Privacy guardrails

Scopes chosen: gmail.metadata, Mail.ReadBasic — no body access possible
No calendar titles stored: attendee list + times only
Per-user opt-in: Each scope toggleable independently
Disconnect wipe: All InteractionEvents deleted

7. Phased plan

The V1 breakpoint. The "Thin V1 ships" node splits the plan. Phases 1–6 depend on nothing we can't control — Phase 7+ gate on Findem capability confirmations. A capability answer that slips doesn't block the thin-V1 ship. See DR-05.

flowchart LR P1[Phase 1
InteractionEvent
+ simple resolver] --> P2[Phase 2
Gmail + GCal syncers
+ scope UI toggles] P1 --> P6[Phase 6
MS Graph integration] P2 --> P3[Phase 3
Rollup worker
+ stats table] P6 --> P3 P3 --> P4[Phase 4
Rule engine
+ Reachability] P4 --> P5[Phase 5
UI surfaces
+ filters] P5 --> V1[[Thin V1 ships
8/16 clauses active
both providers]] V1 --> P7[Phase 7
Findem lookup
+ cache] P7 --> P8[Phase 8
User enrichment] P8 --> P9[Phase 9
Overlap calculators
+ activate clauses] P9 --> Full[[Full V1
14/16 clauses active]] classDef v1 fill:#dcfce7,stroke:#15803d,stroke-width:2px classDef full fill:#ddf4ff,stroke:#0969da,stroke-width:2px class V1 v1 class Full full

Phase	Scope	Depends on	Size	Slice
1	`InteractionEvent` + `ContactEmailResolver` + backfill hooks	—	S	A
2	Gmail + Google Calendar syncers + scope UI + CircuitBox retrofit	1	M	A
3	Rollup worker + `UserContactInteractionStats`	2	M	A
4	Rule engine + reachability + UCC strength columns + override	3	M	A
5	Contact-detail signals + sortable columns + reachability filters + hover reasons	4	M	A
6	MS Graph integration (OAuth, mail+cal syncers, admin-portal card)	1	L	A
7	Findem `lookup_by_email` + `EnrichedEmailLookup` cache; swap resolver	Findem F1	S	B
8	User enrichment via Findem (F4)	Findem F4	M	C
9	Work + education overlap calculators; activate clauses in rule engine	8	M	C
10	(Optional) Investor overlap via F6	Findem F6	S	C+
11	(Nice) Intro-request auto-draft	5	S	—

8. FAQ

The sharp questions reviewers already have.

Won't this hammer Findem with thousands of calls per mailbox?

No. The EnrichedEmailLookup cache keys on normalized email — Findem gets called once per unique correspondent, not once per event. Day-1 backfill is ~800 calls; steady state is <10/day. See DR-03.

What happens when a provider is down?

CircuitBox opens around the failing client, the sync worker exits cleanly, a Slack alert fires. New InteractionEvent rows pause for that provider; others are unaffected. Stats stay consistent with what we've seen. No data loss.

Can email data leak across collections?

InteractionEvent is scoped to the mailbox owner. Stats are per (user, contact). Reachability aggregates only the UCCs in one collection — a user in Collection A and B shares interaction data, but each collection only sees its own breadth counts.

What happens when a user disconnects their mailbox?

OAuth token revoked, provider-account row deleted, cleanup worker removes every InteractionEvent for that user. Dependent stats recompute (counts drop to 0). Strength tiers re-evaluate to Cold or null.

How do we prevent newsletters from being mistaken for real relationships?

At ingestion each email is flagged newsletter if any of: List-Unsubscribe header present, Precedence: bulk/list/junk, or >20 recipients. Newsletter-only inbound matches a Cold clause (C2), never Known.

Do we ever create contacts from email data?

No — that was V1's failure mode. If the sender/recipient doesn't resolve to an existing Contact, the InteractionEvent stays with contact_id = NULL and waits. An after_commit hook on Contact create backfills attributable events later.

Does this work for Outlook / Microsoft 365 users?

Yes — building Microsoft Graph in parallel with Google is the whole point. The scoring engine is provider-agnostic: an Outlook user and a Gmail user who know the same contact produce indistinguishable signals.

Can an individual hide their own signals?

V1: no hide-toggle, but they can override their own strength up or down. Disconnecting their mailbox is the escape hatch — contributions fall to Cold/None.

Why compute a numeric score if we don't show it?

Two places it matters: sorting a Warm list by "warmest first", and picking the Key Connection when multiple team members are all Warm. V1 hides it per product call; V2 exposes it.

What are the thin-V1 ship criteria?

Phases 1–6 complete: both providers connected, 8 of 16 heuristic clauses active, reachability + key connection functional, UI renders signals on contact detail + list. Findem dedup (7) and user enrichment (8–9) arrive incrementally.

9. Open questions

Findem capability confirmations

F1 — person lookup by email: does the endpoint exist? Rate limits, cost per call, response schema, confidence scoring?
F3 — email-only enrichment: fallback when no handle is known?
F4 — user enrichment: any product/TOS concern enriching authenticated users vs. external contacts?
F6 — investor data: does Findem expose company investor / funding rounds?
LinkedIn degree: does Findem track 1st-degree LinkedIn connections? Long shot; would close the last V1 gap.

Team decisions before execution

Ship Slice A (thin V1, 50% heuristic activation) as the first customer-facing release, or wait for Slice C?
"Sustained at lower volume" threshold — recommend 8 of last 8 quarters continuous. Alternative: 6 of 8.
Reachability editable? Recommend no in V1 — it's a pure aggregation, editing would create drift.
Strength-override audit: when a user overrides, keep a record of what the rule engine would have said (for future rule tuning)?
Backfill horizon: 24 months default. Confirm.
Browser-extension LinkedIn capture: worth a short spike to see if 1st-degree data is already being scraped.

10. File references

Key existing Getro code to extend or mirror.

Backend

Purpose	Path
Google OAuth client	`backend/lib/google/oauth_client.rb`
People API client	`backend/app/services/google/people_client.rb`
GoogleAccount model	`backend/app/models/google_account.rb`
Reference syncer	`backend/app/services/contacts/import/google/contacts_syncer.rb`
Reference scheduler	`backend/app/workers/schedulers/contacts/import/google_contacts_daily_sync_scheduler.rb`
Contact model	`backend/app/models/contact.rb`
ContactEmail	`backend/app/models/contact_email.rb`
Dedup intake	`backend/app/services/contacts/contact_creator.rb`
Dedup lookup	`backend/app/services/contacts/existing_contact_finder.rb`
Merge with audit	`backend/app/services/contacts/merge_service.rb`
UCC (extend)	`backend/app/models/user_contact_collection.rb`
Findem base	`backend/lib/findem/client.rb`, `backend/lib/findem/apis/`
CircuitBox config	`backend/config/initializers/circuitbox.rb`

Admin portal

Purpose	Path
Integrations page (per-user)	`admin-portal/src/pages/Settings/integrations/`
Google OAuth flow hook	`admin-portal/src/pages/Settings/integrations/hooks/useIntegrationsPage.jsx`
Google RTK Query service	`admin-portal/src/services/userGoogleAccountsV2.js`
Path-card atoms (reuse for signals)	`admin-portal/src/pages/listDetail/networkConnections/components/pathCard/`
Contact detail page	`admin-portal/src/components/organisms/contactDetail/`
Contact list views	`admin-portal/src/pages/contactsExtended/`

11. Decision records

Short ADRs for the non-obvious choices landed during this spike. The "why we picked this" future maintainers will ask.

DR-01InteractionEvent as a single normalized store

Context: Two providers, two data kinds. A naive model would split into GmailInteraction, OutlookInteraction, GoogleCalendar, OutlookCalendar.
Decision: One InteractionEvent table; provider + kind are enum columns.
Consequences: Scoring is provider-agnostic by construction. Adding a third provider (e.g. Slack DMs later) is an enum value, not a new pipeline.
Alternatives: Per-provider tables — rejected, doubles the scoring code path and couples tier logic to source.

DR-02Findem as the dedup oracle; Contact model unchanged

Context: V1 G Suite integration created duplicates because Getro's Contact is Collection-scoped — same email in two collections is two rows.
Decision: Resolver passes every unresolved email through Findem's canonical-identity lookup, then matches to Getro Contact via globally-unique linkedin_handle first, then ContactEmail.
Consequences: No schema change to Contact. The identity problem moves outside Getro to Findem's graph.
Alternatives: Global Contact primary key (breaks Collection isolation); per-collection fallback lookup (doesn't close the cross-collection dup gap).

DR-03Findem called per unique email, not per event

Context: Per-event calls would cost thousands per user per day and trip rate limits.
Decision: EnrichedEmailLookup cache fronts every Findem call. TTL 90–180d positive, 7–30d negative.
Consequences: ~800 calls per user on backfill, <10/day steady state. Cache invalidation via Findem F7 profile-update webhook.

DR-04Two-layer scoring — rules for tier, weighted score for sort

Context: Team thread split on rules-vs-weights. Admins want explainable tiers; product wants within-tier ranking.
Decision: Rules assign the tier (first-match-wins). Weighted score computed independently, used only for within-tier sort + Key Connection tiebreak.
Consequences: V1 tiers are explainable ("you matched W1 and W4"). V2 exposes weights for admin tuning.
Alternatives: Pure weighted score with hard thresholds — rejected, not explainable.

DR-05Ship thin V1 first; upgrade in place

Context: Waiting for Findem confirmations blocks ship. Email + calendar alone cover 50% of clauses.
Decision: Phases 1–6 ship without Findem or user enrichment. Findem (Phase 7) and user enrichment (8–9) land incrementally, no data migration required.
Consequences: Dedup coverage is ~40–60% on Slice A vs. ~75–85% with Findem. The architecture is additive at every stage.

DR-06UCC extension, not a new strength table

Context: Strength is per (user, contact). UserContactCollection already models that link.
Decision: Add 4 nullable columns to UCC: strength_tier, strength_score, strength_override, strength_computed_at.
Consequences: No new join for every strength read. Migration is purely additive.
Alternatives: Separate UserContactStrength table — rejected, one-to-one with UCC with no justification.

DR-07Reachability not user-editable in V1

Context: Reachability aggregates strength tiers. If users edit it, it drifts from its own inputs.
Decision: V1: strength is overridable (per-user), reachability is not. V2: revisit if users request it.
Consequences: Single source of truth is UCC strength. Reachability always derivable from current state.

DR-0824-month backfill horizon on OAuth connect

Context: Unbounded backfill is slow and expensive. The "sustained 2-year" heuristic defines the practical signal floor.
Decision: Read last 24 months of email + calendar on first connect. Older history via explicit "load historical" button (future).
Consequences: First sync completes in minutes. Signals stabilize within 24h of connection.

DR-09Nightly rollup in addition to incremental updates

Context: Windowed counters (email_count_12mo, has_response_in_3mo) need to forget events as they age out. Incremental can add but can't drop.
Decision: Incremental path on event insert (O(1) bump). Nightly full rebuild sweeps the aging tail + reconciles drift.
Consequences: Two code paths for the same stats. Incremental is fast; nightly is safe. If incremental ever lags, nightly heals.